As you wrap up your data analysis journey, you face a fun decision: how to share your discoveries effectively. Tables and graphs both have their moments to shine. Tables work great when you want your audience to spot exact values and compare them. But, here’s the twist: tables can look overwhelming at first, especially if they’re crammed with data.

But don’t worry! Styling and formatting are here to help. Pandas comes to the rescue. It lets you turn your data into stylish tables effortlessly.

In this article, we’ll explore some tricks to make your Pandas DataFrames look awesome and tell a clear data story. For the demonstrations, we’ll dive into Seaborn’s built-in “tips” dataset. This dataset is a nifty data frame with 244 rows and 7 variables, offering a peek into the world of tipping. This comprehensive collection includes variables like the tip amount in dollars, the bill total in dollars, the gender of the bill payer, the presence of smokers in the party, the day of the week, the time of day, and the party size. Ready to roll? Let’s jump right in!

Setup

Code

import pandas as pd
import numpy as np

Code

df = pd.read_csv(
    "https://raw.githubusercontent.com/mwaskom/seaborn-data/master/tips.csv"
)
df.sample(5)

Table 1: Original table
	total_bill	tip	sex	smoker	day	time	size
109	14.31	4.00	Female	Yes	Sat	Dinner	2
42	13.94	3.06	Male	No	Sun	Dinner	2
44	30.40	5.60	Male	No	Sun	Dinner	4
37	16.93	3.07	Female	No	Sat	Dinner	3
40	16.04	2.24	Male	No	Sat	Dinner	3

Built-in styling

In this article, we won’t cover every single feature of the Styler class because that might make it too long. Instead, we’ll concentrate on the practical aspects that can simplify your data analysis, improve data comprehension, and make your message clearer to your audience. So, instead of taking a deep dive, we’ll start with the low-hanging fruits and explore the built-in options that Pandas offers.

Before we dive in, let’s prepare an aggregated dataframe for our demonstrations. Here it is:

Code

grouped = df.groupby(["day", "smoker", "time"]).median(numeric_only=True)
grouped

Table 2: How the Data Appears After Grouping and Calculating Medians (Before Pandas Styling)
			total_bill	tip	size
day	smoker	time
Fri	No	Dinner	22.490	3.250	2.0
	No	Lunch	15.980	3.000	3.0
	Yes	Dinner	16.320	3.000	2.0
	Yes	Lunch	12.790	2.100	2.0
Sat	No	Dinner	17.820	2.750	2.0
Sat	Yes	Dinner	20.390	2.690	2.0
Sun	No	Dinner	18.430	3.020	3.0
Sun	Yes	Dinner	23.100	3.500	2.0
Thur	No	Dinner	18.780	3.000	2.0
	No	Lunch	15.235	2.105	2.0
	Yes	Lunch	16.470	2.560	2.0

Highlighting minimum and maximum values

Check out these handy helpers: highlight_min and highlight_max. They’re your go-to for spotting the lowest and highest values in each column. It’s a speedy way to emphasize the most important data points in each category.

Also, please note the format(precision=1, thousands=",", decimal=".") snippet, this is not exactly a built-in style feature but has everything to do with keeping those float numbers tidy. Pandas tends to display more decimal places than we often require, which can be a bit distracting. To tone it down a notch, we can lean on the format() and format_index() methods to fine-tune the precision. Trust me, it’s super useful!

Code

(
    grouped.style.format(precision=1, thousands=",", decimal=".")
    .highlight_max(
        axis=0, props="color:white; font-weight:bold; background-color:green;"
    )
    .highlight_min(axis=0, props="color:white; font-weight:bold; background-color:red;")
)

Table 3: Highlighting Maximum and Minimum Values with Pandas Styler
			total_bill	tip	size
day	smoker	time
Fri	No	Dinner	22.5	3.2	2.0
	No	Lunch	16.0	3.0	3.0
	Yes	Dinner	16.3	3.0	2.0
	Yes	Lunch	12.8	2.1	2.0
Sat	No	Dinner	17.8	2.8	2.0
Sat	Yes	Dinner	20.4	2.7	2.0
Sun	No	Dinner	18.4	3.0	3.0
Sun	Yes	Dinner	23.1	3.5	2.0
Thur	No	Dinner	18.8	3.0	2.0
	No	Lunch	15.2	2.1	2.0
	Yes	Lunch	16.5	2.6	2.0

Emphasizing Values Within a Range

Imagine you’re want to find the days when tips fell between 3 and 5 dollars. In this scenario, the highlight_between method comes to the rescue. Don’t forget to use the subset argument; it’s your trusty sidekick when you only want to work with selected columns.

Code

(
    grouped.reset_index()
    .style.format(precision=1, thousands=",", decimal=".")
    .highlight_between(
        left=3,
        right=5,
        subset=["tip"],
        axis=1,
        props="color:white; font-weight:bold; background-color:purple;",
    )
)

Table 4: Highlighting Data within a Specified Range Using Pandas Styler
	day	smoker	time	total_bill	tip	size
0	Fri	No	Dinner	22.5	3.2	2.0
1	Fri	No	Lunch	16.0	3.0	3.0
2	Fri	Yes	Dinner	16.3	3.0	2.0
3	Fri	Yes	Lunch	12.8	2.1	2.0
4	Sat	No	Dinner	17.8	2.8	2.0
5	Sat	Yes	Dinner	20.4	2.7	2.0
6	Sun	No	Dinner	18.4	3.0	3.0
7	Sun	Yes	Dinner	23.1	3.5	2.0
8	Thur	No	Dinner	18.8	3.0	2.0
9	Thur	No	Lunch	15.2	2.1	2.0
10	Thur	Yes	Lunch	16.5	2.6	2.0

Highlight Column-Wise Outliers

Utilizing parameters such as q_left=0.05, q_right=0.95, axis=0, and defining props='opacity: 10%;', we can highlight values residing outside the 5-95 percentile range.

Code

(
    grouped.style.format(precision=1, thousands=",", decimal=".").highlight_quantile(
        q_left=0.05, q_right=0.95, axis=0, props="opacity: 10%;"
    )
)

Table 5: Highlighting Outliers
			total_bill	tip	size
day	smoker	time
Fri	No	Dinner	22.5	3.2	2.0
	No	Lunch	16.0	3.0	3.0
	Yes	Dinner	16.3	3.0	2.0
	Yes	Lunch	12.8	2.1	2.0
Sat	No	Dinner	17.8	2.8	2.0
Sat	Yes	Dinner	20.4	2.7	2.0
Sun	No	Dinner	18.4	3.0	3.0
Sun	Yes	Dinner	23.1	3.5	2.0
Thur	No	Dinner	18.8	3.0	2.0
	No	Lunch	15.2	2.1	2.0
	Yes	Lunch	16.5	2.6	2.0

Spot trends Using Color Gradients

We can show data trends using both the background_gradient and text_gradient methods. These methods introduce gradient-style background colors and text shading to our visualizations. To optimize their impact, it’s advisable to first arrange your data with the sort_values method before applying the background_gradient.

Code

(
    grouped.reset_index()
    .sort_values(by="total_bill")
    .style.format(precision=1, thousands=",", decimal=".")
    .background_gradient(cmap="viridis", axis=0)
)

Table 6: Unveiling Data Trends with Pandas Styler’s `background_gradient`
	day	smoker	time	total_bill	tip	size
3	Fri	Yes	Lunch	12.8	2.1	2.0
9	Thur	No	Lunch	15.2	2.1	2.0
1	Fri	No	Lunch	16.0	3.0	3.0
2	Fri	Yes	Dinner	16.3	3.0	2.0
10	Thur	Yes	Lunch	16.5	2.6	2.0
4	Sat	No	Dinner	17.8	2.8	2.0
6	Sun	No	Dinner	18.4	3.0	3.0
8	Thur	No	Dinner	18.8	3.0	2.0
5	Sat	Yes	Dinner	20.4	2.7	2.0
0	Fri	No	Dinner	22.5	3.2	2.0
7	Sun	Yes	Dinner	23.1	3.5	2.0

Code

(
    grouped.reset_index()
    .sort_values(by="total_bill")
    .style.format(precision=1, thousands=",", decimal=".")
    .text_gradient(cmap="bwr", axis=0)
)

Table 7: Unveiling Data Trends with Pandas Styler’s `text_gradient`
	day	smoker	time	total_bill	tip	size
3	Fri	Yes	Lunch	12.8	2.1	2.0
9	Thur	No	Lunch	15.2	2.1	2.0
1	Fri	No	Lunch	16.0	3.0	3.0
2	Fri	Yes	Dinner	16.3	3.0	2.0
10	Thur	Yes	Lunch	16.5	2.6	2.0
4	Sat	No	Dinner	17.8	2.8	2.0
6	Sun	No	Dinner	18.4	3.0	3.0
8	Thur	No	Dinner	18.8	3.0	2.0
5	Sat	Yes	Dinner	20.4	2.7	2.0
0	Fri	No	Dinner	22.5	3.2	2.0
7	Sun	Yes	Dinner	23.1	3.5	2.0

Display Bar Charts within Your Table

Let’s explore a technique for highlighting the significance of values by embedding bar charts right within the cells. The blend of bar heights and color gradients can pack a powerful punch in your data storytelling arsenal. Don’t forget to experiment with the ‘align’ option, a handy tool that helps you position these bars within the cells just right, giving your visuals a polished look. Feel free to play around with the settings and find what clicks best with your data tales.

Code

(
    grouped.reset_index()
    .sort_values(by="total_bill")
    .style.format(precision=1, thousands=",", decimal=".")
    .bar(
        align="mean",
        cmap="bwr",
        height=50,
        width=60,
        props="width: 120px; border-right: 1px solid black;",
    )
)

Table 8: Bar Charts in Your Table with Alignment Set to ‘Mean’
	day	smoker	time	total_bill	tip	size
3	Fri	Yes	Lunch	12.8	2.1	2.0
9	Thur	No	Lunch	15.2	2.1	2.0
1	Fri	No	Lunch	16.0	3.0	3.0
2	Fri	Yes	Dinner	16.3	3.0	2.0
10	Thur	Yes	Lunch	16.5	2.6	2.0
4	Sat	No	Dinner	17.8	2.8	2.0
6	Sun	No	Dinner	18.4	3.0	3.0
8	Thur	No	Dinner	18.8	3.0	2.0
5	Sat	Yes	Dinner	20.4	2.7	2.0
0	Fri	No	Dinner	22.5	3.2	2.0
7	Sun	Yes	Dinner	23.1	3.5	2.0

Code

(
    grouped.reset_index()
    .sort_values(by="total_bill")
    .style.format(precision=1, thousands=",", decimal=".")
    .bar(
        align=0,
        cmap="bwr",
        height=50,
        width=60,
        props="width: 120px; border-right: 1px solid black;",
    )
)

Table 9: Bar Charts in Your Table with Alignment Set using a float number
	day	smoker	time	total_bill	tip	size
3	Fri	Yes	Lunch	12.8	2.1	2.0
9	Thur	No	Lunch	15.2	2.1	2.0
1	Fri	No	Lunch	16.0	3.0	3.0
2	Fri	Yes	Dinner	16.3	3.0	2.0
10	Thur	Yes	Lunch	16.5	2.6	2.0
4	Sat	No	Dinner	17.8	2.8	2.0
6	Sun	No	Dinner	18.4	3.0	3.0
8	Thur	No	Dinner	18.8	3.0	2.0
5	Sat	Yes	Dinner	20.4	2.7	2.0
0	Fri	No	Dinner	22.5	3.2	2.0
7	Sun	Yes	Dinner	23.1	3.5	2.0

Formatting

Below is an illustrative example of the myriad possibilities when it comes to fine-tuning the style of a DataFrame. In this code, we’ve undertaken various stylistic enhancements:

Background Gradient: We’ve applied a nice background color gradient to the columns along the vertical axis.
Caption: We’ve set a descriptive caption for the DataFrame.
Visual Styling: We’ve specified text alignment and introduced hovering options to make highlighted values pop even more.
Column Names Formatting: We’ve reformatted our column names for clarity and aesthetics.
Column Hiding: We’ve concealed the ‘smoker’ column, decluttering our view.
Index Hiding: We’ve also hidden the index labels for a cleaner look.
Numerical Formatting: We’ve individually specified the numerical representation, including adding a dollar sign for currency values.

This demonstration underscores the virtually endless possibilities for customizing the appearance of your DataFrame. However, it’s important to note that the physical attributes set by the set_table_styles method won’t be exported to Excel, should you choose to do so. Just a handy tidbit to keep in mind.

Code

# Start by resetting the index and renaming columns with underscores
# Replace underscores with spaces for better readability
(
    grouped.reset_index().rename(columns=lambda x: x.replace("_", " "))
    # Sort the DataFrame by the 'total bill' column
    .sort_values(by="total bill")
    # Apply Pandas Styler to format the table
    .style
    # Apply background color gradient to columns along the vertical axis (axis=0)
    .background_gradient(cmap="viridis", axis=0)
    # Set a caption for the table
    .set_caption("Exploring Dining Trends: Bill Amounts, Tips, and Party Sizes")
    # Customize the table's visual styling
    .set_table_styles(
        [
            {
                "selector": "th.col_heading",
                "props": "text-align: center; font-size: 1.5em;",
            },
            {"selector": "td", "props": "text-align: center;"},
            {
                "selector": "td:hover",
                "props": "font-style: italic; color: black; font-weight:bold; background-color : #ffffb3;",
            },
        ],
        overwrite=False,
    )
    # Apply custom formatting to the index labels (convert to uppercase)
    .format_index(str.upper, axis=1)
    # Hide the 'smoker' column from the table
    .hide(subset=["smoker"], axis=1)
    # Hide the index label (row numbers)
    .hide(axis="index")
    # Format specific columns with dollar signs and one decimal place
    .format(
        {
            "total bill": "$ {:.1f}",
            "tip": "$ {:.1f}",
            "size": "{:.0f}",
        }
    )
)

Table 10: Formatting tables
DAY	TIME	TOTAL BILL	TIP	SIZE
Fri	Lunch	$ 12.8	$ 2.1	2
Thur	Lunch	$ 15.2	$ 2.1	2
Fri	Lunch	$ 16.0	$ 3.0	3
Fri	Dinner	$ 16.3	$ 3.0	2
Thur	Lunch	$ 16.5	$ 2.6	2
Sat	Dinner	$ 17.8	$ 2.8	2
Sun	Dinner	$ 18.4	$ 3.0	3
Thur	Dinner	$ 18.8	$ 3.0	2
Sat	Dinner	$ 20.4	$ 2.7	2
Fri	Dinner	$ 22.5	$ 3.2	2
Sun	Dinner	$ 23.1	$ 3.5	2

Defining Custom Styling

If the predefined styling methods don’t meet your needs, you have the flexibility to create your custom styling function and use it with your DataFrame. You can apply styling element by element using the .applymap() method, or you can work on columns or rows using the .apply() method. If you need to work specifically on column header rows or indexes, both .apply_index() and .applymap_index() are at your disposal.

Code

# Define a custom styling function that highlights 'Sun' in yellow
def highlight_sunday(value, color):
    return f"color: {color};" if value == "Sun" else None


# Sample 10 random rows from the DataFrame
(
    df.sample(10).style
    # Set formatting options for numeric values
    .format(precision=1, thousands=",", decimal=".")
    # Apply the highlight_sunday function to the 'day' column and set the color to yellow
    .applymap(highlight_sunday, color="yellow", subset=["day"])
    # Apply opacity styling to cells in the 'size' column where the value is less than or equal to 3
    .applymap(lambda value: "opacity: 50%;" if (value <= 3) else None, subset=["size"])
    # Apply red text color styling to cells in the 'tip' column where the value is less than or equal to 3
    .applymap(lambda value: "color:red;" if (value <= 3) else None, subset=["tip"])
    # Use apply_index to style index labels. If the label is 'tip', 'day', or 'size', set the color to white, otherwise black.
    .apply_index(
        lambda s: np.where(
            s.isin(["tip", "day", "size"]), "opacity: 100%;", "opacity: 10%;"
        ),
        axis=1,
    )
)

Table 11: Applying Custom Styling
	total_bill	tip	sex	smoker	day	time	size
179	34.6	3.5	Male	Yes	Sun	Dinner	2
3	23.7	3.3	Male	No	Sun	Dinner	2
83	32.7	5.0	Male	Yes	Thur	Lunch	2
208	24.3	2.0	Male	Yes	Sat	Dinner	2
53	9.9	1.6	Male	No	Sun	Dinner	2
17	16.3	3.7	Male	No	Sun	Dinner	3
6	8.8	2.0	Male	No	Sun	Dinner	2
182	45.4	3.5	Male	Yes	Sun	Dinner	3
48	28.6	2.0	Male	No	Sun	Dinner	3
61	13.8	2.0	Male	Yes	Sat	Dinner	2

Other Useful Features

Get a Clearer Overview with `set_sticky`

What if you find yourself facing a DataFrame with more columns than can comfortably fit on your screen, yet you still wish to inspect each column individually? In the past, you might have resorted to using pd.set_option('display.max_columns', xyz) to expand the display. However, there’s a much more elegant solution: set_sticky.

set_sticky introduces a clever CSS trick that permanently pins the index or column headers within a scrolling frame. In our case, although the ‘tips’ DataFrame doesn’t have an excessive number of columns, we’ve concatenated 10 DataFrames together to showcase the remarkable utility of set_sticky. As you scroll horizontally, you’ll notice that you can now conveniently inspect all the columns while the index remains firmly in place, thanks to the magic of set_sticky. Let’s explore this feature below.

Code

(pd.concat([df for i in range(10)], axis=1).head().style.set_sticky(axis="index"))

Table 12: Demonstrating `set_sticky`
	total_bill	tip	sex	smoker	day	time	size	total_bill	tip	sex	smoker	day	time	size	total_bill	tip	sex	smoker	day	time	size	total_bill	tip	sex	smoker	day	time	size	total_bill	tip	sex	smoker	day	time	size	total_bill	tip	sex	smoker	day	time	size	total_bill	tip	sex	smoker	day	time	size	total_bill	tip	sex	smoker	day	time	size	total_bill	tip	sex	smoker	day	time	size	total_bill	tip	sex	smoker	day	time	size
0	16.990000	1.010000	Female	No	Sun	Dinner	2	16.990000	1.010000	Female	No	Sun	Dinner	2	16.990000	1.010000	Female	No	Sun	Dinner	2	16.990000	1.010000	Female	No	Sun	Dinner	2	16.990000	1.010000	Female	No	Sun	Dinner	2	16.990000	1.010000	Female	No	Sun	Dinner	2	16.990000	1.010000	Female	No	Sun	Dinner	2	16.990000	1.010000	Female	No	Sun	Dinner	2	16.990000	1.010000	Female	No	Sun	Dinner	2	16.990000	1.010000	Female	No	Sun	Dinner	2
1	10.340000	1.660000	Male	No	Sun	Dinner	3	10.340000	1.660000	Male	No	Sun	Dinner	3	10.340000	1.660000	Male	No	Sun	Dinner	3	10.340000	1.660000	Male	No	Sun	Dinner	3	10.340000	1.660000	Male	No	Sun	Dinner	3	10.340000	1.660000	Male	No	Sun	Dinner	3	10.340000	1.660000	Male	No	Sun	Dinner	3	10.340000	1.660000	Male	No	Sun	Dinner	3	10.340000	1.660000	Male	No	Sun	Dinner	3	10.340000	1.660000	Male	No	Sun	Dinner	3
2	21.010000	3.500000	Male	No	Sun	Dinner	3	21.010000	3.500000	Male	No	Sun	Dinner	3	21.010000	3.500000	Male	No	Sun	Dinner	3	21.010000	3.500000	Male	No	Sun	Dinner	3	21.010000	3.500000	Male	No	Sun	Dinner	3	21.010000	3.500000	Male	No	Sun	Dinner	3	21.010000	3.500000	Male	No	Sun	Dinner	3	21.010000	3.500000	Male	No	Sun	Dinner	3	21.010000	3.500000	Male	No	Sun	Dinner	3	21.010000	3.500000	Male	No	Sun	Dinner	3
3	23.680000	3.310000	Male	No	Sun	Dinner	2	23.680000	3.310000	Male	No	Sun	Dinner	2	23.680000	3.310000	Male	No	Sun	Dinner	2	23.680000	3.310000	Male	No	Sun	Dinner	2	23.680000	3.310000	Male	No	Sun	Dinner	2	23.680000	3.310000	Male	No	Sun	Dinner	2	23.680000	3.310000	Male	No	Sun	Dinner	2	23.680000	3.310000	Male	No	Sun	Dinner	2	23.680000	3.310000	Male	No	Sun	Dinner	2	23.680000	3.310000	Male	No	Sun	Dinner	2
4	24.590000	3.610000	Female	No	Sun	Dinner	4	24.590000	3.610000	Female	No	Sun	Dinner	4	24.590000	3.610000	Female	No	Sun	Dinner	4	24.590000	3.610000	Female	No	Sun	Dinner	4	24.590000	3.610000	Female	No	Sun	Dinner	4	24.590000	3.610000	Female	No	Sun	Dinner	4	24.590000	3.610000	Female	No	Sun	Dinner	4	24.590000	3.610000	Female	No	Sun	Dinner	4	24.590000	3.610000	Female	No	Sun	Dinner	4	24.590000	3.610000	Female	No	Sun	Dinner	4

Concatenating DataFrame Outputs

You can combine two or more Stylers if they have the same columns. This is particularly handy when presenting summary statistics for a DataFrame

Code

(
    df.groupby("day")[["total_bill", "tip", "size"]]
    .mean()
    .style.format(precision=1)
    .concat(
        df[["total_bill", "tip", "size"]]
        .agg(["mean", "median", "sum"])
        .style.format(precision=1)
        .relabel_index(["Average", "Median", "Sum"])
    )
)

Table 13: Demonstrating Easily Concatenating Different DataFrame Outputs
	total_bill	tip	size
day
Fri	17.2	2.7	2.1
Sat	20.4	3.0	2.5
Sun	21.4	3.3	2.8
Thur	17.7	2.8	2.5
Average	19.8	3.0	2.6
Median	17.8	2.9	2.0
Sum	4827.8	731.6	627.0

Exporting

To save your styled DataFrame in Excel format, you can export it to an .xlsx file, preserving the styling. To do this, ensure you have the ‘openpyxl’ package installed.

And there you have it, folks! This concludes our brief exploration of the formatting options available right within Pandas. I hope you found this article insightful and enjoyable. Until next time, happy coding!

Setup

Built-in styling

Highlighting minimum and maximum values

Emphasizing Values Within a Range

Highlight Column-Wise Outliers

Spot trends Using Color Gradients

Display Bar Charts within Your Table

Formatting

Defining Custom Styling

Other Useful Features

Get a Clearer Overview with set_sticky

Concatenating DataFrame Outputs

Exporting

Get a Clearer Overview with `set_sticky`