Pandas table beauty skills

Public account: You and the cabin by: Peter Editor: Peter

Hello, I’m Peter

This article describes how to beautify Pandas’ DataFrame data. This is done primarily through two methods in Pandas:

Styler.applymap: Returns a single string with CSS property-value pairs, element by element
Styler.apply: column, table, row mode that returns a Series or DataFrame of the same shape, where each value is a string with a CSS attribute value pair. When this method is used, it is passed by the parameter Axis, axis=0 means applied by column, and axis=1 means applied by row.

Official website study address:

Pandas.pydata.org/pandas-docs…
Pandas.pydata.org/pandas-docs…

Pandas serial

Pandas’ article has been in serial:

Simulated data

import pandas as pd
import numpy as np

# Set random seed to ensure the same result every time
np.random.seed(20)  

# Simulation data
df = pd.DataFrame({'A': np.linspace(1.10.10)})
# merge two Dataframes
df1 = pd.concat([df, pd.DataFrame(np.random.randn(10.4), columns=list("BCDE"))],axis=1)
df1
Copy the code

Linspace (start,stop,setp) in numpy: linspace(start,stop,setp) in numpy: linspace(start,stop,setp)

To see the effect of nulls, three nulls are deliberately set:

View and hide styles

The first is to look at the DataFrame style Settings:

We can use the Render method to see the values of the data box styles: we find the default CSS style code

Hidden index

The code to hide index:

Hides the specified column

We find that two columns of AC are hidden, using parameter subset parameter:

Here is a case study of the various display styles:

Case 1: Positive and negative colors

Less than 0 is red, greater than 0 is blue

def color_change(val) :
    color = 'red' if val < 0 else 'blue'
    return 'color: %s' % color  
Copy the code

Use applymap and call the written function
df1.style.applymap(color_change)
Copy the code

The applyMap method is for the entire DataFrame

Case 2: Highlight data

Maximum, minimum, and missing values can be highlighted. Write a simple highlighting function:

def highlight_min(x) :
    is_min = x == x.min(a)return ["background-color: yellow" if v else ' ' for v in is_min]
Copy the code

Example 3: Use the default highlighting function

There are three built-in highlighting functions written in Pandas by default:

Highlight_max () : the biggest
Highlight_min () : a minimum
Highlight_null () : a null value

We can also use the axis argument to display rows or columns

1. Highlight maximum value

2. Highlight the minimum value for each column

3, highlight null value: the default is red

Of course we can change the color by using the null_color parameter:

Use the parameter axis=1 to indicate the direction of the row:

Case 4: Chain call

Chain calls refer to using multiple functions (methods) simultaneously in the same style operation:

# color_change + highlight_max
df1.style.applymap(color_change).apply(highlight_min)
Copy the code

If you have a lot of methods, you can write:

# color_change + highlight_max
df1.style.\
applymap(color_change).\
apply(highlight_min)

Add parentheses to indicate that the code is a whole
(df1.style.
applymap(color_change).
apply(highlight_min))
Copy the code

Null highlighting and color change arguments are used simultaneously to implement chain calls:

Use both hidden column attributes and custom parameters:

Example 5: Partial data beautification

Subset controls the range of functions with a parameter subset, and the parameters passed in can be:

The column tag
List (NUMpy array)
Tuples (row_indexer, column_indexer)

Partial column attribute

Parameter subset specifies the column attributes we want to beautify:

Pd.indexslice constructs the list

Construct with pd.indexslice

Example 6: Formatting display of values (styler.format)

Styler.format controls the formatted output of numeric values, similar to string formatting in Python

The whole show

For example, we want to display only two decimal places for all data:

Use dictionaries to format specific columns

Format the column attribute names as keys and the format specification as values in dictionary form:

Use lambda anonymous functions

A null value displayed

The null value is replaced by the na_rep argument:

Use the word “null” instead:

Chain calls also use “-” instead:

emoticons

Using Emoji:

Example 7: Built-in styles

Built-in function

The built-in null highlighting function is used here:

Thermal map making

We use the Seaborn library to implement:

import seaborn as sns

cm = sns.light_palette("blue", as_cmap=True)
s = df1.style.background_gradient(cmap=cm)
s 
Copy the code

Continuous chromatographic

The range of the chromatography is specified by the parameters low and high:

Set_properties use

# # # set_properties use

df1.style.set_properties(**{"background":"yellow".# the background color
                           "color":"red".# font
                           "border-color":"black"})  # boundary
Copy the code

Example 8: Personalize bar charts

The default mode

Operate via the bar method:

Parameters of the align

There are three alignments:

Left: The minimum starts on the left side of the cell
Zero: The zero value is in the center of the cell
Mid: The center of the cell is (max-min)/2

Example 9: Style sharing

What if we created a style for one DataFrame and then wanted to use it directly in another DataFrame?

Let’s create a style style1:

Apply style1 to style2:

Example 10: Set the precision set_precision

It is usually used to specify the precision (decimal place) of the data in the data box:

Example 11: Set the title

Use the set_caption method

Missing value setting

The set_na_rep function is used:

(df1.style.set_na_rep("FAIL").format(None, na_rep="PASS", subset=["D"])
    .highlight_null("yellow"))
Copy the code

Example 13: Composite case

(df1.style
  .set_na_rep(The '-')  # set null
  .format({'C':'{:.4f}'.# Precision setting
           'D':'{:.4f}'.'E':'{:.2%}'},na_rep="-")  Precision and null values are set at the same time
  .highlight_null("green")  # Null highlight
  .applymap(color_change,subset=['D'])  Column D uses the color_change function
  .apply(highlight_min,subset=['C'])  # C column uses highlight_min
  .background_gradient(cmap='cubehelix',subset=['B'.'D'])  # Background color Settings
  .bar(subset=['E'], align='mid', color=['#60BCD4'.'#6A1B9A'])  # Bar chart Settings
  .set_caption("Title of Pandas Style")  # title Settings
  .hide_index()  # hide index
  .hide_columns(subset=['A']))  # hide column A
Copy the code

Example 14: Ultimate weapon

Version requires at least Pandas1.2.0
style1 = [
    dict(selector="th", props=[("font-size"."125%"), 
                               ("text-align"."center"),
                               ("background-color"."#F0F3CF"),
                               ('width'."100px"),
                               ('height'.'80px')),dict(selector="td", props=[("font-size"."105%"), 
                               ("text-align"."right"),
                               ('width'."150px"),
                               ('height'.'50px')),dict(selector="caption", props=[("caption-side"."top"),
                                    ("font-size"."150%"),
                                    ("font-weight"."bold"),
                                    ("text-align"."left"),
                                    ('height'.'50px'),
                                    ('color'.'#E74C3C')])]

style2 = {
    'A': [dict(selector='td', props=[('text-align'.'center'),
                                     ("font-weight"."bold"),
                                     ("text-transform"."capitalize")])],
    'B': [dict(selector='td', props=[('text-align'.'left'),
                                     ("font-style"."italic")])],
    'C': [dict(selector='td', props=[('text-decoration'.'underline'),
                                     ('text-decoration-color'.'red'),
                                     ('text-decoration-style'.'wavy')])]}
# chain calls various setup methods
(df1.style
  .set_na_rep(The '-')  # global null
  .format({'C':'{:.4f}'.'D':'{:.4f}'.'E':'{:.2%}'},na_rep="-")  # precision
  .highlight_null("gray")  # highlight null value
  .applymap(color_change,subset=['B']).highlight_max(subset=['A'])  # color_change function and maximum highlighting
  .background_gradient(cmap='Pastel1',subset=['C'.'D'])  # background
  .bar(subset=['E'], align='mid', color=['#90BCD4'.'#6A1B9A'])  # column
  .set_caption("Advanced use of Pandas Style")  # titles
 	.hide_index()  # hide index
  .hide_columns(subset=['E'])  # hide column E
  .set_table_styles(style1).set_table_styles(style2,overwrite=False)  # Style passing
  .set_properties(**{'font-family': 'Microsoft Yahei'.'border-collapse': 'collapse'.# Attribute Settings
                     'border-top': '1px solid black'.'border-bottom': '1px solid black'}))
Copy the code

Output to Excel

Openyxl or XlsWriter is used as an engine to export DataFrame to Excel.

Here’s a simple example:

(df1.style
   .applymap(color_change)  # plus and minus change colors
   .apply(highlight_min)  # Highlight the minimum
   .to_excel('styled.xlsx', engine='openpyxl'))
Copy the code

If we don’t want the index number, add index=False:

(df1.style
   .applymap(color_change)  # plus and minus change colors
   .apply(highlight_min)  # Highlight the minimum
   .to_excel('styled.xlsx', engine='openpyxl', index=False))
Copy the code