[TOC]

Datawhale2021.9 Study in teams

Data Visualization _task01

Matplotlib is a Python 2D drawing library that produces publically quality graphics in a variety of hard copy formats and cross-platform interactive environments for drawing static, dynamic, and interactive charts.

Matplotlib is available for Python scripts, Python and IPython shells, Jupyter Notebooks, Web application servers, and various graphical user interface toolkits, among others.

Matplotlib is one of the most popular data visualization tools in Python. The interfaces for manipulating pandas and Seaborn are also based on Matplotlib.

To get a better understanding of Matplotlib, let’s start with some of the most basic concepts and move on to some advanced techniques.

1. Drawing examples

Matplotlib’s images are drawn on figures (e.g., Windows, Jupyter forms), and each figure contains one or more axes (a subarea where coordinate systems can be specified). The easiest way to create figures and axes is by using the Pyplot. subplots command. After axes are created, you can use axes.

import matplotlib.pyplot as plt
import numpy as np

fig, ax = plt.subplots()  Create a figure containing one axes
ax.plot([1.2.3.4], [1.4.2.3]);  # draw an image
Copy the code

Pay attention to: If you use matplotlib in Jupyter Notebook, the code automatically prints something like this[<matplotlib.lines.Line2D at 0x248c0f10c70>]This is because matplotlib’s drawing code prints out the last object by default. If you don’t want to display this sentence, there are three ways to use it, which you can find in the code examples in this section.

  1. Add a semicolon at the end of the code block;
  2. Add plt.show() to the end of the block
  3. Plot ([1, 2, 3, 4]) to line =plt.plot([1, 2, 3, 4])

1.1 the plot () function

Signature: plt.plot(*args, scalex=True, scaley=True, data=None, **kwargs)

    plot([x], y, [fmt], *, data=None, **kwargs) plot([x], y, [fmt], [x2], y2, [fmt2], ... , **kwargs)Copy the code
  • X, Y: nodes of points or lines. X is the X-axis data and y is the Y-axis data. The data can be tabled or array.
  • FMT: Optional, defines basic formats such as colors, marks, and line styles.
  • **kwargs: Optional, used on 2d planar graphs to set specified properties such as labels, line widths, etc.
plot(x, y)        # plot x and y using default line style and color
plot(x, y, 'bo')  # plot x and y using blue circle markers
plot(y)           # plot y using x as index array 0.. N-1
plot(y, 'r+')     # ditto, but with red plusses
Copy the code

Lines and markers can be styled using either a keyword declaration or a format string. The two forms can be mixed, but in case of conflict, the keyword declaration overwrites the format string.

plot(x, y, 'go--', linewidth=2, markersize=12)
plot(x, y, color='green', marker='o', linestyle='dashed',linewidth=2, markersize=12)
Copy the code
import matplotlib.pyplot as plt import numpy as np fig,ax = plt.subplots() Ax. The plot ([1, 2, 3, 4], [4,1,3,2], 'or,' our linewidth = 2, markersize = 14) PLT. The show ()Copy the code

1.1.1 Draw marked data

There is a convenient way to draw objects with labeled data (that is, data that can be accessed by indexing obj[‘y’]). Instead of giving the data in X and Y, you can supply the object arguments in data and give only the labels for x and Y:

plot('xlabel'.'ylabel', data=obj)
Copy the code

All indexable objects are supported. This can be for example a dict, a pandas.DataFrame, or a structured NUMPY array.

import matplotlib.pyplot as plt import numpy as np x = np.arange(0, 2*np.pi, Plot ('dx','dy',data= plotplots) plots = {'dx':x,'dy':y} ax.plot('dx','dy',data= plotplots) plots = {'dx':x,'dy':y} ax.plot('dx','dy',data= plotplots) plt.show()Copy the code

Ax.plot (‘dx’,’dy’,’ data=plotdict ‘)

1.1.2 Drawing multiple groups of data

There are several ways to plot multiple sets of data.

  • The most direct way is to call it multiple timesplot.
   plot(x1, y1, 'bo')
   plot(x2, y2, 'go')
Copy the code
  • Or, if your data is already a two-dimensional array, you can pass it directly tox.y. Each column will be plotted as a separate data set.

Example: an array “A” where the first column represents the x value and the other columns are the Y column:

   plot(a[0], a[1:])
Copy the code

Plt.plot (plotarr[:,0],plotarr[:,1:])

Import matplotlib.pyplot as PLT import numpy as np x = np.linspace(0,2,100) y1 = x y2 = x**2 y3 = x**3 plotarr = np.array([x,y1,y2,y3]).transpose() plt.plot(plotarr[:,0],plotarr[:,1:]) plt.show()Copy the code

  • The third way is to specify multiple groups *[x],y,[fmt]*:
   plot(x1, y1, 'g^', x2, y2, 'g-')
Copy the code

In this case, any additional keyword arguments apply to all data sets. In addition, this syntax cannot be used with data norm parameters.

By default, each line is assigned a different style ‘style cycle’ specified. The FMT and line attribute parameters are necessary if you want to deviate significantly from these defaults. Alternatively, you can use: RC :axes. Prop_cycle.

1.1.3 Parameter Dictionary

1.1.3.1 * * kwargs

** Kwargs.Line2D property, optional. Used to specify properties such as line labels (for automatic legends), line width, anti-aliasing, and tag color.

     plot([1.2.3], [1.2.3].'go-', label='line 1', linewidth=2)
     plot([1.2.3], [1.4.9].'rs', label='line 2')
Copy the code

If you create multiple lines with a single plot call, kwargs works for all of them.

Properties:

agg_filter: a filter function, which takes a (m, n, 3) float array and a dpi value, and returns a (m, n, 3) array

alpha: float or None

animated: bool

antialiased or aa: bool

clip_box: .Bbox

clip_on: bool

clip_path: Patch or (Path, Transform) or None

color or c: color

contains: unknown

dash_capstyle: {‘butt’, ’round’, ‘projecting’}

dash_joinstyle: {‘miter’, ’round’, ‘bevel’}

dashes: sequence of floats (on/off ink in points) or (None, None)

data: (2, N) array or two 1D arrays

drawstyle or ds: {‘default’, ‘steps’, ‘steps-pre’, ‘steps-mid’, ‘steps-post’}, default: ‘default’

figure: .Figure

fillstyle: {‘full’, ‘left’, ‘right’, ‘bottom’, ‘top’, ‘none’}

gid: str

in_layout: bool

label: object

linestyle or ls: {‘-‘, ‘–‘, ‘-.’, ‘:’, ”, (offset, on-off-seq), … }

linewidth or lw: float

marker: marker style string, ~.path.Path or ~.markers.MarkerStyle

markeredgecolor or mec: color

markeredgewidth or mew: float

markerfacecolor or mfc: color

markerfacecoloralt or mfcalt: color

markersize or ms: float

markevery: None or int or (int, int) or slice or List[int] or float or (float, float) or List[bool]

path_effects: .AbstractPathEffect

picker: unknown

pickradius: float

rasterized: bool or None

sketch_params: (scale: float, length: float, randomness: float)

snap: bool or None

solid_capstyle: {‘butt’, ’round’, ‘projecting’}

solid_joinstyle: {‘miter’, ’round’, ‘bevel’}

transform: matplotlib.transforms.Transform

url: str

visible: bool

xdata: 1D array

ydata: 1D array

zorder: float

1.1.3.2 fmt

The format string consists of parts of colors, markers, and lines: FMT = [marker][line][color]

Each of them is optional. If not provided, the value from the style cycle is used. Exception: If line is given, but no marker, the data will be a line without markers. Each of them is optional. If not provided, from Style Cycle. Exception: If a “line” is given but no “marker”, the data will be an unmarked line. Other combinations such as [color][marker][line] are also supported, but note that their parsing may be ambiguous.

Markers

= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =

character description

= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =

'.' point marker

',' pixel marker

'o' circle marker

'v' triangle_down marker

'^' triangle_up marker

'<' triangle_left marker

'>' triangle_right marker

'1' tri_down marker

'2' tri_up marker

'3' tri_left marker

'4' tri_right marker

's' square marker

'p' pentagon marker

'*' star marker

'h' hexagon1 marker

'H' hexagon2 marker

'+' plus marker

'x' x marker

'D' diamond marker

'd' thin_diamond marker

'|' vline marker

'_' hline marker

= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =

Line Styles

= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =

character description

= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =

'-' solid line style

'--' dashed line style

'-.' dash-dot line style

':' dotted line style

= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =

Example format strings::

'b'    # blue markers with default shape
'or'   # red circles
'-g'   # green solid line
'--'   # dashed line with default color
'^k:'  # black triangle_up markers connected by a dotted line
Copy the code

Colors

The supported color abbreviations are the single letter codes

= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =

character color

= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =

'b' blue

'g' green

'r' red

'c' cyan

'm' magenta

'y' yellow

'k' black

'w' white

= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =

and the 'CN' colors that index into the default property cycle.

If color is the only part of the format string, you can use any matplotlib.colors specification in addition, such as full name (‘green’) or hexadecimal string (‘#008000’).

2. The composition of figure

Now let’s take a closer look at the composition of figure. Through a figure anatomy, we can see that a complete Matplotlib image usually consists of the following four levels, also known as containers, which are described in detail in the next section. In the world of Matplotlib, we will manipulate each part of the image through various command methods to achieve the final result of data visualization, a complete image is actually a collection of various sub-elements.

  • Figure: Top level, used to hold all drawing elements

  • The core of the universe that contains a large number of elements used to construct subgraphs. A figure can consist of one or more subgraphs

  • A subordinate level of Axis: axes that handles all elements associated with axes and grids

  • Tick: The subordinate level of axis that handles all elements related to the scale

3. Two drawing interfaces

Matplotlib provides two of the most commonly used drawing interfaces

  1. Explicitly create figures and axes and call drawing methods on them, also known as object-oriented style
x = np.linspace(0, 2, 100)

fig, ax = plt.subplots()  
ax.plot(x, x, label='linear')  
ax.plot(x, x**2, label='quadratic')  
ax.plot(x, x**3, label='cubic')  
ax.set_xlabel('x label') 
ax.set_ylabel('y label') 
ax.set_title("Simple Plot")  
ax.legend() 
plt.show()
Copy the code

2. Rely on PyPlot to automatically create figure and axes and plot

x = np.linspace(0, 2, 100)

plt.plot(x, x, label='linear') 
plt.plot(x, x**2, label='quadratic')  
plt.plot(x, x**3, label='cubic')
plt.xlabel('x label')
plt.ylabel('y label')
plt.title("Simple Plot")
plt.legend()
plt.show()
Copy the code