This paper will use scatter function in the Matplotlib. pyplot module to draw scatter plots. The following code blocks are used to import each library needed. Make_blobs function is used to generate data, and you can see the first 5 lines of df generated by us

import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
from sklearn.datasets import make_blobs
import numpy as np
Copy the code
data=make_blobs(n_samples=150,n_features=4,centers=3,random_state=20220203)
df=pd.DataFrame(data[0],columns=['v1'.'v2'.'v3'.'v4'])
df['target']=data[1]
Copy the code
df.head()
Copy the code
v1 v2 v3 v4 target
0 1.046182 1.747420 1.680394 6.627963 2
1 3.654120 2.169111 1.494711 4.880651 2
2 2.227285 0.497507 2.115962 8.216452 2
3 3.995025 1.423546 0.993872 5.559633 2
4 1.994595 1.276570 0.905848 7.997673 2

First of all, we explain each parameter, so that we can optimize the image.

PLT. Scatter () parameters:

X,y: Input sequence to draw

S: Draws the size of the scatter: can be a scalar or vector

C: Color sequence or single color:

1 is a scalar or sequence of the same length as x used to convert the following CMAP or norm parameters into colors. 2 is a 2D RGB or RGBA array. 3 is a color sequence of the same length as input X. 4 is a string used to set colorsCopy the code

Marker: Shape of scatter, default to “O”

Cmap: a string of specific chromatographic names or chromatographic elements, used if and only if C is a column of floating-point numbers

Norm: Used to normalize sequences of input C floating-point types. If not, the default normalization is used

Vmin, vmax: floating point number used to normalize color vectors when no norm parameter is added.

Alpha: color transparency

Linewidths: Scatter boundary width, floating point or vector.

Edgecolors: The edgecolor of the dot. The default is the interior color of the dot. It can also be NONE, or a sequence.

1.1 Primary Drawing

The first one is logarithm

plt.scatter(x=df['v1'],y=df['v2'])
plt.xlabel('v1')
plt.ylabel('v2')
plt.show()
Copy the code

1.1.1 Scalar set point size

Setting s to a scalar makes all points the same size.

plt.scatter(x=df['v1'],y=df['v2'],s=100)
plt.xlabel('v1')
plt.ylabel('v2')
plt.show()
Copy the code

1.1.2 Vector set point size

Using vectors to set the size of a point reflects three dimensions of information: horizontal, vertical and point size.

plt.scatter(x=df['v1'],y=df['v2'],s=df['target'] *100+50)
plt.xlabel('v1')
plt.ylabel('v2')
plt.show()
Copy the code

1.1.3 Set the point shape

First of all, we selected several common point types and drew them as follows:

markers=['. '.', '.'o'.'v'.A '^'.'<'.'>'.'1'.'8'.'s'.'p'.'P'.The '*'.'H'.'h'.'+'.'x'.'X'.'D'.'d']# Commonly used scatter shapes
fig,axs=plt.subplots(4.5,figsize=(25.20))
for j in range(4) :for i in range(1.6):
        axs[j][i-1].scatter(1.1,marker=markers[i-1+j*5],s=1000)
        axs[j][i-1].text(x=1.01,y=1.01,s=str(markers[i-1+j*5]))
Copy the code

Here is an example of a replacement point pattern using only the data from this article

plt.scatter(x=df['v1'],y=df['v2'],s=100,marker=The '*')
plt.xlabel('v1')
plt.ylabel('v2')
plt.show()
Copy the code

1.1.4 Change colors in scalar form

We set the scatter to red by saying c=’red’

plt.scatter(x=df['v1'],y=df['v2'],s=100,c='red')
plt.xlabel('v1')
plt.ylabel('v2')
plt.show()
Copy the code

1.1.5 Set transparency

Alpha =0.5 Set transparency, it can be seen that it is more convenient to see the overlap of data after setting transparency

plt.scatter(x=df['v1'],y=df['v2'],s=100,c='red',alpha=0.4)
plt.xlabel('v1')
plt.ylabel('v2')
plt.show()
Copy the code

1.1.6 Add edge color

I added the edge color here just to look good

plt.scatter(x=df['v1'],y=df['v2'],c='red',s=100,alpha=0.6,edgecolor='black')
plt.xlabel('v1')
plt.ylabel('v2')
plt.show()
Copy the code

1.1.7 Setting the Edge Width

It’s also for good looks.

plt.scatter(x=df['v1'],y=df['v2'],c='red',s=100,alpha=0.6,linewidths=2,edgecolor='black')
plt.xlabel('v1')
plt.ylabel('v2')
plt.show()
Copy the code

1.2 Advanced rendering

1.2.1 Use chromatographic vectors to represent classification

plt.scatter(x=df['v1'],y=df['v2'],c=df['target'],s=100,alpha=0.6,
            linewidths=2,edgecolor='black',cmap='Set1')
plt.colorbar()
plt.xlabel('v1')
plt.ylabel('v2')
plt.show()
Copy the code

1.2.2 Use continuous color vectors

plt.scatter(x=df['v1'],y=df['v2'],c=df['v4'],s=100,alpha=0.7,
            linewidths=2,edgecolor='black',cmap='Set1',vmin=min(df['v4']),vmax=max(df['v4']))
plt.colorbar()
plt.xlabel('v1')
plt.ylabel('v2')
plt.show()
Copy the code

plt.scatter(x=df['v1'],y=df['v2'],c=df['v4'],s=100,alpha=0.7,
            linewidths=2,edgecolor='black',cmap='hsv',vmin=min(df['v4']),vmax=max(df['v4']))
plt.xlabel('v1')
plt.ylabel('v2')
plt.colorbar()
plt.show()
Copy the code

1.2.3 bubble chart

plt.scatter(x=df['v1'],y=df['v2'],c=df['target'],s=abs(df['v4']) *80,alpha=0.6,
            linewidths=2,edgecolor='black',cmap='Set1')
plt.xlabel('v1')
plt.ylabel('v2')
plt.show()
Copy the code

1.2.3 Legend and Label method 1

import matplotlib.patches as mpatches
Copy the code
plt.rcParams['font.sans-serif'] = ['fangsong'] # Step 1 (replace sans-Serif font)
plt.rcParams['font.size'] = 12 Set the font size
plt.rcParams['axes.unicode_minus'] = False   # # # # # # # # # # # # # # #
plt.rcParams["axes.facecolor"] ="cornsilk"
Copy the code
plt.figure(figsize=(10.6))
plt.scatter(x=df['v1'],y=df['v2'],c=df['target'],s=abs(df['v4']) *80,alpha=0.6,
            linewidths=2,edgecolor='black',cmap='Set1')
plt.xlabel('horizontal')
plt.ylabel('vertical')
plt.title('Bubble chart of fictitious data')
cmap1=plt.get_cmap('Set1')
colors=cmap1(np.arange(0.9.4))
labels=[mpatches.Patch(color=colors[i],label=i)for i in range(len(colors))]
plt.legend(handles=labels,loc='lower right')
plt.grid()
plt.show()
Copy the code

1 way 2

plt.figure(figsize=(10.6))
colors=['red'.'green'.'blue']
for class1 in df['target'].value_counts().index:
    x=df[df['target']==class1]['v1']
    y=df[df['target']==class1]['v2']
    plt.scatter(x,y,c=colors[class1],s=100,alpha=0.6,
            linewidths=2,edgecolor='black',label=class1)
plt.grid()
plt.xlabel('horizontal')
plt.ylabel('vertical')
plt.title('Bubble chart of fictitious data')
plt.legend()
plt.show()
Copy the code