PCA (Principal component analysis) is a method for dimensionality reduction of data. It can be used to map high-dimensional data to low-dimensional space, remove those irrelevant attributes, and facilitate data analysis.

Methods are provided in Python’s Sklearn library.

sklearn.decomposition.PCA(n_components=None,copy=True,whiten=False)\

Parameter Description:

(1) n_components

The number of principal components to be retained in PCA algorithm is the number of retained features.

Type: int or string. Default: None. All elements are reserved. Assign an int, such as n_components=1, to reduce the original data to one dimension.

An assignment to string, such as n_components=’mle’, automatically selects the number of features so that the desired percentage of variance is satisfied.

(2) the copy

Type: bool. The default value is True

Indicates whether the original data set is assigned when the algorithm is run. True runs the algorithm on the replica without changing the original data set.

(3) the whiten \

Type: bool The default value is False

Whiten, so that each feature has the same variance.

Here’s an example:

import numpy as np
from sklearn.decomposition import PCA
D=np.random.rand(10.4)
pca = PCA()
pca.fit(D)
print(pca.components_)
print(pca.explained_variance_ratio_)
Copy the code

Output:

Array ([[0.50337803, -0.58076386, -0.57386558, 0.28284659], [-0.16288191, -0.73178568, 0.65241137, 0.11098921], [0.26130446, -0.21547305, -0.01640765, -0.94075615], [-0.80734133, -0.2842084, -0.49474083, -0.15052265]])\

Array ([0.48584656, 0.30710106, 0.12088463, 0.08616775])\