Link to original article:tecdat.cn/?p=22492 

Original source:Tuoduan numbers according to the public account of the tribe

 

 

We will use the wine data set for principal component analysis.

data

Data contains 177 samples and 13 variables of the data box; Vintages contains class labels. The data are the result of a chemical analysis of wines grown in the same region of Italy but from three different cultivars: Nebbiolo, Barbera and Grigno grapes. Wines from the Nebbiolo grape are called barolo.

The data included the amounts of several ingredients found in each of the three types of wine.

 

Head (no)Copy the code

The output

Transform and standardize data

Logarithmic transformation and normalization, setting all variables on the same scale.

 

No_log < -scale (no_log) head(log_scale)Copy the code

Principal Component Analysis (PCA)

The singular value decomposition algorithm was used for principal component analysis

 

prcomp(log_scale, center=FALSE)
summary(PCA)
Copy the code

 

Basic Graphics (default Settings)

Principal component score and load diagram with base graph

Plot (scores[,1:2], # x and y data PCH =21, # dot shape cex=1.5, # dot size legend("topright", # legend location legend= Levels (vint), Plot (loadings[,1:2], # x and y data PCH =21, # point shape text(loadings[,1:2], # set tag positionCopy the code

 

 

In addition, we can add 95% confidence ellipses to the groups in the fractional graph.

Confidence elliptic graph function

 

## ellipse graph elev=0.95, # ellipse probability level pcol=NULL, # manually add color, must meet the length factor cexsize=1, # PPCH =21, # dot type, Must satisfy factor length legcexSize =2, # legend font size legptSize =2, If (is.factor(factr) {f < -factr} else {f < -factor (factr, Level =unique(as.character(factr)))} intfactr < -as.integer (f) # Ellipses < -ply (edf,.(factr), function(x) {Ellipse(LV1, LV2, levels=elev, robust=TRUE, -plotat (range(c(as.vector(sapply(ellipses, sapply(ellipses, sapply))) Function (x) x [1]), min (x), Max (x)))) # # for block set colors if (is null (pcol)! Pcol < -paste (pcol, "7e", sep="") Main ="" abline(h=0, v=0, col="gray", lty=2) # add line to 0 Legend (x=legpos, legend=levels(f), PCH =legpch, Pcavar < -round ((sdev^2)/sum((sdev^2))Copy the code

Basic shapes

Draw the principal component score diagram and the load diagram using the basic default values

Plot (scores[,1], # X-axis data scores[,2], # Y-axis data vint, # class factor pcol=c(), # color used for plotting (must match number of factors) pbgcol=FALSE, # Dot border is black? Cexsize =1.5, PPCH =c(21:23), # dot shape (must match number of factors) legpos="bottom right", # linewidth=1.5 # axissize=1.5 # linewidth=1.5 # axissize) title(xlab=explain[["PC1"]], Ylab =explain[["PC2"]], # PC2 main="Scores", # title cex.lab=1.5, Plot (loadings[,1:2], # x and y data PCH =21, # cex=1.5, # type="n", # type="n", # type="n", # axes=FALSE, # axes= "", # axes= "", # axes= "", # axes= "", # axes= ", # axes= ", # axes= ", # axes= ", # axes= ", # axes= ", # axes= ", # axes= ", # pointLabel will try to place text around the point axis(1, # display x axis cex.axis=1.5, # display x axis cex.axis=1.5, LWD =1.5 # set the size of the axis Title (xlab=explain[["PC1"]], ylab=explain[["PC2"]], # PC2 interprets variance percentage cex.lab=1.5, # tag text size cex.main=1.5 # Header text size)Copy the code

 


Most popular insight

1. Matlab Partial least squares regression (PLSR) and principal component regression (PCR)

2. Dimension reduction and visualization analysis of R language high-dimensional data based on principal component PCA and T-SNE algorithm

3. Principal component analysis (PCA) principles and analysis examples

4. LASSO regression analysis based on R language

5. Use LASSO regression to forecast stock earnings data analysis

6. Lasso regression, Ridge regression and elastic- Net model in R language

7. Partial least squares regression PLS-DA data analysis in R language

8. Partial least squares PLS regression algorithm in R language

9. Linear discriminant Analysis (LDA), Quadratic discriminant Analysis (QDA) and regular discriminant Analysis (RDA)