R language principal component Analysis (PCA) wine visualization: principal components to scatter point diagram and load diagram

Link to original article:tecdat.cn/?p=22492

Original source:Tuoduan numbers according to the public account of the tribe

We will use the wine data set for principal component analysis.

data

Data contains 177 samples and 13 variables of the data box; Vintages contains class labels. The data are the result of a chemical analysis of wines grown in the same region of Italy but from three different cultivars: Nebbiolo, Barbera and Grigno grapes. Wines from the Nebbiolo grape are called barolo.

The data included the amounts of several ingredients found in each of the three types of wine.

Head (no)Copy the code

The output

Transform and standardize data

Logarithmic transformation and normalization, setting all variables on the same scale.

No_log < -scale (no_log) head(log_scale)Copy the code

Principal Component Analysis (PCA)

The singular value decomposition algorithm was used for principal component analysis

prcomp(log_scale, center=FALSE)
summary(PCA)
Copy the code

Basic Graphics (default Settings)

Principal component score and load diagram with base graph

Plot (scores[,1:2], # x and y data PCH =21, # dot shape cex=1.5, # dot size legend("topright", # legend location legend= Levels (vint), Plot (loadings[,1:2], # x and y data PCH =21, # point shape text(loadings[,1:2], # set tag positionCopy the code

In addition, we can add 95% confidence ellipses to the groups in the fractional graph.

Confidence elliptic graph function

## ellipse graph elev=0.95, # ellipse probability level pcol=NULL, # manually add color, must meet the length factor cexsize=1, # PPCH =21, # dot type, Must satisfy factor length legcexSize =2, # legend font size legptSize =2, If (is.factor(factr) {f < -factr} else {f < -factor (factr, Level =unique(as.character(factr)))} intfactr < -as.integer (f) # Ellipses < -ply (edf,.(factr), function(x) {Ellipse(LV1, LV2, levels=elev, robust=TRUE, -plotat (range(c(as.vector(sapply(ellipses, sapply(ellipses, sapply))) Function (x) x [1]), min (x), Max (x)))) # # for block set colors if (is null (pcol)! Pcol < -paste (pcol, "7e", sep="") Main ="" abline(h=0, v=0, col="gray", lty=2) # add line to 0 Legend (x=legpos, legend=levels(f), PCH =legpch, Pcavar < -round ((sdev^2)/sum((sdev^2))Copy the code

Basic shapes

Draw the principal component score diagram and the load diagram using the basic default values

Plot (scores[,1], # X-axis data scores[,2], # Y-axis data vint, # class factor pcol=c(), # color used for plotting (must match number of factors) pbgcol=FALSE, # Dot border is black? Cexsize =1.5, PPCH =c(21:23), # dot shape (must match number of factors) legpos="bottom right", # linewidth=1.5 # axissize=1.5 # linewidth=1.5 # axissize) title(xlab=explain[["PC1"]], Ylab =explain[["PC2"]], # PC2 main="Scores", # title cex.lab=1.5, Plot (loadings[,1:2], # x and y data PCH =21, # cex=1.5, # type="n", # type="n", # type="n", # axes=FALSE, # axes= "", # axes= "", # axes= "", # axes= "", # axes= ", # axes= ", # axes= ", # axes= ", # axes= ", # axes= ", # axes= ", # axes= ", # pointLabel will try to place text around the point axis(1, # display x axis cex.axis=1.5, # display x axis cex.axis=1.5, LWD =1.5 # set the size of the axis Title (xlab=explain[["PC1"]], ylab=explain[["PC2"]], # PC2 interprets variance percentage cex.lab=1.5, # tag text size cex.main=1.5 # Header text size)Copy the code

R language principal component Analysis (PCA) wine visualization: principal components to scatter point diagram and load diagram

Link to original article:tecdat.cn/?p=22492

Original source:Tuoduan numbers according to the public account of the tribe

data

Transform and standardize data

Principal Component Analysis (PCA)

Basic Graphics (default Settings)

Confidence elliptic graph function

Basic shapes

Related Posts

Git does not use common commands

Linux kernel compilation

TASKCTL server highly reliable architecture scheduling services and installation