Original link:tecdat.cn/?p=22772 

Original source:Tuo End number according to the tribe public number

 

A relational metric that we often use is correlation. You can use data frames and plots to help explore relationships.

This paper first creates the relational data box of the relational relationship, and then draws the relational structure.

library

We will use the following libraries.

library(tidyverse)
library(igraph)
Copy the code

Basic method

Given a data box D consisting of numeric variables, we want to plot its correlation in the network. Here is a basic approach.

# Create the correlate box D %>% correlate() %>% # Convert a correlation stronger than a certain value into an object that transformed into an undirected graph Cors %>% filter(ABS (R) # plot(CORS)Copy the code

Example 1: Car parameter configuration associated variables

Let’s follow this approach with the MTCars data set. By default, all variables are numeric, so we don’t need to do any preprocessing.

We start by creating a correlation data box and converting it to a graphic object.


  correlate() %>% 
  stretch()
Copy the code

Next, we convert these values into an undirected graph object. The diagram is unoriented because the relationship has no direction. Correlation is not causation.

Because we generally do not want to see all correlations, we first filter () out any correlations whose absolute value is less than a certain threshold. For example, let’s include correlations of 0.3 or stronger (positive or negative).

cors %>%
  filter(abs(r) > .3) %>%
Copy the code

Let’s draw this object. Here is a basic diagram.

plot(cors) 
Copy the code

Improved.

Plot (cors,width = abs(r), color = r,title=" ")Copy the code

 

Example 2: Countries with similar drinking habits

This example requires some data preprocessing, so let’s just look at the strong positive correlation.

Let’s take a look at the data on beer and wine consumption in different countries around the world.

 


drinkdata
Copy the code

I wanted to find out which countries in Europe and the Americas have similar beer, wine and spirits drinking habits, and where Australia fits in. Bind geographic information and find the country I’m interested in and shape the data into relevant data.

# Standardize data to check relative quantities. # instead of absolute quantity # the relative quantity of beer, wine and liquor d %>% mutate_if(is.numeric, scale) # -country) %>% drop_na() %>% # Convert wide data for association analysis %>% spread(country, downstream) %>%Copy the code

This includes z-scores of the number of beers, wines and spirits drunk in each country.

We can now continue with our standard methods. Since I’m only interested in which countries are really similar, we filter data with low correlation coefficients. (r > 0.9)

Plot (cors,alpha = r, color = r,title = "Which countries have similar drinking habits?" )Copy the code

Drinking behavior in these countries fell into three groups.

Australia, for example, appears in the upper left cluster along with many western and Northern European countries such as the UK, France, the Netherlands, Norway and Sweden.


Most welcome insight

1. Dynamic map visualization in R language: how to create beautifully animated graphs

2. Visual analysis of R language survival analysis

3.Python Data Visualization – Seaborn Iris Iris data

4. R language for buffon needle throwing (Buffon needle throwing) experiment simulation and dynamic

5. Visualization case of R language survival analysis data analysis

6. R language data visualization analysis case: Explore BRFSS data data analysis

7. Dynamic visualization in R language: make animated GIF video images of cumulative dynamic line charts of historical global average temperature

8. Case report of principal component Pca and T-SNE algorithm dimension reduction and visual analysis for R language high-dimensional data

9. Python topics LDA modeling and T-SNE visualization