It was four years ago that I just got into contact with data visualization. At that time, a college teacher S asked us to use R for statistical analysis of data (yes, I also learned statistics for a while). Part of the knowledge was to use R to draw visual results such as boxplot and scatter plot.

Since then, I have developed a strong interest in data visualization. Now I will systematically study data visualization. Today I’m going to share some basic processes for data visualization.

There are rules for everything to be done, and so is data visualization. The basic steps, processes and systems of visualization are universal. Different data can be visualized according to the following model.

The steps given here are not to teach you how to draw a visual result of “faithfulness, expressiveness and elegance”, but to convey an idea, namely the visualization methodology.

Visualize the basic flow

Visualization is not an algorithm, but a process, a bit like an assembly line, but the assembly lines can interact with each other and go both ways.

We can simply divide the visualization process into three parts: front end, processing, and back end, which is not the same concept as the software development process.

The text doesn’t look very intuitive, so go straight to the image above:

There are several main parts involved in the diagram:

  1. Data acquisition. Data acquisition directly determines the format, dimension, size, resolution, accuracy and other important properties of data, and to a large extent determines the quality of visualization results.
  2. Data transformation. The process includes data noise removal, data cleaning and feature extraction.
  3. Visual mapping. Visual mapping is the core of the whole visualization process. It maps data values, spatial positions, and connections between data in different positions into different visual channels. For visual channels, you can see this article — Basic principles of data visualization — visual channels.
  4. User awareness. The biggest difference between data visualization and other data analysis and processing methods is the user. With the help of data visualization results, users can feel the difference of data and extract information, knowledge and inspiration from it.

The above visualization process is simple, but there are two points to note:

  • All of these processes are based on natural or social phenomena behind the data, not the data itself.
  • The connection between each module is not in the order I drew. The connection between them is more non-linear, and any two modules may have a connection.

Other Visual processes

Scientific visualization process

This model is similar to the simplified process above, with more explicit steps to organize the visualization in terms of data collection, processing, mapping, and so on.

Information visualization process

This model, proposed by Card et al., upgrades the pipeline-like visual process into a loop where users can operate at any stage. Most visualization processes today are modeled after this, and most systems may have some implementation differences.

Visual model of human-computer interaction

Visual analysis is closely combined with visual analysis methods through human-computer interaction automatic processing. The following diagram represents the latest visual analysis model:

There are two paths from data to knowledge:

  • Interactive visualization of data to help users perceive the rules contained in the data
  • According to the given priors, data mining is carried out to extract the data model directly from the data.

In both paths, users can visualize the model and build the model from the visualization results.

In many applications, the object of visual analysis operation is multi-source heterogeneous data. In these data, there are a lot of noise, unstructured data and abnormal data. The visual interface helps analysts to intuitively see the modification of parameters or the selection of algorithms during automatic analysis, which enhances the efficiency of model evaluation.

In addition, methods that allow users to autonomously combine automatic analysis and interactive visual analysis are essential features of visual analysis processes. In this process, we can use visualization to find errors in intermediate steps, or contradictory errors in a timely manner, improving credibility.

To sum up, the development of data visualization to the present, man-machine combination is the overlapping product of multiple times. On the one hand, machine intelligence can do things that humans have been unable to do for hundreds of millions of years. On the other hand, after hundreds of millions of years of human evolution, some “can only be understood, not spoken” skills, namely the ability to reason and analyze.

References:

[1] Chen Weishen, Tao Yubo. Data Visualization [M]. Press of Electronics Industry, 2013.

Zhejiang University – Chen Wei, Wu Yingcai Data Visualization course

[2] Haber, R. B. and McNabb, D. A. Visualization idioms: A conceptual model for scientific visualization systems, 1990.

[3] Card S K, Mackinlay J D, Shneiderman B. Readings in information visualization: using vision to think[M]// Readings in information visualization. Morgan Kaufmann Publishers, 1999:647-650.


Welcome to follow the wechat official account: Visualization Technology (Visteacher)

Not only front-end and visualization, but also algorithms, source code analysis, book delivery

Personal website: blog.kurrywaf.com

KurryLuo of each sharing platform is under.

Study hard, live hard and work hard!