As a research field, data visualization has derived many research topics of visualization. As the second part of the front-end visualization report, this paper mainly describes the various research topics and corresponding difficulties in the realization of front-end data visualization.

Due to the limited energy of the author, there are a lot of information to collect, so part of the content I am writing becomes a display of achievements. I will continue to improve it later, and there will be a third chapter devoted to horizontal comparison. This article investigates AntV work, with some great data visualization projects in between. Please check out the links in the reference section.

1. Digital twinning

For those of you who love Iron Man, you must have been impressed by the way the iron Man suit was designed in season 1. It was a 360-degree model that Tony could change with a wave of his hand, resulting in a powerful Iron Man Suit.

Digital Twin (DT, Data Twin), also translated as digital mapping or digital double. The concept has been popular in recent years, embraced by PD and designers alike. First, what are digital twins?

Wiki: Digital twins refer to twins that simulate physical entities, processes, or systems within an information platform, similar to those of physical systems within an information platform. With the help of digital mapping, the status of physical entities can be understood on the information platform, and even the predefined interface components in physical entities can be controlled.

The main papers are as follows:

Here I also found a relatively new review paper in 2018, which mentioned data modeling, data fusion and other technologies.

The specific link of digital twin can be roughly divided into four parts: data acquisition, data fusion, data modeling and interactive mapping.

As an example, alibaba group’s newbies use a large number of Internet-of-Things devices (such as LEMO handheld devices) as the building blocks, digitize the data collected by the devices, and then use simulation to aid decision-making. Finally, the feedback drives the physical world to transform and upgrade.

2. Graph analysis and visualization

A graph is a data structure: a collection of entities and relationships. And what is graph visualization, is to visually show the graph data structure to users, to assist users to analyze complex relational data.

For more information on the origins of graphs, typical application scenarios of graphs, major problems and challenges of graphs, see antV-2019-Shigo-Hello World Graph Visualization.

2.1 Analysis insight of figure

Here at @g6-graphin, I’ve been paying a lot of attention to automatic layout, which is very interesting and useful.

2.2 Graph visualization of large-scale data

For G6 performance tests, to verify the amount of data that G6 can carry, the G6 team tested samples of 5000+ primitives, nearly 20000 primitives and 50000+ primitives (default layout, no FitView). It can be seen from the results that G6 can interact normally when the data volume is around 20000 pixels. When the data volume reaches 50000+, there will be some lag in the interaction. However, for most businesses, it is not recommended to display so much data on the canvas.

The image above shows G6 at 20,000 primitives, which looks a bit like a colony, with clusters of closely related nodes.

What about force-oriented layout? Without considering the interaction of nodes, G6(4.0+) uses GPU parallel computing to launch gforce and Fruchterman layouts, and has drawn nodes with 5500W + at most. The following two tables compare the computing time of GPU and CPU of the two algorithms under different data scales and different algorithms:

2.3 3d image visualization

Most of the visualization results completed by G6 are drawn on a two-dimensional plane using Canvas. The Graph Studio based on G6 Graphin is shown in the following figure. The sample data set provided contains hundreds of nodes, and switching analysis of various layouts is provided to reduce visual confusion of graphs. Such as circular layout, directed layering, radial layout and so on.

But if you’re dealing with a graph application scenario with more data and more complex relationships, are there any other solutions besides switching layouts with tools like Graphin to reduce visual clutter? The answer may be 3d visualization based on WebGL technology.

A team at Alibaba Data offers this answer in this direction:

I also had some similar ideas before, but there was still a long way to go. Now SEEING this perfect 3d topology visualization effect, I was shocked by Orz. Because it is a three-dimensional graph visualization view, there are relative solutions for many difficulties of large-scale graph visualization analysis encountered on the plane before, such as view chaos, dense and miscellaneous relations, massive node number and other problems.

The weight of points is distinguished by different figures and volumes, and the hierarchical relationship between points can also be presented in the view. A large number of leaf nodes are also hidden under the main node circle, showing the whole view is very clean and transparent.

In the layout of the diagram, the team extended the three-dimensional “perspective”, and the data meaning presented by each vision is different. I also believe there will be more space for exploration in this area.

More information can be found in the reference article at the bottom of this article.

Here is not to say that all visual need to be present in three-dimensional space, the visualization of the ultimate aim is to help users better and faster to understand data, three-dimensional space, compared with the two-dimensional plane data with a layer of more dimensions, which is hindrance for users to quickly understand, there are some simple data set, with the effect of the two-dimensional plane combination of data cleaning and polymerization method, The visualization will be better.

A reasonable 3d visualization scheme can also reduce the sense of visual confusion in 2d visualization, such as in large-scale data sets. So whether you use 3d or not depends on what the real business scenario looks like.

2.4 Diagram visualization solution

The first figure shows the top-level application of G6, how it designs its own technical flow and front-end solutions.

The second chart shows the G6 team’s current ideal graph visualization capabilities.

The G6 Graphin white Paper was released in November 2020. If you haven’t seen it yet, download it. The white paper consists of six sections, as shown below:

3. Visualization of geographic information data

  1. Uber – kepler

    Uber’s Kepler project classifies geographic information visualization into seven basic categories: Arc \ Line \ Hexgon \ Point \ HeatMap \ GeoJSON \ Buildings

  2. Baidu Degree small Man – Buildings

    The work in the above picture is classified as “Buildings”. Only the design draft has been released so far, so it is assumed that the team of Duxiaoman of Baidu is working on it. We don’t know the completion degree, but we can roughly infer that it is a geographic information data visualization platform that uses C4D and other tools to complete city modeling first and then on this basis.

  3. Ali DataV

    DataV focuses on large screens of data that users can interact with. Geographic information data visualization is one of the best ways to achieve large screens. The above picture shows a demo project of DataV — school district analysis view of Hangzhou city.

    After the user clicks on a school icon in the map, the school details in the lower left corner will immediately switch to the corresponding content of the school.

4. The fourth dimension — time

Time is a wondrous dimension that enlivates stagnant data.

What I mean here is not to treat time as a static dimension, but to control the changes of the graph through time to achieve the effect similar to animation playback, essentially turning a static graph into a dynamic visual presentation.

The graph below is a typical race chart, and you often see this type of dynamic bar chart in GDP development and epidemic data summary.

G6 added a timing component called TimeBar to its component design last year, and currently offers three types: trend charts, simplified versions, and time scales. In the actual use, there will be slider overflow or can not reach the boundary, the component should still be in a perfect stage.

5. Unit data visualization

Each item of data is called a DataUnit or a DataItem.

Identity (ID) is the unique identity of a data unit, and maintaining this attribute for each data unit can cause visual confusion in large data sets.

To address this problem, many visualization techniques are based on data abstractions, such as aggregation, segmentation, or filtering. Instead of maintaining an absolute one-to-one mapping between data items and visual tags, these abstract or aggregated visualizations combine multiple data items into visual aggregations that can no longer be separated, and identity attributes therefore do not exist. Examples of aggregated visualizations are bar charts, bar charts, and histograms.

Advantages of unit visualization:

  1. Awareness: Maintains identifiers that allow users to track data units during animation transitions and interactions
  2. Physics: Provides a unique way to physically transform and build views
  3. Interaction: The user can obtain detailed information about each individual data item, while also supporting some filtering operations

Weaknesses of unit visualization:

  1. Computing scalability: Memory, computing, and rendering performance can be limiting factors
  2. Display scalability: Cell visualization is only valuable if a view can distinguish between data cells, so there are also requirements on the number of pixels or resolution of the display screen
  3. Perceived scalability (visual clutter) : Very small data sets are not good for unit visualizations, and aggregation visualizations (such as stacked bar charts) may work better

In unit visualization, the concept of visual grammar is also involved. The abstraction of this grammar-based visualization method makes it easier for people to think, reason and communicate with graphics. For Graphics syntax, check out this origin article recommended by Both Microsoft and Ant: Wilkinson’s The Grammar of Graphics

5.1 Microsoft – SandDance

SandDance is a free Web application for data visualization launched by Microsoft Research in 2016. It was created by Microsoft Research’s Visualization and Interactive Data Analysis (VIDA) group, which focuses on human-centered data processing and explores the areas of data visualization, immersive analysis, and understanding of machine learning models.

5.2 Tableau works

5.3 Related Papers

  1. 2018 – Atom: A Grammar for Unit Visualizations

    • Deokgun Park, Steven Drucker, Roland Fernandez, Niklas Elmqvist
    • IEEE Transactions on Visualization and Computer Graphics | December 2018, Vol 24(12): pp. 3032-3043
  2. 2015 – A Unifying Framework for Animated and Interactive Unit Visualizations

    • Steven Drucker, Roland Fernandez
    • MSR-TR-2015-65 | August 2015
    • Articles & videos

6. Charting tools

6.1 AntV-ChartCube Chart Rubik’s Cube

7. Ai-assisted visualization

At present, AI mainly has two directions, one is intelligent chart recommendation (favoring decision machine) and the other is D2C (favoring graph algorithm). The following will also introduce AI from these two perspectives.

7.1 Automatic Visualization Framework (Chart Staff/Chart Intelligence Recommendation)

7.1.1 AntV – AVA

AVA is a technical framework for easier visual analysis. VA stands for Visual Analytics, while the first A has multiple meanings: its goal is to become an Automated, ai-driven, Augmented Visual Analytics solution.

Just prepare a valid data set, write an API call, and the rest of the tweaking can be done through the visual UI.

The image below shows AVA’s design thinking

AVA’s key capability is to recommend rules, and from a core, it has derived functions such as smart charts, one-click icon optimization, and chart dictionary.

(At first, the author really thought that the most suitable chart type could be selected by training the multi-layer neural network classifier through the labeled sample training set, and then input the real data RealDataset to obtain the matching degree through the classifier. It turned out that the implementation idea of AVA’s recommenter was to maintain the knowledge graph and rule table…)

Two sets of demos are shown below, one for AVA’s smart recommendation list and the other for AVA’s smart chart generation.

7.2 Visualization of design draft generation

7.2.1 DATAV-DCc-LADV

In the visualization direction of design draft generation, AI undertakes the function of identifying chart elements on the design draft and completing the rapid chart view construction.

It can be seen from the model design drawing that the main core part of target recognition on the design draft is Faster R-CNN (region-CNN).

LADV’s team also made another function, which accepted a style image as another input to extract color collocation from the style map, and adopted DBSCAN density clustering algorithm for the extraction of main colors.

8. Data large screen and BI enabling

Strictly speaking, whether it is data large screen or BI report, all belong to the application of data visualization, and I only make a simple overview here.

8.1 Ant Financial – DeepInsight

In the era of big data, data-driven user behavior analysis, operation analysis and business analysis are undoubtedly the most concerned “hot words”, especially for large and medium-sized enterprises with massive data, the demand for data has far exceeded the scope that traditional data reports can provide. How to use self-service BI to achieve modern enterprise fine operation has become a new topic of enterprise operation management.

BI (Business Intelligence). It is a complete set of solutions, through the existing data in the enterprise to effectively collect, clean, integrate, mining and analysis, quickly provide statements and put forward forecasts and auxiliary decision-making basis, to help managers make business decisions.

Related articles and resources:

  1. DeepInsight’s product home page
  2. DeepInsight experience video
  3. SEEConf – 2020 – Create the ultimate visual graphic experience

DeepInsight is focused on the size of the ant gold take independent research and development data of insight into data analysis platform, mainly for business analysts, business people and developers, provide enterprises with data driven business development for the next generation of BI tools, including visual chart, intelligent analysis module, support for secondary development, business building analysis platform is more flexible, Data can be quickly circulated in an enterprise, and reports and reports can help businesses quickly discover problems and locate causes, so that data can be of greater value.

A screenshot of DeepInsight in action is shown below, but this video is also available for those interested.

How DeepInsight performs report analysis is roughly divided into three steps:

  1. Connection data

    In order for the data Insight analytics platform to analyze your data, you first need to establish a connection between the data insight analytics platform and the data. Data sources support ODPS, MySQL, Explorer, HybridDB and other databases.

  2. Analyze the data

    Once the data is connected, create a workbook, make a simple report that contains tables, and add interactive filters to analyze the data for executable insights.

  3. Sharing and publishing

    The final work can be shared with designated users for access, or published to a fixed portal for all users to view.

8.2 Aliyun – DataV

DataV data visualization is a product that uses visual applications to analyze and present complex data. DataV aims to let more people see the charm of data visualization, helping non-professional engineers to easily build professional level visualization applications through the graphical interface, meeting your exhibition, business monitoring, risk warning, geographic information analysis and other business display needs.

DataV is a large screen product of AliYun that provides professional data visualization. Not only Ali Cloud, but also every cloud platform has such products to provide users at the C or B end. This kind of big data screen is very eye-catching and cool, which is very popular among businesses and government and enterprise users. One of the most influential applications is the annual double 11 gala data screen project.

9. Enterprise data visualization solutions

AntV is a new generation of data visualization solutions of Ant Financial, committed to providing a set of simple, professional and reliable data visualization best practices without limitation. The diagram below shows the architecture of AntV, including the drawing engine, general, framework, middleware, and applications.

Disassemble the bottom part of the above frame diagram and get the following flow chart:

10. Other directions

  1. Screen multiterminal more

  1. Urban modeling -> Urban Digital Twin -> Smart Community/Smart City

Refer to the article

  1. This article quotes some lecturers’ information from the early chat in the front, which is not listed here, but also welcome everyone to pay attention to this conference.
  2. Zhihu-2020 – Application of Digital Twin in Logistics Industry
  3. “IEEE 2018 — F. Tao, H. Zhang, A. Liu and A. Y. C. Nee, “Digital Twin In Industry: State-of-the-Art,” in IEEE Transactions on Industrial Informatics, vol. 15, no. 4, pp. 2405-2415, April 2019, doi: 10.1109 / TII. 2018.2873186.”
  4. Wiki-2020 – Digital Twin
  5. Heart of the Machine – 2020 – Digital Twins
  6. Antv-2019-shigo-hello World Visualization
  7. Zcool – 2020 – The Beauty of Data: Value and Associated Data Visualization
  8. G6-components – Timebar
  9. Antv-2020-ant Design 4.0 is coming! What is the 3rd SEE Conf about?
  10. Antv-ava-2020 – Welcome to 2020 Era of Intelligent Research and Development of Data Visualization
  11. Antv-ava-2020-ava 1.0 Your Chart Advisor
  12. “Antv-ava-2020-ppt-visual Intelligent R&D Process AVA”
  13. Tableau – 2020 – Why Tableau
  14. Video – 2018 – AntV, return to nature and turn into thousands of visual expression – Absolute cloud & bending
  15. DeepInsight: Everyone is a Data Analyst
  16. DataV – What is DataV Data Visualization?