Large screen for analysis of epidemic time and space situation

task

To analyze the spatio-temporal situation of the epidemic, monitor the development of the epidemic, and assess epidemic prevention and control measures

introduction

Subject: Epidemic spatio-temporal situation Analysis: Using visual analysis technology, analyze epidemic spatio-temporal distribution pattern, monitor epidemic development situation and evaluate epidemic prevention and control measures.

Subject train of thought from the provincial total number of confirmed growth trend over time and space distribution situation of growth over time, using the data collected will be provincial total number of confirmed the space-time distribution map, a line chart, the form of a stacked bar chart, analyzed roughly from the overall situation, and then through the different particle size on the detailed information of the relevant provincial epidemic To find out the reasons (foreign input, relevant policy promulgation, etc.) that influence its situation change in different time periods. At the same time, the data related to COVID-19 cases and deaths were also correlated with GDP, education level, urbanization rate and medical and health level of each province, so as to find out whether there is a relationship between them and GDP, education level and urbanization rate. The target users are the government and other prevention and control agencies, through which the system can analyze the spatial and temporal distribution pattern of the epidemic, monitor the development situation of the epidemic, and evaluate the epidemic prevention and control measures.

The data source

Two kinds of data were obtained through tianxing data API (the data came from the epidemic data published by the National Health Commission)

(1) Weibo. Json Data of top 50 real-time hot searches on Sina Weibo

The name of the type The sample value instructions
hotword string The missing girl was confirmed to have been in Zhangzhou Hot topic search
hotwordnum string 129940 Hot search index

(2) ProvinceData. Json ProvinceData

The name of the type The sample value instructions
cityName string wuhan The city name
confirmedCount int 495 The number of confirmed
suspectedCount int 0 suspected case
curedCount int 31 Cured cases
deadCount int 0 deaths

Data from China’s National Bureau of Statistics (China Statistical Yearbook 2018)

CityPopulation. Json Population of each city

The name of the type The sample value instructions
Name string chengdu The city name
Value int 1633 Population (ten thousand)

Cityrate. json Urbanization rate of each province

The name of the type The sample value instructions
Name string Beijing The name of the province
Value int 86.5 Urbanization rate (%)

Json Higher education penetration rate of each province (sample data, sample ratio 0.82‰, total proportion of postgraduates and undergraduates)

The name of the type The sample value instructions
Name string sichuan The name of the province
Value int 0.056785446 Higher education rate

Doctor. json Number of health technicians per 10,000 people by province

The name of the type The sample value instructions
Name string sichuan The name of the province
Value int 67 Number of health technicians per 10,000 people

Wuhan-2019-nCoV.csv

Data from 2020-01-10 to 2020-02-06 were collected from national, provincial and Wuhan Health Commission epidemic bulletin. Data after 2020-02-07 were collected from the interface of Toutiao

field instructions* * * * field The name of the* * * * Field to explain* * * *
Date date Updated date
String country State the name of the
Country countryCode Country code
String province Province (foreign empty)
Integer provinceCode Province code
String city Urban areas (foreign empty)
Integer cityCode City code
Integer confirmed Cumulative number of confirmed cases
Integer suspected The number of suspected
Integer cured The number of cure
Integer dead Cumulative death toll

Dxyarea. CSV data is obtained from Dxy Garden

The field names type Field to explain
continentName string Chinese names of continents
continentEnglishName string English names of continents
countryName string State the name of the
countryEnglishName string English name of country
provinceName string The name of the province
provinceEnglishName string English name of province
province_confirmedCount int Cumulative number of confirmed cases
province_suspectedCount int Cumulative number of suspected patients
province_curedCount int Cumulative cure number
province_deadCount int Cumulative deaths
updateTime date Update time

Datas. json Foreign input data (manually collected from online news)

The name of the type The sample value instructions
Date date On February 2 time
provinceName string liaoning Enter the province
countryName string Japan source
Count int 2 Daily input quantity
Total int 10 Cumulative input quantity

Analysis tasks and visual analysis of the overall process

Analysis tasks: analyze the spatial and temporal distribution pattern of the epidemic, monitor the development trend of the epidemic, and evaluate the prevention and control measures.

Overall process of visual analysis:

  1. To observe the change curve and map of the number of epidemic cases in China, and get an overall sense of the spatio-temporal distribution of the epidemic situation.

By observing the spatial distribution pattern of the national epidemic change curve and the national cumulative confirmed cases map over time, some fluctuation nodes of the curve change and abnormal color parts of the map were found, and then click the corresponding province on the map to view the detailed information.

  1. Click the urbanization rate and the number of confirmed cases, and check the relationship between various factors and the number of confirmed cases in detail in the pop-up panel. Some potential relationships between urbanization rate, GDP, education level and the number of confirmed cases in each province can be viewed.

3. Click the province on the map and view the detailed information of the province in the corresponding linkage panel to further analyze the causes of different patterns at different granularity. For example, click Hubei province and the detailed information of Hubei will be displayed in the linkage panel.

Data processing and algorithm model

For the Wuhan- 2019-nCOv.csv data file, because the data source is from the National Health Commission, the National Health Commission releases the data in the form of document release, so there will be some deviation in the crawling process due to the change of document form, so for the vacant data, the manual comparison with the data of the National Health Commission to complete.

For DXYArea. CSV data file, including data crawl time interval problems, there may be multiple data, leading to a day to generate a large number of redundant repetitive data, we processed by its pandas, for every day the same province or country data, we only take its on the day of release time of the latest one, as the data.

On April 17, the National Health Commission revised the cumulative confirmed cases in Hubei, so the data will fluctuate slightly on April 17.

Visualization and interaction design

The main interface:

(1) the system is divided into the hot search term cloud module, the proportion of confirmed cases in each province, the distribution of new cases (imported from China and overseas), the map, the epidemic population display, the template for the relationship between the urbanization rate and the number of confirmed cases, the top10 imported provinces, and the distribution of the epidemic population in each province (mortality rate and the proportion of infected people).

(2) There are two modules to scroll down the slider, namely, the change curve of the epidemic population in provinces and the change curve of the epidemic population in the whole country (the stacked bar chart of the newly increased population in the whole country).

The distribution of newly added people (local/overseas), the distribution of epidemic people in provinces (mortality rate, proportion of infected people) and the curve of epidemic people in provinces are linked to the map. Click the province on the map, and these three panels are updated to the detailed information of the province.

(3) The map module can drag the timeline to show the national cumulative diagnosis distribution at different times

(4) Click the distribution button of overseas input, and the panel pops up to display the spread chart of overseas input. You can also drag the timeline to view the accumulated overseas input data at different times

(5) Click the province on the map, and the detailed information of the corresponding province will be displayed in the corresponding linkage panel. For example, click Heilongjiang, and the information shown in the picture will be displayed in the linkage panel

(6) Click the button on the relationship between urbanization rate and confirmed cases and the number of confirmed cases, and the potential relationship between different factors (such as GDP, higher education level, urbanization rate) and the number of confirmed cases can be viewed in the pop-up panel

Experiment case scenario analysis

By observing the changes of epidemic distribution over time on the map, we found the spatial distribution pattern of epidemic change over time, which is as follows:

In early January, only Hubei province was affected.

Since January 20, the virus has spread outward, and almost all neighboring provinces have reported confirmed cases.

A map showing the spread of the virus on January 24 showed that almost all provinces except Qinghai and Xizang had confirmed cases. As of February 10, all provinces in China have confirmed cases, and the number of confirmed cases in neighboring provinces of Hubei is significantly higher than other provinces.

According to the epidemic distribution map on May 13, the overall distribution trend is as follows: Infections more areas in almost all Hu Huanyong line (heihe – tengchong) southeast of the plate, Hu Huanyong line in geography has the extremely important status, is about agriculture, population, GDP, so we click on the quantitative relationship between the urbanization rate and diagnosis, and view the details of the “provincial relationship between GDP and the number of confirmed scatterplot” (FIG. 6-2). It is found that almost all provinces with higher GDP level have far more confirmed cases than other provinces, because high GDP represents large population mobility and population density, which is more conducive to the spread of the virus. Therefore, in cities with high population density, various epidemic prevention links should be strictly checked to prevent local rebound, and nucleic acid tests should be conducted regularly by sampling.

At the same time, it can be seen from Figure 6-1 that there is an anomaly in the map distribution. Heilongjiang, far away from Hubei, is dark in color, indicating a large number of confirmed cases. This is not consistent with the rule that the number of infected people in neighboring Hubei province is higher. Then we pass the new confirmed stacked bar chart analysis, as a result of hubei new diagnosed, accumulative total quantity are far higher than other provinces, we are checking the first block of hubei (the red value for hubei), owing to the incomplete data in Taiwan, Hong Kong and Macao area so we also block in Hong Kong, Macao and Taiwan.

It can be found that from March to the middle of April, some provinces showed small growth, among which Heilongjiang, Inner Mongolia and Shanghai accounted for the largest growth, so we click on these three provinces to see the detailed distribution of growth personnel.

According to the distribution of new cases (local/overseas), it can be seen that the abnormal increase in the number of confirmed cases in these three provinces in April was due to a large number of overseas imports. Click the distribution of confirmed cases imported from overseas to view the details of overseas imports.

It can be seen that a large number of imported confirmed cases have flooded into China recently, most of which come from the UK, the US, Russia, Brazil and Spain, which are among the most severely affected regions in the world. Therefore, a key point of prevention and control is to strictly prevent foreign imports, strictly control the entry procedures, and strictly control the number of people who enter the country. When entering the country, they must undergo nucleic acid tests and be quarantined for 14 days.

Discussion and Summary

Through analysis, we extracted the spatial and temporal distribution trend of epidemic distribution, as follows:

Temporal distribution trend: From December 1, 2019 to January 20, 2020, a small increase was observed in Hubei province, while no confirmed cases were reported in other provinces

From 20 January 2020 to 12 February 2020, there was a large outbreak and a sharp rise in the number of confirmed cases across the country, with an inflection point on 12 February.

The inflection point occurred on February 13, 2020, when the number of single-day increases declined.

From February 13 to March 1, 2020, the number of daily increases is decreasing.

After March 1, 2020, all outbound trips to provinces other than Hubei will be eliminated.

From March 10, 2020 to April 15, 2020, a small increase was observed in a few provinces outside Hubei, mainly imported from abroad and asymptomatic infected persons.

Spatial distribution situation: Similar to temporal distribution situation

From December 1, 2019 to January 20, 2020, confirmed cases were reported in Hubei province, while no confirmed cases were reported in other provinces.

From January 21, 2020 to January 29, 2020, confirmed cases were reported across the country.

From January 30, 2020 to March 1, 2020, there was a large-scale outbreak of confirmed patients in neighboring Hubei Province.

2 March 2020 – Currently, small outbreaks are reported in a few provinces, such as Heilongjiang, Guangdong, Beijing and Inner Mongolia.

The overall spatial distribution trend shows that the areas with more confirmed cases are mostly located in the southeast part of the Hu Line, which has a great relationship with population density and population flow.

Key prevention and control measures:

Prevention and control in the post-epidemic era should mainly focus on preventing imports from abroad and preventing and controlling asymptomatic cases.

For foreign input:

Entry procedures and the number of people must be strictly controlled. The case history and contact history of people entering China (whether they come from countries seriously affected by the epidemic) must be strictly reviewed, and 100% nucleic acid test is required for people entering China. Strict isolation is required during testing. At the same time, it will focus on the four major aviation hubs (Beijing, Shanghai, Guangzhou and Chengdu), which are the top priority for prevention and control because of the huge daily passenger flow and large outbound passenger flow. At the same time, we should also pay attention to China’s border provinces, such as Inner Mongolia, Heilongjiang, Tibet and Xinjiang. In many border areas, the land is vast and sparsely populated, and people are prone to smuggling. Meanwhile, the medical level of many border areas is limited, and timely detection and treatment cannot be carried out.

For asymptomatic infected persons: currently, there is little information about asymptomatic infected persons. Generally, only the number of infected persons is released. It is hoped that more detailed information, such as activity track, can be released later. At the same time, scientific and effective nucleic acid sampling tests should be carried out at regular intervals in accordance with the standards of the medical field in order to understand the situation of asymptomatic infected persons in the region.