Python was used to crawl the epidemic data of Tencent News, and Pyecharts was used to visualize, draw the daily growth map of domestic and international people, and Matplotlib drew the square inch map.

Write in front: this already is not what new topic, so please big guy do not spray

Import related modules

import time
import json
import requests
from datetime import datetime
import pandas as pd
import numpy as np
Copy the code

1. Epidemic data collection

Crawl through data published by Tencent news

Address: news.qq.com/zt2020/page…

For static web pages, we simply pass the URL in the address bar of the web page to get the data of the web page easily. The key to dynamic web grab is to analyze the logic of web page data acquisition and jump first, and then to write code.

Right click to check, select Network, Ctrl+R

# define data fetching function: https://beishan.blog.csdn.net/ def Domestic () : url = 'https://view.inews.qq.com/g2/getOnsInfo?name=disease_h5' reponse = requests.get(url=url).json() data = json.loads(reponse['data']) return data def Oversea(): url = 'https://view.inews.qq.com/g2/getOnsInfo?name=disease_foreign' reponse = requests.get(url=url).json() data = json.loads(reponse['data']) return data domestic = Domestic() oversea = Oversea() print(domestic.keys()) print(oversea.keys()) dict_keys(['lastUpdateTime', 'chinaTotal', 'chinaAdd', 'isShowAdd', 'showAddSwitch', 'areaTree']) dict_keys(['foreignList', 'globalStatis', 'globalDailyHistory', 'importStatis', 'countryAddConfirmRankList', 'countryConfirmWeekCompareRankList', 'continentStatis'])Copy the code

2. Preliminary analysis

Extract data details of each region

Domestic ['areaTree'] = domestic['areaTree'Copy the code

Extract data details of foreign regions

ForeignList = oversea['foreignList'] # View and analyze specific data foreignListCopy the code

You can see the structure of the JSON data store

3. Data processing

3.1 Extraction of epidemic data from domestic provinces

# Adresss: https://beishan.blog.csdn.net/ china_data = areaTree[0]['children'] china_list = [] for a in range(len(china_data)): province = china_data[a]['name'] confirm = china_data[a]['total']['confirm'] heal = china_data[a]['total']['heal'] dead = china_data[a]['total']['dead'] nowConfirm = confirm - heal - dead china_dict = {} china_dict['province'] = province china_dict['nowConfirm'] = nowConfirm china_list.append(china_dict) china_data = pd.DataFrame(china_list) China_data.to_excel (" XLSX ", index=False) # Store as EXCEL file china_data.head()Copy the code

3.2 International epidemic data extraction

world_data = foreignList world_list = [] for a in range(len(world_data)): Country = world_data[a]['name'] nowConfirm = world_data[a]['nowConfirm'] confirm = world_data[a]['confirm'] dead = world_data[a]['dead'] heal = world_data[a]['heal' world_dict['nowConfirm'] = nowConfirm world_dict['confirm'] = confirm world_dict['dead'] = dead world_dict['heal'] = Heal world_list.append(world_dict) world_data = pd.dataframe (world_list) world_data.to_excel(" XLSX ", index=False) world_data.head()Copy the code

3.3 Data Integration

Merge domestic data with overseas data

Check whether the query data contains Chinese epidemic data

Loc [world_data['country'] == "中国"]Copy the code

Extract China data from the new areaTree and add it to world_data

Confirm = areaTree[0]['total']['confirm'] # Heal = areaTree[0]['total']['heal'] # Heal = areaTree[0]['total']['heal' AreaTree [0]['total']['dead'] # Collect cumulative death data in China nowConfirm = confirm-heal # Calculate the number of confirmed cases in China world_data = World_data. append({'country': "China ", 'nowConfirm': nowConfirm, 'confirm': confirm, 'heal': heal, 'dead': dead }, ignore_index=True)Copy the code

Check again whether the data contains Chinese epidemic data

Loc [world_data['country'] == "中国"]Copy the code

4. Visualization

4.1 Visualization of domestic epidemic situation

Import the Pyecharts libraries

import pyecharts.options as opts
from pyecharts.charts import Map
from pyecharts.globals import CurrentConfig, NotebookType
CurrentConfig.NOTEBOOK_TYPE = NotebookType.JUPYTER_LAB
Copy the code

Map of the number of confirmed cases in each region of the country

m = Map() m.add("", [ list(z) for z in zip(list(china_data["province"]), list(china_data["nowConfirm"])) ], maptype="china", Is_map_symbol_show =False) m.set_global_opts(title_OPts = opts.titLeopts (title=" Map of the number of confirmed COVID-19 cases in existing areas of China "), visualmap_opts=opts.VisualMapOpts( is_piecewise=True, pieces=[ { "min": 5000, "label": '>5000', "color": {"min": 1000, "Max ": 4999, "label": '1000-4999', "color": "#ff585e"}, {"min": 500, "max": 999, "label": '500-1000', "color": "#fb8146" }, { "min": 101, "max": 499, "label": '101-499', "color": "#ffA500" }, { "min": 10, "max": 100, "label": '10-100', "color": "#ffb248" }, { "min": 1, "max": 9, "label": '1-9', "color": "#fff2d1" }, { "max": 1, "label": '0', "color": "#ffffff" } ])) m.render_notebook()Copy the code

4.2 Visualization of international epidemic situation

To convert the Chinese names to English names, use the merge method in PANDAS

Pd. merge(left, right, how= ‘inner’, on=None, left_ON =None, right_ON =None, left_index=False, right_index=False, Sort =False, suffixes=(‘ _x ‘, ‘_y’), copy=True, indicator=False, validate=None

How: One of ‘left’, ‘right’, ‘outer’, ‘inner’. Inner by default. Inner takes the intersection, outer takes the union

XLSX) world_data_t = pd.merge(world_data, world_name, left_on="country", Right_on ="中文", how="inner") world_data_tCopy the code

169 rows × 7 columns

Map of the number of confirmed cases in countries around the world

Add ("", [list(z) for z in zip(list(world_data_t[" English "]), list(world_data_t["nowConfirm"]))], Maptype ="world", is_map_symbol_show=False) m2.set_global_opts(title_opts= opts.titLeopts (title=" Map of current COVID-19 cases in countries around the world"), visualmap_opts=opts.VisualMapOpts(is_piecewise=True, pieces=[{ "min": 5000, "label": '>5000', "color": "#893448" }, { "min": 1000, "max": 4999, "label": '1000-4999', "color": "#ff585e" }, { "min": 500, "max": 999, "label": '500-1000', "color": "#fb8146" }, { "min": 101, "max": 499, "label": '101-499', "color": "#ffA500" }, { "min": 10, "max": 100, "label": '10-100', "color": "#ffb248" }, { "min": 0, "max": 9, "label": '0-9', "color": M2.set_series_opts (label_opts= opts.labelopts (is_show=False)) m2.render_notebook()Copy the code

4.3 Domestic epidemic dimensions

Take out China’s epidemic data separately

China_data = world_data.loc[world_data['country'] == "China_data.reset_index" (drop=True, inplace=True) China_dataCopy the code

The cumulative diagnosis, cumulative cure and cumulative death data of China_data were extracted

# data. At [n,'name'] represents the data based on the row index and column name, At [0, 'confirm'] w_heal = China_data. At [0, 'heal'] w_dead = China_data. At [0, 'heal'] w_dead = China_data.Copy the code

Import matplotlib libraries

import matplotlib.pyplot as plt
import matplotlib.patches as patches
Copy the code

To construct a pictorial representation of the epidemic situation in China

# -*- coding: utf-8 -*-
%matplotlib inline
fig1 = plt.figure()
ax1 = fig1.add_subplot(111, aspect='equal', facecolor='#fafaf0')
ax1.set_xlim(-w_confirm / 2, w_confirm / 2)
ax1.set_ylim(-w_confirm / 2, w_confirm / 2)
ax1.spines['top'].set_color('none')
ax1.spines['right'].set_color('none')
ax1.spines['bottom'].set_position(('data', 0))
ax1.spines['left'].set_position(('data', 0))
ax1.set_xticks([])
ax1.set_yticks([])
p0 = patches.Rectangle((-w_confirm / 2, -w_confirm / 2),
                       width=w_confirm,
                       height=w_confirm,
                       facecolor='#29648c',
                       label='confirm')
p1 = patches.Rectangle((-w_heal / 2, -w_heal / 2),
                       width=w_heal,
                       height=w_heal,
                       facecolor='#69c864',
                       label='heal')
p2 = patches.Rectangle((-w_dead / 2, -w_dead / 2),
                       width=w_dead,
                       height=w_dead,
                       facecolor='#000000',
                       label='dead')
plt.gca().add_patch(p0)
plt.gca().add_patch(p1)
plt.gca().add_patch(p2)
plt.title('COVID-19 Square - China', fontdict={'size': 20})
plt.legend(loc='best')
plt.show()
Copy the code

4.4 Scale of the international epidemic

Reordering data

world_data.sort_values("confirm", ascending=False, inplace=True)
world_data.reset_index(drop=True, inplace=True)
world_data
Copy the code

162 rows × 5 columns

Building an international epidemic map

# -*- coding: utf-8 -*-
plt.rcParams['font.sans-serif'] = [u'SimHei']
plt.rcParams['axes.unicode_minus'] = False
fig1 = plt.figure(figsize=(25, 25))
for a in range(20):
    w_confirm = world_data.at[a, 'confirm']
    w_heal = world_data.at[a, 'heal']
    w_dead = world_data.at[a, 'dead']
    ax1 = fig1.add_subplot(20 / 4,
                           4,
                           a + 1,
                           aspect='equal',
                           facecolor='#fafaf0')
    ax1.set_xlim(-w_confirm / 2, w_confirm / 2)
    ax1.set_ylim(-w_confirm / 2, w_confirm / 2)
    ax1.spines['top'].set_color('none')
    ax1.spines['right'].set_color('none')
    ax1.spines['bottom'].set_position(('data', 0))
    ax1.spines['left'].set_position(('data', 0))
    ax1.set_xticks([])
    ax1.set_yticks([])
    p0 = patches.Rectangle((-w_confirm / 2, -w_confirm / 2),
                           width=w_confirm,
                           height=w_confirm,
                           alpha=w_confirm / 90000,
                           facecolor='#29648c',
                           label='confirm')
    p1 = patches.Rectangle((-w_heal / 2, -w_heal / 2),
                           width=w_heal,
                           height=w_heal,
                           alpha=1,
                           facecolor='#69c864',
                           label='heal')
    p2 = patches.Rectangle((-w_dead / 2, -w_dead / 2),
                           width=w_dead,
                           height=w_dead,
                           alpha=1,
                           facecolor='black',
                           label='dead')
    plt.gca().add_patch(p0)
    plt.gca().add_patch(p1)
    plt.gca().add_patch(p2)
    plt.title(world_data.at[a, 'country'], fontdict={'size': 20})
    plt.legend(loc='best')
plt.show()
Copy the code

So you can see the relationship between the number of COVID-19 cases, cures and deaths in each country