Make writing a habit together! This is the fifth day of my participation in the “Gold Digging Day New Plan · April More text Challenge”. Click here for more details.

A modest gentleman, with a big river.

preface

The graph drawing of PPI-CPI and M0-M1-M2 was described in the previous paper. In this paper, PMI, an indicator reflecting economic activity prosperity, was continued to be shared. Crawler method was used to obtain data in this paper, and then the yearly data of PMI were displayed by matplotlib mapping tool. For starters, you’ll learn the basics of Python, crawlers, and graphing.

PMI data acquisition

Before we get to the numbers, here’s what the PMI numbers mean: As we all know, the manufacturing industry is the foundation of a country, so PMI is an indicator to measure the development and operation of a country’s manufacturing industry. Under normal circumstances, a ratio of 50% is the dividing line to indicate the strength of the economy; if the ratio is greater than 50%, the manufacturing industry is expanding; if the ratio is 40-50, the manufacturing industry is in recession; if the ratio is below 40, the economy is in recession.

Since it is data acquisition, we need to find an authoritative website to obtain data. Here xiaobian uses the data of Oriental Fortune network, and the page access address is directly given here:

# money supply data access at https://data.eastmoney.com/cjsj/pmi.htmlCopy the code

The data source of pmi is shown in the figure below, which only obtains the index data of manufacturing and non-manufacturing, but excludes the year-on-year growth data.

Now that we know the source of purchasing managers index, how to obtain the data? Do we need to copy the page into Excel for parsing? If we do so, it will take time and effort. I think you have noticed that there is pagination at the bottom of the table, so there must be ajax and background communication. Through observation, you can find the following interface, and the result of data interaction is as shown in the figure below:

# # the purchasing managers' index, https://datainterface.eastmoney.com/EM_DataCenter/JS.aspx?type=GJZB&sty=ZGZB&p=1&ps=200&mkt=21 The money supply interface, PPI interface and CPI interface are also pasted here, which can be found to be the same. Only MKT parameters is not the same as https://datainterface.eastmoney.com/EM_DataCenter/JS.aspx?type=GJZB&sty=ZGZB&p=1&ps=200&mkt=11 # # money supply interface Ppi and cpi data at https://datainterface.eastmoney.com/EM_DataCenter/JS.aspx?type=GJZB&sty=ZGZB&p=1&ps=10&mkt=22 https://datainterface.eastmoney.com/EM_DataCenter/JS.aspx?type=GJZB&sty=ZGZB&p=1&ps=10&mkt=19Copy the code

For data fetching, use python for fetching data, using Requests:

    body = requests.get(req_url).text
    body = body.replace("("."").replace(")"."")
    data_list = body.split("\ \", "")

    # define data
    date_list, pmi1_list, pmi2_list = [], [], []

    for node in data_list:
        node = node.replace("]"."").replace("["."").replace("\" "."")
        arr_list = node.split(",")
        date = arr_list[0]
        if date < "2010-01-01":
            continue
        # Time data
        date_list.append(date)
        # Data manipulation storage
        pmi1_list.append(float(arr_list[1]))
        pmi2_list.append(float(arr_list[3]))
        print(node)

Copy the code

The data finally obtained is shown in the figure below:

Pmi Graph drawing

Before drawing a graph, you need to process the data:

  • 1. Data needs to be processed, extracted and displayed, and then the data format needs to be transformed.
  • 2. In data processing, data are still obtained according to the lists of manufacturing and non-manufacturing industries and time.
  • 3. Np.asarray is still used to create data and prepare for graph drawing.

According to the above point of view, the code for data processing is shown below:

For graphic drawing, there are the following points:

  • 1. Data of manufacturing and non-manufacturing industries should be displayed in the graph, and legends should be displayed for identification.
  • 2 Set the index to 50 and 40 horizontal lines, used to set the standard contrast line.

Finally, after these codes, the final comparison graph of manufacturing and non-manufacturing indices is as follows:

conclusion

In this article, we introduced a simple Python crawler, used Numpy for simple data processing, and finally used Matplotlib for graph rendering, which realized an intuitive way to display the manufacturing and non-manufacturing index graphs. Using the interface to obtain data can be obtained at any time to update the data to redraw the graph, eliminating the steps of data recapture.