Python quantitative data warehouse construction 3: data warehousing code encapsulation

This series of tutorials for quantitative developers, to provide local quantitative financial data warehouse construction tutorials and full set of source code. We use the UDATA financial data community as the data source to dump the financial base data into a local database. Tutorial provides a full set of source code, including historical data download and incremental data update, data update task deployment and daily monitoring operations.Copy the code

In the previous section, we wrapped a Python custom class that operates on the MySQL database as MySQLOperation. This section operates on the database by calling the MySQLOperation method and the pandas. To_sql and pandas. Read_sql operations.

I. Introduction to UDATA operation

1. Obtain the Token

A. Register and log in on udata.hs.net, and order A free experience package on the subscription page;

B. In the upper right corner, enter the overview page from the avatar drop-down menu to copy the Token;

C. On the data page, view the data interface document to obtain the interface name, request parameters, return parameters, and Python code examples;

2. Install HS_UDATA

pip install hs_udata
Copy the code

The following is an example:

import hs_udata as hs
# set the Token
hs.set_token(token = 'xxxxxxxxxxxxxxx')  # Overview page to obtain personal tokens
SZ in 2021-05-01 to 2021-06-01 min. line data
See # interface documentation: https://udata.hs.net/datas/342/
df = hs.stock_quote_minutes(en_prod_code="000001.SZ",begin_date="20210501",end_date="20210601")
# add ticker column
df['hs_code'] ='000001.SZ'
df.head()
Copy the code

The process of using other interfaces is similar;

Two, the data database example

Take stock_list as an example to explain the operation of table building, storage, query and so on. The complete code is in chapter 3 of this article.

1. Preparation

(1) In MySQL database, create database UDATA, see lecture 1 “Python Quantitative Data Warehouse Construction Series 1: Database Installation and Operation”;

MySQL > create table udata.tb_update_records (udATa.tb_update_records);

Create table SQL as follows:

CREATE TABLE udata.tb_update_records (
				table_name CHAR(40),
				data_date CHAR(20),
				update_type CHAR(20),
				data_number INT(20),
				elapsed_time INT(20),
				updatetime CHAR(20))Copy the code

(3) write Token and database parameters into the db_mysql. config file as follows:

[udata] token=' your token 'host='127.0.0.1' port=3306 user='root' passwd=' password' db='udata'Copy the code

(4) Read the parameters in the configuration file

import configparser
# Read config file, constant number and database parameters
configFilePath = 'DB_MySQL.config'
section = 'udata'
config = configparser.ConfigParser()
config.read(configFilePath)
Read UData tokens
token = eval(config.get(section=section, option='token'))
MySQL > select * from mysql_connect
host = eval(config.get(section=section, option='host'))
port = int(config.get(section=section, option='port'))
db = eval(config.get(section=section, option='db'))
user = eval(config.get(section=section, option='user'))
passwd = eval(config.get(section=section, option='passwd'))
Copy the code

2, build table

On the Community Data page, view the data table to return parameter fields to the stock list (stock_list, udata.hs.net/datas/202/)…

DB_Table. Config file DB_Table. Config file DB_Table.

[tb_stock_list]
DROP_TABLE='DROP TABLE IF EXISTS tb_stock_list'
CREATE_TABLE='''CREATE TABLE tb_stock_list (
				secu_code CHAR(20),
				hs_code CHAR(20),
				secu_abbr CHAR(20),
				chi_name CHAR(40),
				secu_market CHAR(20), 
				listed_state CHAR(20),
				listed_sector CHAR(20),
				updatetime CHAR(20)
				)'''
DELETE_DATA = 'truncate table tb_stock_list'
Copy the code

Table code is as follows:

MySQL > open, close, execute SQL, execute SQL to read data
MySQL = MySQLOperation(host, port, db, user, passwd)
# Read config file, constant number and database parameters
configFilePath = 'DB_Table.config'
section = 'tb_stock_list'
config = configparser.ConfigParser()
config.read(configFilePath)
DROP_TABLE = eval(config.get(section=section, option='DROP_TABLE'))
CREATE_TABLE = eval(config.get(section=section, option='CREATE_TABLE'))
DELETE_DATA = eval(config.get(section=section, option='DELETE_DATA'))
Drop table structure
MySQL.Execute_Code(DROP_TABLE)
Create table structure
MySQL.Execute_Code(CREATE_TABLE)
Copy the code

3. Data warehousing

import hs_udata as hs
from sqlalchemy import create_engine
from datetime import datetime
# set the token
hs.set_token(token)
Get stock list data
df = hs.stock_list()
Add the system timestamp in the last column
dt = datetime.now()
df['updatetime'] = dt.strftime('%Y-%m-%d %H:%M:%S')
# Since the stock list data is fully updated, delete the data in the table before inserting the data
MySQL.Execute_Code(DELETE_DATA)
# sqlalchemy connection Settings available for pandas. Read_sql, pandas
engine = create_engine('mysql://{0}:{1}@{2}:{3}/{4}? charset=utf8'.format(user,passwd,host,port,db))
Write data to a table in MySQL
df.to_sql(name='tb_stock_list', con=engine, index=False, if_exists='append')
Copy the code

The result is as follows:

4. Read data

import pandas as pd
result = pd.read_sql('''SELECT * FROM tb_stock_list ''', con=engine)
print(result.head())
Copy the code

Third, the library code encapsulation

Encapsulate the above steps to define and invoke properties and methods of the Python class class. The main technical points in the code are as follows:

(1) Use pymysql, pandas. To_sql and pandas. Read_sql to operate MySQL database.

(2) Use the method of class class to integrate the operation of building tables, inserting data and querying data;

(3) the use of configuration files, from the local file, read database parameters and table operation SQL code;

(4) Use the try fault tolerance mechanism and log function to print the execution log to the local db_mysql_log. TXT file.

import pandas as pd
import hs_udata as hs
from MySQLOperation import *
from sqlalchemy import create_engine
from datetime import datetime
import time
import configparser
import logging
import traceback
import warnings
warnings.filterwarnings("ignore")

class TB_Stock_List:

    def __init__(self,MySQL_Config,BD_Name,Table_Config,Table_Name) :
        # create log
        self.logging = logging
        self.logging.basicConfig(filename='DB_MySQL_LOG.txt', level=self.logging.DEBUG
                                 , format='%(asctime)s - %(levelname)s - %(message)s')
        # Read config file, constant number and database parameters
        configFilePath = MySQL_Config
        self.section1 = BD_Name
        config = configparser.ConfigParser()
        config.read(configFilePath)
        Read UData tokens
        self.token = eval(config.get(section=self.section1, option='token'))
        MySQL > select * from mysql_connect
        self.host = eval(config.get(section=self.section1, option='host'))
        self.port = int(config.get(section=self.section1, option='port'))
        self.db = eval(config.get(section=self.section1, option='db'))
        self.user = eval(config.get(section=self.section1, option='user'))
        self.passwd = eval(config.get(section=self.section1, option='passwd'))
        # pymysql connection Settings, open, close, execute SQL, execute SQL read data
        self.MySQL = MySQLOperation(self.host, self.port, self.db, self.user, self.passwd)
        # sqlalchemy connection Settings available for pandas. Read_sql, pandas
        self.engine = create_engine('mysql://{0}:{1}@{2}:{3}/{4}? charset=utf8'.format(self.user
                                                                                      , self.passwd
                                                                                      , self.host
                                                                                      , self.port
                                                                                      , self.db))
        # Read config file, constant number and database parameters
        configFilePath = Table_Config
        self.section2 = Table_Name
        config = configparser.ConfigParser()
        config.read(configFilePath)
        self.DROP_TABLE_SQL = eval(config.get(section=self.section2, option='DROP_TABLE'))
        self.CREATE_TABLE_SQL = eval(config.get(section=self.section2, option='CREATE_TABLE'))
        self.DELETE_DATA_SQL = eval(config.get(section=self.section2, option='DELETE_DATA'))

        self.logging.info('* * * * * * * * * * * * * * * * * * * * * {0} {1} * * * * * * * * * * * * * * * * * * * * *'.format(self.section1, self.section2))

    def CREATE_TABLE(self) :
        try:
            Drop table structure
            self.MySQL.Execute_Code('SET FOREIGN_KEY_CHECKS = 0')
            self.MySQL.Execute_Code(self.DROP_TABLE_SQL)
            Create table structure
            self.MySQL.Execute_Code(self.CREATE_TABLE_SQL)
            self.logging.info('Table {0}.{1}, table created successfully'.format(self.section1,self.section2))
        except:
            self.logging.info('Table {0}.{1}, table creation failed'.format(self.section1,self.section2))
            self.logging.debug(traceback.format_exc())

    def UPDATE_DATA(self) :
        try:
            # set the token
            hs.set_token(self.token)
            time_start = time.time()  # timing
            Get stock list data
            df = hs.stock_list()
            Add the system timestamp in the last column
            dt = datetime.now()
            df['updatetime'] = dt.strftime('%Y-%m-%d %H:%M:%S')
            # Since the stock list data is fully updated, delete the data in the table before inserting the data
            self.MySQL.Execute_Code(self.DELETE_DATA_SQL)
            Write data to a table in MySQL
            df.to_sql(name='tb_stock_list', con=self.engine, index=False, if_exists='append')
            time_end = time.time()  # timing
            elapsed_time = round(time_end-time_start,2)
            Mysql > update table name, data date, update method, update number, update time, system time
            self.RECORDS_SQL = . "' INSERT INTO udata tb_update_records VALUES (' {0} ', '{1}', 'whole quantity, {2}, {3}, SYSDATE ()' ' '.format(self.section2
                                                                   ,dt.strftime('%Y-%m-%d'),len(df),elapsed_time)
            self.MySQL.Execute_Code(self.RECORDS_SQL)
            self.logging.info('Table {0}.{1}, data updated successfully'.format(self.section1,self.section2))
        except:
            self.logging.info('Table {0}.{1}, data update failed'.format(self.section1,self.section2))
            self.logging.debug(traceback.format_exc())

    def READ_DATA(self, WhereCondition=' ') :
        try:
            result = pd.read_sql('''SELECT * FROM {0}.{1} '''.format(self.section1,self.section2)
                                 + WhereCondition, con=self.engine)
            self.logging.info('Table {0}.{1}, data read successful'.format(self.section1,self.section2))
            return result
        except:
            self.logging.info('Table {0}.{1}, data read failed'.format(self.section1,self.section2))
            self.logging.debug(traceback.format_exc())
            return 0
        
if __name__ == '__main__':
    MySQL_Config = 'DB_MySQL.config'
    BD_Name = 'udata'
    Table_Config = 'DB_Table.config'
    Table_Name = 'tb_stock_list'
    # instantiation
    TB_Stock_List_Main = TB_Stock_List(MySQL_Config,BD_Name,Table_Config,Table_Name)
    Create table structure
    TB_Stock_List_Main.CREATE_TABLE()
    # update data
    TB_Stock_List_Main.UPDATE_DATA()
    # fetch data
    data = TB_Stock_List_Main.READ_DATA()
Copy the code

Download code and configuration file

(1) Attachment file Directory:

DB_MySQL.config

DB_Table.config

MySQLOperation.py

TB_Stock_List.py

(2) After downloading the attached file, modify the Token and database parameters in the db_mysql. config configuration file as your own parameters. Place the code in the same directory for execution.

Next, Python Quantitative Data Warehouse Building Series 4: Stock Data Warehousing