Machine learning 045- Hidden Markov Modeling of stock data

(Python libraries and versions used in this article: Python 3.6, Numpy 1.14, Scikit-learn 0.19, matplotlib 2.2)

Stock data is very typical of time-series data, the data are in accordance with the date arrangement is good, and price is what we can observe the observation sequence, which hide behind the change mechanism of is we find it difficult to see the hidden state and state transition probability, so can use hidden markov model to the stock, and to predict the stock movements, If there is a breakthrough in stock data research, then there is a lot of money in the pocket.


1. Prepare data sets

Here I used Tushare to extract data for a particular stock and then modeled the daily gains and volume of that stock to see what I could predict.

# 1, prepare the data set and use Tushare to get stock data
import tushare as ts
stock_df=ts.get_k_data('600123',start='2008-10-01',end='2018-10-01') # Get the data of the stock 600123 over the last ten years
print(stock_df.info()) # check no error
print(stock_df.head())
Copy the code

The above just downloaded the daily data of 600123 stock in the last ten years, but what we want to get is the closing price increase, so we need to do further processing on the data.

# Prepare data set, this time we use two indicators to calculate THE HMM model, stock price rise and trading volume
close=stock_df.close.values
feature1=100*np.diff(close)/close[:- 1] # Calculation of stock increase
print(close[:10])
print(feature1[:10]) # Check the increase calculation has no problem
Copy the code

— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — – — – a — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — —

[6.775 6.291 6.045 5.899 5.361 5.436 5.299 4.994 4.494 4.598] [-7.14391144-3.91034812-2.41521919-9.12018986 1.39899273-2.52023547-5.75580298-10.01201442 2.31419671 9.98260113]

— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — – — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — –

# Because the sequence after calculating the increase is one less than the original closing sequence (the original stock price does not calculate the increase), we need to subtract one
feature2=stock_df.volume.values[1:]
dataset_X=np.c_[feature1,feature2]
print(dataset_X[:5]) # Check it out
Copy the code


2. Create the HMM model

I have explained the HMM model in detail in my previous article. Please refer to machine Learning 044- Creating hidden Markov Models

# Create the HMM model and train it
from hmmlearn.hmm import GaussianHMM
model=GaussianHMM(n_components=5,n_iter=1000) # Assume for the moment that the stock has five implied states
model.fit(dataset_X)
Copy the code

After using the HMM model, how do we know if the model is good or bad? Then you need to compare the predicted results with the actual results to see if they are consistent.

# Use this model to see the effect
N=500
samples,_=model.sample(N)
# Since I use the increase as the first feature and the volume as the second feature to model,
# Therefore, the first column of the model is the predicted increase, and the second column is the volume
plt.plot(feature1[:N],c='red',label='Rise%') # Draw the actual increase and forecast increase into a graph for easy comparison
plt.plot(samples[:,0],c='blue',label='Predicted%')
plt.legend()
Copy the code

It seems that the matching result is not good, and then take a look at the forecast for trading volume:

plt.plot(feature2[:N],c='red',label='volume')
plt.plot(samples[:,1],c='blue',label='Predicted')
plt.legend()
Copy the code

These two results are not good, the predicted value and the actual value are quite different, indicating that the model is difficult to solve the project. Let’s look at it another way. If it were so easy to predict what stocks would do, everyone would take money out of the stock market, and eventually the market would close down. Readers who are interested in optimizing the hidden state number of HMM model may get better matching results, but there may also be fitting, so I think optimization is not very useful.

# # # # # # # # # # # # # # # # # # # # # # # # small * * * * * * * * * * and # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #

1. Here is just a simple example of using HMM model to analyze stock data. Although it is of little practical value, it can provide some ideas for other complex algorithms.

Stay out of the stock market, stay out of harm’s way! **

# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #


Note: This part of the code has been uploaded to (my Github), welcome to download.

References:

1, Classic Examples of Python machine learning, by Prateek Joshi, translated by Tao Junjie and Chen Xiaoli