Can we write policies with RNN?

Community links

Xiaopian: We have a user who likes to use machine learning to manipulate strategies. He described several models very vividly and wrote a demo strategy using PonderLSTM. Today, I would like to share it with you

The ACT model simulates the thinking process of complex problems by performing multiple calculations at each time step (time series node). This algorithm is especially valuable when dealing with long time sequences using RNN variants with external memory (such as DNC, NTM, etc.).

The following is a brief visual description of the model (not a very accurate description, but a good one). We assume that the VanillaRNN model is a examinee who is taking an English listening test. The examinee is required to “dry listen”, that is, not to use paper and pen to record, and all the listening content questions and answers are conducted orally. Of course, in the case of excluding the day longitudinal wizards, the answer of this examinee is miserable.

PonderDNC not only carries e-man e-book, but also time stand still machine. That is to say, the examinee can let the time stand still in any place he or she thinks possible during the listening test, pay attention not to go back, and then calmly refer to his or her records in the E-person E-book to answer the question.

PonderDNC

By embedding DNC computing units into ACT computing architecture, PonderDNC can realize multiple operations in each time step. This means that at time step T, DNC does not need to output information immediately after obtaining external input at time t, but can make decision output after repeated thinking and calculation at time T, and then enter the next time t+1. The following figure

The original DNC model generally uses a 500-line external memory matrix, which is controlled by 1~3 write heads and 2~6 read heads for information (memory) interaction with the external memory matrix at each time step T. That is, the DNC model generally interacts with no more than 10 memory locations at time T. This works well for simple problems, such as memory retrieval, but not so well for decisions that interact with the previous information over a long period of time. PonderDNC computing unit formed after embedding DNC computing unit into ACT computing architecture can interact information with external memory matrix several times in each time step. For example, a PonderDNC with 2 write head control, 4 read head control and 1000 lines of external memory operates 50 times at time T and can interact with 300 positions of external memory matrix at most. At time T, it can deduce output based on 50 short-term memory and 200 read head memory at most. That is, at time T, PonderDNC can infer and judge based on 20% of the total amount of previous memory.

Generally speaking, if these models are regarded as different traders, LSTM traders make judgments on the future market based on the K-chart of the past two weeks, while PonderDNC makes judgments on the future based on a variety of technical analysis indicators of the past quarter.

The PonderDNC model is suitable for processing long sequence tasks and backtracking complex tasks, such as processing high frequency data of futures. The model needs larger trainable weight parameters to get better results. Attention should be paid to increasing the amount of training data and eliminating the multicollinearity of input factors.

Note: The information that PonderDNC transfers from each time step t to the next time step t+1 is the accumulated controller state, the accumulated read head read memory vector, and the external storage memory matrix after N(t) operations at time t (this is not the sum of weights, but the memory matrix after the last operation is directly transmitted).

2017-08-18 Updated futures daily frequency back test

In the out-of-sample back test from January 1, 2017 to August 1, 2018, it should be said that the generalization ability is good

Related Posts

Nasdaq 2021 Tech Trends Report, Byte reaches $92 million privacy deal with US users, 2020 Most popular APP | Decode the Week

LeetCode antithesis | 26. Delete sort duplicates of the array

MySQL > delete root password