Convolutional neural networks are usually associated with image classification tasks, but with appropriate modifications, they have proved to be a valuable tool for sequence modeling and prediction.

motivation

So far, the topic of sequence modeling in the context of deep learning has been mainly related to recursive neural network architectures such as LSTM and GRU.

The basic model

An overview of the

TCN is short for Temporal Convolutional Network, which consists of extended and causal 1D Convolutional layers with the same input and output lengths.

One-dimensional convolution network

One-dimensional convolutional networks take a three-dimensional tensor as input and also output a three-dimensional tensor. Our TCN implementation has input tensors with shapes (batch_size, input_length, input_size) and output tensors with shapes (batchz_size, input_length, output_size). Since each layer in TCN has the same input and output lengths, only the third dimension of the input and output is different. In the univariate case, both input_size and output_size are equal to 1. In the more general multivariate case, input_size and output_size may differ because we may not want to predict every component of the input sequence. A single 1D convolution layer receives a shape input tensor (batch_size, input_length, nr_Input_channels) and outputs a Shape tensor (batch_size, input_length, Nr_output_channels). To understand how a single layer converts its inputs into outputs, let’s look at one element of the batch (each element in the batch is treated the same). Let’s start with the simplest example, where nr_input_channels and nr_output_channels are both equal to 1. In this case, we are looking at one-dimensional input and output tensors. The following figure shows how one element of the output tensor is evaluated.















expansion

An ideal quality of the prediction model is that the value of a particular item in the output depends on all previous entries in the input, that is, all entries whose index is less than or equal to itself. This can be achieved when the size of the receive field (the set of entries that affect the original input of a particular entry for the output) is input_length. We also call it a “complete historical record.” As we have seen before, a traditional convolution layer creates an entry in the output that depends on the kernel_size entry of the input, whose index is less than or equal to itself. For example, if our kernel_size is 3, the fifth element in the output will depend on elements 3, 4, and 5 in the input. This range expands as we stack multiple layers together. In the figure below, we can see that by stacking two layers of kernel_size 3, we get an acceptance field size of 5.













Overview of basic TCN

Given input_length, kernel_size, dilation_base and the minimum number of layers needed to cover the entire history, the basic TCN network looks like this:


To predict

So far, we have only discussed “input sequences” and “output sequences” without any insight into how they relate to each other. In terms of prediction, we want to predict the next entry in a future time series. In order to train our TCN network for prediction, the training set will be composed of equal large and small sequence pairs (input sequence and target sequence) of a given time series. The target sequence will be a sequence that moves forward a certain amount of output_length relative to its respective input sequence. This means that the target sequence of length input_length contains the last (input_length-output_length) element of its respective input sequence as the first element, and the output_length element after the last entry of the input sequence as its last element. In terms of prediction, this means that the maximum prediction horizon that the model can predict is equal to output_length. Using the sliding window method, many overlapping sequences of inputs and targets can create a time series.