Time series (ARIMA, exponential smoothing) analysis is performed using R language

Original link:tecdat.cn/?p=3609

Original source:Tuo End number according to the tribe public number

Read time series data

The first thing you do to analyze time series data is read it into R and plot the time series. You can read data into R using the scan () function, which assumes that the data at the continuous point in time is in a simple text file with a single column.

The data set is as follows:

King death age data 60 43 67 50 56 42 65 68 43 65 34...Copy the code

Only the first few lines of the file are displayed. The first three lines contain some comments on the data that we want to ignore when we read the data into R. We can use it by using the “skip” parameter of the scan () function, which specifies how many lines. Top of the file to ignore. To read the file into R, ignore the first three lines and type:

 
> kings
  [1] 60 43 67 50 56 42 50 65 68 43 65 34 47 34 49 41 13 35 53 56 16 43 69 59 48
  [26] 59 86 55 68 51 33 49 67 77 81 67 71 81 68 70 77 56
Copy the code

In this case, the age of death of the 42 successive Kings of the United Kingdom has been read into the variable “king”.

Once the time series data is read into R, the next step is to store the data in a time series object in R so that the time series data can be analyzed using R’s many functions. To store data in a time series object, we use the ts () function in R. For example, to store data in the variable ‘Kings’ as a time series object in R, we type:

 
  Time Series:
  Start = 1
  End = 42
  Frequency = 1
  [1] 60 43 67 50 56 42 50 65 68 43 65 34 47 34 49 41 13 35 53 56 16 43 69 59 48
  [26] 59 86 55 68 51 33 49 67 77 81 67 71 81 68 70 77 56
Copy the code

The time series data sets you have might be collected at fixed intervals of less than a year, for example, monthly or quarterly. In this case, you can use the ‘frequency’ parameter in the ts () function to specify how many times the data is collected per year. For monthly time series data, you set frequency = 12, and for quarterly time series data, you set frequency = 4.

You can also use the “start” parameter in the ts () function to specify the first year of data collection and the first time interval for that year. For example, if the first data point corresponds to the second quarter of 1986, set start = c (1986,2).

> birthstimeseries <- ts(births, frequency=12, Start = C (1946,1))# Convert to timeseries data format > birthstimeseries Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 1946 26.663 23.598 26.931 24.740 25.806 24.364 24.477 23.901 23.175 23.227 21.672 21.870 1947 21.439 21.089 23.709 21.669 21.752 20.761 23.479 23.824 23.105 23.110 21.759 22.073 1948 21.937 20.035 23.590 21.672 22.222 22.123 23.950 23.504 22.238 23.142 21.059 21.573 1949 21.548 20.000 22.424 20.615 21.761 22.874 24.104 23.748 23.262 22.907 21.519 22.025 1950 22.604 20.894 24.677 23.673 25.320 23.583 24.671 24.454 24.122 24.252 22.084 22.991 1951 23.287 23.049 25.076 24.037 24.430 24.667 26.451 25.618 25.014 25.110 22.964 23.981 1952 23.798 22.270 24.775 22.646 23.988 24.737 26.276 25.816 25.210 25.199 23.162 24.707 1953 24.364 22.644 25.565 24.062 25.431 24.635 27.009 26.606 26.268 26.462 25.246 25.180 1954 24.657 23.304 26.982 26.199 27.210 26.122 26.706 26.878 26.152 26.379 24.712 25.688 1955 24.990 24.239 26.721 23.475 24.767 26.219 28.361 28.599 27.914 27.784 25.693 26.881 1956 26.217 24.218 27.914 26.975 28.527 27.139 28.982 28.169 28.056 29.136 26.291 26.987 1957 26.589 24.848 27.543 26.896 28.878 27.390 28.065 28.141 29.048 28.484 26.634 27.735 1958 27.132 24.924 28.963 26.589 27.931 28.009 29.229 28.759 28.405 27.945 25.912 26.619 1959 26.076 25.286 27.660 25.951 26.398 25.565 28.865 30.000 29.261 29.012 26.992 27.897Copy the code

Similarly, monthly sales of souvenir shops in the beach resort town of Queensland, Australia, from January 1987 to December 1993 (original data from Wheelwright and Hyndman, 1998). We can read data into R by typing the following:

Read 84 items > souvenirtimeseries <- ts(souvenir, frequency=12, Start = C (1987,1)) > souvenirtimeseries Jan Feb Mar Apr May Jun Jul Aug Sep Oct Dec 1987 1664.81 2397.53 2840.71 3547.29 3752.96 3714.74 4349.61 3566.34 5021.82 6423.48 7600.60 19756.21 1988 2499.81 5198.24 7225.14 4806.03 5900.88 4951.34 6179.12 4752.15 5496.43 5835.10 12600.08 28541.72 1989 4717.02 5702.63 9957.58 5304.78 6492.43 6630.80 7349.62 8176.62 8573.17 9690.50 15151.84 34061.01 1990 5921.10 5814.58 12421.25 6369.77 7609.12 7224.75 8121.22 7979.25 8093.06 8476.70 17914.66 30114.41 1991 4826.64 6470.23 9638.77 8821.17 8722.37 10209.48 11276.55 12552.22 11637.39 13606.89 21822.11 45060.69 1992 7615.03 9849.69 14558.40 11587.33 9332.56 13082.09 16732.78 19888.61 23933.38 25391.35 36024.80 80721.71 1993 10243.24 11266.88 21826.84 17357.33 15997.79 18601.53 26155.15 28586.52 30505.41 30821.33 46634.38 104660.67Copy the code

Draw time series

Once you have read the time series into R, the next step is usually to make a plot of the time series data, which you can do with the plot.ts () function in R.

For example, to plot the time series of the deaths of 42 successive Kings in Great Britain, we enter:

> plot.ts(kingstimeseries)
Copy the code

We can see from the time graph that the additive model can be used to describe this time series because the random fluctuations in the data are roughly constant in size over time.

Similarly, to plot a time series of monthly births in New York City, we enter:

 
Copy the code

As we can see from this time series, there seems to be a seasonal variation in monthly births: a peak every summer and a trough every winter. Similarly, it seems that this time series may be described by an additive model, since the size of seasonal fluctuations is roughly constant over time and does not seem to depend on the level of the time series, and random fluctuations also seem to be constant over time.

Similarly, to plot the time series of monthly souvenir shop sales in the beach resort town of Queensland, Australia, we enter:

 
Copy the code

In this case, it seems that the addition model is not suitable for describing this time series, because the size of seasonal and random fluctuations seems to increase with the level of the time series. Therefore, we may need to transform the time series to obtain the transformation time series that can be described using the addition model. For example, we can transform a time series by computing a natural log of raw data:

 > plot.ts(logsouvenirtimeseries)
Copy the code

Here we can see that the magnitude of seasonal and random fluctuations in logarithmic transformation time series seems to be roughly constant over time and does not depend on the level of time series. Therefore, the addition model can be used to describe the time series of logarithmic transformations.

Decomposed time series

Breaking up a time series means breaking it up into its components, which are usually the trending and irregular components and, in the case of a seasonal time series, the seasonal components.

Break down non-seasonal data

Non-seasonal time series are composed of trend component and irregular component. Decomposing a time series involves attempting to divide the time series into these components, i.e., estimating the trend component and the irregular component.

To estimate the trend component of a non-seasonal time series that can be described using an additive model, smoothing methods, such as calculating a simple moving average for a time series, are often used.

The SMA () function in the “TTR” R package can be used to smooth time series data using a simple moving average. To use this feature, we first need to install the “TTR” R package. Once the “TTR” R package is installed, type the following command to load the “TTR” R package:

 
Copy the code

You can then use the “SMA ()” feature to smooth the time series data. To use the SMA () function, specify the order (span) of the simple moving average using the argument “n”. For example, to calculate a simple moving average of order 5, we set n = 5 in the SMA () function.

For example, as noted above, the time series for the age of death of 42 successive Kings in England appeared to be non-seasonal and could be described using additive models because the magnitude of random fluctuations in the data was essentially constant. Time:

Therefore, we can try to estimate the trend component of this time series by smoothing it using a simple moving average. To smooth the time series using order 3 simple moving average and plot the smooth time series data, we type:

> kingstimeseriesSMA3 <- SMA(kingstimeseries,n=3)
> plot.ts(kingstimeseriesSMA3)
Copy the code

There seems to be quite a bit of random fluctuation in time series using order 3 smoothing of simple moving averages. Therefore, in order to estimate trend components more accurately, we may wish to try to smooth the data using a simple moving average. Higher order. It takes some trial and error to find the right amount of smoothing. For example, we can try using a simple moving average of order 8:

> kingstimeseriesSMA8 <- SMA(kingstimeseries,n=8)
> plot.ts(kingstimeseriesSMA8)
Copy the code

Smoothing the data using an 8-order simple moving average to show the trend component more clearly, we can see that the age of death of the British Kings seems to have fallen from about 55 years to about 38 years in the last 20 Kings, and then increased to about 73 years after the 40th king’s reign ended in the time series.

Breaking up seasonal data

Seasonal time series consist of trend component, seasonal component and irregular component. Decomposing a time series means dividing the time series into these three components: that is, estimating the three components.

To estimate the trend and seasonal components of the seasonal time series that can be described using additive models, we can use the “decompose ()” function in R. The function estimates the trend, seasonal and irregular components of the time series. It can be described using an additive model.

The function “Decompose ()” returns as a result a list object in which estimates of the seasonal, trend, and irregular components are stored in the named elements of the list object, called “seasonal,” “trend,” and “random,” respectively.

For example, as noted above, the time series of monthly births in New York City is seasonal, with peaks each summer and winter, and may be described using additive models because seasonal and random fluctuations appear to be constant in size over time:

To estimate the trend, seasonal and irregular components of this time series, we input:

> birthstimeseriescomponents <- decompose(birthstimeseries)
Copy the code

Estimates of the seasonality, trend and irregular component is now stored in the variable birthstimeseriescomponents $seasonal, Birthstimeseriescomponents $trend and birthstimeseriescomponents $in random. For example, we could output estimates for seasonal components by typing:

> birthstimeseriescomponents $seasonal # get seasonal ingredients Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 1946-0.6771947 -2.0829607 0.8625232-0.8016787 0.2516514-0.1532556 1.4560457 1.1645938 0.6916162 0.7752444-1.1097652-0.3768197 1947 -0.6771947-2.0829607 0.8625232-0.8016787 0.2516514-0.1532556 1.4560457 1.1645938 0.6916162 0.7752444-1.1097652 0.3768197 1948 -0.6771947 -2.0829607 0.8625232-0.8016787 0.2516514-0.1532556 1.4560457 1.1645938 0.6916162 0.7752444 -1.1097652-0.3768197 1949-0.6771947-2.0829607 0.8625232-0.8016787 0.2516514-0.1532556 1.4560457 1.1645938 0.6916162 0.7752444-1.1097652-0.3768197 1950-0.6771947-2.0829607 0.8625232-0.8016787 0.2516514-0.1532556 1.4560457 1.1645938 0.6916162 0.7752444-1.1097652-0.3768197 1951-0.6771947 -2.0829607 0.8625232-0.8016787 0.2516514-0.1532556 1.4560457 1.1645938 0.6916162 0.7752444-1.1097652-0.3768197 1952-0.6771947 -2.0829607 0.8625232-0.8016787 0.2516514 0.1532556 1.4560457 1.1645938 0.6916162 0.7752444-1.1097652-0.3768197 1953 -0.6771947 -2.0829607 0.8625232 -0.8016787 0.2516514-0.1532556 1.4560457 1.1645938 0.6916162 0.7752444-1.1097652-0.3768197 1954-0.6771947-2.0829607 0.8625232-0.8016787 0.2516514-0.1532556 1.4560457 1.1645938 0.6916162 0.7752444-1.1097652-0.3768197 1955-0.6771947 -2.0829607 0.8625232-0.8016787 0.2516514-0.1532556 1.4560457 1.1645938 0.6916162 0.7752444-1.1097652-0.3768197 1956 -0.6771947-2.0829607 0.8625232-0.8016787 0.2516514-0.1532556 1.4560457 1.1645938 0.6916162 0.7752444-1.1097652 0.3768197 1957-0.6771947-2.0829607 0.8625232-0.8016787 0.2516514-0.1532556 1.4560457 1.1645938 0.6916162 0.7752444 -1.1097652-0.3768197 1958-0.6771947 -2.0829607 0.8625232-0.8016787 0.2516514-0.1532556 1.4560457 1.1645938 0.6916162 0.7752444-1.1097652-0.3768197 1959-0.6771947-2.0829607 0.8625232-0.8016787 0.2516514-0.1532556 1.4560457 1.1645938 0.6916162 0.7752444-1.1097652-0.3768197Copy the code

Estimated seasonal factors are given between January and December and are the same from year to year. The largest seasonal factor was in July (about 1.46) and the lowest was in February (about -2.08), indicating that the birth rate appears to peak in July and trough in February.

We can use the “plot ()” function to plot the estimated trend, season, and irregular components of the time series, for example:

> plot(birthstimeseriescomponents)
Copy the code

The figure above shows the original time series (top), the estimated trend component (second from top), the estimated seasonal component (third from top) and the estimated irregular component (bottom). We see that the estimated trend component shows a small decline from about 24 in 1947 to about 22 in 1948, followed by a steady increase from 1959 to about 27.

Seasonal adjustment

If you have seasonal time series that can be described using additional models, you can seasonally adjust the time series by estimating the seasonal components and subtracting the estimated seasonal components from the original time series. We can do this using estimates of the seasonal components calculated by the decompose () function.

For example, to seasonally adjust the time series of monthly births in New York City, we can use “Decompose ()” to estimate the seasonal components and then subtract the seasonal components from the original time series:

> birthstimeseriescomponents <- decompose(birthstimeseries)
> birthstimeseriesseasonallyadjusted <- birthstimeseries - birthstimeseriescomponents$seasonal
Copy the code

We can then plot a seasonally adjusted time series using the “plot ()” function, entering:

> plot(birthstimeseriesseasonallyadjusted)
Copy the code

You can see that seasonal variations have been removed from the seasonally adjusted time series. Seasonally adjusted time series now contain only trend and irregular components.

Prediction using exponential smoothing

Exponential smoothing can be used for short-term prediction of time series data.

Simple exponential smoothing

If you have a time series that can be described using additional models with constant levels and no seasonality, you can use simple exponential smoothing for short-term predictions.

The simple exponential smoothing method provides a way to estimate the level at the current point in time. Smoothing is controlled by the parameter alpha; Used to estimate the level of the current point in time. Alpha value; A value of α close to zero means that the most recent observations are small when making predictions about future values.

   Read 100 items
> rainseries <- ts(rain,start=c(1813))
> plot.ts(rainseries)
Copy the code

You can see from the graph that the level is roughly constant (the average remains constant at about 25 inches). Random fluctuations in time series appear to be roughly constant over time, so it may be appropriate to use additive models to describe the data. Therefore, we can use simple exponential smoothing for prediction.

To predict using simple exponential smoothing in R, we can use the “HoltWinters ()” function in R to fit a simple exponential smoothing prediction model. To use HoltWinters () for simple exponential smoothing, we need to set the parameters beta = FALSE and gamma = FALSE in the HoltWinters () function (the β and gamma arguments are used for Holt’s exponential smoothing, or Holt-Winters’ exponential smoothing, Described below).

The HoltWinters () function returns a list variable with multiple named elements.

For example, to use simple exponential smoothing to predict the time series of annual rainfall in London, we enter:

> rainseriesforecasts <- HoltWinters(rainseries, beta=FALSE, gamma=FALSE) > rainseriesforecasts Smoothing parameters: Alpha: 0.02412151 Beta: FALSE Gamma: FALSE Coefficients: [,1] a 24.67819Copy the code

Output from HoltWinters () tells us that the estimate for the alpha parameter is about 0.024. This is very close to zero, telling us that the prediction is based on recent and recent observations (although more emphasis is placed on recent observations).

By default, HoltWinters () forecasts only the same periods covered by our original time series. In this case, our original time series includes rainfall in London from 1813 to 1912, so the prediction is also 1813 to 1912.

In the example above, we store the output of the HoltWinters () function in the list variable “RainseriesForecasts”. HoltWinters () ‘s predictions are stored in the named element of the list variable named “fits”, so we can get their values by typing:

> rainseriesforecasts$fitted Time Series: Start = 1814 End = 1912 Frequency = 1 Xhat level 1814 23.56000 23.56000 1815 23.62054 23.62054 1816 23.57808 23.57808 1817 23.76290 23.76290 1818 23.76017 23.76017 1819 23.76306 23.76306 1820 23.82691 23.82691... 1905 24.62852 24.62852 1906 24.58852 24.58852 1907 24.58059 24.58059 1908 24.54271 24.54271 1909 24.52166 24.52166 1910 24.57541 24.57541 1911 24.59433 24.59433 1912 24.59905 24.59905Copy the code

We can plot the original time series and forecast by typing:

> plot(rainseriesforecasts)
Copy the code

The graph shows the original time series as black and the prediction as red. The predicted time series are much smoother than the time series of the original data.

As a measure of prediction accuracy, we can calculate the sum of the squared errors of the prediction errors within the sample, that is, the prediction errors of the time period covered by our original time series. The sum of squared errors is stored in the named element of the list variable “RainseriesForecasts” named “SSE”, so we can get its value by typing:

> rainseriesforecasts $SSE 1828.855 [1]Copy the code

In other words, the sum of the squared errors here is 1,828.855.

In simple exponential smoothing, it is common to use the first value in the time series as the initial value of the level. For example, in the rainfall time series for London, the first value of rainfall in 1813 was 23.56 inches. You can specify initial values for levels in the HoltWinters () function using the “l.start” argument. For example, to predict with an initial level value of 23.56, we type:

> HoltWinters(rainseries, beta=FALSE, gamma=FALSE, l.start=23.56)
Copy the code

As noted above, by default, HoltWinters () forecasts only the time periods covered by the original data, namely the rain time series 1813-1912. You can forecast more points in time using the forecast.HoltWinters () function in the R “Forecast” package. To use the forecast.HoltWinters () function, we first need to install the “forecast” R packages (see Installing R Packages for instructions on how to install R packages).

After installing the Forecast R package, you can type the following command to load the Forecast R package:

> library("forecast")
Copy the code

When you use the forecast.HoltWinters () function as its first argument (input), you use the HoltWinters () function to pass your forecast models that you have fitted. For example, in the case of rain time series, we store prediction models using HoltWinters () in the variable “RainseriesForecasts.” You can specify additional points of time to forecast using the “h” parameter in forecast.HoltWinters (). For example, to use forecast.HoltWinters () to forecast rainfall for 1814-1820 (more than 8 years), we type:

> rainseriesforecasts2 <- forecast.HoltWinters(rainseriesforecasts, H =8) > RainseriesForecasts2 Point Forecast Lo 80 Hi 80 Lo 95 Hi 95 1913 24.67819 19.17493 30.18145 16.26169 33.09470 1914 24.67819 19.17333 30.18305 16.25924 33.09715 1915 24.67819 19.17173 30.18465 16.25679 33.09960 1916 24.67819 19.17013 30.18625 16.25434 33.10204 1917 24.67819 19.16853 30.18785 16.25190 33.10449 1918 24.67819 19.16694 30.18945 16.24945 33.10694 1919 24.67819 19.16534 30.19105 16.24701 33.10938 1920 24.67819 19.16374 30.19265 16.24456 33.11182Copy the code

Forecast.HoltWinters () functions provide you with one-year forecasts with forecasts 80 percent apart and forecasts 95 percent apart. For example, the predicted rainfall for 1920 was about 24.68 inches, with 95% of predicted intervals (16.24,33.11).

To draw forecasts made by forecast.HoltWinters () we can use the “plot.forecast ()” function:

> plot.forecast(rainseriesforecasts2)
Copy the code

Here the 1913-1920 predictions are plotted as blue lines, 80% of the prediction intervals are plotted as orange shaded areas, and 95% of the prediction intervals are plotted as yellow shaded areas.

For each time point, the “prediction error” is calculated as the observed minus the predicted value. We can only calculate the prediction error for the time period covered by the original time series, i.e. 1813-1912 of the rainfall data. As mentioned above, one measure of the accuracy of the prediction model is the sum of the squared errors of the in-sample prediction errors (SSE).

The in-sample prediction errors are stored in the named element “residuals” of the list variables returned by forecast.HoltWinters (). If the prediction model cannot be improved, there should be no correlation between the prediction errors of continuous prediction. In other words, if there is a correlation between the prediction errors of continuous predictions, it may be possible to improve simple exponential smoothing predictions with another prediction technique.

To find out if this is the case, we can obtain a correlation graph of the in-sample prediction error with a lag of 1-20. We can use the “ACf ()” function in R to calculate the correlation graph of the prediction error. To specify the maximum lag we want to look at, we use the “lag.max” parameter in acf ().

For example, to calculate the correlation graph of the in-sample prediction error for London rainfall data, we input:

> acf(rainseriesforecasts2$residuals, lag.max=20)
Copy the code

You can see from the sample correlation diagram that the autocorrelation with a lag of 3 just hit the significant boundary. To test whether there is significant evidence of non-zero correlation with a lag of 1-20, we can perform lJung-box tests. This can be done in R using the “box.test ()” function. The maximum delay we want to look at is specified using the “lag” parameter in the box.test () function. For example, to test whether there is a non-zero autocorrelation with a lag of 1-20, for the in-sample prediction error of rainfall data in London, we type:

> Box.test(rainseriesforecasts2$residuals, lag=20, type="Ljung-Box") Box-Ljung test data: Rainseriesforecasts2 $residuals X-squared = 17.4008, df = 20, p-value = 0.6268Copy the code

The LJung-box test statistic here is 17.4 and the p-value is 0.6, so there is almost no evidence of non-zero autocorrelation of sample prediction error behind 1-20.

To ensure that the prediction model cannot be improved, it is also a good idea to check that the prediction errors are normally distributed with zero mean and constant variance. To check whether the prediction error has constant variance, we can make the time diagram of the prediction error in the sample:

> plot.ts(rainseriesforecasts2$residuals)
Copy the code

The figure shows that the in-sample prediction error appears to be roughly constant over time, although the size of the fluctuation at the beginning of the time series (1820-1830) may be slightly smaller than in later dates (e.g. 1840-1850).

To check if the normal distribution of the prediction error is zero mean, we can plot a histogram of the prediction error, in which the covered normal curve has the same mean zero and standard deviation as the distribution of the prediction error. To do this, we can define an R function “plotForecastErrors ()” below:

 
Copy the code

You must copy the above functionality into R to use it. You can then use plotForecastErrors () to plot the histogram of the forecast error for the rainfall forecast (with overlapping normal curves) :

> plotForecastErrors(rainseriesforecasts2$residuals)
Copy the code

The graph shows that the distribution of prediction errors is roughly zero centered and more or less normally distributed, although it appears to be slightly to the right compared to the normal curve. However, the right skew is relatively small, so it is usually reasonable to predict the error as a mean 0 distribution.

The lJung-box test shows that there is almost no evidence of non-zero autocorrelation in the in-sample prediction error, and the distribution of the prediction error seems to be normal distribution with the mean of zero. This suggests that the simple exponential smoothing method provides a sufficient prediction model for London rainfall, which may not be able to be improved. In addition, the assumptions based on which the 80% and 95% prediction intervals are based (there is no autocorrelation in the prediction errors, which are usually distributed as mean zero and constant variance) may be valid.

Holt’s exponential smoothing

If your time series can be described using an additive model with trend increases or decreases and no seasonality, Holt’s exponential smoothing can be used for short-term forecasting.

Holt’s exponential smoothing estimates the level and slope at the current point in time. Smoothing is controlled by two parameters, α, which is used to estimate the level at the current time point, and β, which is used to estimate the slope B of the trend component at the current time point. As with simple exponential smoothing, the values of the parameters alpha and beta are between 0 and 1, and a value close to 0 means that the most recent observations are of little importance in making predictions about future values.

An example of a time series can be used to describe a time series of annual diameters of women’s skirts from 1866 to 1911 using an addition model with trends and without seasonality. Read and draw the data in R by typing the following:

 
> skirtsseries <- ts(skirts,start=c(1866))
> plot.ts(skirtsseries)
Copy the code

As we can see from the figure, the diameter increased from about 600 in 1866 to about 1050 in 1880, and then decreased to about 520 in 1911.

To make predictions, we can use the HoltWinters () function in R to fit the prediction model. To use HoltWinters () for Holt’s exponential smoothing, we need to set the parameter gamma = FALSE (the gamma parameter is used for Holt-Winters exponential smoothing, as described below).

For example, to use Holt’s exponential smoothing to fit a prediction model for hemline diameter, we type:

> skirtsseriesforecasts <- HoltWinters(skirtsseries, gamma=FALSE)
> skirtsseriesforecasts
  Smoothing parameters:
  alpha:  0.8383481
  beta :  1
  gamma:  FALSE
  Coefficients:
    [,1]
  a 529.308585
  b   5.690464
> skirtsseriesforecasts$SSE
  [1] 16954.18
Copy the code

The estimated value of α is 0.84 and the estimated value of β is 1.00. These are high, telling us that the current value of the level and the slope B of the trend component are estimated primarily based on recent observations in the time series. This makes good intuitive sense because the level and slope of a time series can change a lot over time. The sum of squares of the in-sample prediction error is 16,954.

We can plot the original time series as a black line, with the predicted value as a red line, by typing:

> plot(skirtsseriesforecasts)
Copy the code

We can see from the graph that the in-sample predictions fit well with the observations, although they tend to lag slightly behind the observations.

If desired, you can use the “l.start” and “B. start” parameters of the HoltWinters () function to specify levels of the trend components and initial values for slope B. It is common to set the initial value of the level to the first value in the time series (608 for the skirt number) and the initial value of the slope to the second value minus the first value (9 for the skirt number). For example, in order to use Holt’s exponential smoothing to fit the prediction model of hemline flanging data, the initial value of horizontal is 608, and the slope B of trend component is 9, we input:

> HoltWinters(skirtsseries, gamma=FALSE, l.start=608, b.start=9)
Copy the code

For simple exponential smoothing, we can use the forecast.HoltWinters () function in the Forecast package to forecast future times not covered by the original time series. For example, our hemline hemline time series data is 1866 to 1911, so we can predict 1912 to 1930 (another 19 data points) and plot by entering the following:

> skirtsseriesforecasts2 <- forecast.HoltWinters(skirtsseriesforecasts, h=19)
> plot.forecast(skirtsseriesforecasts2)
Copy the code

The forecasts are shown as blue lines, with 80% of the forecast intervals shaded in orange and 95% in yellow.

For simple exponential smoothing, we can check whether the prediction model can be improved by checking whether the in-sample prediction error shows a non-zero autocorrelation at the lag of 1-20. For example, for hemline hem data, we could make a correlation diagram and perform the LJung-box test by typing:

> acf(skirtsseriesforecasts2$residuals, lag.max=20) > Box.test(skirtsseriesforecasts2$residuals, lag=20, Type =" ljung-box ") BOX-Ljung test data: SkirtsSeriesForecasts2 $residuals X-squared = 19.7312, df = 20, p-value = 0.4749Copy the code

The correlation diagram here shows that the sample autocorrelation of the in-sample prediction error with a lag of 5 exceeds the significant boundary. However, we expect that one out of 20 autocorrelations in the top 20 countries only accidentally exceeded the 95% significant threshold. In fact, when we performed the LJung-box test, the p value was 0.47, indicating that there was almost no evidence of non-zero autocorrelation in the in-sample prediction error behind 1-20.

For simple exponential smoothing, we should also check that the prediction error is constant over time and is usually distributed with a mean of 0. We can do this by making a time diagram of the prediction error and a histogram of the prediction error distribution, as well as a normal curve covering:

> plot.ts(skirtsSeriesForecasts2 $residuals) # Draw a time series diagram > plotForecastErrors(skirtsSeriesForecasts2 $residuals) # Draw a histogramCopy the code

The time diagram of the prediction error shows that the prediction error changes with time approximately unchanged. The histogram of the prediction error shows that the prediction error is usually reasonable in terms of mean zero and constant variance distribution.

Therefore, the LJung-box test shows that there is little evidence of autocorrelation in the prediction error, and the time plot and histogram of the prediction error show that the prediction error is usually distributed with mean zero and constant variance is reasonable. Therefore, we can conclude that Holt’s exponential smoothing provides a sufficient predictive model for hemline diameter, which may not be improved. In addition, this means that the assumptions on which the 80 per cent and 95 per cent ranges are based may be valid.

The Holt-Winters index is smooth

If you have a time series that can be described using an additive model of increase or decrease trends and seasonality, you can use Holt-Winters exponential smoothing for short-term forecasting.

Holt-winters exponential smoothing estimates the level, slope, and seasonal component of the current point in time. Smoothing is controlled by three parameters: α, β, and γ, which are used to estimate the level at the current time point, slope B of the trend component, and the seasonal component, respectively. The parameters alpha, beta, and gamma all have values between 0 and 1, and a value close to 0 implies that the most recent observations are given relatively little weight in the prediction of future values.

An example of a time series that can be described using additional models with trends and seasonality is the time series of monthly sales logs for souvenir shops in a beach resort town in Queensland, Australia (as described above) :

To make predictions, we can use the HoltWinters () function to fit forecast models. For example, to fit the prediction model of a souvenir shop’s monthly sales log, we enter:

> logsouvenirtimeseries <- log(souvenirtimeseries) > souvenirtimeseriesforecasts <- HoltWinters(logsouvenirtimeseries) >  souvenirtimeseriesforecasts Holt-Winters exponential smoothing with trend and additive seasonal component. Smoothing Parameters: alpha: 0.413418 beta: 0 gamma: 0.9561275 Coefficients: [,1] a 10.3766666b 0.02996319 s1-0.80952063 s2-0.60576477 s3 0.01103238 S4-0.24160551 s5-0.35933517 s6-0.18076683 S7 0.07788605 s8 0.10147055 s9 0.09649353 s10 0.05197826 s11 0.41793637 s12 1.18088423 > Souvenirtimeseriesforecasts $SSE 2.011491Copy the code

The estimated values of α, β and γ are 0.41,0.00 and 0.96, respectively. The value of α (0.41) is relatively low, indicating that the level estimate for the current time point is based on recent observations and some observations further in the past. The value of β is 0.00, indicating that the estimated slope B of the trend component is not updated over the time series but is set to equal its initial value. This makes good intuitive sense because the levels change considerably over time series, but the slope B of the trend component remains roughly the same. In contrast, the gamma value (0.96) is high, indicating that estimates of seasonal components at the current time point are based only on recent observations.

For simple exponential smoothing and Holt’s exponential smoothing, we can draw the original time series as a black line, the predicted value as a red line, and the top is:

> plot(souvenirtimeseriesforecasts)
Copy the code

As we can see from the chart, the Holt-Winters index is very successful in predicting seasonal peaks, which occur roughly in November each year.

To forecast future times not included in the original time series, we use the forecast.HoltWinters () function in the Forecast package. For example, the original data on souvenir sales are from January 1987 to December 1993. If we wanted to forecast January 1994 to December 1998 (48 months or more) and plot the forecast, we would type:

> souvenirtimeseriesforecasts2 <- forecast.HoltWinters(souvenirtimeseriesforecasts, h=48)
> plot.forecast(souvenirtimeseriesforecasts2)
Copy the code

Predictions are shown as blue lines, with orange and yellow shaded areas showing 80% and 95% prediction intervals, respectively.

We can investigate whether the prediction model can be improved by checking whether the in-sample prediction error shows non-zero autocorrelation at 1-20 lags and by making correlation graphs and performing lJung-box tests:

> acf(souvenirtimeseriesforecasts2$residuals, lag.max=20) > Box.test(souvenirtimeseriesforecasts2$residuals, lag=20, type="Ljung-Box") Box-Ljung test data: Souvenirtimeseriesforecasts2 $- squared residuals X = 17.5304, df = 20, p - value = 0.6183Copy the code

The correlation diagram shows that the autocorrelation of the in-sample prediction error does not exceed the significant limit of 1-20 lag. In addition, the P-value of the LJung-box test is 0.6, indicating that there is little evidence of non-zero autocorrelation at the lag of 1-20.

We can check whether the prediction error has a constant variance over time and is usually distributed with a mean of 0 by making a time plot of the prediction error and histograms (normal curves with overlapping) :

> the plot. The ts (souvenirtimeseriesforecasts2 $residuals) # drawing time series > plotForecastErrors (souvenirtimeseriesforecasts2 $residuals) # Draw the histogramCopy the code

It can be seen from the time diagram that the prediction error has a constant change with time. Based on the histogram of the prediction error, it seems reasonable that the prediction error is usually distributed as mean zero.

Therefore, for prediction errors, there is little evidence of autocorrelation at 1-20 lags, and the prediction errors appear to be normally distributed, with a mean of zero and constant over time. This suggests that holt-Winters exponential smoothing provides an adequate predictive model for souvenir store sales records, which may not be improved. In addition, the assumptions on which the forecast ranges are based may be valid.

ARIMA model

Exponential smoothing methods are useful for making predictions and do not make assumptions about correlations between successive values of time series. However, if predictions made using exponential smoothing are to be spaced, the prediction interval requires that the prediction errors are not correlated and are usually distributed with mean zero and constant variance.

While the exponential smoothing approach makes no assumptions about correlations between consecutive values of a time series, in some cases you can build a better prediction model by considering correlations in the data. Autoregressive integrated moving average (ARIMA) models include explicit statistical models of the irregular components of time series, which allow non-zero autocorrelation in the irregular components.

Difference time series

ARIMA model is defined as a fixed time series. So if you start with a non-stationary time series, you will first need a “differential” time series until you get a stable time series. If you have to divide a time series D times to get a stable series, then you have an ARIMA (P, D, q) model where D is the order in which difference is used.

You can distinguish time series by using the “diff ()” function in R. For example, from 1866 to 1911, the time series of the annual diameter of women’s skirts in 1866 to 1911 was not smooth because the levels varied greatly over time:

We can differentiate the time series (which we store in the “Skirt series”, see above) once and draw the difference series by entering the following:

> skirtsseriesdiff1 <- diff(skirtsseries, differences=1)
> plot.ts(skirtsseriesdiff1)
Copy the code

The resulting time series of first-order differences (above) does not appear to be stationary. Therefore, we can difference the time series twice to see whether it provides us with a stable time series:

> skirtsseriesdiff2 <- diff(skirtsseries, differences=2)
> plot.ts(skirtsseriesdiff2)
Copy the code

Stability test

A stationarity test called a “unit root test” is provided in the fUnitRoots package.

A time series of second-order differences (above) appears to be stationary in terms of mean and variance, because the level of the series remains roughly constant over time, and the variance of the series appears roughly constant over time. Therefore, it seems that we need to difference the time series of skirt diameter twice to achieve a stationary sequence.

If you need to divide the raw time series data d times to get a fixed time series, this means that you can use the ARIMA (P, D, Q) model for the time series, where D is the order in which difference is used. For example, for the time series of female skirt diameter, we must distinguish the time series twice, so the difference order is 2. This means that you can use ARIMA (P, 2, Q) for your time series models. The next step is to calculate the P and Q values of the ARIMA model.

Another example is the time series of deaths of Kings of England (see above) :

From the time chart (above), we can see that time series are not mean values. To compute the time series for the first difference and plot it, we type:

> kingtimeseriesdiff1 <- diff(kingstimeseries, differences=1)
> plot.ts(kingtimeseriesdiff1)
Copy the code

The time series of the first difference seems to be fixed in mean and variance, so the ARIMA (P, 1, Q) model may be suitable for the time series of the age of death of the Kings of England. By adopting a first-order time series, we remove the trend component of the time series at the time of the king’s death, leaving an irregular component. We can now examine whether there is a correlation between the consecutive terms of this irregular component; If so, this could help us develop predictive models for the age at which Kings died.

Select candidate ARIMA models

If your time series is stationary, or you convert it to a stationary time series by difference D times, the next step is to select the appropriate ARIMA model, which means finding the value (P, D, Q) model with the most appropriate P and Q values for ARIMA. To do this, you usually need to examine the correlation plots and partial correlation plots for stationary time series.

To plot correlations and partial correlations, we can use the “ACf ()” and “pacf ()” functions in R respectively. To get the actual values for autocorrelation and partial autocorrelation, we set “plot = FALSE” in the “acf ()” and “pacf ()” functions.

An example from the time of the death of the English king

For example, to plot the correlation of lag 1-20 of the first-order difference time series of the death time of the English king and obtain the autocorrelation value, we enter:

> acf(kingtimeseriesdiff1, lag.max=20) Plot =FALSE) # Get autocorrelation coefficient 0 1 2 3 4 5 6 7 8 9 10 1.000-0.360-0.162-0.050 0.227-0.042-0.181 0.095 0.064-0.116-0.071 11 12 13 14 15 16 17 18 19 20 0.206-0.017-0.212 0.130 0.114-0.009-0.192 0.072 0.113-0.093Copy the code

We see from the correlation diagram that the autocorrelation at lag 1 (-0.360) exceeds the significant boundary, but all other autocorrelations between lag 1 and 20 do not exceed the significant boundary.

In order to plot the partial correlation of lag 1-20 of the first-order difference time series of the English king’s death time and obtain the partial autocorrelation value, we use the “pacf ()” function and type:

> pacf(kingtimeseriesdiff1, lag.max=20) Correlations of Partial autocorrelations of series 'kingtimeseriesdiff ' Lag 12 3 4 5 67 8 9 10 11-0.360-0.335-0.321 0.005 0.025-0.144-0.022-0.007-0.143-0.167 0.065 12 13 14 15 16 17 18 19 20 0.034-0.161 0.036 0.066 0.081-0.005-0.027-0.006-0.037Copy the code

The partial correlation diagram shows that the partial autocorrelation of hysteresis 1,2 and 3 exceeds the significant boundary and is negative, and slowly decreases in magnitude with the increase of hysteresis (hysteresis 1: -0.360, hysteresis 2: -0.335, hysteresis 3: -0.321). After a lag of 3, the partial autocorrelation becomes zero.

Since the correlation graph is zero after a lag of 1 and the partial correlation graph becomes zero after a lag of 3, this means that for the first difference time series, the following ARMA (autoregressive moving average) model is possible:

The ARMA (3,0) model, i.e., the autoregressive model with order P = 3, because the partial autocorrelation graph is zero after a lag of 3, and the autocorrelation graph ends at zero (although perhaps too abruptly, since the model is appropriate)
An ARMA (0,1) model, that is, a moving average model with q = 1, because the autocorrelation graph is zero after a lag of 1, while the partial autocorrelation graph ends at zero
An ARMA (p, q) model, i.e., a mixed model where p and q are greater than zero because autocorrelation graphs and partial correlation graphs tail zero (although correlation graphs may suddenly turn to zero because the model is appropriate)

We use the parsimony principle to determine which model is best: that is, we assume that the model with the fewest parameters is best. The ARMA (3,0) model has three parameters, the ARMA (0,1) model has one parameter, and the ARMA (p, q) model has at least two parameters. Therefore, the ARMA (0,1) model is considered to be the best model.

ARMA (0,1) model is the moving average model of order 1 or MA (1) model. This model can be written as: X_t – mu = Z_t- (Theta * Z_t-1), where X_t is the stationary time series we are studying (the first different age series at the death of the English king), mu is the mean time series X_t, Z_t is white noise with mean zero and constant variance, And theta is an estimable parameter.

MA (moving average) models are commonly used to simulate time series that show short-term dependencies between successive observations. Intuitively, it makes sense that the MA model could be used to describe the irregular components in the time series of death of English Kings, since we would expect the age at which a particular English king died to have some effect on age. The death of one or two Kings followed, but had little effect on the age at which the king died, and much longer reign after that.

The auto-.arima () function

The auto-.arima () function can be used to find the appropriate ARima model, for example, by typing “Library (Forecast)” and then “Auto-.arima (Kings)”. The output says the appropriate model is ARIMA (0,1,1).

Since the ARMA (0,1) model (P = 0, q = 1) is considered to be the best candidate model for the first differential time series of the British king’s age of death, the original time series age of death can be modeled using the ARIMA (0,1,1) model (P = 0, d = 1, q = 1, Where D is the order of the required differences).

ARIMA model was used for prediction

After selecting the best candidate ARIMA (P, D, Q) model for time series data, you can estimate the parameters of the ARIMA model and use it as a prediction model to predict the future values of time series.

You can use the “arima ()” function in R to estimate the parameters of the ARIMA (P, D, q) model.

An example from the time of the death of the English king

For example, as we discussed above, the ARIMA (0,1,1) model seems to be a reasonable model for the age at which the Kings of England died. You can specify p, D, and q values in the ARIMA model using the “order” argument of the “arima ()” function in R. To make the ARIMA (P, D, Q) model fit this timeseries (which we store in the variable “Kingstimeseries”, see above), we type:

> kingstimeseriesarima <- arima(kingstimeseries, Order =c(0,1,1)) # fitting ARIMA(0,1,1) Coefficients: Ma1-0.7218 S.E. 0.1208 Sigma ^2 Estimated as 230.4: log likelihood = -170.06AIC = 344.13 AICc = 344.44 BIC = 347.56Copy the code

As mentioned above, if we fit the ARIMA (0,1,1) model into our time series, it means that we fit the ARMA (0,1) model into the first difference time series. We can write the ARMA (0,1) model x_t-mu = Z_t- (theta * z_t-1), where theta is the parameter to be estimated. Based on the output of the “arima ()” R function (above), in the case of fitting the arima (0,1,1) model, the estimated value of theta (given as ‘ma1’ in the R output) is the time series from -0.7218 to the king’s death.

Specifies the confidence of the prediction interval

You can specify the confidence of the forecast interval in forector.arima () using the “level” parameter. For example, to get a 99.5% forecast interval, we would type “Forecast.Arima (Kingstimeseriesarima, H = 5, Level = C (99.5))”.

We can then use the ARIMA model to predict the future value of the time series using the “forecast. ARIMA ()” function in the “Forecast” R package. For example, to predict the age of death of the next five English Kings, we type:

> < font size = 3pt; "> < font size = 3pt; H =5) > Kingstimeseriesforecasts Point Forecast Lo 80 Hi 80 Lo 95 Hi 95 43 67.75063 48.29647 87.20479 37.99806 97.50319 44 67.75063 47.55748 87.94377 36.86788 98.63338 45 67.75063 46.84460 88.65665 35.77762 99.72363 46 67.75063 46.15524 89.34601 34.72333 100.77792 47 67.75063 45.48722 90.01404 33.70168 101.79958Copy the code

The original time series of Kings of England includes the ages of death for 42 Kings of England. The forecast.arima () function gives predictions of the ages of death for the next five English Kings (Kings 43-47) and the 80% and 95% prediction ranges for these predictions. The 42nd English king died at 56 years old (the last value observed in our time series), and the ARIMA model gives a predicted age of 67.8 years for the next five Kings.

We can use our ARIMA (0,1,1) model to plot the ages of death of the first 42 Kings and to predict the ages of these 42 Kings and the next 5 Kings:

> plot.forecast(kingstimeseriesforecasts)
Copy the code

As in the case of exponential smoothing model, it is better to study the correlation between whether the ARIMA model’s prediction error is normally distributed with zero mean and constant variance, and whether it is continuous prediction error.

For example, we could graph the prediction error for the ARIMA (0,1,1) model at the king’s death and perform the ljung-box test by typing the following, a lag of 1-20.

> acf(kingstimeseriesforecasts$residuals, lag.max=20) > Box.test(kingstimeseriesforecasts$residuals, lag=20, Type =" ljung-box ") Box-Ljung Test Data: KingstimeseriesForecasts $residuals X-Squared = 13.5844, DF = 20, p-value = 0.851Copy the code

Since the correlation diagram shows that the sample autocorrelation with a lag of 1-20 does not exceed the significant boundary, and the P-value of the LJung-box test is 0.9, we can conclude that there is little evidence of non-zero autocorrelation prediction errors with a lag of 1-20.

In order to study whether the normal distribution of the prediction error is zero mean and constant variance, we can make the time graph and histogram of the prediction error (normal curve with overlap) :

> plot.ts(Kingstimeseriesforecasts $residuals) # Draw a histogramCopy the code

A time plot of the in-sample prediction error shows that the variance of the prediction error appears to be roughly constant over time (although the variance may be slightly higher in the second half of the time series). The histogram of the time series shows a roughly normal distribution of prediction errors, with the mean seemingly close to zero. Therefore, it is reasonable that the prediction error is usually distributed as mean zero and constant variance.

Since continuous prediction errors do not seem to be correlated, and the normal distribution of prediction errors seems to be zero with constant variance, ARIMA (0,1,1) does seem to provide a sufficient prediction model for age of death.

Any other questions? Contact us!

Most welcome insight

1. Use LSTM and PyTorch for time series prediction in Python

2. Long and short-term memory model LSTM is used in Python for time series prediction analysis

3. Time series (ARIMA, exponential smoothing) analysis using R language

4. R language multivariate Copula – Garch – model time series prediction

5. R language Copulas and financial time series cases

6. Use R language random wave model SV to process random fluctuations in time series

7. Tar threshold autoregressive model for R language time series

8. R language K-Shape time series clustering method for stock price time series clustering

Python3 uses ARIMA model for time series prediction