# Forecast GDP growth

We have reproduced the example provided in Ghysels (2013). We performed a Midas regression analysis to forecast quarterly GDP growth and monthly non-farm payrolls growth. The prediction formula is as follows

Where Yt is the logarithmic increase in seasonally adjusted real GDP on a quarterly basis, and X3t is the logarithmic increase in monthly non-farm wages of total employment.

First, we load the data and perform the transformation.

``````R> y <- window(USqgdp, end = c(2011, 2))
R> x <- window(USpayems, end = c(2011, 7))
R> yg <- diff(log(y)) * 100
R> xg <- diff(log(x)) * 100``````

The last two lines are used to equalize the sample size, which varies in the original data. We simply add additional NA values to the beginning and end of the data. A graphical representation of the data is shown in the figure. To specify the model of the midas_r function, we override it in the following equivalent form:

As in Ghysels (2013), we limited the estimated sample to between the first quarter of 1985 and the first quarter of 2009. We use Beta polynomials, non-zero Beta and U-MIDAS weights to evaluate the model.

``R> COEF (beta0) (Intercept) yy xx1 xx2 xx3 0.8315274 0.1058910 2.5887103 1.0201202 13.6867809 R> COEF (beta0) (Intercept) yy xx1 xx2 xx3 0.1058910 2.5887103 1.0201202 13.6867809 R> COEF (Intercept) Yy xx1 xx2 xx3 xx4 0.93778705 0.06748141 2.26970646 0.98659174 1.49616336-0.09184983 (Intercept) yy xx1 xx2 xx3 xx4 0.92989757 0.08358393 2.00047205 0.88134597 0.42964662-0.17596814 XX5 XX6 XX7 XX8 XX9 0.28351010 1.16285271-0.53081967 - 0.73391876-1.18732001``

We can use sample data covering 9 quarters from Q2 2009 to Q2 2011 to evaluate the predictive performance of these three models.

``R> fulldata <- list(xx = window(nx, start = c(1985, 1), end = c(2011, 6)), + yy = window(ny, start = c(1985, 1), end = c(2011, 2))) R> insample <- 1:length(yy) R> outsample <- (1:length(fulldata\$yy))\[-insample\] R> avgf <- average_forecast(list(beta0, betan, um), data = fulldata, + insample = insample, OutSample = OutSample) R> SQRT (AVGF \$Accuracy \$individual\$mse.out.of. Sample) \[1\] 0.5361953 0.4766972 0.4457144``

We see that the MIDAS regression model provides the best out of sample RMSE.

# Forecast actual fluctuation

As another demonstration, we use MIDASR to predict the volatility implemented on a daily basis. Corsi (2009) proposed a simple model to predict daily actual volatility. Heterogeneous autoregressive model (HAR-RV) for realizing volatility is defined as

Let’s say there are five days in a week and four weeks in a month. This model is a special case of MIDAS regression:

For empirical demonstration, we use realized volatility data on stock indexes provided by Heber, Lunde, Shephard, and Sheppard (2009). We estimate an annual realized volatility model for the S&P500 based on 5 minutes of earnings data.

``Parameters: T the value Estimate Std. Error (Pr > | | t) (Intercept) 0.83041 0.36437 2.279 0.34066 0.04463 7.633 2.95 0.022726 * rv1 e-14 * * * Rv2 0.41135 0.06932 5.934 3.25e-09 *** Rv3 0.19317 0.05081 3.802 0.000146 ** \-\ -Signif. 0 '***' 0.001 '**' 0.01 '.' 0.1 '1 Residual standard error: 5.563 on 3435 degrees of freedom``

For comparison, we also use the normalized exponential Almon weight to estimate the model

``Parameters: T the value Estimate Std. Error (Pr > | | t) (Intercept) 0.837660 0.377536 2.219 0.944719 0.027748 34.046 0.0266 * rv1 < 2 e - 16 *** Rv3 0.029084 0.005604 5.190 2.23E-15 *** rv3 0.029084 0.005604 5.190 2.23E-15 *** \-\ -Signif. Codes: 0 '***' 0.001 '**' 0.01 '.' 0.1 '1 Residual standard error: 5.535 on 3435 degrees of freedom``

We can use the heteroscedasticity and autocorrelation robust weight canonical test HAHR_TEST to check which of these limitations are compatible with the data.

`````` hAh restriction test (robust version)
data:
hAhr = 28.074, df = 17, p-value = 0.04408

hAh restriction test (robust version)
data:
hAhr = 19.271, df = 17, p-value = 0.3132``````

It can be seen that the null hypothesis related to the HAR-RV implicit constraint in the MIDAS regression model is rejected at the significance level of 0.05, while the null hypothesis of the exponential Almon hysteresis constraint cannot be rejected.

The figure illustrates the fitted MIDAS and U-MIDAS regression coefficients and their corresponding 95% confidence intervals. For the index Almon lag indicator, we can select the lag times by AIC or BIC.

We use two optimization methods to improve the convergence. Apply the test function to each candidate model. The hAhr_test function requires a lot of calculation time, especially for models with large lag orders, so we only do the calculation in the second step and limit the selection of lag restriction test. AIC selection model has 9 order lag:

``Selected model with AIC = 21551.97 Based on restricted MIDAS regression model The p-value for The null hypothesis of The Test HAHR_TEST is 0.5531733 Parameters: T the value Estimate Std. Error (Pr > | | t) (Intercept) * * rv1 0.93707 0.02729 34.337 0.96102 0.36944 2.601 0.00933 < 2-16 * * * e Rv2-1.19233 0.19288-6.182 7.08E-10 *** Rv3 0.09657 0.02190 4.411 1.06E-05 *** \-\ -Signif. 0 '***' 0.001 '**' 0.01 '.' 0.1 '1 Residual standard error: 5.524 on 3440 degrees of freedom``

HAC of HAH_TEST again cannot reject the null hypothesis of exponential Almon lag. We can study the prediction performance of the two models using a rolling prediction with a window of 1000 observations. For comparison, we also calculate the predictions of the unrestricted AR (20) model.

``1 rv ~ (rv, 1:20, 1) 10.82516 26.60201 2 rv ~ (rv, 1:20, 1) 10.82516 26.60201 2 rv ~ (rv, 1:20, 1) 3 rv ~ (rv, 1:9, 1, 1) Sample mse.in. Sample MAPE.in. Sample MASE. In. Sample MASE 0.8333858 2 0.8019687 29.24989 21.59220 0.8367377 3 0.7945121 29.08284 21.81484 0.8401646``

We see that the exponential Almon hysteresis model is slightly better than the HAR-RV model, and that both models are better than the AR (20) model.

### reference

Andreou E, Ghysels E, Kourtellos A (2010). “Regression models with mixed sampling frequencies.” Journal of Econometrics, 158, 246-261. DOI: 10.1016 / j.econom.2010.01 004.

Andreou E, Ghysels E, Kourtellos A (2011). “Prediction of mixed frequency data.” In MP Clements, DF Hendry (eds.), Oxford Handbook of Economic Forecasting, pp. 225-245.