The Dancing Economist: Forecasting In R: The Greatest Shortcut That Failed The Ljung-Box

Okay so you want to forecast in R, but don't want to manually find the best model and go through the drudgery of plotting and so on. I have recently found the perfect function for you. Its called auto.arima and it automatically fits the best arima model to your time series. In a word- it is "brilliant". Lets take a look at its brilliance with the following examples.

So in our last post the last thing we plotted was de-trended GDP and were hoping to forecast it. R makes this not only really easy, but also hilariously fun.

Just type in the following code with dt being your de-trended series.

fitdt<-auto.arima(dt)

plot(forecast(fitdt,h=25))

Here is the marvelous plot:

Here are the summary statistics:

Series: dt
ARIMA(2,0,2)(0,0,1)[4] with zero mean

Coefficients:
ar1 ar2 ma1 ma2 sma1
1.4753 -0.5169 0.0519 0.1909 0.1539
s.e. 0.1201 0.1166 0.1236 0.0909 0.0726

sigma^2 estimated as 1528: log likelihood = -1314.11
AIC = 2640.22 AICc = 2640.55 BIC = 2661.53

In-sample error measures:
ME RMSE MAE MPE MAPE MASE
-0.02875014 39.08953811 22.33132382 -30.62351536 75.51060679 0.83683534

So it fitted the de-trended series to a ARMA(2,2). Lets take a look at what it fits regular ol' GDP to.

> fitx<-auto.arima(GDP)
> plot(forecast(fitx,h=25))
> summary(fitx)

Series: GDP
ARIMA(2,2,2)

Coefficients:
ar1 ar2 ma1 ma2
-0.078 0.4673 -0.3520 -0.5813
s.e. 0.158 0.0928 0.1611 0.1525

sigma^2 estimated as 1653: log likelihood = -1312.34
AIC = 2634.69 AICc = 2634.93 BIC = 2652.42

In-sample error measures:
ME RMSE MAE MPE MAPE MASE
3.7546824 40.4951109 22.4820107 0.1552383 0.7111309 0.3613070

So it fit it also to an ARMA(2,2), but we have that extra 2 there in the middle. That 2 was clearly not there when we fit for the trendless de-trended GDP and shows up to account for the quadratic trend of the regular ol' GDP series.

Located above is the plot of the h=25 step ahead forecast and below is some code for the actual values in list form

> fit7=forecast(fitx,h=20)
> list(fit7)

Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
65 Q3 15124.35 15072.25 15176.45 15044.67 15204.03
65 Q4 15245.80 15148.82 15342.78 15097.48 15394.11
66 Q1 15359.95 15215.32 15504.58 15138.76 15581.15
66 Q2 15475.10 15285.46 15664.74 15185.07 15765.12
66 Q3 15586.75 15352.89 15820.62 15229.09 15944.42
66 Q4 15699.15 15423.24 15975.05 15277.19 16121.11
67 Q1 15809.85 15493.02 16126.69 15325.30 16294.41
67 Q2 15921.03 15564.75 16277.32 15376.14 16465.92
67 Q3 16031.39 15636.52 16426.26 15427.49 16635.29
67 Q4 16142.03 15709.51 16574.56 15480.54 16803.52
68 Q1 16252.27 15782.66 16721.87 15534.07 16970.47
68 Q2 16362.67 15856.53 16868.81 15588.59 17136.74
68 Q3 16472.86 15930.53 17015.20 15643.44 17302.29
68 Q4 16583.15 16004.92 17161.39 15698.82 17467.48
69 Q1 16693.34 16079.38 17307.30 15754.37 17632.31
69 Q2 16803.58 16154.01 17453.15 15810.15 17797.01
69 Q3 16913.77 16228.64 17598.89 15865.96 17961.57
69 Q4 17023.98 16303.31 17744.65 15921.81 18126.15
70 Q1 17134.17 16377.91 17890.43 15977.57 18290.77
70 Q2 17244.37 16452.46 18036.29 16033.25 18455.50

Let's see if this model passes the Ljung-Box test for white noise.

> Box.test(fitx$resid,type="Ljung-Box")

Box-Ljung test

data: fitx$resid
X-squared = 0.0511, df = 1, p-value = 0.8212

No! It doesn't pass the test! AHHHHHH!!! WHAT ARE WE GOING TO DO!??!?! Find out in the next post or look it up, but whatever you do-

Keep Dancin'

Steven J.

Update: This model does actually pass the "Ljung-Box" test. Please read the next post for details.

The Dancing Economist

Pages

Saturday, August 27, 2011

Forecasting In R: The Greatest Shortcut That Failed The Ljung-Box

1 comment:

Disclaimer

Search The Dancing Economist