Introduction

Data Exploration

Both data are downloaded from the U.S. Bureau of Larbor Statistics.

Time Plot

We first look at the time plot of two series from January 1, 1978 to January 1, 2020.

  • The blue line shows the CPI for all urban consumers on alcoholic beverages increases with time.

  • Compared to CPI, the black line, unemployment rate, is more wiggly and exhibiting several spikes. We can know that the two spikes in 2001 and 2008 indicate the U.S. recession periods.

  • To examine the relationship between these two series, we need to decompose the trend, noise and business cycles and assess if the cycles match. The movement of business cycles may comtain low-frequency trends and high-frequency noise, here we use Loess smoothing to extract the trends, noise, and cycles component.

Decomposition

# Function to decompose
decomp_loess = function(response, time, start_time, freq = 12){
  low = ts(loess(response ~ time, span = .5)$fitted, start = start_time,
           frequency = freq)
  high = ts(response - loess(response ~ time, span = .1)$fitted, 
            start = start_time, frequency = freq)
  cycles = ts(response - low - high, start = start_time, frequency = freq)
  ts = ts.union(response, low, high, cycles)
  colnames(ts) = c("Observed", "Trend", "Noise", "Cycles")
  return(list(cycles, ts))
}

# Decompose CPI for alcoholic beverages
decomp_alc = decomp_loess(alc, time, 1978, freq = 12)
plot(decomp_alc[[2]], main = "Decomposition of CPI as trend + noise + cycles")

# Decompose unemployment rate
decomp_unemp = decomp_loess(unemp, time, 1978, freq = 12)
plot(decomp_unemp[[2]], main = "Decomposition of unemployment as trend + noise + cycles")

We combine the cycle components of CPI for alcoholic beverages and unemployment rate into the same plot.

The figure suggests that he detrended unemployment rate and the detrended CPI for alcoholic beverages cycle together.

Model

To confirm this hypothesis, we further analyze the \(A_{1:N}\) using a regression with ARMA errors model, \[A_n = \alpha + \beta u_n + \epsilon_n\], where {\(\epsilon_n\)} is a Gausian ARMA process.

AIC Table

We will use the AIC table to choose the parameters \(p\), \(q\) of ARMA(p,q) model.

# Function for AIC table
aic_table = function(data, p, q, xreg=NULL){
  table = matrix(NA, p+1, q+1)
  for(p in 0:p) {
    for(q in 0:q) {
      table[p+1,q+1] = arima(data, order = c(p,0,q), xreg = xreg)$aic
    }
  }
  dimnames(table) = list(paste("AR", 0:p, sep = ""), paste("MA", 0:q, sep = ""))
  return(table)
}

# AIC
aic_tbl = aic_table(decomp_alc[[1]], 3, 3, xreg = decomp_unemp[[1]])
##            MA0       MA1        MA2       MA3
## AR0  1907.6641  1218.982   536.1389  -125.888
## AR1  -544.4689 -1229.396 -1875.7121 -2345.931
## AR2 -2563.2860 -3137.285 -3346.6649 -3368.524
## AR3 -3073.7108 -3367.321 -3374.3532 -3374.362

The AIC table shows that ARMA(3,3) has the lowest AIC value followed by ARMA(3,2). Since smaller model is less likely to encounter issues like parameter redundancy, causality, and invertibility. We will leave out ARMA(3,3) and consider ARMA(3,2) for further discussion.

ARMA(3,2)

alcohol_cycles = decomp_alc[[1]]
unemploy_cycles = decomp_unemp[[1]]
arma32 = arima(alcohol_cycles, xreg = unemploy_cycles, order = c(3,0,2))
cat("Polyroot for AR Coefficient: ", abs(polyroot(c(1,-arma32$coef[1:3]))),
    "\nPolyroot for MA Coefficient: ", abs(polyroot(c(1,arma32$coef[4:5]))))
## Polyroot for AR Coefficient:  1.022995 1.022995 1.808804 
## Polyroot for MA Coefficient:  1.241376 3.804177

The result shows that ARMA(3,2) is causality and invertibility.

## 
## Call:
## arima(x = alcohol_cycles, order = c(3, 0, 2), xreg = unemploy_cycles)
## 
## Coefficients:
##          ar1      ar2     ar3     ma1     ma2  intercept  unemploy_cycles
##       2.5017  -2.0330  0.5283  1.0684  0.2118    -0.0063           0.8671
## s.e.  0.0595   0.1179  0.0590  0.0686  0.0699     0.2831           0.0903
## 
## sigma^2 estimated as 6.977e-05:  log likelihood = 1695.18,  aic = -3374.35

The standard errors suggest a statistically significant association between cyclical variation in unemployment and CPI for alcoholic beverages.

Likelihood Ratio Test

We can also test the significance of coefficients from the likelihood ratio test. Suppose we have two nested hypotheses \[ \begin{eqnarray*} H^{\langle 0\rangle} &:& \theta\in \Theta^{\langle 0\rangle}, \\ H^{\langle 1\rangle} &:& \theta\in \Theta^{\langle 1\rangle}, \end{eqnarray*} \] defined via two nested parameter subspaces, \(\Theta^{\langle 0\rangle}\subset \Theta^{\langle 1\rangle}\), with respective dimensions \(D^{\langle 0\rangle}< D^{\langle 1\rangle}\le D\).

We consider the log likelihood maximized over each of the hypotheses, \[ \begin{eqnarray*} \ell^{\langle 0\rangle} &=& \sup_{\theta\in \Theta^{\langle 0\rangle}} \ell(\theta), \\ \ell^{\langle 1\rangle} &=& \sup_{\theta\in \Theta^{\langle 1\rangle}} \ell(\theta). \end{eqnarray*} \] A useful approximation asserts that, under the hypothesis \(H^{\langle 0\rangle}\), \[ \ell^{\langle 1\rangle} - \ell^{\langle 0\rangle} \approx (1/2) \chi^2_{D^{\langle 1\rangle}- D^{\langle 0\rangle}}, \] where \(\chi^2_d\) is a chi-squared random variable on \(d\) degrees of freedom and \(d\) = 1 in our case.

We now test whether \(\alpha\) = 0 or not.

# test alpha
loglik = as.numeric(
  logLik(arima(decomp_alc[[1]], xreg = decomp_unemp[[1]], order = c(3,0,2))) 
  - logLik(arima(decomp_alc[[1]], xreg = decomp_unemp[[1]], order = c(3,0,2),
                 include.mean = FALSE))
)
cat("P-value: ", 1 - pchisq(2*loglik, df = 1))
## P-value:  0.9817706

The p-value of 0.9817706 suggests that we can not reject the null hypothesis. Hence, \(\alpha\) should be 0.

We also test whether \(\beta\) = 0 or not.

# test beta
loglik = as.numeric(
  logLik(arima(decomp_alc[[1]], xreg = decomp_unemp[[1]], order = c(3,0,2))) 
  - logLik(arima(decomp_alc[[1]], order = c(3,0,2))))

# p-value
cat("P-value: ", 1 - pchisq(2*loglik, df = 1))
## P-value:  0

The p-value of 0 suggests that we can reject the null hypothesis at level \(\alpha=.05\) and suggest that \(\beta \neq 0\).

ARMA(3,2) with α = 0

Based on the result, we adjust our model to \(A_n = \alpha + \beta u_n + \epsilon_n\), where {\(\epsilon_n\)} is a Gausian ARMA(3,2) process.

arma32_adj = arima(alcohol_cycles, xreg = unemploy_cycles, order = c(3,0,2),
                   include.mean = FALSE)
## 
## Call:
## arima(x = alcohol_cycles, order = c(3, 0, 2), xreg = unemploy_cycles, include.mean = FALSE)
## 
## Coefficients:
##          ar1      ar2     ar3     ma1     ma2  unemploy_cycles
##       2.5016  -2.0329  0.5282  1.0684  0.2117           0.8670
## s.e.  0.0595   0.1179  0.0590  0.0686  0.0699           0.0902
## 
## sigma^2 estimated as 6.977e-05:  log likelihood = 1695.18,  aic = -3376.35
## Polyroot for AR Coefficient:  1.022995 1.022995 1.808804 
## Polyroot for MA Coefficient:  1.241376 3.804177

All the coefficients are significant and the model remains causal and invertible.

Diagnostic

We proceed to check the residuals for the fitted model and inspect their sample autocorrelation.

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(3,0,2) with zero mean
## Q* = 319.99, df = 18, p-value < 2.2e-16
## 
## Model df: 6.   Total lags used: 24

Bootstrap Simulation

# set up
B = 500
ar = c(2.5016252, -2.0328500, 0.5282242)
ma = c(1.0684262, 0.2117450)
xreg_coef = 0.8669930
sigma = sqrt(arma32_adj$sigma2)

theta = matrix(NA, nrow = B, ncol = 6, 
               dimnames = list(NULL, names(coef(arma32_adj))))
s = rep(NA, B)

# Bootstrap B times
for(b in 1:B){
  x = ts(arima.sim(list(ar, ma), n = length(alcohol_cycles), sd = sigma),
         start = start_tm, frequency = 12) + xreg_coef*unemploy_cycles
  mod = arima(x, order = c(3, 0, 2), xreg = unemploy_cycles, include.mean = F)
  theta[b, ] = coef(mod)
  s[b] = var(mod$residuals)
}

# s.e.
sqrt(diag(var(theta)))
##             ar1             ar2             ar3             ma1 
##    0.7397657951    0.6299757690    0.0580636189    0.7482964335 
##             ma2 unemploy_cycles 
##    0.6432908230    0.0003844257
## Bootstrap Confidence Interval:
##                       2.5%     97.5%
## ar1             -1.5317233 1.5621234
## ar2             -1.0155861 0.8179436
## ar3             -0.1135062 0.1066840
## ma1             -1.6047708 1.5429964
## ma2             -0.8580770 0.9999945
## unemploy_cycles  0.8662586 0.8677405
## Fisher Information:
##                        2.5%      97.5%
## ar1              2.38509992  2.6181504
## ar2             -2.26396832 -1.8017317
## ar3              0.41260831  0.6438400
## ma1              0.93398091  1.2028714
## ma2              0.07469427  0.3487957
## unemploy_cycles  0.69026225  1.0437238

Unfortunately, the result is not robust.

Conclusion

Reference