1.Introduction



2.Data and Model Selection

##install.packages("devtools")
library(devtools)
##install_github('quandl/R-package')
library(Quandl)
##install.packages("mFilter")
library(mFilter)
##install.packages("forecast")
library(forecast)
##read data "yield" and "sp500"
mydata=Quandl("FED/SVENY",start_date="2011-02-26")
yield=mydata$SVENY30-mydata$SVENY01
sp500=Quandl("SPDJ/SPX",start_date="2011-02-26")
t <- intersect(mydata$Date,sp500$`Effective date`)
yield <- yield[mydata$Date %in% t]
sp <- sp500$`S&P 500`[sp500$`Effective date` %in% t]
t=as.Date(t)

plot(as.Date(t),yield,type="l",col="blue",xlab="Year",ylab="",ylim=c(1.0,9),
     main="Yield curve and S&P 500 Index")
par(new=TRUE)
plot(as.Date(t),sp,col="red",type="l",axes=FALSE,xlab="",ylab="")
axis(side=4, col="red")
legend("topleft",legend=c("Yield","S&P 500"),col=c("blue","red"),
       cex=0.8,lty=1,bty="n")

spectrum(yield,spans=c(3,5,3), main="Smoothed periodogram of Yield")

spectrum(sp,spans=c(3,5,3), main="Smoothed periodogram of S&P 500")

yield_hp <- hpfilter(yield, freq=12960000,type="lambda",drift=F)$cycle
sp500_hp <- hpfilter(sp, freq=12960000,type="lambda",drift=F)$cycle

plot(t,yield_hp,type="l",xlab="Year",ylab="",col="blue",
     main="Detrended yield (blue; left axis) and detrended S&P 500 (red; right axis)")
par(new=TRUE)
plot(t,sp500_hp,col="red",type="l",axes=FALSE,xlab="",ylab="")
axis(side=4, col="red")
legend("bottom",legend=c("Yield","S&P 500"),col=c("blue","red"),
       cex=0.8,lty=1,bty="n")

arima(yield_hp,xreg=sp500_hp,order=c(1,0,0))
## 
## Call:
## arima(x = yield_hp, order = c(1, 0, 0), xreg = sp500_hp)
## 
## Coefficients:
##          ar1  intercept  sp500_hp
##       0.9642    -0.0056    0.0016
## s.e.  0.0075     0.0365    0.0001
## 
## sigma^2 estimated as 0.002232:  log likelihood = 2048.79,  aic = -4089.58
log_lik_ratio <- as.numeric(
  logLik(arima(yield_hp,xreg=sp500_hp,order=c(1,0,0))) -
    logLik(arima(yield_hp,order=c(1,0,0)))
)
LRT_pval <- 1-pchisq(2*log_lik_ratio,df=1)
plot(as.Date(t),yield_hp,type="l",xlab="Year",ylab="",col="blue",
     main="Yield and fitted value")
par(new=TRUE)
plot(as.Date(t),fitted(arima(yield_hp,xreg=sp500_hp,order=c(1,0,0))),col="red",type="l",lty=2,axes=FALSE,xlab="",ylab="")



3.Supplementary analysis

3.1 Model Selection by AIC

aic_table <- function(data,P,Q,xreg=NULL){
  table <- matrix(NA,(P+1),(Q+1))
  for(p in 0:P) {
    for(q in 0:Q) {
      table[p+1,q+1] <- arima(data,order=c(p,0,q),xreg=xreg)$aic
    }
  }
  dimnames(table) <- list(paste("<b> AR",0:P, "</b>", sep=""),paste("MA",0:Q,sep=""))
  table
}
e_aic_table <- aic_table(yield_hp,4,5,xreg=sp500_hp)
require(knitr)
kable(e_aic_table,digits=2)
MA0 MA1 MA2 MA3 MA4 MA5
AR0 -783.14 -2095.25 -2813.78 -3223.26 -3477.90 -3610.37
AR1 -4089.58 -4089.49 -4089.90 -4088.01 -4090.00 -4090.06
AR2 -4089.32 -4089.27 -4090.76 -4088.77 -4084.92 -4088.47
AR3 -4089.66 -4087.73 -4086.19 -4091.56 -4088.28 -4086.96
AR4 -4087.71 -4085.82 -4086.81 -4082.90 -4053.45 -4086.93
  • This suggests that the model with ARMA(1,0) error is the best small model. We have also try some larger models with impressive AIC values. But the coefficients of higher-order show that the impact is rather weak, so we decide to use this smaller model to avoid overfitting.

3.2 Residual analysis

  • We should check the residuals for the fitted model, and look at their sample autocorrelation. From the results, we see that fluctuations changing in amplitude over time, which might be evidence for non-constant variance. It is not severe here but can be studied in further research since most financial time series are modeled by GARCH model. The ACF looks close to that of white noise. Although the Q-Q plot shows a sign of long-tailed, the model is still acceptable.
r <- resid(arima(yield_hp,xreg=sp500_hp,order=c(1,0,0)))
plot(r)

acf(r,lag=100)

qqnorm(r)
qqline(r)

3.3 Extracting cycles using a band pass filter

  • We are going to analyze the association by another filter, which uses local regression smoother. For the yield and S&P 500 data, high frequency variation might be considered “noise” and low frequency variation might be considered trend. A band of mid-range frequencies might be considered to correspond to the cycle component. Thus we decompose both yield and S&P 500, and extract cycle components for further study.
t=as.numeric(t)
u_low <- ts(loess(yield~t,span=0.05)$fitted,start = 2011,frequency=312)
u_hi <- ts(yield - loess(yield~t,span=0.1)$fitted,start = 2011,frequency=312)
u_cycles <- yield - u_hi - u_low
plot(ts.union(yield, u_low,u_hi,u_cycles),
     main="Decomposition of yield as trend + noise + cycles")

sp_low <- ts(loess(sp~t,span=0.05)$fitted,start = 2011,frequency=312)
sp_hi <- ts(sp - loess(sp~t,span=0.1)$fitted,start = 2011,frequency=312)
sp_cycles <- sp - sp_hi - sp_low
plot(ts.union(sp, sp_low,sp_hi,sp_cycles),
     main="Decomposition of sp500 as trend + noise + cycles")

  • After decomposing, we fit a regression with ARMA(2,1) errors model of the cycle components of yield and S&P 500. Comparing with the results of HP-filter, band pass filter looks smoother and extracts the cycle components clearer. We see clearly these two variables cycle together.
plot(as.Date(t),u_cycles,type="l",xlab="Year",ylab="",col="blue",ylim=c(-0.2,0.5),
     main="Detrended yield (blue; left axis) and detrended S&P 500 (red; right axis)")
par(new=TRUE)
plot(as.Date(t),sp_cycles,col="red",type="l",axes=FALSE,xlab="",ylab="",ylim=c(-81,90))
axis(side=4, col="red")
legend("topleft",legend=c("Yield","S&P 500"),col=c("blue","red"),
       cex=0.8,lty=1,bty="n")

arima(u_cycles,xreg=sp_cycles,order=c(2,0,1))
## 
## Call:
## arima(x = u_cycles, order = c(2, 0, 1), xreg = sp_cycles)
## 
## Coefficients:
##          ar1      ar2     ma1  intercept  sp_cycles
##       1.9615  -0.9752  -0.732     0.0002      1e-03
## s.e.  0.0072   0.0071   0.020     0.0022      1e-04
## 
## sigma^2 estimated as 1.527e-05:  log likelihood = 5174.81,  aic = -10337.62
  • This model is causal and invertible. The standard errors also suggest a statistically significant positive association between cyclical variation in yield and S&P 500.And from the true value vs. fitted value plot, we see this model fits very well.
plot(as.Date(t),u_cycles,type="l",xlab="Year",ylab="",col="blue",
     main="Yield and fitted value")
par(new=TRUE)
plot(as.Date(t),fitted(arima(u_cycles,xreg=sp_cycles,order=c(2,0,1))),col="red",type="l",lty=2,axes=FALSE,xlab="",ylab="")

  • From the residual analysis, heteroskedasticity seems to be very serious. Log-transformation has also been applied later, but it does not improve the situation any more. And the ACF shows there are significant correlations among the first 30 lags. The Q-Q plot suggests long-tailed distribution of residuals. But since the residuals are all within (-0.2,0.2), which are relatively small, residuals might not be a serious problem and this model is also acceptable.
r <- resid(arima(u_cycles,xreg=sp_cycles,order=c(2,0,1)))
plot(r)

acf(r,lag=100)

qqnorm(r)
qqline(r)



4.Conclusion



5.Reference

[1] Constable, Simon and Wright, Robert E. 2011. The Wall Street Journal guide to the 50 economic Indicators that Really Matter: From Big Macs to “Zombie Banks,” the Indicators Smart Investor Watch to Beat the Market

[2] Edward Ionides. 2016. Case study: An association between unemployment and mortality? http://ionides.github.io/531w16/notes10/notes10.html#some-supplementary-analysis

[3] Jeff DeMaso. 2012. The Yield Curve and Equity Markets. http://adviserinvestments.com/pdf/contentmgmt/Analyst-Spotlight-Yield-Curve-and-Equity-Markets.pdf