Licensed under the Creative Commons attribution-noncommercial license, http://creativecommons.org/licenses/by-nc/3.0/. Please share and remix noncommercially, mentioning its origin.
CC-BY_NC



Objectives




Historical investigations linking business cycles to mortality




Some data and a fitted model

e_data <- read.table(file="life_expectancy_usa.txt",header=TRUE)
head(e_data)
##   Obs   Yr   e0F   e0M    e0
## 1   1 1933 62.78 59.17 60.88
## 2   2 1934 62.34 58.34 60.23
## 3   3 1935 63.04 58.96 60.89
## 4   4 1936 62.60 58.35 60.35
## 5   5 1937 63.37 59.00 61.05
## 6   6 1938 64.54 60.45 62.39

** To measure fluctuations in economic growth, we use monthly national USA unemployment figures](http://data.bls.gov/timeseries/LNU04000000) published by the Bureau of Labor Statistics.

u_data <- read.table(file="unadjusted_unemployment.csv",sep=",",header=TRUE)
head(u_data)
##   Year Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
## 1 1948 4.0 4.7 4.5 4.0 3.4 3.9 3.9 3.6 3.4 2.9 3.3 3.6
## 2 1949 5.0 5.8 5.6 5.4 5.7 6.4 7.0 6.3 5.9 6.1 5.7 6.0
## 3 1950 7.6 7.9 7.1 6.0 5.3 5.6 5.3 4.1 4.0 3.3 3.8 3.9
## 4 1951 4.4 4.2 3.8 3.2 2.9 3.4 3.3 2.9 3.0 2.8 3.2 2.9
## 5 1952 3.7 3.8 3.3 3.0 2.9 3.2 3.3 3.1 2.7 2.4 2.5 2.5
## 6 1953 3.4 3.2 2.9 2.8 2.5 2.7 2.7 2.4 2.6 2.5 3.2 4.2
t <- intersect(e_data$Yr,u_data$Year)
e <- e_data$e0[e_data$Yr %in% t]
u <- apply(u_data[u_data$Year %in% t, 2:13],1,mean)
plot(ts(cbind(e,u),start=1948),main="Percent unemployment (u) and life expectancy (e) for USA",xlab="Year")

\[{s_{1:N}^*} = \arg\min_{s_{1:N}} \left\{ \sum^{N}_{n=1}\big({y_n^*}-s_{n}\big)^2 + \lambda\sum^{N-1}_{n=2}\big(s_{n+1}-2s_{n}+s_{n-1}\big)^2 \right\}.\]

require(mFilter)
e_hp <- hpfilter(e, freq=100,type="lambda",drift=F)$cycle
u_hp <- hpfilter(u, freq=100,type="lambda",drift=F)$cycle
plot(t,u_hp,type="l",xlab="Year",ylab="")
par(new=TRUE)
plot(t,e_hp,col="red",type="l",axes=FALSE,xlab="",ylab="")
axis(side=4, col="red")

Figure 2. Detrended unemployment (black; left axis) and detrended life expectancy at birth (red; right axis).


[M1] \({{\quad\quad}\displaystyle E}^{HP}_n = \alpha + \beta u^{HP*}_n + \epsilon_n\),


where \(\{\epsilon_n\}\) is a Gaussian ARMA process. We use an ARMA(1,0) model, as discussed in the supplementary analysis.

arima(e_hp,xreg=u_hp,order=c(1,0,0))
## 
## Call:
## arima(x = e_hp, order = c(1, 0, 0), xreg = u_hp)
## 
## Coefficients:
##          ar1  intercept    u_hp
##       0.4816    -0.0038  0.0678
## s.e.  0.1082     0.0320  0.0186
## 
## sigma^2 estimated as 0.01859:  log likelihood = 37.72,  aic = -67.44


\(H_0\): There is no association. We suppose model [M1] applies with \(\beta=0\).


\(H_1\): Some association is present. Model [M1] applies with \(\beta\neq 0\).


We can also compute the ratio of the likelihoods, maximized under \(H_1\) and \(H_0\), to construct the likelihood ratio statistic. Under the null hypothesis, \(H_0\), this statistic has a chi-squared approximation.

log_lik_ratio <- as.numeric(
   logLik(arima(e_hp,xreg=u_hp,order=c(1,0,0))) -
   logLik(arima(e_hp,order=c(1,0,0)))
)
LRT_pval <- 1-pchisq(2*log_lik_ratio,df=1)
tt <- 1:45
arima(e_hp[tt],xreg=u_hp[tt],order=c(1,0,0))
## 
## Call:
## arima(x = e_hp[tt], order = c(1, 0, 0), xreg = u_hp[tt])
## 
## Coefficients:
##          ar1  intercept  u_hp[tt]
##       0.4927     0.0027    0.0809
## s.e.  0.1291     0.0417    0.0229
## 
## sigma^2 estimated as 0.02103:  log likelihood = 22.9,  aic = -37.8




Conclusions




Some supplementary analysis

Model selection by Akaike’s information criterion (AIC)

aic_table <- function(data,P,Q,xreg=NULL){
  table <- matrix(NA,(P+1),(Q+1))
  for(p in 0:P) {
    for(q in 0:Q) {
       table[p+1,q+1] <- arima(data,order=c(p,0,q),xreg=xreg)$aic
    }
  }
  dimnames(table) <- list(paste("<b> AR",0:P, "</b>", sep=""),paste("MA",0:Q,sep=""))
  table
}
e_aic_table <- aic_table(e_hp,4,5,xreg=u_hp)
require(knitr)
kable(e_aic_table,digits=2)
MA0 MA1 MA2 MA3 MA4 MA5
AR0 -52.36 -64.26 -65.64 -66.97 -66.60 -64.78
AR1 -67.44 -65.44 -64.78 -65.96 -71.30 -69.30
AR2 -65.44 -63.50 -64.53 -64.69 -69.30 -69.34
AR3 -64.75 -71.32 -63.31 -63.22 -67.56 -71.26
AR4 -67.00 -66.28 -65.10 -71.19 -68.32 -69.85
  • This suggests that the model with ARMA(1,0) errors is the best small model.

  • There are some larger models with impressive AIC values. For example, let’s look at the fitted model with ARMA(3,1) errors.

arima(e_hp,xreg=u_hp,order=c(3,0,1))
## 
## Call:
## arima(x = e_hp, order = c(3, 0, 1), xreg = u_hp)
## 
## Coefficients:
##          ar1      ar2      ar3      ma1  intercept    u_hp
##       1.3092  -0.3168  -0.1674  -1.0000    -0.0011  0.0764
## s.e.  0.1203   0.1981   0.1229   0.0406     0.0043  0.0187
## 
## sigma^2 estimated as 0.0153:  log likelihood = 42.66,  aic = -71.32
  • We have a non-invertible solution, with a regression coefficient for \(U_{1:N}\) that is apparently statistically significant (according to the Fisher information standard errors).

  • Probably, the ARMA(3,1) analysis is not very stable, meaning that we might not expect to find a similar fit for similar data.

  • We can find inconsistencies in the AIC table. For nested models, adding a parameter cannot decrease the maximized likelihood. Therefore, mathematically, adding a parameter cannot increase the AIC by more than two units. This evidence for numerical problems in evaluating and/or maximizing the ARMA likelihood is common. It suggests one should be cautious about using AIC to choose relatively large models.




Residual analysis

  • We should check the residuals for the fitted model, and look at their sample autocorrelation.
r <- resid(arima(e_hp,xreg=u_hp,order=c(1,0,0)))
plot(r)

  • There is some evidence for fluctuations decreasing in amplitude over time. This is an example of heteroskedasticity. It is not extreme here, but could be studied in a future analysis.
acf(r)

  • It is not disastrous to have one out of 18 lags narrowly outside the dashed lines showing pointwise acceptance regions at the 5% level under a null hypothesis of Gaussian white noise.

  • The presence of some small amount of sample autocorrelation is consistent with the AIC table, which finds the possibility of small gains by fitting some larger models to the regression errors.




Analysis of temporal differences

  • One might model annual changes in life expectancy, rather than difference from a trend. In this case, we consider the variable \[ \Delta {e_n^*} = {e_n^*} - {e_{n-1}^*}.\]

  • We compute this as

delta_e <- e - e_data$e0[e_data$Yr %in% (t-1)]
plot(t,u,type="l",xlab="Year",ylab="")
par(new=TRUE)
plot(t,delta_e,col="red",type="l",axes=FALSE,xlab="",ylab="")
axis(side=4,col="red")

unemployment (black; left axis) and differenced life expectancy (red; right axis).

  • The relationship between unemployment and differenced life expectancy is harder to see than when HP-detrended.

  • The relationship is also harder to find by statistical methods. For example,

arima(delta_e,xreg=u,order=c(1,0,1))
## 
## Call:
## arima(x = delta_e, order = c(1, 0, 1), xreg = u)
## 
## Coefficients:
##         ar1      ma1  intercept       u
##       0.746  -0.6354     0.0804  0.0187
## s.e.  0.232   0.2451     0.1119  0.0184
## 
## sigma^2 estimated as 0.03895:  log likelihood = 13.43,  aic = -16.86
  • Here, we do not see statistically convincing evidence for the relationship.

  • A scientific principle for interpreting experimental results is as follows: An experiment which finds evidence of an effect is usually a better foundation for future investigations than one which fails to find evidence. The experiment which found no evidence of an effect might have been a bad choice of experiment, or might have been carried out poorly.

  • The principle of giving preference to methods giving positive results must be balanced with consideration of multiple testing. There is a danger in trying many analyses and settling on the first that claims statistical significance. The generalizability of any result is tentative until confirmed in other studies.

  • It appears that temporal differencing has destroyed too much of the evidence we were hoping it would clarify.




Exploratory analysis using correlations and linear detrending

  • A simple analysis is to remove a linear trend from both the life expectancy and unemployment time series, and compute the correlation coefficient. The correlation coefficient provides a widely used numerical summary of an association. The familiarity and simplicity of the correlation coefficient is convenient for comparing results across a range of time periods and population subgroups.

  • Since the detrended series have non-negligible autocorrelation, the standard p-value associated with the correlation coefficient is unreliable as a formal test. Nevertheless, it provides a simple measure to help us see patterns in a table such as the following. The source code for this table is not included in this document, but you can see it in the source file (report.Rmd) or in the extracted R code file (report.R).

Correlation between linearly detrended USA national unemployment and life expectancy at birth for different time periods and populations.
National Male Female
1948-2013 0.42*** 0.11 0.47***
1948-1959 0.19 0.098 0.21
1950-1969 0.35 0.25 0.49*
1960-1979 0.70*** 0.52* 0.71***
1970-1989 0.67** 0.67** 0.66**
1980-1999 0.80*** 0.21 0.85***
1990-2009 0.31 -0.18 0.37
2000-2013 0.50† 0.74** 0.33
*P<0.05, **P<0.01, ***P<0.001, †P<0.1, using the usual t-test that ignores autocorrelation.




References

Ionides, E. L., Z. Wang, and J. A. Tapia Granados. 2013. Macroeconomic effects on mortality revealed by panel analysis with nonlinear trends. Annals of Applied Statistics 7:1362–1385.

Ogburn, W. F., and D. S. Thomas. 1922. The influence of the business cycle on certain social conditions. Journal of the American Statistical Association 18:324–340.