t-tests versus z-tests. In Question 6.6 of the notes, we use a normal approximation for the statistic \(\hat\beta \big/\mathrm{SE}(\hat\beta)\). When carrying out linear regression analysis, it is good and customary practice to use Student’s t distribution instead. Should we do that here? What are the arguments for and against it? Think about the justification for the t-test versus the justification for the z-test.
The multiplicative structure for SARIMA. Question 6.2 raised the issue of whether there is a scientific reason to think that practical models for seasonal phenomena should have a product structure to their ARMA polynomials, leading to a preference for [S3] over [S2] that goes beyond methodological convenience. Can you suggest a reason, or alternatively suggest a specific case where a non-multiplicative model like [S2] makes more sense?
Annual cycles modeled by local AR(2) vs seasonal SAR(1). The following code shows that monthyly SAR models have an ACF with peaks at multiples of 12 lags, without correlation elsewhere. By contrast, an AR(2) model can have an oscillating ACF with period 12 months, as described in Chapter 4. How does this help us interpret the residuals on Slide 10 of Chapter 6?
library(astsa)
set.seed(123)
omega <- 2*pi/12
y1 <- sarima.sim(ar=c(2,-1)/(1+omega^2))
acf(y1,lag.max=50)
y2 <- sarima.sim(sar=c(0.6),S=12)
acf(y2,lag.max=50)
n <- 0:100
set.seed(42)
epsilon <- rnorm(n=length(n),mean=0,sd=1)
y1 <- 2*n/100 + epsilon
y2 <- exp(1.5*n/100) + epsilon
y3 <- sin(2*pi*n/200)
y4 <- n*(100-n)/2500
Using the default settings of adf.test
in the
tseries
R package, y1
, y2
and
y3
are found to have clear evidence against the unit root
null hypothesis, and are therefore determined to be stationary.
The trends in y1
and y2
are highly
significant if one fits a simple linear model using lm()
.
Evidently, ADF is not sensitive to this non-stationarity, at least using
the default tuning parameters.
y3
is found to have extremely strong evidence for
stationarity (using the common abuse of language when there is strong
evidence against the null) with an ADF statistic of order \(10^{13}\). However, a rather similar
looking function, y4
, is found to be compatible with the
unit root null hypothesis.
##
## Augmented Dickey-Fuller Test
##
## data: y1
## Dickey-Fuller = -4.0813, Lag order = 4, p-value = 0.01
## alternative hypothesis: stationary
##
## Augmented Dickey-Fuller Test
##
## data: y2
## Dickey-Fuller = -3.656, Lag order = 4, p-value = 0.03161
## alternative hypothesis: stationary
##
## Augmented Dickey-Fuller Test
##
## data: y3
## Dickey-Fuller = -1.251e+13, Lag order = 4, p-value = 0.01
## alternative hypothesis: stationary
##
## Augmented Dickey-Fuller Test
##
## data: y4
## Dickey-Fuller = -1.438, Lag order = 4, p-value = 0.8096
## alternative hypothesis: stationary
The ADF approach is to argue that a time series is well modeled as stationary by rejecting the hypothesis that it is well modeled as a unit-root Gaussian autoregressive process. A problem here is that many time series models are in neither of these categories, a common example being a nonlinear trend with stationary noise. One has to be rather careful about treating rejection of the null as evidence for the alternative. This is a common issue with all inference, Bayesian and frequentist: if all models under consideration are silly, then the inference is silly. Do you think that the ADF test leads to silly conclusions in this case? How do you know whether it is appropriate for your data analysis, and how do you convince others of this?