STATS 531 HW3 Solution

1.Explore the data

This is open-ended problem. An example of the analysis is as follows. First we read in the data. It is the Ann Arbor January Low temperature time series. The data set consists of 115 observations and 12 variables. It records the weather change from 1900 to 2014. There is one missing value within the Low variable, so we delete it before analyzing the data.

x <- read.table(file="http://ionides.github.io/531w16/intro/ann_arbor_weather.csv",header=TRUE)
x1 <- na.omit(x)

The five number summary of the lowest temperature in Fahrenheit for each year are as follows.

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## -22.000  -7.750  -3.000  -2.833   2.000  19.000

Then we plot the data over time to study its pattern. The blue line represents the mean for this time series.

The variation of the data seems to increase a bit during the last 35 years but there is no visible trend. Now we analyze this time series.

2.Fitting an ARMA model

Let’s start by fitting a stationary ARMA\((p,q)\) model under the null hypothesis that there is no trend. This hypothesis, which asserts that nothing has substantially changed in this system over the last 150 years, is not entirely unreasonable from looking at the data.
We seek to fit a stationary Gaussian ARMA(p,q) model with parameter vector \(\theta=({\phi}_{1:p},{\psi}_{1:q},\mu,\sigma^2)\) given by \[ {\phi}(B)(X_n-\mu) = {\psi}(B) \epsilon_n,\] where \[\begin{eqnarray} \mu &=& {\mathbb{E}}[X_n] \\ {\phi}(x)&=&1-{\phi}_1 x-\dots -{\phi}_px^p, \\ {\psi}(x)&=&1+{\psi}_1 x+\dots +{\psi}_qx^q, \\ \epsilon_n&\sim&\mathrm{ iid }\, N[0,\sigma^2]. \end{eqnarray}\]
We need to decide where to start in terms of values of \(p\) and \(q\). Viewed as a way to select a model with reasonable predictive skill from a range of possibilities.
Akaike’s information criterion AIC is given by \[ AIC = -2 \times {\ell}({\theta^*}) + 2D\]

AIC is often useful. Let’s tabulate some AIC values for a range of different choices of \(p\) and \(q\).

	MA0	MA1	MA2	MA3	MA4	MA5
AR0	780.92	782.90	784.52	785.84	787.83	789.41
AR1	782.90	784.27	785.86	787.81	789.81	791.21
AR2	784.51	785.89	785.94	789.64	791.60	787.10
AR3	785.65	787.61	789.56	786.39	790.84	789.09
AR4	787.64	789.61	791.52	793.38	792.84	794.37

From the AIC table, I select ARMA(0,0), ARMA(1,0), and ARMA(0,1) as candidates. However, from the results below, we can check that ar1, ma1 parameters have a small fitted value that is inside the unit circle. Moreover, their standard errors seem to be large compared to the coefficient estimates. Thus, I choose ARMA(0,0) model to analyze this dataset.

ar1 <- arima(x1$Low, order=c(1,0,0))
ar1

## 
## Call:
## arima(x = x1$Low, order = c(1, 0, 0))
## 
## Coefficients:
##          ar1  intercept
##       0.0217    -2.8363
## s.e.  0.0943     0.6992
## 
## sigma^2 estimated as 53.34:  log likelihood = -388.43,  aic = 782.86

ma1 <- arima(x1$Low, order=c(0,0,1))
ma1

## 
## Call:
## arima(x = x1$Low, order = c(0, 0, 1))
## 
## Coefficients:
##          ma1  intercept
##       0.0196    -2.8360
## s.e.  0.0897     0.6974
## 
## sigma^2 estimated as 53.34:  log likelihood = -388.44,  aic = 782.87

ARMA(0,0) model can be written as: \[X_n=\mu+\epsilon_n\] where \(\{\epsilon_n\}\) is a white noise process.
Here, \(\mu\) is a constant, since I assumed there is no trend. In addition, I will assume that \(\{\epsilon_n\}\) is a Gaussian white noise process.
Now, the null hypothesis become \[H_0 : X_n\sim IID\ N[\mu,\sigma^2].\]

3.Testing for Normality

I first plot qqplot. From the qqplot below, we can see that the quantiles lie roughly on a straight line.

qqnorm(x1$Low)
qqline(x1$Low)

Moreover, I also perform Shapiro-Wilk’s normality test on the dataset.

shapiro.test(x1$Low)

## 
##  Shapiro-Wilk normality test
## 
## data:  x1$Low
## W = 0.99149, p-value = 0.7062

I get p-value of 0.7062. Null hypothesis of normality cannot be rejected, thus it is reasonable to assume \(\{X_n\}\) are normally distributed.

4.Inference on \(\mu\)

Now we fit ARMA(0,0) model to the dataset.

arma00 <- arima(x1$Low, order=c(0,0,0))
arma00

## 
## Call:
## arima(x = x1$Low, order = c(0, 0, 0))
## 
## Coefficients:
##       intercept
##         -2.8333
## s.e.     0.6842
## 
## sigma^2 estimated as 53.37:  log likelihood = -388.46,  aic = 780.92

We have esimate of \(\mu\), \(\mu^*=-2.83\), with standard error of 0.68. Moreover, estimate of \(\sigma^2\) is given by 53.37. Now, I consider simulation method to check if this result is reasonable.
Below figures show the histogram (left) and a density plot (right) of simulated \(\mu^*\) under the hypothesis \(X_n\sim IID\ N[-2.83,53.37]\).

Confidence interval derived from Fisher information is \[[-2.83-1.96(0.6842),-2.83+1.96(0.6842)]=[-4.17,-1.49].\]
By comparing this confidence interval with above plots, it seems like standard error computed by Fisher information is reasonable.

5. Diagnostics

For the diagnostic analysis, I look at the residuals.

plot(arma00$resid, ylab="residuals")

acf(arma00$resid,na.action=na.pass,main="ACF of residuals")

ACF plot shows not much sign of autocorrelation. However, residual plot shows that there are some up-down trend that occurs every 20-30 years, suggesting further analysis with more complex model than ARMA.

6.Summary

We find ARMA(0,0)(Gaussian white noise) model fits the Ann Arbor January Low temperature time series. We compute the autocovariance of the data and compare the data with a normal distribution. All of these results suggest that ARMA(0,0)(Gaussian white noise) model is a good fit for the data.
As white noise model fits the data well, it means that there’s no pattern, just random variation in the low temperature in January for Ann Arbor, showing that we can not predict next year’s low temperature based on data from previous years. Maybe one year is too long for predicting the temperature. Last year(2014) it was quite cold in January and there was lot of snow, however this year it is too warm for Michigan winter, resulting in little snow and difficulty for snow resorts to make snow.

STATS 531 HW3 Solution

Adapted from Xiaoyue Liu and Yura Kim

February 22, 2016