Question 4.1.

Part A.

Given the AR2 model \[X_{n}=1.5X_{n-1}-0.8X_{n-2}+\epsilon_{n},\] we have \[\epsilon_{n}=X_{n}-1.5X_{n-1}+0.8X_{n-2}.\] By definition, the autocovariance of the white noise process \(\{\epsilon_n\}\) is \(\gamma_\epsilon(h) = \sigma^2\) for \(h=0\) and \(\gamma_\epsilon(h) = 0\) otherwise. Thus the spectrum of the white noise is \[\lambda_\epsilon(\omega) = \sum_{h=-\infty}^\infty \gamma_\epsilon(h) e^{-i\omega h} = \sigma^2\] for all \(\omega\). Now we observe that the ACF for \(\left\{\epsilon_{n}\right\}\) can be written as \[\begin{split} \gamma_{\epsilon}\left(h\right)&=&\mathrm{Cov}\left(\epsilon_{n+h},\epsilon_{n}\right)\\ &=&\mathrm{Cov}\left(X_{n+h}-1.5X_{n+h-1}+0.8X_{n+h-2},X_{n}-1.5X_{n-1}+0.8X_{n-2}\right)\\ &=&(1+1.5^2+0.8^2)\gamma_{X}\left(h\right)+(-1.5-1.5\times0.8)\left[\gamma_{X}\left(h+1\right)+\gamma_{X}\left(h-1\right)\right]+0.8\left[\gamma_{X}\left(h+2\right)+\gamma_{X}\left(h-2\right)\right]. \end{split}\]

Thus we see that \[\begin{split} \lambda_\epsilon(\omega) &= \sum_{h=-\infty}^\infty \gamma_\epsilon(h) e^{-i\omega h}\\ &= \sum_{h=-\infty}^\infty \left\{ 3.89 \gamma_X(h) - 2.7 [\gamma_X(h+1)+\gamma_X(h-1)] + 0.8[\gamma_X(h+2)+\gamma_X(h-2)] \right\} e^{-i\omega h}. \end{split}\] Note that \[\begin{split} \sum_{h=-\infty}^\infty \gamma_X(h+1)e^{-i\omega h} &= \sum_{h=\infty}^\infty \gamma_X(h+1) e^{-i\omega (h+1)} e^{i\omega} \\ &= e^{i\omega} \sum_{h'=-\infty}^\infty \gamma_X(h') e^{-i\omega h'}\\ &= e^{i\omega} \lambda_X(\omega) \end{split}\] where we used the change of variable \(h' = h+1\). Similarly, we have \[ \sum_{h=-\infty}^\infty \gamma_X(h-1)e^{-i\omega h} = e^{-i\omega} \lambda_X(\omega), \qquad \sum_{h=-\infty}^\infty \gamma_X(h+2)e^{-i\omega h} = e^{2i\omega} \lambda_X(\omega), \qquad \sum_{h=-\infty}^\infty \gamma_X(h-2)e^{-i\omega h} = e^{-2i\omega} \lambda_X(\omega). \]

It follows that \[ \sigma^2 = \lambda_\epsilon(\omega) = 3.89 \lambda_X(\omega) - 2.7 (e^{i\omega} + e^{-i\omega}) \lambda_X(\omega) + 0.8 (e^{2i\omega} + e^{-2i\omega})\lambda_X(\omega). \] Therefore, \[ \lambda_{X}(\omega)=\frac{\sigma^{2}}{3.89-5.4\cos(\omega)+1.6\cos(2\omega)}. \]

The plot of the spectral density and autocovariance are given as follows. Note that \(\omega = 2\pi f\), where \(f\) denotes the frequency in cycles per unit time and \(\omega\) measures radians per unit time.

library('TSA')
ARMAspec(model = list(ar = c(1.5, -0.8)))

model_AR = arima.sim(model = list( ar = c(1.5, -0.8)), n=1000)
acf(model_AR, type="covariance")

Part B.

Given the MA2 model

\[ X_{n}=\epsilon_{n-2}+\epsilon_{n-1}+\epsilon_{n} \] with \(\mathrm{Var}\left(\epsilon_{n}\right)=\sigma^{2}\), we start by writing the ACF \[\begin{eqnarray} \gamma\left(0\right)&=&\mathrm{Cov}\left(X_{n},X_{n}\right)=3\sigma^{2}\\ \gamma\left(1\right)&=&\gamma\left(-1\right)=\mathrm{Cov}\left(X_{n},X_{n-1}\right)=2\sigma^{2}\\ \gamma\left(2\right)&=&\gamma\left(-2\right)=\mathrm{Cov}\left(X_{n},X_{n-2}\right)=\sigma^{2}\\ \gamma\left(h\right)&=&0\;\forall|h|\geq3. \end{eqnarray}\] So we have \[\begin{split} \lambda(\omega)&=\sum_{h=-\infty}^\infty \gamma\left(h\right)e^{- i\omega h}\\ &=\sum_{h=-2}^{2}\gamma(h)e^{-i\omega h}\\ &=\gamma(0)+2 \gamma(1)\cos(\omega)+2\gamma(2)\cos(2\omega)\\ &=\sigma^{2}\left[3+4\cos(\omega)+2\cos(2\omega)\right] \end{split}\] The plot of the spectral density and autocovariance are given as follows:

ARMAspec(model = list(ma = c(1, 1)))

model_MA = arima.sim(model = list( ma = c(1, 1)), n=1000)
acf(model_MA, type="covariance", plot= TRUE)

Part C.

For part A, if we look closely at the spectrum density plot, we can see a peak at frequency around 0.1 (actually slightly less than 0.1), which is the dominant frequency. This indicates that the dominant period is around 10 (precisely larger than 10). If we look at the ACF, we can see there exists an oscillatory behavior characteristic of period = 11. We see that these two are matched with each other.

For part B, there is no appearant peak on the spectrum density plot, and there is no periodic behavior on the ACF plot either. Again, these two correspond to each other.


Question 4.2.
We first read in the data from the source. We seek to find out the relationship between time and number. We now make a time plot of the data to explore.

mydata = read.table(file="https://ionides.github.io/531w22/hw04/sunspots.txt", header = TRUE)
year = mydata$Time
number = mydata$Number
plot(year, number, type= "l")

From the plot, we see that there is a periodic behavior, with regularly spaced peaks. The time interval between these peaks are about 11 years. This is as expected from reading the NASA website [1].

Now we take a look at the inconsistent spectral density estimate provided by the raw periodogram:

spectrum(number, main = "Unsmoothed periodogram",
  xlab="frequency (cycles per month)",sub="")

This obtains a maximum at 0.00770 \(\mathrm{month}^{-1}\), which corresponds to a period of 10.8 years.

We compare with using repeated rectangular smoothing windows to obtain a non-parametrically smoothed periodogram ([2], slide 24).

smoothed_r = spectrum(number, spans=c(30,30),
  main = "Smoothed periodogram", xlab="frequency (cycles per month)", sub="")

We now determine the dominant frequency.

freq_smoothed <- smoothed_r$freq[which.max(smoothed_r$spec)]

We see that the dominant frequency is 0.00741 \(\mathrm{month}^{-1}\), which corresponds to a period of 11.2 years.

Now we use parametric method to estimate the spectral density.

estimated = spectrum(number, method = "ar",
  main = "Spectrum estimated via AR model picked by AIC",
  xlab="frequency (cycles per month)")
freq_parametric <- estimated$freq[which.max(estimated$spec)]  
abline(v=freq_parametric, lty="dotted")

We find that the dominant frequency is 0.00802 \(\mathrm{month}^{-1}\), which corresponds to a period of 10.4 years.

We see that the parametric approach gives a sharper peak estimate, which may be appropriate in this siutation.

These two estimates are somewhat similar to each other, to the result from the raw periodogram, and to the 11 year period discussed by [1]. In the absence of a full stochastic model for the sunspots, it is hard to say which of these is most trustworthy.


Sources

Parts of this solution are adapted from a previous homework submission by Xiang Gao. This is a vague attribution, perhaps appropriate here since it is not necessary to evaluation individual contributions. For homework submissions, a generic attribution like this is not appropriate. One should help the grader by being more precise about which parts of the code and/or explanation are derived from the source.

References

1. Aeronautics, N., and Administration, S. (2017). The sunspot cycle. Available at: https://solarscience.msfc.nasa.gov/SunspotCycle.shtml.

2. Ionides, E. (2022). Notes for STATS/DATASCI 531, Modeling and Analysis of Time Series Data. Available at: https://ionides.github.io/531w22/.