In the unpredictable and ever-changing world of finance, predicting the stock market is a notoriously difficult task due to volatility and the numerous variables that influence the price of a stock. However, time series analysis can help us uncover patterns in the stock prices and provide valuable insights to investors hoping to make a profit.
In this analysis, we will focus on the log-returns of the daily opening stock price of Apple Inc., one of the largest and best-known tech companies of all time. Apple stock is traded on the National Association of Securities Dealers Automated Quotations (NASDAQ) Global Select Market under the ticker symbol AAPL.
We aim to investigate the log-returns of Apple stock prices from January 2, 2004 to February 9, 2024 and answer the following questions:
Our data is taken from Yahoo Finance and contains 5,061 records of Apple stock price from January 2, 2004 to February 9, 2024 1. We begin our analysis with a time series plot of the Apple opening stock price over time.
From the time series plot, we notice an increasing trend in the opening Apple stock price over the 20-year period. There appears to be a steady increase from 2004 to 2018 with some small spikes, but we see a dramatic increase in stock price around 2020.
Given that the data appears to be non-stationary, the log-returns of the opening stock price can be helpful to observe as they can reduce non-stationary properties of the data. In finance and economics, we often care about the percent chance in price rather than the absolute price change. Since log-returns are additive over time, they have become a popular method to measure the performance of an investment 2.
We take the log and first difference to stabilize the time series and reduce the trend. After the log transformation, we take the first difference of the data. Note that the first difference time series \(z_{2:N}\) is defined by \(z_n =\Delta y^{*}_n = y^{*}_n - y^{*}_{n-1}\) 3. We then plot the log-returns of the opening stock price against time with a red dotted line at 0% change in price.
From the plot, the log-returns appear to be centered around 0. We see some signs of high volatility around 2008 and the end of 2015 since there are some spikes in the log-returns indicating a large percent change in stock price.
To confirm our decision to analyze the log-returns of the stock price rather than the actual stock prices, we will compare their autocorrelation (ACF) plots.
The ACF plot of the log-returns show that the log-returns are not significantly different from 0 (except around lag 1 and lag 19). Overall, the log-returns appear to be uncorrelated. On the other hand, the ACF plot of the stock prices provide evidence that the stock prices are correlated. Hence, we will move forward by analyzing the log-returns.
Before fitting a model to the log-returns, it may be helpful to
identify if there are any cyclic patterns in the log-returns. We plot a
smoothed periodogram of the log-returns below using the default method
to estimate the spectral density. Note that the default method is
pgram
, which is a form of non-parametric spectral
estimation 4.
The periodogram of the log-returns of Apple stock price does not have a dominant frequency, which indicates that seasonality may not be present in the data. This could be attributed to the high volatility of the stock market, specifically for Apple stock.
Since the time series of the log-returns appears to be non-stationary, we begin our modeling process with the ARMA(p, q) model. If a time series \(Y_n\) has a nonzero mean \(\mu\), we can set \(\alpha = \mu(1-\phi_1-...-\phi_p)\) and write the ARMA(p,q) model as follows:
\[Y_n = \alpha + \phi_{1} Y_{n-1} + ... + \phi_{p} Y_{n-p} + \epsilon_{n} + \psi_{1} \epsilon_{n-1} + ... + \psi_{q} \epsilon_{n-q}\]
where \(\phi_{p} \ne 0\), \(\psi_{q} \ne 0\), and \(\sigma^2_{\epsilon} > 0\). We can also assume \(\epsilon_{n}\) is a white noise process which follows \(N(0, \sigma^2)\) 5. Note that \(\phi\) represents the autoregressive (AR) parameters and \(\psi\) represents the moving average (MA) parameters, where \(p\) is the order of the AR polynomial and \(q\) is the order of the MA model.
To choose values of \(p\) and \(q\), we will begin with a plausible range of values for \(p\) and \(q\) and compare the ARMA(p,q) models using the Akaike Information Criterion (AIC). The AIC is given by:
\[ AIC = -2*\ell(\theta^{*}) + 2D\] where \(\ell(\theta^{*})\) is the maximized log-likelihood and \(D\) is the number of parameters in the model 6. The AIC is often used for model selection, and by choosing the model with the lowest AIC, we can find the ARMA model that best fits the data.
MA0 | MA1 | MA2 | MA3 | MA4 | |
---|---|---|---|---|---|
AR0 | -24430.41 | -24468.06 | -24466.66 | -24464.78 | -24466.05 |
AR1 | -24468.67 | -24466.70 | -24464.69 | -24462.84 | -24465.02 |
AR2 | -24466.76 | -24464.74 | -24462.95 | -24460.85 | -24466.09 |
AR3 | -24465.03 | -24462.96 | -24460.96 | -24471.72 | -24473.33 |
AR4 | -24465.24 | -24464.68 | -24465.81 | -24473.42 | -24470.55 |
We notice some inconsistencies in the AIC table above. Mathematically, adding a parameter to the ARMA model cannot decrease the maximized log-likelihood, so we should not see the AIC increase by more than 2 units 7. We see an example of an inconsistency in the table where the AIC increases by roughly 3 units from the ARMA(4,3) model to the ARMA(4,4) model.
Due to inconsistencies in the AIC table, we will be careful not to choose models that are too large 8. Although the ARMA(4, 3) model is associated with the lowest AIC score, we will narrow our focus to small models in the table. Hence, we choose the ARMA(1,0) model which gives the lowest AIC score among small models.
We will continue our analysis by fitting the ARMA(1, 0) model to the log-returns.
Intercept | SE (Intercept) | AR 1 Coefficient | |
---|---|---|---|
ARMA(1,0) | 0.0012 | 3e-04 | -0.089 |
We continue our analysis by fitting the ARMA model to the data and displaying the model summary above. Using the fitted values from the table, we can write the ARMA(1,0) model as follows:
\[\text{ARMA(1,0): } Y_n = \alpha + \phi_{1} Y_{n-1} + \epsilon_{n}\] Note that if we use the backshift operator 9 and denote \(\phi(x)\) and \(\psi(x)\) as the AR and MA polynomials respectively, we can rewrite the model as such:
\[\phi(B)(Y_n - 0.0012) = \psi(B) \epsilon_n \text{ where } \phi(x) = 1-0.089x \text{ and } \psi(x) = 1\] We can focus on the roots of the AR and MA polynomials from the model to determine causality and invertibility. An ARMA process is causal only when the roots of \(\phi(x)\) lie outside the unit circle, and invertible only when the roots of \(\psi(x)\) lie outside the unit circle 10. Note that the ARMA(1,0) model is always invertible since it does not contain MA roots, so we are mostly focused on determining causality.
From the visualization above, the inverse root of \(\phi(x)\) lies inside the unit circle for the ARMA model. Hence, the root of the AR polynomial is outside the unit circle which indicates causality. Therefore, the ARMA(1,0) model is both causal and invertible.
Note: Out of curiosity, we also fit the ARMA(4,3) model which had the lowest AIC of all the fitted ARMA(p,q) models and observed the roots of the AR and MA polynomials. There appeared to be canceling roots since the inverse AR roots were located on the exact same part of the unit circle as the three inverse MA roots. Additionally, most of the inverse roots were located on the boundaries of the unit circle. This observation gave us more motivation to continue our analysis with the ARMA(1,0) model instead of the ARMA(4,3) model.
We continue by analyzing the residuals of the ARMA model and checking the validity of our model assumptions. If the model is correctly specified, the residuals should have the properties of white noise 11. Hence, we will determine if the residuals of the model appear to have mean 0, are uncorrelated, and normally distributed.
We notice that the residuals are mostly centered at 0. However, the variance does not appear to be constant as the variance is quite large around 2008 and 2016.
The ACF plot of the residuals show that the autocorrelations are not significantly different from 0. Overall, the errors appear to be uncorrelated.
The points on the QQ-plot are quite close to the identity line except for the left and right tail ends where we see deviation from the line. Hence, the normality assumption seems dubious as the tails are heavier than the normal distribution. The heavy tails could be caused by the large variances seen in the residual plot.
Since we have possible evidence that the ARMA(1,0) model residuals may violate some of the model assumptions, we continue our analysis by fitting a new model that may be more appropriate for the data.
Researchers have long noticed that stock returns often have “heavy-tailed” distributions since the conditional variance is not constant, and outliers can occur when variance is large. ARMA models are unable to account for volatile behavior (non-constant variance) of the return values. Therefore, a Generalized Autoregressive Conditionally Heteroscedastic (GARCH) model may be better suited to the data as it can model both the conditional heteroskedasticity and the heavy-tailed distributions of financial markets data 12.
We begin by discussing the ARCH model. ARCH(m) model can model the returns as:
\[ y_t = \sigma_t \epsilon_t \\ \sigma^2_{t} = \alpha_{0} + \alpha_{1} y_{t-1}^2 + \dots + \alpha_{m} y_{t-m}^2 \]
where \(\epsilon_t\) is standard Gaussian white noise, i.e., \(\epsilon_t \sim \text{ iid } N(0,1)\). We also impose constraints \(\alpha_0, \alpha_1, \dots, \alpha_m \geq 0\) on the model parameters to avoid negative variance.
We further extend the ARCH model to the Generalized ARCH (GARCH) model which uses values of the past squared observations and past variances to model the variance at time \(t\). The GARCH(m, r) model retains \(y_t = \sigma_t \epsilon_t\) and includes the conditional dependence of the previous conditional variance and expectation 13.
\[ \sigma_t^{2} = \alpha_0 + \sum_{j=1}^{m} \alpha_{j} y_{t-j}^{2} + \sum_{j=1}^{r} \beta_{j} \sigma_{t-j}^{2} \]
Many recent studies have selected the GARCH(1,1) model to analyze time series data since it is one of the simplest and most robust among volatility models. The GARCH(1,1) model equation is written as \(\sigma^2_{t} = \alpha_{0} + \alpha_{1} y_{t-1}^2 + \beta_{1}\sigma^2_{t-1}\) where \(\alpha_1 + \beta_1 < 1\).
ARMA-GARCH models are more popularly used to model the volatility of financial time series data, so we will continue our analysis by fitting a ARMA(1,0)-GARCH(1,1) model 14.
We use the fGarch
package to fit an ARMA(1,0)-GARCH(1,1)
model to the log-returns 15.
Since the ARMA-GARCH model assumes dependency of the conditional variance with previous values over time, observing no autocorrelations in the ARMA-GARCH standardized residuals or squared standardized residuals would indicate that the ARMA-GARCH model is appropriate for the data 16. Hence, we create ACF plots of the standardized residuals and squared standardized residuals below.
For both the standardized residuals and squared standardized residuals, the autocorrelations do not appear to be significantly different from 0. Although we see one autocorrelation above the blue line around lag 23 for the squared standardized residuals, we can conclude that overall, the standardized residuals and squared standardized residuals are uncorrelated.
One of the advantages of including the GARCH model is the ability to visualize volatility over time and further understand periods of high and low volatility. Hence, we use our ARMA-GARCH model to plot the volatility and identify significant events that may have caused spikes in volatility.
From the plot, we can observe at least three periods of significant increases in volatility:
2007-2009: We notice the largest increase in volatility during this period. This is likely related to the Great Recession, which was the worst economic downturn in the United States since the Great Depression. The Great Recession was related to the Subprime mortgage crisis where many high-risk mortgages went into default at the beginning of 2007 17.
Although Apple shares initially took a hit during the Great Recession, the release of the iPhone 3G in 2008 was a massive success, selling one million iPhone 3Gs in the first weekend 18. Apple also saw an increase in iPod sales as it was still an affordable product during the post-recession period. Due to Apple’s innovation, they were able to survive the recession and face fewer financial troubles than their competitors.
2015-2016: The second highest increase in volatility from the plot above occurs from roughly 2015-2016. In October 2015, Apple’s CEO, Tim Cook, stated that fiscal 2015 was Apple’s most successful year ever with a revenue of nearly $234 billion 19. During this time, consumers were captivated by the wide variety of Apple products such as the iPhone, iPad, Mac, and Apple Watch.
2020-2021: The increase in volatility during this period is likely linked to the COVID-19 pandemic, where millions of people all over the world heavily relied on technology to keep in touch with others and continue their professional careers. Although Apple dealt with massive layoffs and hiring freezes, their stock performance remained quite strong 20. Apple also released the iPhone 12 in 2020 which included 5G technology and significantly improved download speeds and display quality.
After fitting the ARMA(1,0) and ARMA(1,0)-GARCH(1,1) models to the log-returns, we found the ARMA(1,0)-GARCH(1,1) model to be more appropriate for modeling the log-returns of Apple stock prices. Although ARMA models have useful applications in real-world problems, the addition of the conditional variance model, GARCH(1,1), provided better insight on the behavior of the stock market, especially for high-volatility periods which are the most concerning for investors.
The ARMA(1,0)-GARCH(1,1) also allowed us to observe volatility over time and extract information about the history of Apple Inc. as well as possible internal and external factors that influenced the financial market, which would not have been possible with just the ARMA model. We speculate that the periods of high volatility may be related to extreme economic decline, successful product launches, and the COVID-19 pandemic.
We did not find strong evidence of seasonality in our frequency analysis, which may be attributed to the high volatility of the stock market.
For future studies, it may be interesting to attempt to forecast the log-returns and predict spikes in volatility in the short-term. However, because the stock market is notoriously difficult to predict, it is not guaranteed that predictions made by our ARMA(1,0)-GARCH(1,1) model would provide reliable predictions of future volatility levels.
Additionally, when creating the AIC table for the ARMA model, we
noticed inconsistencies in the table which led us to only consider small
ARMA models. In the future, we could try the arima2
package
(created by Jesse Wheeler and Professor Ionides), which uses multiple
starting values to improve the optimization performance of the
arima
function in the stats
package and could
lead to less inconsistencies in the AIC table 21.
[1] Apple reports record fourth quarter results. Apple Newsroom. (2015, October 27). https://www.apple.com/newsroom/2015/10/27Apple-Reports-Record-Fourth-Quarter-Results/#.
[2] Belzile, L. (n.d.). Spectral Estimation in R. 4.3 Spectral Estimation in R. https://lbelzile.github.io/timeseRies/spectral-estimation-in-r.html
[3] Choy, Yoke & Chong, CY. Effect of Subprime Crisis on U.S. Stock Market Return and Volatility.
[4] Downey, L. (2021, August 20). How COVID affects Apple (AAPL). Investopedia. https://www.investopedia.com/how-covid-affects-apple-aapl-5198334
[5] Ionides, Edward. Analysis of Time Series Chapter 3 Lecture Slides
[6] Ionides, Edward. Analysis of Time Series Chapter 4 Lecture Slides
[7] Ionides, Edward. Analysis of Time Series Chapter 5 Lecture Slides
[8] Ionides, Edward. Analysis of Time Series Chapter 7 Lecture Slides
[9] Ionides, Edward. Analysis of Time Series Chapter 9 Lecture Slides
[10] Modeling and Forecasting of Volatility using ARMA-GARCH: Case Study on Malaysia Natural Rubber Prices. (2019). https://iopscience.iop.org/article/10.1088/1757-899X/548/1/012023/pdf
[11] Ruppert, D. (2010). Statistics and Data Analysis for Financial Engineering (Springer Texts in Statistics). Springer, Berlin.
[12] Shumway, R. H., & Stoffer, D. S. (2006). Time Series Analysis and Its Applications with R Examples. New York: Springer.
[13] STATS 531 Midterm Project : Bitcoin Historical Data
[14] Time Series Analysis for Log Returns of S&P500
[15] Weir, D. (2009, April 23). Apple’s iphone defies recession as mobile takes over. CBS News. https://www.cbsnews.com/news/apples-iphone-defies-recession-as-mobile-takes-over/
[16] Wikimedia Foundation. (2024, January 25). Rate of Return. Wikipedia. https://en.wikipedia.org/wiki/Rate_of_return
[17] Yahoo! (2024, February 21). Apple Inc. (AAPL) Stock Price, News, Quote & History. Yahoo! Finance. https://finance.yahoo.com/quote/AAPL
Chapter 3 Slide 25 Class Notes↩︎
https://lbelzile.github.io/timeseRies/spectral-estimation-in-r.html↩︎
Chapter 3.2 (Page 92) of Time Series Analysis and Its Applications with R Examples↩︎
Chapter 5 Slide 21 Class Notes↩︎
Chapter 5 Slide 30 Annotated Class Notes↩︎
Chapter 7 Slide 23 Class Notes↩︎
Chapter 4 Slide 13 Class Notes↩︎
Chapter 3.2 (Page 95) of Time Series Analysis and Its Applications with R Examples↩︎
Chapter 3.8 (Page 149) of Time Series Analysis and Its Applications with R Examples↩︎
Chapter 18.7 (Page 484) of Statistics and Data Analysis for Financial Engineering↩︎
Chapter 5.4 (Page 286) of Time Series Analysis and Its Applications with R Examples↩︎
https://iopscience.iop.org/article/10.1088/1757-899X/548/1/012023/pdf↩︎
Chapter 5.4 (Page 284) of Time Series Analysis and Its Applications with R Examples↩︎
Chapter 18.12 (Page 496) of Statistics and Data Analysis for Financial Engineering↩︎
https://www.researchgate.net/publication/265233838_Effect_of_Subprime_Crisis_on_US_Stock_Market_Return_and_Volatility↩︎
https://www.cbsnews.com/news/apples-iphone-defies-recession-as-mobile-takes-over/↩︎
https://www.apple.com/newsroom/2015/10/27Apple-Reports-Record-Fourth-Quarter-Results/#↩︎
https://www.investopedia.com/how-covid-affects-apple-aapl-5198334↩︎
Chapter 9 Slide 23 Class Notes↩︎