In modern financial markets, the Efficient Market Hypothesis[1] suggests that stock prices incorporate all available information, making it difficult to systematically outperform a simple random walk model. However, as new technologies emerge and industries evolve, certain sectors experience long-term structural growth. This raises the possibility that time series analysis could uncover persistent trends in stock prices within such industries.
In this report, we focus on Tesla, the leading company in the electric vehicle industry, to analyze its stock price behavior over time. Specifically, we aim to investigate the following questions:
The dataset used in this study originates from Yahoo Finance and is archived on Kaggle[2]. For our analysis, we consider only closing prices before July 7, 2022. This exclusion is motivated by the significant market disruptions caused by the COVID-19 pandemic, which introduced volatility that is difficult to separate from natural stock price fluctuations. In the later part of the project, we also study the Ford Motors stock price to explore any unilateral effect of a trade variable like Volume. By focusing on pre-pandemic data, we aim to better isolate the underlying trends in Tesla’s stock price.
As shown by plot below, the raw stock price seems have a stable trend until 2020, then the stock price increases exponentially until about November of 2021, then the stock price exhibits a decreasing trend. It is apparent that if we want to fit a ARMA like model to the data, we must transform the data.
As suggested by previous year project[3], an intuitive way to transform stock price data is to compute the log return, which helps in analyzing relative changes while removing the impact of absolute price levels. Let \(\{P_t\}\) denote the daily closing price, the log return is defined as \[ R_t = \log \left(\frac{P_t}{P_{t-1}}\right) = \log(P_t) - \log(P_{t-1}) \approx \frac{P_t - P_{t-1}}{P_{t-1}} \] In addition to the de-trending effect of the transformation, a desirable property for log return is that it is additive. Namely, let \(R_{t_1,t_2}\) represent the log return between time \(t_1\) and \(t_2\) where \(t_2 > t_1\), for \(t_3 > t_2\) we have \(R_{t_3,t_1} = R_{t_1,t_2} + R_{t_2,t_3}\). This additive property allows the application of linear models.
As shown, the transformed data seems to have zero mean and can work better with a stationary model. However, it is important to note that when the true rate of return is large (positive or negative), the error of the log estimate increases. In the Tesla stock dataset, the extreme daily return rarely exceeds \(\pm 20\%\). Thus, according to the error table below, the absolute estimation error for each data point should be mostly lower than \(2\%\). This error is noticeable when requiring high precision forecasting, but it is good enough for the purpose of researching the general trend.
Exact.Return | Log.Return | Error |
---|---|---|
-0.20 | -0.2231 | 0.0231 |
-0.15 | -0.1625 | 0.0125 |
-0.10 | -0.1054 | 0.0054 |
-0.05 | -0.0513 | 0.0013 |
0.00 | 0.0000 | 0.0000 |
0.05 | 0.0488 | 0.0012 |
0.10 | 0.0953 | 0.0047 |
0.15 | 0.1398 | 0.0102 |
0.20 | 0.1823 | 0.0177 |
To start, we can try to investigate the autocorrelation function. The number of significant lags can be informative for choosing the \(q\) parameter for the \(MA(q)\) model. As shown, for any lag that’s greater than 0, there are borderline significant at lag 6 and lag 24. This might suggest that having \(q=0\) for the model might be the best.
To further investigate the autocorrelation, we can try to plot the partial autocorrelation function. As defined in Wikipedia[4], the “partial autocorrelation function (PACF) gives the partial correlation of a stationary time series with its own lagged values, regressed the values of the time series at all shorter lags.” Since it controls for shorter lags, it is help full for choosing the \(p\) parameter in \(AR(p)\) model. As shown by the plot, although not statistically significant, there is a cyclic structure.
To confirm whether the cyclic structure is significant we can try to estimate the spectral density of the series. Since the analysis if ACF and PACF did not indicate a strong auto regressive relationship in the series, we will use a series of modified Daniell smoothers ( method)[5] with span of \((40,40)\) is used for smoothing.
As shown, the dominant frequency is around 0.4, which suggests that the period is about 2.5 days. However, there are several other peaks that are close to the chosen frequency, which may suggest that the seasonality is rather weak in this data.
In this section, we will explore various models and test if they perform better than the baseline model. We will then draw inference from the best model.
The basic model is used as baseline, which will be used later to compare with other models. The model takes the form \[ Y_n = \mu + \epsilon_n + \sum_{i=1}^{p}\phi_i(Y_{n-i} - \mu) + \sum_{i=1}^{q}\psi_i\epsilon_{n-i} \] For computational efficiency reason, we only consider models where the \((p,q) \in [0,4] \times [0,4]\). Here’s the AIC table using the code borrowed from lecture note[6].
MA0 | MA1 | MA2 | MA3 | MA4 | |
---|---|---|---|---|---|
AR0 | -11583.28 | -11581.30 | -11579.30 | -11577.78 | -11576.00 |
AR1 | -11581.30 | -11579.97 | -11578.17 | -11575.78 | -11574.00 |
AR2 | -11579.30 | -11578.16 | -11579.05 | -11575.91 | -11574.81 |
AR3 | -11577.74 | -11576.62 | -11577.98 | -11577.63 | -11573.61 |
AR4 | -11575.93 | -11574.92 | -11573.90 | -11582.27 | -11583.21 |
As shown, model with low AIC are \(ARMA(4,3)\) and \(ARMA(0,0)\). However, after further investigation, the model overfit when either \(p > 3\) or \(q > 3\). (standard errors are null due to convergence issue) Thus, the AIC value shown above is not trust worthy for higher order models. As for investigating the potential structure of the Tesla returns, we prefer more complex model, thus we proceed with \(ARMA(1,1)\). Here’s the fitted model.
##
## Call:
## arima(x = x, order = c(1, 0, 1))
##
## Coefficients:
## ar1 ma1 intercept
## -0.8753 0.8826 0.0016
## s.e. 0.2564 0.2515 0.0007
##
## sigma^2 estimated as 0.001277: log likelihood = 5793.98, aic = -11579.97
To test the validity of the \(ARMA(1,1)\) model, we want to investigate whether the residual follows i.i.d. normal assumption. As shown by the residual plot, there’s no apparent pattern, and the histogram approximately follows a bell curve. For the residual autocorrelation function, there is no significant lags; However, the ACF seems to have a sinusoidal pattern, which suggest that we might need to add fourier terms to capture the cyclic behavior. The plot is produced by the \(\texttt{checkresiduals()}\) function in \(\texttt{forecast}\) library[7].
To formally test whether there is a serial correlation between lags of the residual, we can test autocorrelation of the residual with Ljung-Box test[8]. As defined in Wikipedia, the null hypothesis is that the data is not correlated and the alternative hypothesis is that the data exhibit serial correlation. As shown, p-value is greater than \(\alpha=0.05\). Thus, there is no sufficient evidence to reject the null hypothesis. Combining the test result with residual plots, we can say that the \(ARMA(1,1)\) model is appropriate for the given data.
##
## Ljung-Box test
##
## data: Residuals from ARIMA(1,0,1) with non-zero mean
## Q* = 7.4422, df = 8, p-value = 0.4898
##
## Model df: 2. Total lags used: 10
We can check the roots of the AR and MA polynomial to see if this model is causal and invertible. From the fitted coefficients, we have the AR polynomial as \(\phi(x)=1 - \phi_1x\) and the MA polynomial is \(\psi(x)=1+\psi_1x\), where \(\phi_1=\) -0.8753121 and \(\psi_1 =\) 0.8825971. As shown, both root’s absolute value are outside the unit circle. Thus, the model is both causal and invertible. However, note that the value of roots are rather close. They might just cancel out and fall back to ARMA(0,0).
## AR Root: 1.14245
## MA Root: 1.13302
To investigate further on the presence of seasonality in data, we specifically analyse combining seasonal components with our previous \(ARMA(1,1)\)[9] as well as SARIMA model with the non-seasonal components removed, defined as: \[ (1 - \phi_1B^5)Y_n = (1 - \psi_1B^5)\epsilon_n + constant \]
The period (5) is selected based on the number of trading days of the week. Our analysis shows that the above model with the AR/MA parts removed is a more robust model(AIC = -11580.63) than the previous \(ARMA(1,1)\) model. The integrated SARIMA model returns a higher AIC score describing a relatively weaker fit.
##
## Call:
## arima(x = x, order = c(0, 0, 0), seasonal = list(order = c(1, 0, 1), period = 5))
##
## Coefficients:
## sar1 sma1 intercept
## -0.1707 0.1497 0.0016
## s.e. 0.4964 0.4939 0.0006
##
## sigma^2 estimated as 0.001276: log likelihood = 5794.32, aic = -11580.63
While these results point towards a more dominant weekly pattern in the stock price and prove to be more influential than day-to-day projections. We also experiment with different periods to investigate monthly and quarterly cycles, but trading days(period=5) define the cyclical pattern better. However, it is also to be noted that the coefficients weights do not make a significant difference to the trend. Thus, the relatively stronger seasonal patterns on weekly basis are too small to profitably trade after costs. We note that this inference would align with the weak form of Efficient Market Hypothesis[10] and suggests a deeper fundamental trade analysis is required to support the technical statistics.
As indicate by above analysis, the traditional ARMA like models, in theory, cannot perform better than the white noise model for the log return. However, there exists modern methods that are claimed to have the ability to directly deal with raw stock price data. As suggested by Zhang et al., a CNN-BiLSTM-Attention-based model[11] can be used for accurate stock price prediction. The proposed model have three major parts, the first part is a Convolutional Neural Network[12] that aims to capture the temporal structure of the input data. The second part is a Bidirectional Long Short Term Neural Network[13] that aims to capture the inter-dependence between lags of the input data. Note that the the “Bidirectional” means that it is a modified version of the original LSTM that allows the model to learn patterns both ways. The last major part of the proposed model is an attention layer that aims to distinguish the more relevant inter-dependencies returned by the BiLSTM layer. This attention mechanism is the same one as used in modern large language models as suggested by the paper “Attention Is All You Need”[14] written by scientists at Google.
For the Tesla stock price data, we have used the daily open, close, high, and low price as predictors, and the target variable is the daily close price. The look back period is 5 days, and the model is set up to do one step prediction. (one day into future in this case) we have adopted a model structure we found on GitHub[15]. The model from Github is implemented via Tensor Flow[16], we re-implemented the model with PyTorch[17] and we changed the single head attention layer into a multihead attention layer to allow the model to capture more dependencies between data and its lag values. The model also uses a min-max scalar[18] to preprocess the data to improve numeric stability, which makes all data points fall in the range \([0,1]\). Here’s a visualization of the model structure:
Due to the high complexity of the model, our Tesla dataset (about 3000 observations) is not enough for the model to reach convergence. Thus, a transfer learning technique is applied, which leveraged knowledge learned from other dataset to improve the converged for predicting Tesla stock price. The other dataset used is the China and U.S. currency exchange rate data we gathered from GitHub[15] (about 26000 observations), which contains the open, close, high, and low exchange rate in CNY per USD. The granularity of this dataset is not stable (ranges from 1 minute to 7 minutes) between observation, but it shouldn’t matter as predicting currency exchange rate is not out primary task. After fitting the model on the currency exchange rate data, we fine-tuned the learned parameters with Tesla stock price dataset, and the model has reached convergence.
For the Tesla data, we only used 80% for training, and the rest for
testing. The performance on the testing data has \(R^2=0.8897\), and mean absolute percentage
error of 0.1502. Here’s a visualization of the prediction on testing
data. Note that this is the best fitted model and re-run the code
provided may result in different model fit. So, does this model fit result indicate that
the model might beat the efficient market hypothesis? After fitting the
model multiple times, the resulting fit’s performance does vary quite a
bit. Some of the worst fits have mean absolute percentage error up to
0.3, which is about two times higher than the best fit. Additionally,
across all the fits, there’s a common theme that appears in the fitted
value v.s. actual value: the predictive power falls apart after the
spike around October 2021. Thus, we believe that the efficient market
still holds. As it’s likely that the early portion of the Tesla stock
price testing data has exhibited a very similar trend either in the
pre-train data or the training data. This could explain the sudden
change in predictive power in the testing set after October 2021.
FYI: Tesla stock has increased so much in October 2021 due to announced large contract from rental companies as indicated by online news[19].
Understanding volume, the amount of stocks being traded daily, is key to realizing it’s power on the stock price. Volume gives us an idea of market participation and generated/declining interest in the stocks. Whether a stock is being bought/sold in high volumes tells the trader of bullish and bearish trends in the stock[20], making them more informed of conducting or setting up trades. We look at how the amount of daily trades have varied for Ford and Tesla over the years. A 10-day moving average line has been constructed as it is often beneficial to gauge the market interest through this[21].
Along with volume, the CBOE Volatility Index[22] is used to determine the market’s stress, fear and risk centering around the position of a stock. The index represents the market’s expectation of volatility over the next 30 days. The trend stability can be interpreted by following the VIX along with the candlestick representation of the stock. To visualize how it has varied since 2010:
## [1] "VIX"
Thus, with sufficient proof to their influential aspect to the closing price, we investigate these two entities as exogenous variables and observe any difference they make to the model fit. We consider the base models \(ARMA(1,1)\) and \(MA(1)\) as well as adding their seasonal components with the exogenous variables. In our analysis, we look into the influence of the logarithm of the volume to scale it with our closing price. We model six different variations of our base models introducing different combinations fo exogenous variables as well as seasonality. It is inferred that the Volatility Index depicts a marked improvement in model utility through the AIC score dropping significantly.
Parameters | M1_LogVol | M2_VIX | M3_Both | M4_LogVol_Seasonal | M5_VIX_Seasonal | M6_Both_Seasonal |
---|---|---|---|---|---|---|
MA(1) | -11582.95 | -11597.95 | -11582.95 | -11580.39 | -11594.82 | -11580.39 |
ARMA(1,1) | -11580.95 | -11595.95 | -11597.17 | -11578.38 | -11592.84 | -11594.07 |
The question that arises now is how statistically significant is the exogenous term? A starkly lower AIC score(-11597.95) for the \(MA(1)\) model definitely allows us to infer that moving averages can describe trends better by smoothing over short-term noise[23]. Hence, on further looking at the coefficients and p-values, we observe that even though VIX exogenous weight does not have much of a practical impact, statistically it definitely proves to influence the model more than the Volume(logarithm) of trades. We have already concluded that it’s presence definitely has a marked improvement of the robustness of the model and stability, hence it’s usage largely depends on the context of the forecasting task.
##
##
## Table: ARIMA Model Results
##
## | |Variable | Coefficient| Std.Error| t.value| p.value|Significance |
## |:--------------|:---------|-----------:|---------:|---------:|--------:|:------------|
## |ma1 |MA1 | -0.0004| 1.82e-02| -2.11e-02| 9.83e-01| |
## |intercept |Intercept | 0.0095| 1.76e-03| 5.43e+00| 5.75e-08|*** |
## |as.matrix(VIX) |VIX | -0.0004| 9.12e-05| -4.73e+00| 2.25e-06|*** |
##
## Significance codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Being a capital-intensive industry, automobile stocks often require analysis with scrutiny. Through this project, we investigate our initial three research questions and gather significant momentum in answering these. Our analysis aligns with historical findings that moving average models provide the best fit for stock data. While standalone ARMA models are less effective in capturing trends in daily closing prices, incorporating influential variables like the Volatility Index or Volume improves their fit. This highlights that stock prices are influenced by other financial factors, emphasizing the need for both fundamental[20] and technical analysis in forecasting. We extend our trend analysis by leveraging state-of-the-art CNN models with Bi-LSTM layers and transfer learning to capture the temporal structure of the data. While our model achieves a promising \(R^2\) score, accuracy begins to diverge from actual stock performance starting in October 2021, reflecting the impact of external factors on stock prices[10]. Future research should focus on accurately modeling pandemic-driven stock fluctuations, particularly for essential products that experienced sudden disruptions. Additionally, this study can be expanded to analyze key financial ratios[24], such as debt-to-equity and inventory turnover, which influence the performance of companies within the same industry. Such insights would be valuable for market analysts and investors alike.
In addition to explicit sources cited above, we have also consulted the general structure of the apple stock[25] and the nvidia[3] stock project from last year. Although some of the method used are similar, we have developed our own insights into the results of those methods. (those methods are also taught as standard tools in lecture)