In 2019, due to the intensification of the China-United States trade war, the global economy has been severely influenced in all aspects. The U.S. economy failed to achieve the 3% growth target set by the Trump administration for the second consecutive year. The annual growth rate in 2019 was the lowest in past three years, as corporate investment declined further under the demaging trade tensions. Tesla, as the largest electric vehicle and solar panel company in the United States, is worthwhile to focus on. Especially, in the first half of 2019, Tesla was still struggling with the decline of the market, the stock price fell to 190 per share, however, in the second half, stock price frequently hit the record highs, rising above 500 per share. After 2020, Tesla has become the first listed automobile company in the United States with a market value of over 100 billion dollars. In this paper, we will concentrate on the stochastic volitiy of Tesla’s adjusted closing price by operating the GARCH model and the POMP model.
Adjusted closing price amends a stock’s closing price to accurately reflect that stock’s value after accounting for any corporate actions. It is considered to be the true price of that stock and is often used when examining historical returns or performing a detailed analysis of historical returns. Our data selects Tesla’s adjusted closing price in the whole 2019 for further analysis.
Our data source: Tesla stock data from 2010 to 2020 (https://www.kaggle.com/timoboz/tesla-stock-data-from-2010-to-2020)
DATE
: Date from 2010 to 2020
Open
: Opening price
High
: Highest price that day
Low
: Lowest price that day
Close
: Closing Price
Adj Close
: Adjusted closing price, taking splits etc into account
Volume
: Trading volume
# Read data and Preprocessing
tsla <- read.csv("TSLA.csv",header = TRUE)
tsla <- tsla[c(2143:2394),]
tsla <- tsla[which(tsla$Adj.Close != "NA"),]
# Data visualization
tsla_log <- diff(log(tsla$Adj.Close))
tsla_dm <- tsla_log - mean(tsla_log)
y1 = range(tsla$Adj.Close, na.rm=TRUE)
y2 = range(tsla_dm,na.rm = TRUE)
plot(tsla$Adj.Close,type = "l",main = "Adjusted Close Price",ylim = y1,xlab = "Date",ylab = "Adj.Close")
abline(h=mean(tsla$Adj.Close),col = "blue")
plot(tsla_dm,type = "l",main = "Demeaned log Adjusted Close Price",ylim = y2,xlab = "Date",ylab = "Adj.Close")
abline(h=mean(tsla_dm),col = "blue")
As showed above, we obtained two plots for the original adjusted closing price and the demeaned adjusted closing price, respectively. From the plot of the original adjusted closing price, we could observe that the turbulent trend throughout the 2019, to be more specific, the overall decline in the first half, the overall rise in the second half. From the plot of the demeaned adjusted closing price, we could observe that it’s a random pertubation around 0. It’s worthy to be mentioned that the variance of the process is quite different during the different periods in 2019.
Following the model representation of Breto, we propose a model,
\(Y_n\) = exp{\(H_n\)/2}\(\epsilon_n\)
\(H_n\) = \(\mu_h\)(1 - \(\phi\)) + \(\phi\)\(H_{n-1}\) + \(\beta_{n-1}\)\(R_n\)exp{-\(H_{n-1}\)/2} + \(\omega_n\)
\(G_n\) = \(G_{n-1}\) + \(\nu_n\)
where \(\beta_n\) = \(Y_n\)\(\sigma_{\eta}\)\(\sqrt{1 - \phi^2}\),{\(\epsilon_n\)} is an iid N(0,1) sequence, {\(\nu_n\)} is an iid N(0,\(\sigma_{\eta}^2\)) sequence, and {\(\omega_n\)} is an iid N(0,\(\sigma_{\omega}^2\)) sequcence.
Here, \(H_n\) is the log volatility.
We use the state variable \(X_n\) = (\(G_n\), \(H_n\), \(Y_n\)) and model the measurement process as a perfect observation of the \(Y_n\) component of the state space.
Firstly, we construct the POMP model as following.
# Building a POMP model
require(pomp)
tsla_statenames <- c("H","G","Y_state")
tsla_rp_names <- c("sigma_nu","mu_h","phi","sigma_eta")
tsla_ivp_names <- c("G_0","H_0")
tsla_paramnames <- c(tsla_rp_names,tsla_ivp_names)
rproc1 <- "
double beta,omega,nu;
omega = rnorm(0,sigma_eta * sqrt( 1- phi*phi ) *
sqrt(1-tanh(G)*tanh(G)));
nu = rnorm(0, sigma_nu);
G += nu;
beta = Y_state * sigma_eta * sqrt( 1- phi*phi );
H = mu_h*(1 - phi) + phi*H + beta * tanh( G )
* exp(-H/2) + omega;
"
rproc2.sim <- "
Y_state = rnorm( 0,exp(H/2) );
"
rproc2.filt <- "
Y_state = covaryt;
"
tsla_rproc.sim <- paste(rproc1,rproc2.sim)
tsla_rproc.filt <- paste(rproc1,rproc2.filt)
tsla_rinit <- "
G = G_0;
H = H_0;
Y_state = rnorm( 0,exp(H/2) );
"
tsla_rmeasure <- "
y=Y_state;
"
tsla_dmeasure <- "
lik=dnorm(y,0,exp(H/2),give_log);
"
# Parameter transformations
tsla_partrans <- parameter_trans(
log=c("sigma_eta","sigma_nu"),
logit="phi"
)
Secondly, we filter on simulated data to check whether the basic particle is working or not. In the following, we set three different run levels and finally obtain loglikelihood is -257.20 with standard error 0.93, which helps us to know more about the estimation range of the related parameters.
tsla.filt <- pomp(data=data.frame(
y=tsla_dm,time=1:length(tsla_dm)),
statenames=tsla_statenames,
paramnames=tsla_paramnames,
times="time",
t0=0,
covar=covariate_table(
time=0:length(tsla_dm),
covaryt=c(0,tsla_dm),
times="time"),
rmeasure=Csnippet(tsla_rmeasure),
dmeasure=Csnippet(tsla_dmeasure),
rprocess=discrete_time(step.fun=Csnippet(tsla_rproc.filt),
delta.t=1),
rinit=Csnippet(tsla_rinit),
partrans=tsla_partrans
)
params_test <- c(
sigma_nu = exp(-4.5),
mu_h = -0.25,
phi = expit(4),
sigma_eta = exp(-0.07),
G_0 = 0,
H_0=0
)
sim1.sim <- pomp(tsla.filt,
statenames=tsla_statenames,
paramnames=tsla_paramnames,
rprocess=discrete_time(
step.fun=Csnippet(tsla_rproc.sim),delta.t=1)
)
sim1.sim <- simulate(sim1.sim,seed=1,params=params_test)
sim1.filt <- pomp(sim1.sim,
covar=covariate_table(
time=c(timezero(sim1.sim),time(sim1.sim)),
covaryt=c(obs(sim1.sim),NA),
times="time"),
statenames=tsla_statenames,
paramnames=tsla_paramnames,
rprocess=discrete_time(
step.fun=Csnippet(tsla_rproc.filt),delta.t=1)
)
run_level <- 1
tsla_Np <- switch(run_level, 100, 1e3, 2e3)
tsla_Nmif <- switch(run_level, 10, 100, 200)
tsla_eval <- switch(run_level, 4, 10, 20)
tsla_local <- switch(run_level, 10, 20, 20)
tsla_global <- switch(run_level, 10, 20, 40)
library(doParallel)
registerDoParallel()
library(doRNG)
registerDoRNG(34118892)
stew(file=sprintf("pf1-%d.rda",run_level),{ t.pf1 <- system.time(
pf1 <- foreach(i=1:tsla_eval,
.packages='pomp') %dopar% pfilter(sim1.filt,Np=tsla_Np))
},seed=493536993,kind="L'Ecuyer")
(L.pf1 <- logmeanexp(sapply(pf1,logLik),se=TRUE))
## se
## -380.7714716 0.2316179
Thirdly, we operate a local search on logLikelihood surface of stochastic volatility model to Tesla data.
run_level <- 2
tsla_Np <- switch(run_level, 100, 1e3, 1e4)
tsla_Nmif <- switch(run_level, 10, 100, 200)
tsla_eval <- switch(run_level, 4, 10, 20)
tsla_local <- switch(run_level, 10, 20, 20)
tsla_global <- switch(run_level, 10, 20, 40)
tsla_rw.sd_rp <- 0.02
tsla_rw.sd_ivp <- 0.1
tsla_cooling.fraction.50 <- 0.5
tsla_rw.sd <- rw.sd(
sigma_nu = tsla_rw.sd_rp,
mu_h = tsla_rw.sd_rp,
phi = tsla_rw.sd_rp,
sigma_eta = tsla_rw.sd_rp,
G_0 = ivp(tsla_rw.sd_ivp),
H_0 = ivp(tsla_rw.sd_ivp)
)
stew(file=sprintf("mif1-%d.rda",run_level),{ t.if1 <- system.time({
if1 <- foreach(i=1:tsla_local,
.packages='pomp', .combine=c) %dopar% mif2(tsla.filt,
params=params_test,
Np=tsla_Np,
Nmif=tsla_Nmif,
cooling.fraction.50=tsla_cooling.fraction.50,
rw.sd = tsla_rw.sd)
L.if1 <- foreach(i=1:tsla_local,
.packages='pomp', .combine=rbind) %dopar% logmeanexp(
replicate(tsla_eval, logLik(pfilter(tsla.filt,
params=coef(if1[[i]]),Np=tsla_Np))), se=TRUE) })
},seed=318817883,kind="L'Ecuyer")
r.if1 <- data.frame(logLik=L.if1[,1],logLik_se=L.if1[,2],
t(sapply(if1,coef)))
if (run_level>1) write.table(r.if1,file="TSLA.csv",
append=TRUE,col.names=FALSE,row.names=FALSE)
summary(r.if1$logLik,digits=5)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 509.0 511.5 512.2 513.7 513.6 542.0
r.if1[which.max(r.if1$logLik),]
## logLik logLik_se sigma_nu mu_h phi sigma_eta
## result.8 542.0496 0.1547801 0.02360043 -7.483303 0.3974036 0.9031391
## G_0 H_0
## result.8 0.1674713 -1.123017
plot(if1)
pairs(~logLik+sigma_nu+mu_h+phi+sigma_eta,
data=subset(r.if1,logLik>max(logLik)-500))
From the results above,
We could see that the largest logLikelihood is 542.0 for this POMP model.
The likelihood converges quickly, with an estimated mean of 513.7, and once it converged, it becomes quitely stable. \(H_0\) are not shrinkage, however, since \(H_0\) is just the starting point of the POMP model, there are naturally large fluctuations and we don’t need to pay more attention on this phenomenon. Almost all parameters are not convergent, however, if you observe carefully, we could find that they all stay in very small range of intervals. Considering about the small sample size, we could conclude that this result is basically satisfied.
Finally, we search on likelihood starting randomly throughout a large box. We could obtain the similar results and conclusions as what in the local search method above. We think that maybe we could refine the parameters box and then run the alogrithms to optimize the results.
tsla_box <- rbind(
sigma_nu=c(0.005,0.05),
mu_h =c(-1,0),
phi = c(0.95,0.99),
sigma_eta = c(0.5,1),
G_0 = c(-2,2),
H_0 = c(-1,1)
)
stew(file=sprintf("box_eval-%d.rda",run_level),{ t.box <- system.time({
if.box <- foreach(i=1:tsla_global,
.packages='pomp',.combine=c) %dopar% mif2(if1[[1]],
params=apply(tsla_box,1,function(x)runif(1,x)))
L.box <- foreach(i=1:tsla_global,
.packages='pomp',.combine=rbind) %dopar% { logmeanexp(replicate(tsla_eval, logLik(pfilter(
tsla.filt,params=coef(if.box[[i]]),Np=tsla_Np))),
se=TRUE)
} })
},seed=290860873,kind="L'Ecuyer")
r.box <- data.frame(logLik=L.box[,1],logLik_se=L.box[,2],
t(sapply(if.box,coef)))
if(run_level>1) write.table(r.box,file="TSLA.csv",
append=TRUE,col.names=FALSE,row.names=FALSE)
summary(r.box$logLik,digits=5)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 502.5 510.4 513.3 514.8 514.9 541.1
r.box[which.max(r.box$logLik),]
## logLik logLik_se sigma_nu mu_h phi sigma_eta
## result.11 541.1146 0.1306042 0.03139986 -7.453165 0.4908952 0.9498469
## G_0 H_0
## result.11 0.2242531 -1.289617
pairs(~logLik+log(sigma_nu)+mu_h+phi+sigma_eta+H_0,
data=subset(r.box,logLik>max(logLik)-500))
plot(if.box)
The generalized ARCH model, known as GARCH(p,q), has the form \(Y_n\) = \(\epsilon_n\)\(\sqrt{V_n}\),
where \(V_n\) = \(\alpha_0\) + \(\sum_{j=1}^{p}\alpha_j\)\(Y_{n-j}^2\) + \(\sum_{k=1}^{q}\beta_k\)\(V_{n-k}\) and \(\epsilon_{1:N}\) is white noise.
library(tseries)
require(fGarch)
garch11 <- garchFit(~garch(1,1), data = tsla_dm, cond.dist = c("norm"), include.mean = FALSE, algorithm = c("nlminb"), hessian = c("ropt"))
##
## Series Initialization:
## ARMA Model: arma
## Formula Mean: ~ arma(0, 0)
## GARCH Model: garch
## Formula Variance: ~ garch(1, 1)
## ARMA Order: 0 0
## Max ARMA Order: 0
## GARCH Order: 1 1
## Max GARCH Order: 1
## Maximum Order: 1
## Conditional Dist: norm
## h.start: 2
## llh.start: 1
## Length of Series: 251
## Recursion Init: mci
## Series Scale: 0.0309068
##
## Parameter Initialization:
## Initial Parameters: $params
## Limits of Transformations: $U, $V
## Which Parameters are Fixed? $includes
## Parameter Matrix:
## U V params includes
## mu -9.42554e-17 9.42554e-17 0.0 FALSE
## omega 1.00000e-06 1.00000e+02 0.1 TRUE
## alpha1 1.00000e-08 1.00000e+00 0.1 TRUE
## gamma1 -1.00000e+00 1.00000e+00 0.1 FALSE
## beta1 1.00000e-08 1.00000e+00 0.8 TRUE
## delta 0.00000e+00 2.00000e+00 2.0 FALSE
## skew 1.00000e-01 1.00000e+01 1.0 FALSE
## shape 1.00000e+00 1.00000e+01 4.0 FALSE
## Index List of Parameters to be Optimized:
## omega alpha1 beta1
## 2 3 5
## Persistence: 0.9
##
##
## --- START OF TRACE ---
## Selected Algorithm: nlminb
##
## R coded nlminb Solver:
##
## 0: 375.92963: 0.100000 0.100000 0.800000
## 1: 371.02639: 0.153815 0.0925624 0.830288
## 2: 361.32900: 0.154894 0.0391875 0.798370
## 3: 360.41207: 0.255934 1.00000e-08 0.806464
## 4: 355.64989: 0.244806 1.00000e-08 0.753278
## 5: 355.64918: 0.242331 1.00000e-08 0.754316
## 6: 355.64885: 0.240671 1.00000e-08 0.756424
## 7: 355.64863: 0.235130 1.00000e-08 0.762129
## 8: 355.64720: 0.145973 1.00000e-08 0.852938
## 9: 355.64131: 0.133187 1.00000e-08 0.865267
## 10: 355.63422: 0.121769 1.00000e-08 0.875887
## 11: 355.62639: 0.0925786 1.00000e-08 0.904244
## 12: 355.61260: 0.0640802 1.00000e-08 0.933297
## 13: 355.60003: 0.0469642 1.00000e-08 0.951452
## 14: 355.59763: 0.0294289 1.00000e-08 0.969203
## 15: 355.59486: 0.0295567 1.00000e-08 0.969326
## 16: 355.59481: 0.0294208 1.00000e-08 0.969440
## 17: 355.59470: 0.0274447 1.00000e-08 0.971480
## 18: 355.59467: 0.0261738 1.00000e-08 0.972797
## 19: 355.59467: 0.0259853 1.00000e-08 0.972992
## 20: 355.59467: 0.0259755 1.00000e-08 0.973002
##
## Final Estimate of the Negative LLH:
## LLH: -517.0769 norm LLH: -2.060067
## omega alpha1 beta1
## 2.481255e-05 1.000000e-08 9.730024e-01
##
## R-optimhess Difference Approximated Hessian Matrix:
## omega alpha1 beta1
## omega -159742800803 -127210120.9 -146519356.6
## alpha1 -127210121 -106147.0 -120973.2
## beta1 -146519357 -120973.2 -135274.9
## attr(,"time")
## Time difference of 0.003530025 secs
##
## --- END OF TRACE ---
##
##
## Time to Estimate Parameters:
## Time difference of 0.03997803 secs
summary(garch11)
##
## Title:
## GARCH Modelling
##
## Call:
## garchFit(formula = ~garch(1, 1), data = tsla_dm, cond.dist = c("norm"),
## include.mean = FALSE, algorithm = c("nlminb"), hessian = c("ropt"))
##
## Mean and Variance Equation:
## data ~ garch(1, 1)
## <environment: 0x7fda388bf548>
## [data = tsla_dm]
##
## Conditional Distribution:
## norm
##
## Coefficient(s):
## omega alpha1 beta1
## 2.4813e-05 1.0000e-08 9.7300e-01
##
## Std. Errors:
## based on Hessian
##
## Error Analysis:
## Estimate Std. Error t value Pr(>|t|)
## omega 2.481e-05 1.104e-05 2.248 0.0246 *
## alpha1 1.000e-08 NA NA NA
## beta1 9.730e-01 NA NA NA
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Log Likelihood:
## 517.0769 normalized: 2.060067
##
## Description:
## Mon May 4 13:36:48 2020 by user:
##
##
## Standardised Residuals Tests:
## Statistic p-Value
## Jarque-Bera Test R Chi^2 388.5955 0
## Shapiro-Wilk Test R W 0.9137103 7.339919e-11
## Ljung-Box Test R Q(10) 9.327889 0.5012941
## Ljung-Box Test R Q(15) 13.87553 0.5349893
## Ljung-Box Test R Q(20) 14.97234 0.7779881
## Ljung-Box Test R^2 Q(10) 3.862239 0.9533477
## Ljung-Box Test R^2 Q(15) 4.959775 0.9924615
## Ljung-Box Test R^2 Q(20) 6.900504 0.9969988
## LM Arch Test R TR^2 5.257037 0.948838
##
## Information Criterion Statistics:
## AIC BIC SIC HQIC
## -4.096230 -4.054093 -4.096511 -4.079273
t11 = garch11@sigma.t
plot(tsla_dm, ylim = c(-0.2,0.2), ylab = 'Demeaned tsla Adjusted Close Price', xlab = 'Date', type = 'l', main = 'Garch(1,1)', lwd = 1)
lines(-2*t11, lty=2, col='grey', lwd = 1.5)
lines(2*t11, lty=2, col='grey', lwd = 1.5)
legend('topright', c('Demeaned tsla Adjusted Close Price','95% interval'), col = c('black','grey'), lty = c(1,2), lwd = c(1,1.5))
qqnorm(tsla_dm)
qqline(tsla_dm)
From the results above, we could obtain that the logLikelihood for GARCH(1,1) is 517.1, which is smaller than it in POMP model.
Comparing with the maximum logLikelihood between the POMP model and GARCH(1,1) model, the values of the logLikelihood are close to each other. If we base on the AIC criteria, we could elect the both of them since they have similarly good performance to predict the stochastic volatility of Tesla stock adjusted closing price.
The Tesla stock adjusted closing price in 2019 violates the assumption of the GARCH model that the residuals should have normality. This point is worthwhile to further focus on, maybe we could do something, like expanding sample size to optimize the GARCH model.
From the diagnostics of the POMP model of stochastic volatility, almost all the parameters are not convergent, however, they flucturate in relatively small ranges. We guess that we could refine the parameters box and expand sample size to solve this problem.
To be honest, the GRACH model and stochastic volatility model for the Tesla stock adjusted closing price both show unsatisfacory at a certain degree. We have to further research on optimizing the existed model or putting forward the new approach.
Stochastic Volatility of the SPX500 Index (“https://ionides.github.io/531w16/final_project/Project14/Final.html”)
Financial Volatility of Google Stock (“https://ionides.github.io/531w18/final_project/1/final.html”)
Investigation on Financial Volatility of NASDAQ (“https://ionides.github.io/531w18/final_project/2/final.html”)
Stochastic Volatility of Nasdaq index (“https://ionides.github.io/531w18/final_project/16/final.html”)
Analysis of SP500 Volatility (“https://ionides.github.io/531w18/final_project/19/final.html”)
Time Series Analysis of Nintendo stock price (“https://ionides.github.io/531w18/final_project/27/final.html”)
Analyzing the Volatility of Bitcoin Market Price (“https://ionides.github.io/531w18/final_project/35/final.html”)
Case study: POMP modeling to investigate financial volatility (“https://ionides.github.io/531w20/14/notes14.pdf”)
Can log likelihood function be positive- Cross Validated (“https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=3&cad=rja&uact=8&ved=2ahUKEwixwL_hp47pAhVXYs0KHbFJAcMQFjACegQIDBAG&url=https%3A%2F%2Fstats.stackexchange.com%2Fquestions%2F319859%2Fcan-log-likelihood-funcion-be-positive&usg=AOvVaw2a1dWxy4kc55mNfj0CAyuE”)
Tesla 2019 Financial Report (“https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&uact=8&ved=2ahUKEwiM5ObgqY7pAhWaK80KHblbDb0QFjAAegQIARAB&url=https%3A%2F%2Fauto.sina.cn%2Fnews%2Fhy%2F2020-02-06%2Fdetail-iimxxste9398111.d.html&usg=AOvVaw3ZDGExRI4Jnyi9UtXrMQUE”)