1 Introduction

Stochastic volatility models are widely in econometrics and finance. In this case GARCH time series models are good candidates for modeling time varying volatility. Another kind of stochastic volatility model is the Heston model. Whereas the GARCH model parameters are estimated by maximum likelihood, we will estimate the Heston model parameters with a particle filter.


2 Data

For our example, we will use daily Google stock prices from (\(2015\)-\(03\)-\(06\)) to (\(2020\)-\(03\)-\(06\)). The stock price data comes from Yahoo Finance.

Google Log-Returns.

Figure 2.1: Google Log-Returns.

##       GOOG           
##  Min.   :-0.0800893  
##  1st Qu.:-0.0058661  
##  Median : 0.0006316  
##  Mean   : 0.0006724  
##  3rd Qu.: 0.0082107  
##  Max.   : 0.1488719
##                     log-return variance
## log-return variance        0.0002323917

In order to detect presence of volatility clustering, we plot the ACF of squared log-returns. In the plot below, we see that log-returns and square log-returns exhibit some serial correlation. This indicates that log-returns are not independent and there is some evidence of volatility clustering.

ACF plots of Google log-returns.

Figure 2.2: ACF plots of Google log-returns.


3 Methodology

To model Google’s log-returns, we propose a Heston model structure with time varying mean and variance [2]. This project builds upon a previous project on stochastic volatility [7]. Denote \(S_{t}\) as the stock price process, \(v_{t}\) as the volatility process, and \(\mu_{t}\) as the mean process. Then the dynamics can be written as, \[\begin{align*} dS_{t} & =\mu_{t}S_{t}dt+\sqrt{v_{t}}S_{t}dW_{t}^{S}\\ dv_{t} & =\kappa\left(\bar{\sigma}-v_{t}\right)dt+\sigma v_{t}dW_{t}^{v}\\ d\mu_{t} & =\lambda\left(\bar{\mu}-\mu_{t}\right)dt+\eta dW_{t}^{\mu}\\ \end{align*}\]

where \(d\left\langle W^{S},W^{v}\right\rangle =\rho\) are correlated brownian motions, but independent of \(dW_{t}^{\mu}\), and each brownian motion term is distributed \(N\left(0,t\right)\). The volatility and mean processes are Vasicek type models [5], where

Next we will derive a scheme for the log-returns process [6]. We know the solution of the price process \(S_{t}\) on the interval \(\Delta t\) is log-normal, \[\begin{align} S_{t+\Delta t} & =S_{t}\exp\left\{ \left(\mu_{t}-\frac{1}{2}v_{t}\right)\Delta t+\sqrt{v_{t}}W_{\Delta t}\right\} \label{eq:1} \end{align}\]

Dividing by \(S_{t}\) and taking log of equation (\ref{eq:1}), the log return process is given by, \[\begin{align*} R_{t+\Delta t} & :=\log\left(\frac{S_{t+\Delta t}}{S_{t}}\right)\\ & =\left(\mu_{t}-\frac{1}{2}v_{t}\right)\Delta t+\sqrt{v_{t}}W_{\Delta t} \end{align*}\]

This implies that the distribution of log returns \(R_{t+\Delta t}\sim N\left(\left(\mu_{t}-\frac{1}{2}v_{t}\right)\Delta t,v_{t}\Delta t\right)\). Next, in order for us to impose the condition that volatility be nonnegative, we also take its log transformation and compute the dynamics of the log-volatility process. Define \(\nu_{t}:=\log\left(v_{t}\right)\). Using Ito’s formula [8], the dynamics of \(\nu_{t}\) are given by, \[\begin{align*} d\nu_{t} & =\left[\kappa\left(\frac{\bar{\sigma}}{v_{t}}-1\right)-\frac{1}{2}\sigma^{2}\right]dt+\sigma dW_{t}^{\nu}\\ & =\left[\kappa\left(\bar{\sigma}e^{-\nu_{t}}-1\right)-\frac{1}{2}\sigma^{2}\right]dt+\sigma dW_{t}^{\nu} \end{align*}\]

Taking \(\Delta t=1\), we have the following Euler discretization. \[\begin{align*} R_{n+1} & =\left(M_{n}-\frac{1}{2}V_{n}\right)+\sqrt{V_{n}}\epsilon_{1}\\ V_{n+1} & =V_{n}+\kappa\left(\bar{\sigma}e^{-V_{n}}-1\right)-\frac{1}{2}\sigma^{2}+\sigma\left(\rho\epsilon_{1}+\sqrt{1-\rho^{2}}\epsilon_{2}\right)\\ M_{n+1} & =M_{n}+\lambda\left(\bar{\mu}-M_{n}\right)+\eta\epsilon_{3} \end{align*}\]

where \(\epsilon_{1},\epsilon_{2},\epsilon_{3}\stackrel{iid}{\sim}N\left(0,1\right)\).


4 Results

4.1 Normally Distributed

4.1.1 Create pomp object

##### POMP object
path = "Code/final_version/"

lret_statenames <- c("Z", "M","Y_state")
lret_rp_names <- c("k","s_bar","s", # volatility parameters
                   "l", "m_bar", "e", # growth parameters
                   "rho") # corr between brownian motions
lret_ivp_names <- c("M_0", "Z_0")
lret_paramnames <- c(lret_rp_names, lret_ivp_names)
lret_covarnames <- "covaryt"


rproc1 <- "
  double dW, dW_v, dW_m, delta_t;
  delta_t = 1;
  dW = rnorm(0,sqrt(delta_t));
  dW_v = rnorm(0,sqrt(delta_t));
  dW_m = rnorm(0,sqrt(delta_t));
  Z = Z + (k*(exp(-Z) * s_bar - 1) - 0.5*pow(s,2))*delta_t + s*(rho*dW + sqrt(1-pow(rho,2))*dW_v);
  M = M + l*(m_bar - M)*delta_t + e*dW_m;
"

rproc2.sim <- "
  Y_state = (M - 0.5*exp(Z))*delta_t + exp(Z/2)*dW;
 "
rproc2.filt <- "
  Y_state = covaryt;
 "
lret_rproc.sim <- paste(rproc1,rproc2.sim)
lret_rproc.filt <- paste(rproc1,rproc2.filt)

# Define the initializer and assume that the measurement process is a perfect 
# observation of Yt component of the state space.
lret_rinit <- "
  M = M_0;
  Z = Z_0;
  Y_state = rnorm( M - 0.5*exp(Z), exp(Z/2) );
"

lret_rmeasure = "
  y = Y_state;
"


lret_dmeasure = "
  lik = dnorm(y, M - 0.5*exp(Z), exp(Z/2), give_log);
"

# Perform log and logit transformations on parameters.
lret_partrans <- parameter_trans(
  log = c("s_bar","k", "s", "l", "e"), 
  logit = "rho"
)

df = data.frame(y= lrets,
                time=1:length(lrets))
colnames(df) = c("y", "time")
lret.filt <- pomp(data=df,
                  statenames=lret_statenames, 
                  paramnames=lret_paramnames, 
                  times="time",
                  t0=0,
                  covar=covariate_table(time=0:length(lrets), 
                                        covaryt=c(0,lrets), 
                                        times="time"),
                  rmeasure=Csnippet(lret_rmeasure), 
                  dmeasure=Csnippet(lret_dmeasure), 
                  rprocess= discrete_time(step.fun = Csnippet(lret_rproc.filt),
                                          delta.t=1), 
                  rinit=Csnippet(lret_rinit), 
                  partrans=lret_partrans
                  )

From the recursions derived in the previous section, we obtain a POMP model framework. We will call \(M_n\) and \(V_n\) our latent variables and use the state variable \(R_n\) to model the measurement process as a perfect observation of the data. Next we will set \(\bar{\mu},\bar{\sigma}\) parameters equal to the mean and variance of our historical log-returns. Below we print our initial parameters. We notice that \(\kappa=0.019\) is low, which can be interpreted with the concept of half-life [1]. \[\textrm{Half-Life} = \frac{\ln(2)}{0.019}=36.5 \textrm{ years}\]

which means it takes rougly 37 business days for volatility to travel back to equilibrium from the current level. On the other hand \(\lambda =1.95\), so the process is mean-reverting much quicker. We also notice that \(\rho=0.95\) is quite high, which implies that the brownian motion from our log-volatility process is highly correlated with the brownian motiono of the log-reeturns process.

##             k        s_bar        s        l        m_bar            e
## 94 0.09724461 0.0002323917 0.558105 1.952343 0.0006723645 0.0003688726
##          rho         M_0       Z_0
## 94 0.9518341 0.007153135 -12.03925

4.1.2 Simulate with initial parameters

Below we show a couple simulations to see how close they are to the data. We notice that our simulations tend to exhibit slightly higher volatility, but still provide a reasonably goood fit to the data. In section 4.1.5, we compare our model to ARMA(1,1)+GARCH(1,1). We also print the log-likelihoods of each simulation after filtering, which shows that our simulations can vary widely.

Comparing Simulations.

Figure 4.1: Comparing Simulations.

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    3435    3555    3653    3622    3686    3751

4.1.5 Benchmark model

We take the benchmark model to be a standard ARMA(1,1)+GARCH(1,1) model. From the coefficient table, we see that the GARCH parameters are significant while the mean parameters are not. However, the log-likelihood of this model is far below our Heston model. In this case we would conclude that are Heston model has done a better job of fitting the data.

However, if we take into account that log-returns are heavy-tailed and replace the conditional error distribution of the GARCH model to a t-distribution, the log-likelihood of the benchmark model increases to 3681, which is now better than our Heston model. See section 4.2.5. Thus in the next section we modify our Heston model.

Estimate Std. Error t value Pr(>|t|)
mu 0.0014833 0.0007320 2.0262632 0.0427378
ar1 -0.4561431 0.4619580 -0.9874126 0.3234404
ma1 0.4816157 0.4529011 1.0634015 0.2875999
omega 0.0000355 0.0000094 3.7778921 0.0001582
alpha1 0.1801861 0.0389327 4.6281415 0.0000037
beta1 0.6861493 0.0641409 10.6975309 0.0000000
## LogLikelihood 
##       3543.59

Benchmark with 95% confidence bands.

Figure 4.5: Benchmark with 95% confidence bands.

4.2 t-distributed

4.2.1 Create pomp object

We make a small change to our POMP model from the previous section. Seeing from the data that returns are heavy-tailed, it might be better to model log-returns with a heavy-tailed distribution. Thus we will model the measurement process as, \[R_{n+1}\sim t_\nu(M_n - \frac{1}{2}\exp(V_n), \exp(\frac{1}{2}V_n))\]

where \(\hat{\nu}=\frac{6}{\hat{K}}+4\) is estimated from the excess kurtosis of the log-returns. Since \(R_{n+1}\) no longer has a brownian motion, it is trickier to have correlated brownian motions as in the previous example, Instead, we will take the brownian motions be independent.

##### POMP object
path = "Code/t_dist/"

library(e1071)
kur = kurtosis(lrets) - 3
nu = (6 / kur) + 4


lret_statenames <- c("Z", "M","Y_state")
lret_rp_names <- c("k","s_bar","s", # volatility parameters
                   "l", "m_bar", "e", # growth parameters
                   "rho","nu") # corr between brownian motions
lret_ivp_names <- c("M_0", "Z_0")
lret_paramnames <- c(lret_rp_names, lret_ivp_names)
lret_covarnames <- "covaryt"

### New method
lret_rproc.filt = function (Z,M,k,s_bar,s,rho,l,m_bar,e,covaryt,...,delta.t) {
  dW = rnorm(1,0,sqrt(delta.t))
  dW_v = rnorm(1,0,sqrt(delta.t))
  dW_m = rnorm(1,0,sqrt(delta.t))
  c(Z = Z + (k*(exp(-Z) * s_bar - 1) - 0.5*s^2)*delta.t + s*(rho*dW + sqrt(1-rho^2)*dW_v),
    M = M + l*(m_bar - M)*delta.t + e*dW_m,
    Y_state = covaryt)
}

lret_rproc.sim = function (Z,M,Y_state,k,s_bar,s,rho,l,m_bar,e,nu,...,delta.t) {
  dW = rnorm(1,0,sqrt(delta.t))
  dW_v = rnorm(1,0,sqrt(delta.t))
  dW_m = rnorm(1,0,sqrt(delta.t))
  c(Z = Z + (k*(exp(-Z) * s_bar - 1) - 0.5*s^2)*delta.t + s*(rho*dW + sqrt(1-rho^2)*dW_v),
    M = M + l*(m_bar - M)*delta.t + e*dW_m,
    Y_state = rstd(1,M - 0.5*exp(Z), exp(Z/2), nu))
}

lret_rmeasure = function (Y_state, ...) {
  c(y=Y_state)
}

lret_dmeasure = function (M,y,Z,nu, ...,log) {
  dstd(y, M - 0.5*exp(Z), exp(Z/2), nu,log = log)
}

# Define the initializer and assume that the measurement process is a perfect 
# observation of Yt component of the state space.

lret_rinit = function (M_0, Z_0,nu,...) {
  c(M = M_0,
    Z = Z_0,
    Y_state = rstd(1, M_0 - 0.5*exp(Z_0), exp(Z_0/2), nu))
}

expit <- function(real){1/(1+exp(-real))}
logit <- function(p.arg){log(p.arg/(1-p.arg))}


lret_partrans <- parameter_trans(
  toEst = function(s_bar, k, s, l, e,rho,...) {
    c(s_bar = log(s_bar),
      k = log(k),
      s = log(s),
      l = log(l),
      e = log(e),
      rho = logit(rho))
  }, fromEst = function(s_bar, k, s, l, e,rho,...) {
    c(s_bar = exp(s_bar),
      k = exp(k),
      s = exp(s),
      l = exp(l),
      e = exp(e),
      rho = expit(rho))
  }
)


df = data.frame(y= lrets,
                time=1:length(lrets))
colnames(df) = c("y", "time")
lret.filt <- pomp(data=df,
                  times="time",
                  t0=0,
                  rinit=lret_rinit,
                  covar=covariate_table(time=0:length(lrets), 
                                        covaryt=c(0,lrets), 
                                        times="time"),
                  rmeasure=lret_rmeasure, 
                  dmeasure=lret_dmeasure, 
                  rprocess= discrete_time(
                    step.fun=lret_rproc.filt,
                    delta.t=1 
                  ), 
                  partrans=lret_partrans)

These are our initial parameters,

##              k        s_bar         s        l        m_bar            e
## 142 0.01860055 0.0002323917 0.1461322 1.962586 0.0006723645 2.421022e-05
##           rho         M_0       Z_0       nu
## 142 0.8024777 0.007937706 -8.665614 4.762563

4.2.2 Simulate with initial parameters

Below we show a couple simulations to see how close they are to the data. We notice that our simulations seem to represent the data better than the previous model. The summary table below also shows that the maximum likelihood of one of our filtered simulations is also higher than what was achieved by the previous model.

Comparing Simulations.

Figure 4.6: Comparing Simulations.

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    3487    3540    3568    3597    3646    3842

4.2.4 Global Search

Our global search found parameters that give us a higher log-likelihood of 3684. This is the highest log-likelihood out of any model in our experiment. These values are similar to the ones from our initial parameters and have the same interpretation.

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    3669    3679    3680    3680    3682    3684

We notice that the parameters that maximized the likelihood in our global search remain similar to parameters we have seen before. Much of the improvement in the log-likelihood likely comes from changing the distribution to t-distribution.

##            logLik logLik_se          k        s_bar         s        l
## result.3 3684.359  0.145336 0.01860055 0.0002323917 0.1461322 1.962586
##                 m_bar            e       rho         M_0       Z_0
## result.3 0.0006723645 2.421022e-05 0.8024777 0.007937706 -8.665614
##                nu
## result.3 4.762563

Below we show the convergence of our estimated parameters. Again, we can see that the mean parameters \(l\) and \(e\) have more difficulty converging, but the log-likelihood does seem to have converged.

MIF2 Convergence Diagnostics.

Figure 4.7: MIF2 Convergence Diagnostics.

Filter diagnostics (last iteration).

Figure 4.8: Filter diagnostics (last iteration).

Likelihood surface.

Figure 4.9: Likelihood surface.

4.2.5 Benchmark model (t-dist)

Below we show the coefficients of our ARMA(1,1)+GARCH(1,1) model with conditional t-distribution. We see that the ARMA coefficients are not significant, but the GARCH coefficients are significant. Also we notice that the log-likelihood improved from changing the conditional distribution to t-distribution. This model performs just as well or slightly under our new Heston model. The advantage of the Heston model is that it is more interpretable.

Estimate Std. Error t value Pr(>|t|)
mu 0.0013943 0.0007519 1.8543092 0.0636949
ar1 -0.2472329 0.5784889 -0.4273770 0.6691047
ma1 0.2695307 0.5739062 0.4696424 0.6386105
omega 0.0000043 0.0000021 2.0192991 0.0434561
alpha1 0.0730921 0.0194996 3.7483995 0.0001780
beta1 0.9179424 0.0201980 45.4472189 0.0000000
shape 3.5523933 0.3773704 9.4135464 0.0000000
## LogLikelihood 
##      3681.482

Benchmark with 95% confidence bands.

Figure 4.10: Benchmark with 95% confidence bands.


5 Conclusion

In our experiment we find that the t-distributed Heston model performed the best in terms of maximizing the log-likelihood. However, note that imposing a t-distribution on the jump process is not typically done in practice, since brownian motion requires using the normal-distribution. On the other hand, the normally distributed Heston model also performs well and is able to outperform the ARMA(1,1)+GARCH(1,1) model with normally distributed conditional error. In the future it would be interesting to try different kinds of jump processes, or incorporate stochastic correlation between brownian motions.


6 References

[1] https://quant.stackexchange.com/questions/18602/speed-of-mean-reversion-of-an-interest-rate-model
[2] https://en.wikipedia.org/wiki/Heston_model
[3] https://kingaa.github.io/pomp/vignettes/getting_started.html
[4] Lecture Notes from Stats 531
[5] https://en.wikipedia.org/wiki/Vasicek_model
[6] https://srdas.github.io/MLBook/FinanceModels.html
[7] https://ionides.github.io/531w18/final_project/16/final.html
[8] https://en.wikipedia.org/wiki/Itô%27s_lemma