In this project, we will investigate the volatility in USD/CNY exchange rate, mainly with partially-observed Markov processes (POMP) stochastic volatility models. Understanding the dynamics of volatility will help the investers better manage the risks and opportunities in foreign exchange market.
The data is obtained from Yahoo finance website [1]. We select 500 time points ranging from 2018-5-21 to 2020-4-17 to build the models. The original data clearly shows some trend and seasonality. We calculate log-returns of the exchange rate and demean the data for further analysis. There is some volatiity clustering in the plot of demeaned log-returns.
dat = read.csv('USDCNY.csv',header=TRUE)
dat = dat[801:1300,]
dat$Date = as.Date(dat$Date)
dat$Adj.Close = as.numeric(as.character(dat$Adj.Close))
USX = na.omit(dat)
ret = diff(log(USX$Adj.Close))
USX.ret.demeaned = ret - mean(ret)
par(mfrow=c(1,2))
plot(as.Date(USX$Date),USX$Adj.Close,type='l',xlab='Date',ylab='USD/CNY')
plot(as.Date(USX$Date[-1]),USX.ret.demeaned,type='l',xlab='Date',ylab='USD/CNY (demeaned log-returns)')
Heston model is one of the most popular stochastic volatility models. It is used for option pricing with the assumption that volatility is arbitrary [3]. This is contrary to the famous Black-Scholes model, which assumes a geometric Brownian motion (GBM) dynamics with constant volatility.
The basic Heston model assumes that the asset price \(S_t\) is determined by a stochastic process [2]:
\[dS_t = \mu S_t dt + \sqrt{\nu_t}S_t dW_t^S \]
where \(\mu\) is the rate of return, \(\nu_t\) is the instantaneous variance. \(\nu_t\) is a Cox–Ingersoll–Ross (CIR) process \(d\nu_t = \kappa(\theta-\nu_t)dt+\xi\sqrt{\nu_t}dW_t^{\nu}\), where \(W_t^S, W_t^{\nu}\) are Wiener processes.
The discrete-time representation for \(\nu_t\) is [4]:
\[\begin{split} Y_n &= \sqrt{\nu_n}\epsilon_n \\ \nu_n &= (1-\phi)\theta + \phi\nu_{n-1} + \sqrt{\nu_{n-1}} \omega_n =(1-\phi)\theta + \phi\nu_{n-1} + \sigma \sqrt{\nu_{n-1}} \epsilon_n \end{split}\]
with \(0<\phi<1,\theta>0\), and \(\epsilon_n\) are iid N(0,1), \(\omega_n\) are iid N(0,\(\sigma^2\)).
The state variables and the parameters to be optimized are declared below.
HES_statenames <- c("V","Y_state")
HES_rp_names <- c("sigma_omega","phi","theta")
HES_ivp_names <- c("V_0")
HES_paramnames <- c(HES_rp_names,HES_ivp_names)
HES_covarnames <- "covaryt"
We build the components of the POMP model as well as the ones needed for simulation purpose. Note that there is an implicit restriction on the instantaneous variance \(V_n\), which might become negative in the rprocess. In this case, both \(\sqrt{V_n}\) and the resulting likelihood measure will be meaningless. We first adopt the approach to replace negative value with the constant term. Moreover, we take expit/logit transform of \(\phi\) since \(0<\phi<1\), and exp/log transform of \(\sigma, \theta, V_0\) since they are positive.
rproc1 <- "
double omega;
omega = rnorm(0,sigma_omega);
V = theta*(1 - phi) + phi*sqrt(V) + sqrt(V)*omega;
if (V > 0) {
V = V;
} else {
V = theta*(1 - phi);
}
"
rproc2.sim <- "
Y_state = rnorm(0,sqrt(V));
"
rproc2.filt <- "
Y_state = covaryt;
"
HES_rproc.sim <- paste(rproc1,rproc2.sim)
HES_rproc.filt <- paste(rproc1,rproc2.filt)
HES_rinit <- "
V = V_0;
Y_state = rnorm(0,sqrt(V));
"
HES_rmeasure <- "
y=Y_state;
"
HES_dmeasure <- "
lik=dnorm(y,0,sqrt(V),give_log);
"
HES_partrans <- parameter_trans(
log=c("sigma_omega","theta","V_0"),
logit="phi"
)
HES.filt <- pomp(data=data.frame(
y=USX.ret.demeaned,time=1:length(USX.ret.demeaned)),
statenames=HES_statenames,
paramnames=HES_paramnames,
times="time",
t0=0,
covar=covariate_table(
time=0:length(USX.ret.demeaned),
covaryt=c(0,USX.ret.demeaned),
times="time"),
rmeasure=Csnippet(HES_rmeasure),
dmeasure=Csnippet(HES_dmeasure),
rprocess=discrete_time(step.fun=Csnippet(HES_rproc.filt),
delta.t=1),
rinit=Csnippet(HES_rinit),
partrans=HES_partrans
)
After building the POMP model, we first test the code using simulated data, see whether the testing parameters are in reasonable range.
params_test <- c(
sigma_omega = 0.001,
phi = 0.001,
theta = 0.00002,
V_0 = 0.002
)
sim1.sim <- pomp(HES.filt,
statenames=HES_statenames,
paramnames=HES_paramnames,
rprocess=discrete_time(
step.fun=Csnippet(HES_rproc.sim),delta.t=1)
)
sim1.sim <- simulate(sim1.sim,seed=1,params=params_test)
sim1.filt <- pomp(sim1.sim,
covar=covariate_table(
time=c(timezero(sim1.sim),time(sim1.sim)),
covaryt=c(obs(sim1.sim),NA),
times="time"),
statenames=HES_statenames,
paramnames=HES_paramnames,
rprocess=discrete_time(
step.fun=Csnippet(HES_rproc.filt),delta.t=1)
)
run_level <- 3
HES_Np <- switch(run_level, 10, 1e3, 2e3)
HES_Nmif <- switch(run_level, 10, 100, 200)
HES_Nreps_eval <- switch(run_level, 4, 10, 20)
HES_Nreps_local <- switch(run_level, 10, 20, 20)
HES_Nreps_global <- switch(run_level, 10, 20, 100)
stew(file=sprintf("pf1-HES-final-%d.rda",run_level),{
t.pf1 <- system.time(
pf1 <- foreach(i=1:USX_Nreps_eval,
.packages="pomp") %dopar% pfilter(sim1.filt,Np=USX_Np))
},seed=493536993,kind="L'Ecuyer")
L.pf1.hes <- logmeanexp(sapply(pf1,logLik),se=TRUE)
L.pf1.hes
## se
## 1.961259e+03 1.402184e-02
In short time, we obtain a log-likelihood estimate of around 1961 with a standard error of 0.014. We plot the comparison between original demeaned log-returns and the simulated data below, and the two curves almost show similar volatility clustering pattern. We treat the first several points as burn-in period and remove them.
plot(Y_state[-c(1:5)]~time[-c(1:5)], data=sim1.sim, type='l',col='red',xlab='Time',ylab='Log-returns')
lines(USX.ret.demeaned)
legend('top', legend=c('Simulated data','Observed data'), col=c('red','black'), lty=1, cex=0.8)
The simulated curve will look dramatically different if we try different test parameters. We can get a rough sense of how to set the starting parameters when we do iterative filtering on the original data.
We first run a local search with random walk standard deviation using the test parameters. The run level is set to be 3 for better accuracy. The maximum log-likelihood we can obtain is 2243.
HES_rw.sd_rp <- 0.001
HES_cooling.fraction.50 <- 0.5
HES_rw.sd <- rw.sd(
sigma_omega = HES_rw.sd_rp,
phi = HES_rw.sd_rp,
theta = HES_rw.sd_rp,
V_0 = ivp(HES_rw.sd_rp)
)
stew(file=sprintf("mif1-HES-final-%d.rda",run_level),{
t.if1 <- system.time({
if1 <- foreach(i=1:HES_Nreps_local,
.packages="pomp", .combine=c) %dopar% mif2(HES.filt,
params=params_test,
Np=HES_Np,
Nmif=HES_Nmif,
cooling.fraction.50=HES_cooling.fraction.50,
rw.sd = HES_rw.sd)
L.if1 <- foreach(i=1:HES_Nreps_local,
.packages="pomp", .combine=rbind) %dopar% logmeanexp(
replicate(HES_Nreps_eval, logLik(pfilter(HES.filt,params=coef(if1[[i]]),Np=HES_Np))), se=TRUE)
})
},seed=20200426,kind="L'Ecuyer")
r.if1 <- data.frame(logLik=L.if1[,1],logLik_se=L.if1[,2],t(sapply(if1,coef)))
summary(r.if1$logLik,digits=5)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 2240 2240 2241 2241 2242 2243
plot(if1)
The filter and convergence diagnostics indicate that the effective sample size is sufficiently large in the last iteration. The log-likelihood, \(\sigma\), \(\phi\) and \(\theta\) generally converge to a small interval after 200 mif iterations, while \(V_0\) converges in most of the cases.
pairs(~logLik+sigma_omega+phi+theta+V_0,data=subset(r.if1,logLik>max(logLik)-20))
From the log-likelihood surface in a neighborhood of the maximum value, we can see that optimization from diverse starting points end up with comparable likelihoods, though this local search is not optimal.
In order to deal with the negative value of \(V\) that appears during rprocess, we have also tried to modify the likelihood measure below. The resulting maximum log-likelihood is 2222, so we will use the previous method.
HES_dmeasure <- "
if (V > 0.0) {
lik=dnorm(y,0,sqrt(V),give_log);
} else {
lik=0.0;
}
"
r.if1.2 <- data.frame(logLik=L.if1[,1],logLik_se=L.if1[,2],t(sapply(if1,coef)))
summary(r.if1.2$logLik,digits=5)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 2220 2222 2222 2222 2222 2222
We later implement a likelihood maximization using randomized starting values. We set a parameter box and randomly sample the starting values from this box. The best likelihood we can obtain is around 1762.
HES_box <- rbind(
sigma_omega=c(0.001,0.003),
phi = c(0.0001,0.0005),
theta = c(0.000012,0.000015),
V_0 = c(0.001,0.002)
)
stew(file=sprintf("box_eval-HES-final-%d.rda",run_level),{
t.box <- system.time({
if.box <- foreach(i=1:HES_Nreps_global,
.packages='pomp',.combine=c) %dopar% mif2(if1[[1]],
params=apply(HES_box,1,function(x) runif(1,x)))
L.box <- foreach(i=1:HES_Nreps_global,
.packages='pomp',.combine=rbind) %dopar% {
logmeanexp(replicate(HES_Nreps_eval, logLik(pfilter(
HES.filt,params=coef(if.box[[i]]),Np=HES_Np))),
se=TRUE)
}
})
},seed=20200426,kind="L'Ecuyer")
r.box <- data.frame(logLik=L.box[,1],logLik_se=L.box[,2],t(sapply(if.box,coef)))
summary(r.box$logLik,digits=5)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 21.18 318.58 494.98 547.82 677.74 1761.90
We then constructed the POMP stochastic volatility model developed by Bret\(\acute{o}\) (2014) [5][6] with leverage \(R_n\):
\[R_n = \frac{\exp\{2G_n\}-1}{\exp\{2G_n\}+1} \]
where \(G_n\) is the Gaussian random walk.
The whole model is:
\[\begin{split} Y_n &= \exp(H_n/2) \epsilon_n \\ H_n &= \mu_h(1-\phi) + \phi H_{n-1} + \beta_{n-1}R_n\exp(-H_{n-1}/2)+\omega_n \\ G_n &= G_{n-1} + \nu_n \end{split}\]
where \(H_n\) is the log volatility, \(\beta_n = Y_n \sigma_n \sqrt{1-\phi ^2}\), \(\epsilon_n\) are iid N(0,1), \(\nu_n\) are iid \(N(0, \sigma_{\nu}^2)\), and \(\omega_n\) are iid N(0,\(\sigma_{\omega}^2\)).
One prominent advantage of this model compared to the Heston model is that we do not need to worry about taking square root of a negative value.
The POMP model and the corresponding parameters to be optimized are presented below.
USX_statenames <- c("H","G","Y_state")
USX_rp_names <- c("sigma_nu","mu_h","phi","sigma_eta")
USX_ivp_names <- c("G_0","H_0")
USX_paramnames <- c(USX_rp_names,USX_ivp_names)
USX_covarnames <- "covaryt"
rproc1 <- "
double beta,omega,nu;
omega = rnorm(0,sigma_eta * sqrt( 1- phi*phi ) * sqrt(1-tanh(G)*tanh(G)));
nu = rnorm(0, sigma_nu);
G += nu;
beta = Y_state * sigma_eta * sqrt( 1- phi*phi );
H = mu_h*(1 - phi) + phi*H + beta * tanh( G ) * exp(-H/2) + omega;
"
rproc2.sim <- "
Y_state = rnorm( 0,exp(H/2) );
"
rproc2.filt <- "
Y_state = covaryt;
"
USX_rproc.sim <- paste(rproc1,rproc2.sim)
USX_rproc.filt <- paste(rproc1,rproc2.filt)
USX_rinit <- "
G = G_0;
H = H_0;
Y_state = rnorm( 0,exp(H/2) );
"
USX_rmeasure <- "
y=Y_state;
"
USX_dmeasure <- "
lik=dnorm(y,0,exp(H/2),give_log);
"
USX_partrans <- parameter_trans(
log=c("sigma_eta","sigma_nu"),
logit="phi"
)
USX.filt <- pomp(data=data.frame(
y=USX.ret.demeaned,time=1:length(USX.ret.demeaned)),
statenames=USX_statenames,
paramnames=USX_paramnames,
times="time",
t0=0,
covar=covariate_table(
time=0:length(USX.ret.demeaned),
covaryt=c(0,USX.ret.demeaned),
times="time"),
rmeasure=Csnippet(USX_rmeasure),
dmeasure=Csnippet(USX_dmeasure),
rprocess=discrete_time(step.fun=Csnippet(USX_rproc.filt),
delta.t=1),
rinit=Csnippet(USX_rinit),
partrans=USX_partrans
)
Similar as before, we first do iterative filtering on the simulated data to check the POMP model and also choose a set of reasonable test parameters to start.
params_test <- c(
sigma_nu = 0.001,
mu_h = -12,
phi = expit(0.8),
sigma_eta = exp(-0.8),
G_0 = 0,
H_0 = 0
)
sim1.sim <- pomp(USX.filt,
statenames=USX_statenames,
paramnames=USX_paramnames,
rprocess=discrete_time(
step.fun=Csnippet(USX_rproc.sim),delta.t=1)
)
sim1.sim <- simulate(sim1.sim,seed=1,params=params_test)
sim1.filt <- pomp(sim1.sim,
covar=covariate_table(
time=c(timezero(sim1.sim),time(sim1.sim)),
covaryt=c(obs(sim1.sim),NA),
times="time"),
statenames=USX_statenames,
paramnames=USX_paramnames,
rprocess=discrete_time(
step.fun=Csnippet(USX_rproc.filt),delta.t=1)
)
run_level <- 3
USX_Np <- switch(run_level, 10, 1e3, 2e3)
USX_Nmif <- switch(run_level, 10, 100, 200)
USX_Nreps_eval <- switch(run_level, 4, 10, 20)
USX_Nreps_local <- switch(run_level, 10, 20, 20)
USX_Nreps_global <- switch(run_level, 10, 20, 100)
stew(file=sprintf("pf1-final-%d.rda",run_level),{
t.pf1 <- system.time(
pf1 <- foreach(i=1:USX_Nreps_eval,
.packages="pomp") %dopar% pfilter(sim1.filt,Np=USX_Np))
},seed=493536993,kind="L'Ecuyer")
L.pf1 <- logmeanexp(sapply(pf1,logLik),se=TRUE)
plot(Y_state[-c(1:5)]~time[-c(1:5)], data=sim1.sim, type='l',col='red',xlab='Time',ylab='Log-returns')
lines(USX.ret.demeaned)
legend('top', legend=c('Simulated data','Observed data'), col=c('red','black'), lty=1, cex=0.8)
The simulated sequence demonstrates similar pattern as original data. We can clearly identify some volatility variation in some time windows.
We then fit the model using local search. The best log-likelihood is 2248.
USX_rw.sd_rp <- 0.02
USX_rw.sd_ivp <- 0.1
USX_cooling.fraction.50 <- 0.5
USX_rw.sd <- rw.sd(
sigma_nu = USX_rw.sd_rp,
mu_h = USX_rw.sd_rp,
phi = USX_rw.sd_rp,
sigma_eta = USX_rw.sd_rp,
G_0 = ivp(USX_rw.sd_ivp),
H_0 = ivp(USX_rw.sd_ivp)
)
stew(file=sprintf("mif1-final-%d.rda",run_level),{
t.if1 <- system.time({
if1 <- foreach(i=1:USX_Nreps_local,
.packages="pomp", .combine=c) %dopar% mif2(USX.filt,
params=params_test,
Np=USX_Np,
Nmif=USX_Nmif,
cooling.fraction.50=USX_cooling.fraction.50,
rw.sd = USX_rw.sd)
L.if1 <- foreach(i=1:USX_Nreps_local,
.packages="pomp", .combine=rbind) %dopar% logmeanexp(
replicate(USX_Nreps_eval, logLik(pfilter(USX.filt,params=coef(if1[[i]]),Np=USX_Np))), se=TRUE)
})
},seed=20200426,kind="L'Ecuyer")
r.if1 <- data.frame(logLik=L.if1[,1],logLik_se=L.if1[,2],t(sapply(if1,coef)))
summary(r.if1$logLik,digits=5)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 2247 2247 2247 2247 2247 2248
plot(if1)
pairs(~logLik+sigma_nu+mu_h+phi+sigma_eta,data=subset(r.if1,logLik>max(logLik)-20))
The convergence diagnostics indicate that \(\mu_h\), \(\phi\), \(\sigma_{eta}\) and \(G_0\) already converge fairly well. The likehood tends to concentrate to a small range for those parameters.
To further optimize the likelihood function, we conduct a global search using randomized starting values.
USX_box <- rbind(
sigma_nu = c(0.0005,0.003),
mu_h = c(-15,-10),
phi = c(0.5,0.99),
sigma_eta = c(0.3,0.8),
G_0 = c(-2,2),
H_0 = c(-3,3)
)
stew(file=sprintf("box_eval-final-%d.rda",run_level),{
t.box <- system.time({
if.box <- foreach(i=1:USX_Nreps_global,
.packages='pomp',.combine=c) %dopar% mif2(if1[[1]],
params=apply(USX_box,1,function(x) runif(1,x)))
L.box <- foreach(i=1:USX_Nreps_global,
.packages='pomp',.combine=rbind) %dopar% {
logmeanexp(replicate(USX_Nreps_eval, logLik(pfilter(
USX.filt,params=coef(if.box[[i]]),Np=USX_Np))),
se=TRUE)
}
})
},seed=20200426,kind="L'Ecuyer")
r.box <- data.frame(logLik=L.box[,1],logLik_se=L.box[,2],t(sapply(if.box,coef)))
summary(r.box$logLik,digits=5)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 2184 2203 2247 2232 2248 2248
plot(if.box)
pairs(~logLik+sigma_nu+mu_h+phi+sigma_eta,data=subset(r.box,logLik>max(logLik)-20))
The best likelihood is still 2248 with the 3rd quantile increasing to 2248, which means that there are more likelihood values distributed at higher end. In the scatterplot, the likelihood surface become more concentrated for parameters with diverse starting values. The diagnostics plot shows that the effective sample size is large enough and the filtering result is reasonable.
Finally, we fit a GARCH(1,1) model on the demeaned log-returns for comparison. The conditional log-likelihood value is around 2204.7 with 3 parameters. The fitting is not satisfactory because the Jarque Bera Test indicates that the normal assumption on the residuals is not valid. We have further justified it using QQ normal plot which shows heavier tails than normal.
library(tseries)
fit.garch <- garch(USX.ret.demeaned,grad = "numerical",trace = FALSE)
Lik.garch <- tseries:::logLik.garch(fit.garch)
summary(fit.garch)
##
## Call:
## garch(x = USX.ret.demeaned, grad = "numerical", trace = FALSE)
##
## Model:
## GARCH(1,1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.98286 -0.56364 -0.05953 0.46306 5.12131
##
## Coefficient(s):
## Estimate Std. Error t value Pr(>|t|)
## a0 7.211e-06 NA NA NA
## a1 5.000e-02 NA NA NA
## b1 5.000e-02 NA NA NA
##
## Diagnostic Tests:
## Jarque Bera Test
##
## data: Residuals
## X-squared = 239.28, df = 2, p-value < 2.2e-16
##
##
## Box-Ljung test
##
## data: Squared.Residuals
## X-squared = 0.74715, df = 1, p-value = 0.3874
Lik.garch
## 'log Lik.' 2204.738 (df=3)
qqnorm(fit.garch$residuals)
qqline(fit.garch$residuals)
In this report, we have fitted two POMP models on the demeaned log-returns of USD/CNY exchange rate. The data we deal with have small magnitude but obvious volatility variation in the selected time window. The best log-likelihood value for Heston model is 2243 with 4 fitted parameters, while that for the stochastic volatility model with leverage is 2248 with 6 fitted parameters. As comparison, we have also fitted a GARCH(1,1) model, the conditional log-likelihood is around 2204.7 with 3 parameters. AIC prefers the stochastic volatility with leverage model. Although the Heston model seems to have smaller number of parameters, we have to pay additional attention to deal with the possible negative values. In terms of computation time, the two stochastic volatility models are comparable.
With the fitted parameters, we can expect to simulate the instantaneous volatility of the log-returns of USD/CNY exchange rate, and therefore better understand the potential risk and opportunity in the foreign exchange market.
[1] Data source: https://finance.yahoo.com/quote/USDCNY%3DX/history?p=USDCNY%3DX
[2] https://en.wikipedia.org/wiki/Heston_model
[3] Janek, A. et al. FX smile in the Heston model. https://link.springer.com/chapter/10.1007/978-3-642-18062-0_4
[4] Backus, D. et al. Discrete-Time Models of Bond Pricing. https://pdfs.semanticscholar.org/94be/95f51edb40d372e9a8cf42354930e096827e.pdf
[5] Bret\(\acute{o}\), C. et al. On idiosyncratic stochasticity of financial leverage effects, Statistics & Probability Letters.
[6] Edward L. Ionides, Stats 531 (Winter 2020) ‘Analysis of Time Series’ class notes: https://ionides.github.io/531w20/