1. Exploratory Analysis

Cryptocurrency (or crypto) seems to be one of those word that is always buzzing around, and has really blown up in recent years. Broadly speaking, crypto is a digital form of currency, but one that carries its own certificate of who has created the unit of currency, who has owned it, and where it has come from. This differs greatly from the early days of the internet, when piracy was much more rampant than now (i.e. Napster and music rights) 1. One of the main attractions of crypto is that it is not maintained or reliant on a central banking authority2., which is meant to make crypto more egalitarian and less gate-keepy, but we live in a world where large companies already dominate the technology market. Understanding the way that this technology is utilized in markets will be important for understanding the way it will shape our economy and society in coming years.

Our research question of interest is how can we best model cryptocurrency returns? We have three possible models to analyze: the GARCH-ARIMA model, stochastic volatility with leveraged returns (based on Breto and work done in class) and a simple stochastic volatility model based on the Black-Scholes stochastic differential equation (SDE). Work with the Black-Scholes model was previously done for a final project in 2018 by group 16.

##  chr [1:11129] "2021-01-01 05:00:00" "2021-01-01 06:00:00" ...

## [1] 3325
##                TimeStamp ClosePrice TradeVolume BuyVolume     Return
## 3325 2021-05-19 11:00:00    2721.08   955394402 413489401 -0.1307937
##                PriceTime       time
## 3325 2021-05-19 11:00:00 2021-05-19

Notice that there is a large negative spike in returns on May 19th, 2021. That corresponds to the day that Russian hackers stole 90 million in Bitcoin. While this is not Ethereum, both are forms of crypto currency, and it is likely that they are highly linked. There does not seem to be much evidence of a trend, and therefore stationarity seems to be holding. At least, we don’t have much evidence that stationarity is not holding. We also will leave that data point in, because it is not a typo, and such a random even as Russian hackers stealing crypto that day is good to have in the model because hackers will continue to hack.

The analysis of the Ethereum data (a type of cryptocurrency) will continue mid-term project #2.

The Ethereum data are incredibly noisy, with very frequent measurements, and low autocorrelation. This is one of the advantages of working with crypto data because unlike traditional stock markets, we don’t have to wait until closing to get an idea of returns. We can get hourly data quite easily!

2. Benchmarks based on simple time-series models

Benchmarks can be very helpful for our analysis since they can provide a baseline for comparison. Their simplicity and interpretability can give us a basic understanding about the data and its characteristics. In this section, we’ll apply the basic time-seris models that we’ve learned from this course to our data, and provide a reasonable benchmark for the analysis of more complex models.

2.1 ARMA

ARMA model is one of the most basic models in time series 3. Although its performance is limited for financial data, a simple, stable and invertible ARMA model still can capture some characteristics of the data and serve a useful benchmark. This code is a continuation of the code written by Group 2 for the mid-term project (coded by Chongdan Pan) 4.

AIC for Ethereum Return
MA0 MA1 MA2 MA3 MA4
AR0 -54698.88 -54699.73 -54712.70 -54715.31 -54721.70
AR1 -54699.50 -54703.11 -54719.06 -54717.49 -54719.74
AR2 -54712.26 -54719.59 -54722.18 -54715.07 -54720.76
AR3 -54714.60 -54718.01 -54715.59 -54719.38 -54721.40
AR4 -54723.34 -54721.77 -54722.32 -54720.32 -54737.98

Based on the AIC table, it turns out that ARMA(2,2), AR(4) and ARMA(4,4) have outstanding performance. For simplicity, we’re using AR(4) as a benchmark.

## 
## Call:
## arima(x = eth_demeaned, order = c(4, 0, 0))
## 
## Coefficients:
##          ar1      ar2      ar3      ar4  intercept
##       0.0161  -0.0415  -0.0214  -0.0345      0e+00
## s.e.  0.0105   0.0105   0.0105   0.0105      1e-04
## 
## sigma^2 estimated as 0.0001337:  log likelihood = 27367.67,  aic = -54723.34

The inverse of all four characteristic roots are in the unit circle, implying that AR(4) can be a reasonable choice. It’s AIC value -54723.34 can be a benchmark for further analysis.

Based on the plot of residuals, it turns out that there are still some correlation within it. What’s more, the Q-Q plot shows that there are heavy tails in the residuals, which is critical in finance analysis.

2.2 AR-Garch

Garch models are typically used for volatility analysis, thanks to its assumption that there is an internal time-series correlation within the volatility. What’s more, in this section, we seek to combine the AR and Garch model together and see their performance 5. Based on previous result,

Garch(1,1) AR(1) Garch(1,1) AR(2) Garch(1,1) AR(3) Garch(1,1) AR(4) Garch(1,1)
AIC -55175.78 -57161.75 -57162.84 -57161.79 -57161.75
## 
## Title:
##  GARCH Modelling 
## 
## Call:
##  garchFit(formula = ~arma(2, 0) + garch(1, 1), data = eth_demeaned, 
##     cond.dist = c("norm"), include.mean = TRUE) 
## 
## Mean and Variance Equation:
##  data ~ arma(2, 0) + garch(1, 1)
## <environment: 0x55cc4dd1b8c0>
##  [data = eth_demeaned]
## 
## Conditional Distribution:
##  norm 
## 
## Coefficient(s):
##          mu          ar1          ar2        omega       alpha1        beta1  
##  7.7253e-05   1.3308e-02  -1.9424e-02   9.7893e-07   4.7019e-02   9.4591e-01  
## 
## Std. Errors:
##  based on Hessian 
## 
## Error Analysis:
##          Estimate  Std. Error  t value Pr(>|t|)    
## mu      7.725e-05   9.581e-05    0.806    0.420    
## ar1     1.331e-02   1.129e-02    1.179    0.239    
## ar2    -1.942e-02   1.128e-02   -1.722    0.085 .  
## omega   9.789e-07   1.629e-07    6.009 1.87e-09 ***
## alpha1  4.702e-02   4.179e-03   11.250  < 2e-16 ***
## beta1   9.459e-01   4.800e-03  197.052  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Log Likelihood:
##  28587.42    normalized:  3.17638 
## 
## Description:
##  Mon Apr 18 14:31:40 2022 by user:  
## 
## 
## Standardised Residuals Tests:
##                                 Statistic p-Value    
##  Jarque-Bera Test   R    Chi^2  8301.537  0          
##  Shapiro-Wilk Test  R    W      NA        NA         
##  Ljung-Box Test     R    Q(10)  13.78231  0.1831548  
##  Ljung-Box Test     R    Q(15)  16.83657  0.3287278  
##  Ljung-Box Test     R    Q(20)  19.68004  0.4780979  
##  Ljung-Box Test     R^2  Q(10)  29.2359   0.001141029
##  Ljung-Box Test     R^2  Q(15)  31.38268  0.00780473 
##  Ljung-Box Test     R^2  Q(20)  33.67617  0.0284006  
##  LM Arch Test       R    TR^2   31.45032  0.001681191
## 
## Information Criterion Statistics:
##       AIC       BIC       SIC      HQIC 
## -6.351427 -6.346691 -6.351428 -6.349816

Based on the AIC table, it turns out that AR(2) works best in our case. More importantly, it turns out that the combination of AR and Garch greatly decrease the AIC value, showing obvious supremacy.

However, even with the combination of AR and Garch model, our model still suffers from the same problem of not necessarily normal residuals. We hope that a POMP stochastic volatility model can help to assuage some of these issues, or at least fit better.

3. Using Breto model

We build a modified Stochastic volatility models from Breto 6, which introduces leverage to reflect the common phenomenon of the existence of financial markets. The basic setting for the model is the following:

3.1 Implementation Breto’s model

\[Y_n=\exp(H_n/2)\epsilon_n\] \[H_n = \mu_h(1-\phi)+\phi H_{n-1}+\beta_{n-1}R_n\exp(-H_{n-1}/2)+\omega_n \\ G_n = G_{n-1}+v_n\] \[R_n=\frac{\exp(2G_n)-1}{\exp(2G_n)+1}\]

The latent state \(X_n = (G_n, H_n)\) where \(Y_n\) is the observed return, \(\beta_n=Y_n\sigma_\eta \sqrt{1-\phi^2}\), \(\{\epsilon_n\}\) is an i.i.d. \(N(0,1)\) sequence, \(\{\nu_n\}\) is an i.i.d. \(N(0,\sigma_{\nu}^2)\) sequence and \(\{\omega_n\}\) is \(N(0,\sigma_{\omega,n}^2)\) sequence where \(\sigma_{\omega,n}^2=\sigma_\eta^2(1-\phi^2)(1-R_n^2)\). The \(H_n\) in the model is the log volatility, \(G_n\) is Gaussian random walk.
Building the model 7

4. Simple Sotchastic Volatility Model

In class we analyzed the data using leveraged returns, and a fairly complex model written by Breto et. al. We wanted to see if a simpler model would suffice and perhaps even perform better for Ethereum returns.

Our proposed model comes from the Heston model9 where price and volatility are written in order as: \[dS_t=\mu S_t dt +\sqrt{v_t}S_t dW_t\] \[dv_t=\kappa(\theta-v_t)dt+\xi \sqrt{v_t}dW_t^v\]

where \(W\) is a Brownian motion process, \(\mu\) is the drift of the stock, \(\theta\) is the expected value of \(\{v\}\), \(\kappa\) is the rate of mean reversion of \(v_t\) to the long run average price, and \(\xi\) is the variance of \(v_t\). This model is generally seen as a better way to model stock prices than the Black-Scholes model, because it has a non-constant variance.

Our data is observations of returns every hour, and we have de-meaned the data, so following the work of Project 16, Winter ’18,10 we re-write the Heston model for volatility of returns with \(\mu=1\) as: \[ V_n=(1-\phi) \theta +\phi V_{n-1}+\sqrt{V_{n-1}}\omega_n\]

with constraints \(\omega_n\sim N(0, \sigma_\omega^2)\), \(\phi \in (0,1)\), and \(\theta, \sigma_\omega>0\).

We build a POMP model based on the Heston model as follows. Most of the following code was adapted from a previous final project, but the key difference is that we do not carry the observed process as a random process like Project 16 did because they actually made the POMP model more challenging than necessary to implement.

4.1 Implementing the Heston Model

Below we have defined the POMP variables, the random process, and the measurement model. Note that both are relatively simple, but because we have discretized the Stochastic Difference Equation, \(V_n\) can sometimes drop below \(0\), which makes calculation of \(V_{n+1}\) impossible since \(\sqrt{V_n}\) would be imaginary. As a result, we included the ‘if’ statement that automatically maps a negative \(V_n\) to 0 if it is negative. We also mapped parameter transformations because of the constraints noted above.

Now we create a POMP object and particle filter for the Ethereum data.

4.2 Local Search

After filtering and simulating, we explore the likelihood surface. We start with a Local Search, following code that Dr. Ionides wrote for the Breto model.11

##                        se 
## 2.754766e+04 1.684144e-02

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   34536   34647   34719   34720   34781   34975

Looking at some of the diagnostic plots, after running at run-level 1, we notice that there are some problems with effective sample size, and too many particles getting killed off regularly. This might just be a problem of run-level 1 and not having much diversity in the particles in the first place.

4.3 Global Search

After exploring the local surface, we move to the global.

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   34365   34649   34723   34724   34816   34944

The diagnostic plots and convergence diagrams of the parameters indicate that parameters are converging fairly well. However, there are a lot of iterations where we are losing particles, and effective sample size is dropping below 5. This may indicate that the data have some strange behviors at certain points in time where the model cannot explain the data. We noted earlier that there seemed to be some outlier with Russian hackers stealing Bitcoin in the Ethereum data; future analysis should take this into account, and attempt to find a better fit, or at least provide some type of hacking term or human behavior term in the model.

5. Conclusion and Comparsion Between Benchmark and POMP models

The log-likelihood for the ARIMA-GARCH model is 28587.42 on 7 parameters. The log-likelihood for the Heston model is roughly 34975.32 for 6 parameters. The log-likelihood for the Breto model is 28977 for 6 parameters.

We see that the Heston model is likely the best model for understanding the dynamics of crypto volatility. It also has the benefit of the interpretability of parameters. Diagnostically, it also performs a little bit better, with more parameters converging than the Breto model.

6. Reference