1. Motivation and Project Goal

Natural gas is the second largest energy consumption source in the United States. Analyzing natural gas price is interesting because natural gas is an important commodity product as well as an important future product. Its price will affect trading company’s revenue, and manufacturing company’s cost structure. It is also a signal to make energy production plan. There are significant amount of former work analyzing stock price, yet not that many on the natural gas price. Thus, this project is interested in using time series analysis methods that are popular in analyzing stock price (GARCH and POMP) to study the pattern of Henry Hub US Natural Gas Spot Price from 2013 to 2017.



2. Preliminary Data Analysis

Here is the glance of the dataset to get a better sense of the data. The original dataset includes historical daily price data from 1997. Considering the computation feasibility, this project selected daily data since 2013.

##       Day                 Price         Weekday               Time     
##  Min.   :2013-01-02   Min.   :1.490   Length:1237        Min.   :   1  
##  1st Qu.:2014-03-26   1st Qu.:2.750   Class :character   1st Qu.: 310  
##  Median :2015-06-17   Median :3.070   Mode  :character   Median : 619  
##  Mean   :2015-06-09   Mean   :3.249                      Mean   : 619  
##  3rd Qu.:2016-08-23   3rd Qu.:3.790                      3rd Qu.: 928  
##  Max.   :2017-10-30   Max.   :8.150                      Max.   :1237
## [1] "Total data points: 1237"

From Figure 1, we can see that there is a significant pump in the winter of 2014 due to extreme weather, but there is no show of seasonality.

From Figure 2, we can see that the demeaned log return of natural gas price has some similar features to the stock market, which is said that there are time periods having higher volatility, and high volatility usually clusters together. Thus, we choose to fit a GARCH Model – a widely used model for financial time series – in section 3.



3. GARCH Model

A GARCH(p,q) Model can be notate as below: \[ Y_n = \epsilon_n \sqrt{V_n},\] where \[ V_n = \alpha_0 + \sum_{j=1}^p \alpha_j Y_{n-j}^2 + \sum_{k=1}^q \beta_k V_{n-k}\] and \(\epsilon_{1:N}\) is white noise.

The GARCH model assumes that \(y_n\) needs to be uncorrelated, but \(y_n^2\) is correlated. From Figure 3, we can see that \(y_n^2\) is well correlated, but \(y_n\) also has some lag in 1, 2, 3, 9, 12, 14, 16, and 18. So \(y_n\) is further processed to use time series decomposition to extract the random part as the less corrrelated data. We choose to fit a GARCH\((1,1)\) model since it is generally a popular choice (Cowpertwait and Metcalfe 2009).

## 
## Call:
## garch(x = ddPrice, grad = "numerical", trace = FALSE)
## 
## Model:
## GARCH(1,1)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -5.31392 -0.54084  0.01836  0.54326  5.31253 
## 
## Coefficient(s):
##     Estimate  Std. Error  t value Pr(>|t|)    
## a0 4.614e-05   5.796e-06    7.960 1.78e-15 ***
## a1 1.501e-01   1.818e-02    8.255 2.22e-16 ***
## b1 8.156e-01   1.362e-02   59.887  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Diagnostic Tests:
##  Jarque Bera Test
## 
## data:  Residuals
## X-squared = 658.48, df = 2, p-value < 2.2e-16
## 
## 
##  Box-Ljung test
## 
## data:  Squared.Residuals
## X-squared = 0.19883, df = 1, p-value = 0.6557
## 'log Lik.' 1853.099 (df=3)

The loglikelihood of GARCH(1,1) model is 1853.099, and all the coefficient are significant. Also, the p-value of Box-Ljung test is not significant, so it is not sufficient to reject the null hypothesis that the squared residuals are uncorrelated. However, the p-value of Jarque Bera test is very small, so we should reject the null hypothesis of the residuals are normally distributed.

The Q-Q Plot in Figure 4 also shows that the data is quite heavy tail. We can tell that the natural gas price log-returns is also similar to the stock price log-returns in that they both show heavy-tails.

Since it is hard to interprete parameters of GARCH for financial market behavior, we need to go beyond GARCH model to POMP model for further analysis.

4. POMP Model

4.1 Theory

In the stock market, there is a well studied observation called leverage, which is said that negative shocks to a stockmarket index are associated with a subsequent increase in volatility. (Ionides, STATS531 notes14) This section will implement the stochastic volatility with random-walk leverage (Breto, 2014) and analyze this phenomenon in the natural gas market.

We define the following variables:

\(Y_t\): a financial rate of return in time period \(t\)

\(σ_t^2\): the volatility of \(y_t\)

\(H_t\): log volatility of \(y_t\)

\(R_t\): the Fisher-transformed leverage process driven by the random-walk leverage factor process \(G_t\)

\(\epsilon_t\): Gaussian unit-variance white noise processes

\(ν_t\): a Gaussian white fnoise process

\[Y_t = σ_t^2 \epsilon_t = \exp\{H_t/2\} \epsilon_t\] \[H_t = \mu_h (1-\phi) +\phi H_{t-1} + \beta_{t-1}R_t \exp\{-H_{t-1}/2\} +\omega_t\] \[R_t = \frac{\exp\{2G_t\}-1}{\exp\{2G_t\}+1}\] \[G_t = G_{t-1}+\nu_t\] where \(\beta_n=Y_n\sigma_\eta\sqrt{1-\phi^2}\), \(\{\epsilon_n\}\) is an iid \(N(0,1)\) sequence, \(\{\nu_n\}\) is an iid \(N(0,\sigma_{\nu}^2)\) sequence, and \(\{\omega_n\}\) is an iid \(N(0,\sigma_\omega^2)\) sequence.

4.2 Model Fitting

After created the pomp object, we set the following starting values to initiate the model.

We first use IF2 algorithm (Ionides, STATS531 notes12) and set the following variables to find the local maximum likelihood.

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1867    1870    1875    1881    1900    1902

The local maximization log likelihood is 1902, which is 2.6% higher than the result of GARCH(1,1), so that the pomp model better fits this dataset.

Here is the local geometry of the likelihood surface.

Now we use the iterative algorithm again, but randomize the starting values of each parameters from a space and search for the global maximum likelihood.

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1850    1870    1874    1876    1878    1902

The global maximization log likelihood is also 1902.

Here is the global geometry of the likelihood surface.

4.3 Diagnostics

We can see the likelihood converges quite well after coupou iterations, but the effective sample size frequently drop dramatically, which usually occurs when the volatility changes suddenly and dramatically. Phi and one of the sigma eta did not converge well, but the other variables bacame stable after about 100 iterations.



5. Conclusion

This project used time series models to analyze the pattern of the Henry Hub natural gas spot price. We examined the GARCH model and POMP model. We found that the natural gas price also has sudden changes of volatility. The POMP model gave us better performance in terms of the maximum likelihood.



6. Reference

  1. Ionides, E. (n.d.). Stats 531 (Winter 2018) ‘Analysis of Time Series’ Retrieved April 22, 2018, from http://ionides.github.io/531w18/

  2. Henry Hub Natural Gas Spot Price. Retrieved April 22, 2018, from https://www.eia.gov/naturalgas/data.php

  3. Carles Bretó, 2014. On idiosyncratic stochasticity of financial leverage effects. Statistics and Probability Letters 91 (2014), 20–26.