STATS 531 Final Project: POMP Model on Brazilian Industrial Production Index

Introduction
Data Exploration
Work from Midterm Project
POMP Model Description
Conclusion
Acknowledgment
Reference

Introduction

In this project, we continue the work from midterm project to analyze a monthly economic time series (specifically, Industrial Production Index, or IPI for short) from the Brazilian economy.
IPI is an economic indicator that measures the real production output of manufacturing, mining, and utilities. The monthly IPI is usually used to reflect short-term changes in industrial production. The growth in IPI from month to month indicates the growth in the industry. (“Industrial Production Index”, n.d.)
In the midterm project, we focus on ARMA model and SARIMA model. In this project, we will use partially observed Markov process (POMP) model with geometric Brownian Motion (GBM).
The question of interests is that whether POMP model with GBM is suitable for IPI data.

Data Exploration

The dataset comes from the website (West, n.d.), which in turn comes from the article by Huerta and Lopes, 1999.
The dataset contains monthly IPI in Brazil from January 1980 to December 1997, 216 months in total.

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   72.21   94.86  103.67  104.26  114.20  134.75

The above is a brief summary of IPI over 216 months. There are no missing values.

From the time series plot above, there seems to be some evidences for an increasing trend, especially in the later months. However, the evidences are not strong. There are also considerable fluctuations.

Work from Midterm Project

In midterm project, we find SARIMA\((0,1,1)\times(0,1,1)_{12}\) model fits the data pretty closely, which outperforms ARMA(4,3) model and SARIMA\((1,0,0)\times(0,1,0)_{12}\) model.

## 
## Call:
## arima(x = IPI, order = c(0, 1, 1), seasonal = list(order = c(0, 1, 1), period = 12))
## 
## Coefficients:
##           ma1     sma1
##       -0.2907  -1.0000
## s.e.   0.0690   0.1252
## 
## sigma^2 estimated as 21.39:  log likelihood = -616.27,  aic = 1238.53

The log likelihood for SARIMA\((0,1,1)\times(0,1,1)_{12}\) model is about -616.27.
We see from the plot above that this model fits the data very well. The model not only captures most of the fluctuations, but also fits the peaks closely.

POMP Model Description

Background

There are some observations of Gross Domestic Product (GDP) being a GBM. (Conover, n.d.) Since IPI is closely related to GDP, it is reasonable to use GBM to model IPI.
A GBM \(\{S_t\}\) is an exponentiated Brownian motion defined through \[\frac{dS_t}{S_t} = \mu dt + \sigma dW_t\] where \(W_t\) is a Brownian motion. (Avdiu, n.d.)
There is a recursive procedure for simulating values of \(\{S_t\}\): \[S_{t+1} = S_t \exp( (\mu-\frac{1}{2}\sigma^2) (t_{i+1} - t_i) + \sigma \sqrt{t_{i+1} - t_i} Z_i)\] with \(0 < t_0 < t_1 < ...\) and \(Z_i \sim N(0,1)\). (Avdiu, n.d.)

Our Model

In our monthly data, \(t_{i+1} - t_i = 1\).
We present a POMP model with GBM as \[S_{n+1} = S_n \exp((\mu-\frac{1}{2}\sigma^2) + \sigma E_n)\] where \(S\) and \(E\) are state variables, and \(E_n \sim N(0, \phi^2)\).
The measurement model is \[IPI_n | S_n \sim N(0,S_n)\].
We initialize \(S_0 = 1\) and \(E_0 = 0\).
The following is the corresponding pomp implementation.

ipi_statenames <- c("S","E")
ipi_paramnames <- c("mu","sigma","phi")
ipi_obsnames <- "IPI"

ipi_step <- Csnippet("
E = rnorm(0,phi);
S = S*exp((mu-sigma*sigma/2)+sigma*E);
")

ipi_init <- Csnippet("
S=1;
E=0;
")

ipi_fromEstimationScale <- "
 Tmu = exp(mu);
 Tsigma = exp(sigma);
 Tphi = exp(phi);
"

ipi_toEstimationScale <- "
 Tmu = log(mu);
 Tsigma = log(sigma);
 Tphi = log(phi);
"

dmeas <- Csnippet("lik = dnorm(IPI,0,S,give_log);")
rmeas <- Csnippet("IPI = rnorm(0,S);")

pomp(dt,times="Date",t0=0,
     rprocess=discrete.time.sim(step.fun=ipi_step,delta.t=1),
     initializer=ipi_init,rmeasure=rmeas,dmeasure=dmeas,
     statenames=ipi_statenames,
     paramnames=ipi_paramnames,
     fromEstimationScale=Csnippet(ipi_fromEstimationScale),
    toEstimationScale=Csnippet(ipi_toEstimationScale)
     ) -> ipi_pomp

Run Level

We use three run levels to develop and refine the model.

run_level <- 3
switch(run_level,
       {ipi_Np=100; ipi_Nmif=10; ipi_Neval=10; ipi_Nglobal=10; ipi_Nlocal=10}, 
       {ipi_Np=20000; ipi_Nmif=100; ipi_Neval=10; ipi_Nglobal=10; ipi_Nlocal=10}, 
       {ipi_Np=60000; ipi_Nmif=300; ipi_Neval=10; ipi_Nglobal=100; ipi_Nlocal=20}
)

Likelihood Slice

To have an idea of what the likelihood surface looks like, we design the following slice.

sliceDesign(
  c(mu=0.1,sigma=0.2,phi=0.5),
  mu=rep(seq(from=-0.5,to=0.5,length=50),each=3),
  sigma=rep(seq(from=0,to=4,length=50),each=3),
  phi=rep(seq(from=0,to=4,length=50),each=3)
  ) -> p

Along \(\mu\) direction, the log likelihood is maximized when \(\mu\) is about 0.1.

Along \(\sigma\) direction, the log likelihood is maximized when \(\sigma\) is about 0.5.

Along \(\phi\) direction, the log likelihood is maximized when \(\phi\) is about 2.

Running a Partical Filter

Using \(\mu=0.1\), \(\sigma=0.5\) and \(\phi=2\) as point estimate, we run a basic particle filter.

##                          se 
## -1.453785e+03  8.526962e-02

The log likelihood estimate is about -1453.78 with a Monte Carlo standard error of 0.08.

Local Search

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   -1440   -1435   -1432   -1432   -1431   -1422

A local search gives a max log likelihood of -1422.
The figure shows the geometry of the likelihood surface in a neighborhood of this point estimate.

Global Search

We now perform a global search in a parameter box defined below, hoping to improve the log likelihood.

ipi_box <- rbind(
  mu=c(0,0.5),
  sigma=c(0.2,4),
  phi = c(0.5,4)
)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   -2165   -1439   -1432   -1453   -1427   -1420

The global search gives a max log likelihood of -1420, which is an improvement over that from the local search, though not much.

Diagnostic

The fact that both local search and global search give similar max log likelihood shows confidence in our maximization procedure.
We also look at the diagnostic plots.

plot(mifs_global)

The diagnostic plots show that the parameter estimations are quite stable.

Conclusion

In this project, we fit POMP model with GBM on Brazilian monthly IPI data.
The POMP model gives a max log likelihood of -1420, and the log likelihood is stable, which means the method is suitable for the data.
However, the max log likelihood from the POMP model is still quite small, compared with that from SARIMA model studied in the midterm project.

Acknowledgment

The concepts and implementations of POMP model is based on Professor Ionides’ notes for STATS 531 (Winter 2018) at the University of Michigan. (Ionides, n.d.)

Reference

[1] Avdiu, K. Brownian Motion & Geometric Brownian Motion. Retrieved April 26, 2018, from http://homepage.univie.ac.at/kujtim.avdiu/dateien/BrownianMotion.pdf.

[2] Conover, J. Historical Economics. Retrieved April 26, 2018, from http://www.johncon.com/john/historical.economics/.

[3] Huerta, G., Lopes, H. F. 1999. Bayesian forecasting and inference in latent structure for the Brazilian GDP and Industrial Production Index. Retrieved April 26, 2018, from ftp://ftp.stat.duke.edu/pub/WorkingPapers/99-08.html.

[4] Industrial Production Index. Retrieved April 26, 2018, from https://fred.stlouisfed.org/series/INDPRO.

[5] Ionides, E. Stats 531 (Winter 2018) ‘Analysis of Time Series’. Retrieved April 26, 2018, from https://ionides.github.io/531w18/.

[6] West, M. SOME TIME SERIES DATA SETS. Retrieved April 26, 2018, from http://www2.stat.duke.edu/~mw/ts_data_sets.html.