All group members collaboratively reviewed a set of past project reports related to financial data, chosen for their similarities to our CPI dataset. The list includes:
We examined each report individually and discussed the methods used for modeling and analysis, without using any of their code or results. Since we are employing SARIMA models, we compared the strengths and weaknesses of their approaches. For instance, reports such as “Unemployment and Federal Interest Rate” and “Nvidia Stock Price” demonstrated strong diagnostic procedures.
However, the presence of unit roots—likely due to high-order AR and MA parameters—raised concerns about model interpretability and assumption violations, so we want to be more cautious for the model selection process. Also, we noted that SARIMA models perform poorly on high-frequency data (daily or hourly). In “Ethereum and Investment”, for example, cryptocurrency prices, which typically behave like a Wiener process, may not be well-suited for ARMA models even after log transformation and differencing; phenomena such as volatility clustering and sudden stops are common, suggesting that a GARCH model, as used in the “Apple Stock Price” report, might be more appropriate.
Inflation is a critical economic indicator that affects monetary policy, financial markets, and consumer purchasing power. Recent economic and political changes put inflation rate into a focal point of all market participants from central banks to consumers. The Consumer Purchasing Index (CPI) is a widely used measure of inflation. In this project, we are interested in testing time series models on CPI inflation data to forecast future inflation rates. In particular, we will test (S)ARIMA models and assess how macroeconomic events influence CPI time series properties. At the end of this report, we will provide an advanced time series model specifically designed to handle fluctuations caused by structural breaks, and we will provide a 5-year forecast for testing and intuition for further studies.
The Consumer Price Index (CPI) measures the overall change in consumer prices based on a representative basket of goods and services over time. To calculate how much prices are rising, hundreds of government workers spend their days tracking down costs of individual goods and services.
\[ P_t^{\$} = \sum_{i=1}^{N} \left( w_{i,t} \times P_{i,t}^{\$} \right) \]
\[ PI_{t+k}^{\$} = \left( \frac{P_{t+k}^{\$}}{P_t^{\$}} \right) \times 100 \]
\[ \pi_{t,t+k}^{\$} = 100 \times \left( \frac{PI_{t+k}^{\$}}{PI_t^{\$}} - 1 \right) \]
\[ \pi_{t,t+k}^{\$} = 100 \times \left( \log PI_{t+k}^{\$} - \log PI_t^{\$} \right) \]
(Source: Session 8_Purchasing Power Parity, Page 6-10)
We use monthly CPI data of the last five years (2020-2025) to check our forecasts. Instead of putting all the earlier data into model training, we select data from 1985 to 2020 because the base year of CPI data from FRED is 1982-1984. Due to changes in the basket of goods and services for measurement of monthly CPI, data prior to 1984 is liable to adjustments and lack of accuracy. Thus, in order to avoid bringing noisy variables into our model, we are only using 1985-2020 data for modeling.
We will use the following libraries for this project:
## [1] "CPIAUCSL"
Library | Description |
---|---|
knitr |
Enables dynamic report generation and facilitates seamless integration of R code in Markdown documents. |
tidyverse |
A collection of R packages for data manipulation, visualization, and
analysis (e.g., ggplot2 , dplyr ,
tidyr ). |
xts |
Provides an extensible time series structure for managing and analyzing time-dependent data. |
lmtest |
Implements tests for linear models, including heteroskedasticity and autocorrelation diagnostics. |
quantmod |
Offers tools for financial modeling and quantitative trading strategy development. |
forecast |
Includes functions for time series forecasting, such as ARIMA, ETS, and decomposition methods. |
tseries |
Provides time series analysis tools, including unit root tests, GARCH models, and bootstrapping methods. |
stats4 |
Implements maximum likelihood estimation for statistical modeling. |
astsa |
Supports applied statistical time series analysis, including spectral analysis and forecasting. |
moments |
Computes statistical moments (skewness, kurtosis) for assessing data distribution properties. |
rugarch |
Implements univariate GARCH models for volatility modeling and forecasting in financial time series. |
FinTS |
Provides diagnostic tools for financial time series, including ARCH effect tests. |
PerformanceAnalytics |
Offers performance analysis and risk metrics for portfolio and asset return evaluation. |
Using the equation developed in the previous section, we compute and plot the monthly inflation rate from 1985-2020:
## An xts object on 1985-01-01 / 2019-12-01 containing:
## Data: double [420, 1]
## Columns: CPIAUCSL
## Index: Date [420] (TZ: "UTC")
## xts Attributes:
## $ src : chr "FRED"
## $ updated: POSIXct[1:1], format: "2025-02-21 23:45:56"
## Index CPIAUCSL
## Min. :1985-01-01 Min. :-1.7864
## 1st Qu.:1993-09-23 1st Qu.: 0.1050
## Median :2002-06-16 Median : 0.2210
## Mean :2002-06-16 Mean : 0.2136
## 3rd Qu.:2011-03-08 3rd Qu.: 0.3408
## Max. :2019-12-01 Max. : 1.3675
## NA's :1
Based on the plot above, we discover the following features of the data:
From these observations, we expect an ARMA model or an ARMA-GARCH model would be a good fit for this time series. We further study the data with hypothesis tests and plots of key statistical parameters.
We remove the empty value:
The ACF plot provides the following insights:
The ACF plot shows significant autocorrelation with lag equals to 1. To further support our observation, we conduct a Box-Ljung Test which tests autocorrelation up to a given lag:
If p-value < 0.05, it suggests significant autocorrelation exists (Statistics 509, Winter 2024, Lecture 8, p. 21; Thelen, 2024).
##
## Box-Ljung test
##
## data: df$CPIAUCSL
## X-squared = 136.45, df = 20, p-value < 2.2e-16
We reject the null hypothesis that there is no autocorrelation within the data. This result confirms our observation of the ACF plot that the time series cannot be modeled by white noise.
Apart from autocorrelation tests, we also want to provide some support to our observation about trends. The ACF plot provides minor support to the stationarity of our data, and there seems to be significant volatility clustering. We first check if a significant trend exists:
(Ionides, 2025, Modeling and Analysis of Time Series Data, Chapter 8, p. 14).
The red line in the above plot is the LOESS-smoothed inflation rate with span = 0.5, and the blue line is the LOESS-smoothed inflation rate with span = 0.15. The LOESS (Locally Estimated Scatterplot Smoothing) shows that the time series contains some local trend but no significant global trend. The LOESS curve fluctuates around a relatively constant mean, while the sharp spikes and fluctuations cause temporary deviations due to structural breaks. Apart from these short-term fluctuations, there are cyclical patterns observed in the plot. To further analyze the cyclical patterns, we need to firstly decompose the data into trend + noise + cycles:
(Ionides, 2025, Modeling and Analysis of Time Series Data, Chapter 8, pp. 17-18).
We tune the spans so that the cycles’ plot contains roughly periodic patterns. The decomposition plot shows the following signs:
(Ionides, 2025, Modeling and Analysis of Time Series Data, Chapter 7, p. 23).
The periodogram has peaks at frequency of around 0.66, 1.2, 2.15, and 3.1. The highest peak is located at a frequency of 1.2, which is evidence supporting a cyclical pattern of roughly 1.2 years (14 months). Besides, all other peaks mentioned above are also significant in scale. This analysis provides strong support for seasonal models.
Based on our observations before and the test result, we propose that a stationary SARMA model might be compatible with our data. Despite financial crises do affect inflation greatly, these shocks do not seem to affect the long-run trend of inflation. Without strong disagreements from EDA so far, we will try to fit a seasonal ARMA model. However, we remain cautious about the changing variance and the effects of economic shocks in history, and we will try to fit a seasonal ARMA-GARCH model to analyze these effects. A feasible way to retain seasonality features and address volatility clustering is fitting a seasonal ARMA model and then fitting a GARCH-like model on the residuals.
(Statistics 509, Winter 2024, Lecture 10, pp. 14-15; Thelen, 2024).
In this section, we test the normality of our data since this property can affect the residual distribution of a fitted ARMA model. If the data is too far from normal, a simple ARMA model may have non-normal residuals. We check normality through the following channels:
The Shapiro-Wilk Test checks normality by comparing sample distribution to a Normal distribution, which can provide support for Q-Q plot observations. It works best for small data, which suits well with our data. Its hypotheses are:
The test statistics is calculated as:
\[ W = \frac{(\sum_{i=1}^na_ix_{(i)})^2}{\sum_{i=1}^n(x_i - \bar x)^2} \]
where:
The test statistic follows a \(W\) distribution, and the cutoff values for the statistics are calculated through Monte Carlo simulations.
(Statistics 509, Winter 2024, Lecture 4, p. 34; Shapiro-Wilk test, Wikipedia).
The Jarque-Bera Test tests normality with skewness and kurtosis of sample data (weighted sum of skewness and kurtosis). The skewness of a sample represents how “symmetric” the distribution is around mean, and the kurtosis of a sample represents how “fat” the tails are:
\[ S = \frac{\hat{\mu}_3}{\hat{\sigma}^3} = \frac{\frac{1}{n} \sum_{i=1}^n (x_i - \bar{x})^3}{(\frac{1}{n} \sum_{i=1}^n(x_i - \bar{x})^2)^{3/2}}, \]
\[ K = \frac{\hat{\mu}_4}{\hat{\sigma}^4} = \frac{\frac{1}{n} \sum_{i=1}^{n} (x_i - \bar{x})^4}{(\frac{1}{n} \sum_{i=1}^{n} (x_i - \bar{x})^2)^{2}} \]
A Normal distribution has skewness of 0 and kurtosis of 3. We include this test to further support our discovery.
The test statistics is given by:
\[ JB = \frac{n}{6} (S^2 + \frac{(K - 3)^2}{4}) \]
where:
The test follows a \(\chi^2\) distribution with 2 degrees of freedom, and it has the following hypotheses:
(Statistics 509, Winter 2024, Lecture 4, p. 34; Jarque-Bera test, Wikipedia).
We start from the Q-Q plot:
The Q-Q plot indicates a symmetric distribution around the mean, but much fatter tails than a Normal distribution.
##
## Shapiro-Wilk normality test
##
## data: df$CPIAUCSL
## W = 0.88695, p-value < 2.2e-16
##
## Jarque Bera Test
##
## data: df$CPIAUCSL
## X-squared = 2464, df = 2, p-value < 2.2e-16
Test results of Shapiro-Wilk Test and Jarque-Bera Test both reject the null hypothesis, supporting our observation of the Q-Q plot. Thus, we claim that the data is not following a Normal distribution. We know making a log transformation can sometimes resolve the issue, but we cannot do a log transformation to our data because there are negative values. Thus, we might need to proceed to the alternative approach of fitting an seasonal ARMA model with a t-distribution.
As we reach the point where we suspect changing variance, volatility clustering and potentially using an seasonal ARMA model with t-distribution, the choice of our model narrows down to an seasonal ARIMA-GARCH model. However, we will still test the fit of a simple seasonal ARMA model for the purpose of comparison. It may be convenient to conclude our EDA part by computing the excess kurtosis and the degree of freedom \(\nu\) of the sample if we were to describe the sample distribution as a t-distribution:
## [1] 8.550781
## [1] 4.70169
(Statistics 509, Winter 2024, Lecture 3, p. 32; Thelen, 2024).
The sample data has an excess kurtosis of 8.55, which is greatly different from 0. When described as t-distributed, it has degree of freedom of 4.7. These might greatly affect the normality of ARMA residuals.
We first investigated the possibility of a linear trend by fitting a simple linear regression against time and observed a small but statistically significant trend. However, the regression residuals exhibited strong autocorrelation, violating the white noise assumption. To account for this trend while preserving the integrity of seasonal and autoregressive patterns, we would consider to include a linear term in the SARIMA model, allowing it to focus more effectively on short-term dependencies and seasonal effects without the confounding influence of a slow-moving mean shift.
##
## Call:
## lm(formula = CPIAUCSL ~ time, data = df)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.96580 -0.11858 0.00976 0.12505 1.17099
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.888e-01 4.070e-02 9.552 < 2e-16 ***
## time -1.476e-05 3.275e-06 -4.508 8.52e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.2468 on 417 degrees of freedom
## Multiple R-squared: 0.04646, Adjusted R-squared: 0.04418
## F-statistic: 20.32 on 1 and 417 DF, p-value: 8.525e-06
(Statistics 531, Winter 2025, Lecture 9, p. 12; Ionides, 2025).
We first fit a non-seasonal model to determine the optimal values of \(p\) and \(q\) for the SARIMA model using an AIC table. Additionally, we compare AIC values across different model specifications, including the effects of adding a trend term and incorporating a differencing parameter, to evaluate their impact on model performance. To increase the accuracy of our AIC estimation, we use the function from the package.
## [1] "Model with no linear trend and differencing:"
MA0 | MA1 | MA2 | MA3 | MA4 | |
---|---|---|---|---|---|
AR0 | 38.49 | -56.27 | -59.03 | -58.36 | -57.32 |
AR1 | -48.62 | -58.15 | -58.76 | -58.21 | -58.97 |
AR2 | -57.99 | -56.77 | -57.47 | -58.43 | -66.60 |
AR3 | -57.30 | -59.03 | -58.44 | -59.16 | -65.10 |
AR4 | -56.02 | -57.31 | -65.61 | -63.63 | -63.76 |
## [1] "Model with linear trend and no differencing:"
MA0 | MA1 | MA2 | MA3 | MA4 | |
---|---|---|---|---|---|
AR0 | 20.56 | -66.07 | -66.86 | -67.56 | -65.75 |
AR1 | -54.83 | -66.12 | -67.08 | -65.73 | -63.75 |
AR2 | -67.57 | -65.85 | -65.83 | -65.68 | -73.80 |
AR3 | -65.94 | -63.95 | -65.50 | -70.95 | -72.88 |
AR4 | -64.06 | -62.90 | -74.63 | -72.96 | -67.59 |
## [1] "Model with differencing and no linear trend:"
MA0 | MA1 | MA2 | MA3 | MA4 | |
---|---|---|---|---|---|
AR0 | 85.13 | 26.77 | -57.43 | -58.34 | -58.91 |
AR1 | 75.89 | -46.58 | -57.58 | -58.52 | -57.10 |
AR2 | 32.77 | -58.91 | -57.20 | -57.20 | -58.50 |
AR3 | 11.52 | -57.30 | -55.32 | -57.04 | -58.42 |
AR4 | 8.23 | -55.44 | -54.31 | -66.01 | -64.32 |
Based on the results, the model with a linear trend achieves the lowest AIC, with optimal parameters \(p=2\) and \(q=0\). Also, adding the differencing parameter has the highest AIC table for all models, so we set \(d=0\). Additionally, we check the existence of unit root by Augmented Dickey-Fuller test:
##
## Augmented Dickey-Fuller Test
##
## data: df$CPIAUCSL
## Dickey-Fuller = -7.3207, Lag order = 7, p-value = 0.01
## alternative hypothesis: stationary
We obtain a small \(p\)-value, leading to the rejection of \(H_0\) and confirming that our time series does not have a unit root. However, this does not necessarily mean that the series is stationary or that differencing is not required—it merely provides additional evidence for not including a differencing parameter. Now we determine the seasonal parameters \(P\) \(Q\) with a period of 12, as suggested in the EDA. We use the common function here instead of the one in , as the running time for optimization is too long.
## [1] "Model with no linear trend:"
MA0 | MA1 | MA2 | MA3 | |
---|---|---|---|---|
AR0 | -57.99 | -61.91 | -59.92 | -62.66 |
AR1 | -61.61 | -59.92 | -58.58 | -66.76 |
AR2 | -60.47 | -59.38 | -60.27 | -70.73 |
AR3 | -61.81 | -65.70 | -70.62 | -67.70 |
## [1] "Model with linear trend:"
MA0 | MA1 | MA2 | MA3 | |
---|---|---|---|---|
AR0 | -67.57 | -74.87 | -73.32 | -73.84 |
AR1 | -73.84 | -73.08 | -72.13 | -75.52 |
AR2 | -74.34 | -72.88 | -71.89 | -77.75 |
AR3 | -73.48 | -74.77 | -77.57 | -74.34 |
## [1] "Model from auto.arima:"
## Series: df$CPIAUCSL
## ARIMA(1,1,2)
##
## Coefficients:
## ar1 ma1 ma2
## 0.1398 -0.6337 -0.3446
## s.e. 0.0931 0.0848 0.0820
##
## sigma^2 = 0.05007: log likelihood = 32.79
## AIC=-57.58 AICc=-57.48 BIC=-41.44
We also include the results of model with and without a linear trend. We discover that the model with linear trend still has a better performance with the optimal parameters \(P=0\) and \(Q=1\), although there is a possibility of optimization and numerical error. We also use function from the package as a sanity check, and we observe that our selected model is much better than the model selected by , as its searching algorithm is not optimal.
(Statistics 531, Winter 2025, Lecture 5, p. 21; Ionides, 2025).
To further examine the evidence for including the linear trend parameter in our SARIMA model, we provide diagnostic for both options with the other parameters determined above.
## [1] "Model with no linear trend:"
##
## Call:
## arima2::arima(x = df$CPIAUCSL, order = c(2, 0, 0), seasonal = list(order = c(0,
## 0, 1), period = 12))
##
## Coefficients:
## ar1 ar2 sma1 intercept
## 0.5184 -0.1514 -0.1213 0.2137
## s.e. 0.0484 0.0486 0.0488 0.0151
##
## sigma^2 estimated as 0.04926: log likelihood = 35.96, aic = -61.91
## [1] "Model with linear trend:"
##
## Call:
## arima2::arima(x = df$CPIAUCSL, order = c(2, 0, 0), seasonal = list(order = c(0,
## 0, 1), period = 12), xreg = df$time)
##
## Coefficients:
## ar1 ar2 sma1 intercept df$time
## 0.4906 -0.1777 -0.1549 0.3913 0e+00
## s.e. 0.0481 0.0484 0.0516 0.0453 1e-04
##
## sigma^2 estimated as 0.04753: log likelihood = 43.43, aic = -74.87
## Coefficients SE Z p_value
## ar1 4.905872e-01 4.808999e-02 10.2014397 0.0000000000
## ar2 -1.776910e-01 4.837815e-02 -3.6729604 0.0002397567
## sma1 -1.548533e-01 5.156151e-02 -3.0032725 0.0026709316
## intercept 3.913307e-01 4.526435e-02 8.6454500 0.0000000000
## df$time -1.498269e-05 5.266563e-05 -0.2844871 0.7760371205
We first notice that although model with linear trend has a
lower AIC, the trend parameter has a \(p\)-value of 0.78, which seems
that we cannot suggest the trend is not 0. However, here we are using
the typical confidence interval for an MLE estimate
\(\hat{\theta}\): \[
\hat{\theta} \pm z_{\alpha / 2} \cdot
\sqrt{\operatorname{Var}(\hat{\theta})}
\] where:
- \(z_{\alpha / 2}\) is the
critical value (e.g., 1.96 for 95% Confidence
Interval (CI)).
- \(\operatorname{Var}(\hat{\theta})\)
is estimated using the observed Fisher Information.
This might be a problematic estimate if the likelihood surface is non-quadratic or asymmetric, or if the sample size is small and asymptotic normality does not hold. In our case, using the Likelihood Ratio Test (LRT) might be a better option than the MLE-based Fisher Information approximation, and we will investigate that later.
We continue our analysis by examining the residual distribution. The residuals remain heavy-tailed, and the model fit shows no improvement over the original data. As discussed in Section 2.4, the Q-Q plot against a normal distribution does not align well; instead, it closely resembles the t-distribution, as observed in Section 2.5. This confirms that the SARIMA model struggles to account for volatility clustering and time-varying variance.
Indeed, we can check excess kurtosis and the degree of freedom \(\nu\) again for our residuals.
## [1] 3.539722
## [1] 5.695048
Also, similar to Section 2.4, we further perform the Shapiro-Wilk test and the Jarque-Bera test to formally examine normality. The test results for both the Shapiro-Wilk Test and the Jarque-Bera Test reject the null hypothesis, supporting our observation from the Q-Q plot that the data is not normally distributed.
##
## Shapiro-Wilk normality test
##
## data: residuals_model2
## W = 0.92427, p-value = 1.054e-13
##
## Jarque Bera Test
##
## data: residuals_model2
## X-squared = 766.03, df = 2, p-value < 2.2e-16
To better address the issue of volatility clustering, we have to introduce an additional structure, such as GARCH, to model the conditional heteroskedasticity in the data. This will be an extension of our SARIMA model in the next section to improve the AIC and the forecasting capability.
We also perform the Ljung-Box test on our residuals for both models, and we also plot the \(p\)-values for different lags. We first notice that we fail to reject the null hypothesis for all lags in the model with a linear trend, adding evidence that our residuals are white noise (no autocorrelation, but not normal, as observed above), although some higher lags still have lower \(p\)-values.
Additionally, the model with a linear trend performs better than the model without a trend, which exhibits autocorrelation for large lags, as indicated by the rejection of small \(p\)-values in the plot. This further supports the inclusion of the linear trend parameter.
We also examine the Ljung-Box test \(p\)-values plot for the original data below, which confirms that the original data exhibits significant autocorrelations across all lags. Therefore, our SARIMA model successfully captures the autocorrelated structure of the data.
(Statistics 531, Winter 2025, Lecture4, Lecture 5, Lecture 6; Ionides, 2025. Statistics 509, Winter 2024, Lecture 4, Lecture 5, Lecture 6; Thelen, 2024).
Based on trend analysis, residual diagnostics, and profile likelihood evaluation, our final SARIMA model is:
\[ (1 - \phi_1 B - \phi_2 B^2) (Y_t -\mu) = \beta X_t + (1 + \theta_{12} B^{12}) \varepsilon_t, \]
where:
- \(Y_t\) = CPI inflation
rate
- \(B\) = Backward shift
operator
- \(\phi_1, \phi_2\) =
Non-seasonal AR terms
- \(X_t\) = Exogenous regressor
(time trend)
- \(\beta\) = Coefficient for
time trend
- \(\theta_{12}\) = Seasonal MA
term at lag 12
- \(\varepsilon_t \sim WN(0,
\sigma^2)\) = White noise error term
Given remaining volatility clustering, we extend our analysis to SARIMA-GARCH models in the next section.
(Statistics 531, Winter 2025, Lecture 5, p. 25; Ionides, 2025).
## initial value -1.378959
## iter 2 value -1.479685
## iter 3 value -1.503867
## iter 4 value -1.504500
## iter 5 value -1.504505
## iter 6 value -1.504506
## iter 7 value -1.504507
## iter 8 value -1.504507
## iter 8 value -1.504507
## iter 8 value -1.504507
## final value -1.504507
## converged
## initial value -1.504731
## iter 2 value -1.504748
## iter 3 value -1.504750
## iter 4 value -1.504753
## iter 5 value -1.504755
## iter 6 value -1.504755
## iter 6 value -1.504755
## iter 6 value -1.504755
## final value -1.504755
## converged
## <><><><><><><><><><><><><><>
##
## Coefficients:
## Estimate SE t.value p.value
## ar1 0.5184 0.0484 10.7063 0.0000
## ar2 -0.1514 0.0486 -3.1153 0.0020
## sma1 -0.1213 0.0488 -2.4860 0.0133
## xmean 0.2137 0.0151 14.1489 0.0000
##
## sigma^2 estimated as 0.04926281 on 415 degrees of freedom
##
## AIC = -0.1477666 AICc = -0.147536 BIC = -0.09958199
##
The SARIMA(2,0,0)(0,0,1)[12] model is designed to capture both short-term and seasonal dependencies in CPI inflation dynamics. The non-seasonal AR(2) component consists of AR(1) (-0.0896, p = 0.0867) and AR(2) (-0.0134, p = 0.7946), where AR(1) is close to statistical significance. This suggests that inflation exhibits some degree of autoregressive persistence, meaning past inflation values have a moderate influence on present inflation movements. The seasonal moving average term (SMA(1)) is not included in the final model, indicating that seasonal effects were not strongly present in the residual diagnostics.
Residual diagnostics indicate that SARIMA removes most autocorrelation, as shown by the Ljung-Box test p-values, which remain above 0.05 across multiple lags. Additionally, the ACF plot of residuals does not show significant serial correlation, suggesting that the model captures inflation trends effectively. However, the Q-Q plot reveals deviations from normality, particularly in the tails, implying that extreme inflation movements are not fully captured. This suggests that SARIMA alone is insufficient to model volatility clustering, a key feature in financial and macroeconomic time series.
From a numerical standpoint, SARIMA achieves an AIC of -0.1478 and a BIC of -0.1475. Given these limitations, we extend our analysis by integrating GARCH-type models to explicitly model conditional heteroskedasticity, a common feature in inflation series.
##
## ARCH LM-test; Null hypothesis: no ARCH effects
##
## data: garch_resid
## Chi-squared = 4.527, df = 10, p-value = 0.9205
To address SARIMA’s inability to model volatility clustering, we introduce the SARIMA-GARCH model, which integrates a GARCH(1,1) component to capture time-varying inflation uncertainty. The GARCH(1,1) model, originally introduced by Bollerslev (1986) as an extension of the ARCH model by Engle (1982), assumes that past volatility and past squared shocks influence future inflation uncertainty. This aligns with discussions in Tsay (2010, pp. 134-139), where ARMA models are coupled with GARCH to improve time series forecasting.
Examining the conditional variance dynamics, the GARCH(1,1) coefficients (\(\alpha_1 = 0.2339\), \(\beta_1 = 0.6834\)) suggest persistent volatility, with a total persistence measure of 0.917, indicating that inflation uncertainty remains elevated over extended periods. The Student-t shape parameter (4.3499) confirms the presence of fat-tailed residuals, further justifying the choice of a t-distributed error model.
Residual diagnostics, including ARCH-LM tests, confirm that the SARIMA-GARCH model successfully removes significant autocorrelation in squared residuals. The ARCH-LM test on standardized residuals yields a p-value of 0.9302, indicating no remaining ARCH effects, supporting the model’s effectiveness in capturing volatility clustering. From a numerical perspective, SARIMA-GARCH achieves an AIC of -0.4401 and a BIC of -0.3726.
However, a key limitation of SARIMA-GARCH is that it assumes symmetric volatility responses to inflation shocks. This means that positive and negative inflation surprises are treated equally, despite evidence that inflation often reacts asymmetrically to economic shocks (e.g., monetary policy tightening vs. expansionary measures). To address this, we extend our analysis by integrating EGARCH and GJR-GARCH models, which allow for asymmetric volatility effects.
##
## ARCH LM-test; Null hypothesis: no ARCH effects
##
## data: egarch_resid
## Chi-squared = 4.4604, df = 10, p-value = 0.9242
The SARIMA-EGARCH model extends SARIMA-GARCH by introducing asymmetric volatility effects, allowing inflation uncertainty to react differently to positive and negative shocks. Unlike standard GARCH, EGARCH, originally developed by Nelson (1991), uses a log-transformation of variance, ensuring that volatility remains positive without restrictive parameter constraints. Additionally, EGARCH captures leverage effects, meaning that negative economic shocks can increase inflation volatility more than positive shocks of the same magnitude. The effectiveness of EGARCH in modeling asymmetric macroeconomic volatility has been emphasized in Tsay (2010, pp. 143-145).
Examining the variance dynamics, the EGARCH(1,1) parameters (\(\beta = 0.8761\), \(\alpha = 0.0421\), \(\gamma = 0.3925\)) suggest that past volatility strongly influences current uncertainty. Importantly, the asymmetry coefficient (\(\gamma = 0.3925\)) is highly significant (p < 0.01), meaning that negative inflation shocks increase volatility more than positive ones, aligning with theoretical expectations. The omega coefficient (-0.3909) reflects the baseline log-volatility, and the Student-t shape parameter (4.3509) confirms fat-tailed residuals, reinforcing the choice of a non-Gaussian error structure.
ARCH-LM test results show no remaining ARCH effects (p-value = 0.9242), confirming that the model successfully captures conditional heteroskedasticity. From a numerical standpoint, SARIMA-EGARCH achieves an AIC of -0.4302 and a BIC of -0.3531. Given the importance of capturing asymmetric volatility, we next explore the GJR-GARCH model, which models leverage effects differently than EGARCH.
##
## ARCH LM-test; Null hypothesis: no ARCH effects
##
## data: gjr_garch_resid
## Chi-squared = 3.9942, df = 10, p-value = 0.9476
The SARIMA-GJR-GARCH model incorporates asymmetric volatility effects, specifically focusing on leverage effects where negative shocks may increase volatility more than positive ones. This model, developed by Glosten, Jagannathan, and Runkle (1993), is widely used in financial volatility modeling.
Examining the variance dynamics, the GJR-GARCH(1,1) parameters (\(\alpha = 0.2867\), \(\beta = 0.7003\), \(\gamma = -0.1283\)) suggest that past volatility and past shocks play an important role in explaining conditional variance. However, the asymmetry parameter (\(\gamma\)) is not statistically significant, implying that negative inflation shocks do not drastically increase volatility more than positive ones. The Student-t shape parameter (4.4078) confirms fat-tailed residuals.
ARCH-LM test results confirm no remaining ARCH effects (p-value = 0.9476), indicating that the model sufficiently captures conditional heteroskedasticity. From a numerical perspective, SARIMA-GJR-GARCH achieves the best log-likelihood (100.0056) and the lowest AIC (-0.4392), making it the most effective model in capturing CPI inflation volatility.
## Model AIC BIC Log_Likelihood
## 1 SARIMA -0.1477666 -0.1475360 35.95710
## 2 SARIMA-GARCH -0.4400526 -0.3725941 99.19102
## 3 SARIMA-EGARCH -0.4302243 -0.3531289 98.13198
## 4 SARIMA-GJR-GARCH -0.4391678 -0.3620724 100.00565
All three GARCH-based models effectively eliminate autocorrelation and capture seasonal inflation volatility using Student-t distributed errors, ensuring a robust inflation modeling framework. Among them, SARIMA-GJR-GARCH(1,1) achieves the best numerical fit, with the lowest AIC (-0.4392) and highest log-likelihood (100.0056). The ARCH-LM tests confirm that none of the models exhibit significant remaining ARCH effects, indicating proper volatility modeling.
While SARIMA-GARCH and SARIMA-EGARCH provide valuable insights into volatility clustering and asymmetric effects, SARIMA-GJR-GARCH stands out as the best-performing model, balancing numerical performance with economic interpretability. Given its statistical fit and ability to incorporate volatility effects, SARIMA(2,0,0)(0,0,1)[12]-GJR-GARCH(1,1) emerges as the most appropriate model for CPI inflation forecasting.
Rolling window cross-validation is a widely used approach in time series forecasting to evaluate model performance under dynamically changing conditions. As Tsay (2010) explains, the rolling forecasting procedure helps account for the fact that new observations continuously become available, requiring the model to adapt over time.
This approach ensures that forecast accuracy measures reflect the model’s ability to generalize to unseen data (p. 216). By shifting the training and test windows forward iteratively, this method mimics the real-world scenario of incrementally updated forecasts, improving model robustness.
Tsay (2010) also notes that rolling evaluations are particularly useful for assessing models that undergo structural changes, such as regime-switching processes or those affected by economic cycles (p. 222). Consequently, rolling window validation is an essential tool for evaluating forecasting accuracy in financial and economic time series.
## Model MAE RMSE
## 1 SARIMA 0.1934351 0.2574763
## 2 SARIMA-GARCH 0.1836068 0.2487512
## 3 SARIMA-EGARCH 0.1797050 0.2448797
## 4 SARIMA-GJR-GARCH 0.1826161 0.2477861
The SARIMA-EGARCH model emerges as the best-performing model in forecasting accuracy, achieving the lowest Mean Absolute Error (MAE = 0.1797) and Root Mean Squared Error (RMSE = 0.2448). This result supports the findings of Nelson (1991) and Tsay (2010, pp. 143-145), which emphasize EGARCH’s effectiveness in capturing asymmetric volatility effects. The ability to differentiate between positive and negative economic shocks allows SARIMA-EGARCH to better account for periods of heightened uncertainty, leading to superior predictive accuracy.
The SARIMA-GJR-GARCH model ranks as the second-best performer, with MAE = 0.1826 and RMSE = 0.2478. Designed to capture leverage effects (Glosten, Jagannathan, & Runkle, 1993), this model effectively accounts for asymmetric responses in inflation volatility. While it slightly underperforms compared to SARIMA-EGARCH, it still demonstrates strong forecasting capabilities and an improvement over standard SARIMA-GARCH.
The SARIMA-GARCH model, originally introduced by Bollerslev (1986), performs slightly worse than SARIMA-GJR-GARCH, with MAE = 0.1836 and RMSE = 0.2488. Although integrating seasonal ARIMA components with GARCH-based volatility structures enhances forecasting accuracy by capturing both short-term dependencies and long-term volatility clustering, its lack of asymmetry handling limits its adaptability in economic environments where inflation reacts differently to supply-side versus demand-side shocks.
Finally, the baseline SARIMA model exhibits the weakest predictive performance (MAE = 0.1934, RMSE = 0.2575), reinforcing the importance of incorporating volatility modeling techniques when forecasting inflation. SARIMA’s relatively narrow confidence intervals (CI) stem from its lack of a volatility component, leading to an underestimation of uncertainty in future predictions. The wider confidence bands in EGARCH and GJR-GARCH models reflect their ability to adjust for conditional variance, particularly in volatile periods.
Therefore, based on forecasting performance and statistical robustness, SARIMA-EGARCH emerges as the preferred model for CPI inflation analysis, followed by SARIMA-GJR-GARCH, SARIMA-GARCH, and SARIMA.
Given its numerical superiority and ability to model asymmetric volatility, SARIMA-EGARCH is the most effective model for inflation forecasting. While SARIMA-GARCH and SARIMA-GJR-GARCH provide solid alternatives, they do not outperform EGARCH in predictive accuracy. The results highlight the importance of accounting for asymmetric volatility shocks when modeling inflation dynamics, supporting the broader literature on financial and macroeconomic volatility forecasting.
The analysis of Consumer Price Index (CPI) inflation using SARIMA and SARIMA-EGARCH models provides critical insights into inflationary trends and their macroeconomic implications. The SARIMA(2,0,0)(0,0,1)[12] model captures both short-term and seasonal dependencies in inflation, highlighting cyclical patterns that align with broader economic fluctuations. However, inflation volatility, which reflects uncertainty in price movements, necessitates models like EGARCH, which explicitly model conditional heteroskedasticity and asymmetric volatility responses. The findings from SARIMA-EGARCH reveal key insights into inflation persistence, volatility clustering, and the asymmetric response of inflation to economic shocks, which are crucial for monetary policy and economic forecasting.
The SARIMA-EGARCH model emerges as the best-performing model in terms of forecasting accuracy, achieving the lowest Mean Absolute Error (MAE = 0.1797) and Root Mean Squared Error (RMSE = 0.2448). This supports findings from Nelson (1991) and Tsay (2010, pp. 143-145), emphasizing EGARCH’s ability to capture asymmetric volatility effects. Given that inflation dynamics are often driven by unpredictable supply and demand shocks, the flexibility of EGARCH in differentiating between positive and negative shocks allows for a more realistic representation of inflation volatility.
Inflation forecasts play a fundamental role in monetary policy formulation, as central banks, particularly the Federal Reserve, rely on inflation expectations to guide interest rate decisions. The Federal Reserve follows a dual mandate—maintaining price stability and fostering maximum employment—where inflation modeling aids in determining appropriate monetary policy responses. According to Mankiw (2010, Ch. 4), inflation trends are heavily influenced by money supply growth, as illustrated by the quantity theory of money, which links long-run inflation directly to the expansion of monetary aggregates. Furthermore, the New Keynesian framework, as outlined in Galí (2008, Ch. 5), emphasizes the role of forward-looking expectations in shaping inflation dynamics, where credibility in monetary policy can anchor inflation expectations.
The results from SARIMA-EGARCH highlight that CPI inflation exhibits volatility clustering, indicating that periods of high inflation uncertainty are followed by continued uncertainty. This aligns with empirical findings on monetary policy shocks, which can induce long-lasting effects on inflation expectations and macroeconomic stability. Additionally, the EGARCH model captures asymmetric inflation responses, showing that negative economic shocks (e.g., recessions, financial crises) have a greater impact on inflation volatility than positive shocks, reinforcing the notion of leverage effects in macroeconomic fluctuations.
To contextualize these findings, we can compare historical inflation trends with past economic crises and stagflation periods. The 1970s stagflation (high inflation and high unemployment) was characterized by supply-side shocks, excessive monetary expansion, and oil price volatility, which resulted in persistent inflationary pressures. During the 2008 financial crisis, the Federal Reserve implemented quantitative easing (QE) to counteract deflationary pressures, highlighting the trade-offs between inflation targeting and economic stabilization. More recently, the COVID-19 pandemic induced a sharp increase in inflation volatility, driven by supply chain disruptions, labor shortages, and fiscal stimulus measures. The SARIMA-EGARCH model confirms that inflation volatility has become more persistent post-pandemic, reflecting heightened macroeconomic uncertainty.
The empirical results underscore the importance of SARIMA-EGARCH in inflation analysis, particularly for monetary policymakers, financial analysts, and economic forecasters. While SARIMA captures trend and seasonality, EGARCH provides critical insights into inflation uncertainty and risk. Given the Federal Reserve’s reliance on inflation forecasts to adjust interest rates, the ability to model conditional heteroskedasticity and asymmetric volatility responses enhances the predictive power of inflation models.
Future research could explore the integration of exogenous macroeconomic indicators (e.g., oil prices, labor costs, interest rates) using ARIMAX and VAR models to refine inflation forecasting and provide richer insights into macroeconomic policy effects.
As financial forecasting becomes more complex, traditional statistical models may struggle to capture nonlinear interactions and high-dimensional dependencies in economic data. Machine learning techniques like Random Forest (RF) and Extreme Gradient Boosting (XGBoost) provide alternative approaches by leveraging ensemble learning to improve predictive accuracy. Unlike SARIMA-GARCH models, which rely on predefined statistical assumptions, these machine learning methods are data-driven, allowing them to detect hidden patterns in time series without requiring explicit model specifications. Random Forest constructs multiple decision trees using bootstrapped samples and averages predictions to reduce overfitting (James et al., 2021, pp. 344–345). However, RF does not inherently capture temporal dependencies, requiring feature engineering with lagged variables for time series forecasting.
XGBoost, an advanced boosting algorithm, sequentially builds trees to minimize residual errors, resulting in higher predictive power and better generalization (James et al., 2021, pp. 346–348). Unlike RF, which grows trees independently, XGBoost adjusts each tree based on the weaknesses of prior iterations, making it highly effective for macroeconomic forecasting. Research has shown that XGBoost outperforms traditional models in predicting economic trends by capturing complex relationships and efficiently optimizing hyperparameters. Future research could explore hybrid approaches integrating SARIMA-GARCH with XGBoost or RF, leveraging the interpretability of econometric models while enhancing predictive accuracy with machine learning’s ability to capture nonlinear dependencies and dynamic interactions in macroeconomic data.
Traditional statistical models like SARIMA-GARCH, while effective, struggle to capture highly non-linear relationships and long-term dependencies in financial and macroeconomic time series. Long Short-Term Memory (LSTM), a specialized deep learning model, overcomes these limitations by leveraging recurrent neural network (RNN) architecture to retain long-term dependencies and mitigate the vanishing gradient problem in sequential data. LSTMs have been widely applied in financial forecasting, outperforming conventional models in volatility modeling and macroeconomic trend prediction. Siami-Namini et al. (2018) compared ARIMA and LSTM and found that LSTM models reduced forecasting errors by 84–87%, demonstrating their superior ability to learn intricate patterns in time series.
Unlike SARIMA-GARCH hybrids, which assume linear dependencies and conditional heteroskedasticity, LSTMs adaptively learn from data without predefined structures, making them particularly well-suited for inflation forecasting. Clark et al. (2020) highlight that deep learning models, particularly LSTMs, have gained traction in time series regression due to their ability to handle complex economic shocks dynamically. Future research could explore hybrid models combining SARIMA-GARCH with LSTM, where SARIMA-GARCH captures short-term dependencies and volatility clustering, while LSTM detects long-term macroeconomic dynamics. This fusion could enhance inflation forecasting, particularly in volatile economic conditions where traditional models may fall short.
Incorporating exogenous variables into time series forecasting significantly enhances predictive accuracy, particularly in macroeconomic contexts where external factors influence inflation dynamics. The Autoregressive Integrated Moving Average with Exogenous Variables (ARIMAX) model extends ARIMA by integrating explanatory macroeconomic indicators such as oil prices, labor costs, or Federal Reserve policy changes. Unlike SARIMA, which relies solely on past values of the target variable, ARIMAX improves forecasting by incorporating economic shocks directly into the model. Stock & Watson (2001) demonstrate how ARIMAX models capture interdependencies in inflation and monetary policy, improving model interpretability.
In contrast, Vector Autoregression (VAR) models, commonly used in macroeconomic forecasting, treat all included variables as interdependent, allowing researchers to examine inflation’s response to external policy shocks. Koop & Korobilis (2010) highlight that Bayesian estimation techniques can mitigate overfitting in high-dimensional datasets, making VAR a robust alternative for modeling macroeconomic linkages. Future research could compare ARIMAX models against SARIMA-GARCH hybrids and machine learning techniques like XGBoost and Random Forest, evaluating their effectiveness in inflation forecasting. A hybrid approach combining the statistical rigor of ARIMAX with the adaptability of machine learning could provide a more flexible and responsive forecasting framework for real-time economic analysis.
This study examined time series modeling approaches for CPI inflation forecasting, emphasizing the importance of volatility modeling in capturing inflationary dynamics. We compared SARIMA and SARIMA-GARCH hybrid models, incorporating standard GARCH, EGARCH, and GJR-GARCH models to assess their effectiveness in handling conditional heteroskedasticity and asymmetric volatility effects. Our findings indicate that SARIMA-EGARCH delivers the highest forecasting accuracy, achieving the lowest Mean Absolute Error (MAE = 0.1797) and Root Mean Squared Error (RMSE = 0.2448). This aligns with the literature highlighting EGARCH’s effectiveness in capturing asymmetric responses to economic shocks (Nelson, 1991; Tsay, 2010). The SARIMA-GJR-GARCH model ranks second, demonstrating strong performance in modeling leverage effects, though it slightly underperforms compared to SARIMA-EGARCH in overall accuracy. Meanwhile, the SARIMA-GARCH model effectively captures volatility clustering but lacks asymmetry handling, and the baseline SARIMA model exhibits the weakest predictive performance due to its inability to model time-varying variance. From a macroeconomic perspective, our results confirm that inflation volatility is persistent, supporting theories that inflation uncertainty remains elevated following economic shocks. The asymmetric response captured by EGARCH and GJR-GARCH suggests that negative economic events contribute disproportionately to inflation volatility, a critical consideration for monetary policy formulation and financial risk management. Despite the strong performance of SARIMA-EGARCH, certain limitations remain. The model does not incorporate exogenous macroeconomic variables, such as interest rates or oil prices, which can significantly influence inflation. Future research could explore ARIMAX models, machine learning techniques (e.g., XGBoost, LSTM), or hybrid approaches to enhance forecasting robustness. In conclusion, SARIMA-EGARCH emerges as the most effective model for CPI inflation forecasting, offering both statistical rigor and macroeconomic interpretability. However, integrating external economic indicators, machine learning techniques, and non-linear methods could further improve inflation prediction, particularly in periods of heightened economic uncertainty.