1. Introduction

As the development of globalization, exchange rate is now playing a very important role in finance and it is the most important price in any economy. When the currency appreciates, the country becomes more expensive and less competitive internationally. At the same time its citizens enjoy a greater standard of living as they can buy international products at lower prices.[1]

Here in this report, I want to conduct an analysis about the exchange rate between US Dollar (USD) and British Pound (GBP) between 2009 and 2015 since the United States and the United Kindom are two of the most powerful western countries and I would like to figure out how their currecies’ relationship behave. And the reason why I’m interested in this time period is that exchange currency rate is usually significantly affected by influential international events. And as we know, due to the huge financial crisis in 2007-2008, not only GBP and USD are greatly affected but the whole currency system as well. And the unintentional “Brexit” event that first came out in 2016, also significantly affected the GBP currency system. Since these kinds of global high-impact events are not very predictable, I am more interested in investigating the relationship in normal years, from 2009 to 2015.

2. Data Overview

I downloaded the data from “FRED” (Federal Reserve Bank of St.Louis) [2]. The data contains weekly exchange rates between USD and GBP from the first week of 2009 to the first week of 2015.

First, let’s read in the data and get some general ideas.

##       DATE  DEXUSUK
## 1 2009.019 1.453125
## 2 2009.038 1.501240
## 3 2009.058 1.466220
## 4 2009.077 1.381225
## 5 2009.096 1.422000
## 6 2009.115 1.450040

There are a total of 312 equally spaced time points. For each data point, there are two parameters. The parameter “DATE” is the time of the price, and the parameter “DEXUSUK” being the average rate of the week. Each price in a data point represents how much on average is 1 pound in USD in this week. Take as example the value of 1.46, then that means in this week, 1 pound = 1.46 dollars. (Note that, for parameter “DATE”, for the purpose of analysis, I have transformed the original dates into numerical form by adding multiples of 1/52 to 2009 so that it’s easier to plot.)

Now let’s look at a brief summary of this time series data:

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   1.381   1.552   1.593   1.585   1.627   1.714

We can see from above that the average exchange rate is around 1.585 and the min is around 1.381 while the max is around 1.714. To have a more thorough idea of the date, let’s make a time plot! Note that the blue line represents the mean of the exchange rates.

From above plot, we notice that there exists some oscillation in different weeks, but they overall tend to bounce back to the average line. At the beginning of the data, the exchange rate is relatively low, which might be explained by the continuing effect of the great financial crisis of 2008. Other than that, there are some high’s such as around the mid of 2014 where it reached the peak (1 pound to 1.714 dollars), and there are some low’s as well around the mid of 2010 (1 pound to 1.43 dollars).

To make the plot easier to read, let’s make a smoothed time plot using Loess smoothing.

From the plot, we don’t see very obvious periods in the time series, but we can never confirm until we check it.

To identify whether or not these exists periods, we can do a smoothed spectrum analysis.

From the frequency plot, we can identify that the most frequent is around 0.0125 cycles per week, or in other word, 80 weeks per cycle. If we look back to the time plot, we can find that there is very weak period of around 80 weeks. There are some downs in the plot roughly around multiples of 80 weeks. However, this is very week and doesn’t intuitively make much sense, thus I will not cover it in this analysis.

3 Fitting an ARMA(p, q) model

From the time plot we see that the trend in this time series data is very weak, if not none. Therefore, starting with a ARMA(p,q) model seems reasonable to me. The null hypothesis is that, the average of exchange rates between GBP and USD doesn’t change much over the years, despite the osciallations, which almost every time series data has.

In order to choose optimal pair of (p,q) values, I decided to tabulate a chart of AIC values under different (p,q) values as below:

MA 0 MA 1 MA 2 MA 3
AR 0 -855.32 -1198.15 -1412.68 -1498.71
AR 1 -1695.01 -1709.81 -1708.04 -1706.13
AR 2 -1707.74 -1708.01 -1706.05 -1705.12
AR 3 -1707.90 -1706.12 -1708.95 -1704.68

The formula of AIC is as follow: \[AIC=-2 \times \ell(\theta^*) + 2D\] where \(\ell(\theta^**)\) is the log likelihood function \(\mathscr{L}=f_{Y_{1:N}}(y^*_{1:N};\theta)\) and \(D\) represents the number of parameters, which makes the second term \(2D\) a penalty for over-complicated models that might cause overfitting issues.

Therefore, we would like to minimize the AIC value. And from the above table, we can see that ARMA(1,1) and ARMA(3,2) give the lowest AIC values, at -1709.81 and -1708.95 respectively. Let’s start with ARMA(1,1), the simpler one first, since we always prefer the simpler model, if they have similar performances.

3.1 Fitting ARMA(1,1)

## 
## Call:
## arima(x = usuk, order = c(1, 0, 1))
## 
## Coefficients:
##          ar1     ma1  intercept
##       0.9542  0.2539     1.5751
## s.e.  0.0177  0.0583     0.0229
## 
## sigma^2 estimated as 0.0002441:  log likelihood = 858.9,  aic = -1709.81

The fitted ARMA(1,1) model given above result could be written as: \[\phi(B)(Y_n - 1.5751) = \psi(B)\varepsilon_n\] where \(B\) is the backshift operator, and these polynomials are defined as follows: \[\phi(x) = 1 - 0.954x \hspace{3cm} \psi(x) = 1 - 0.254x\]

Diagnostic of ARMA(1,1)

There are several things we need to check to make sure ARMA(1,1) satiefies the criteria.

First I want to investigate more into the roots of the AR and MA polynomials of ARMA(1,1) model. In other words, we want to find the roots for \(\phi(x)\) and \(\psi(x)\) above.

## [1] 1.047998+0i
## [1] 3.938886+0i

Since these two roots are both greater than 1 (outside of the unit circle), we can say the causality and invertibility are satiefied properly. Also, there don’t seem to be any redundancy since the roots are not similar.

Then, we want to check whether the assumptions are satisfied for ARMA(1,1). Let’s start with checking the residuals of the fitted ARMA(1,1) model as a time series plot:

From this plot we notice that, except for one extreme value around the beginning, all other values seem not striking and this should not be too worrisome.

Next, we look at the autocorrelation plot of the ARMA(1,1) residuals.

Since all of the lags have values lying between the dashed lines, this indicates the assumption that the errors \(\{\varepsilon_n\}\) are uncorrelated is correct.

Another important assumption of the model is that \(\{\varepsilon_n\} \sim \mathcal{N}(0,\sigma^2)\), thus we will check the normality assumption by QQ-plot.

From this plot, we see that apart from a few points that deviate a bit from the line, all others almost lie on the line, which means the normality condition is roughly satisfied.

Let’s try the other model!

3.1 Fitting ARMA(3,2)

## 
## Call:
## arima(x = usuk, order = c(3, 0, 2))
## 
## Coefficients:
##          ar1      ar2      ar3      ma1      ma2  intercept
##       1.7568  -0.5721  -0.1891  -0.5736  -0.3956     1.5822
## s.e.  0.2018   0.3935   0.1939   0.1860   0.1754     0.0071
## 
## sigma^2 estimated as 0.0002397:  log likelihood = 861.47,  aic = -1708.95

The fitted ARMA(3,2) model given above result could be written as: \[\phi(B)(Y_n - 1.5822) = \psi(B)\varepsilon_n\] where \(B\) is the backshift operator, and these polynomials are defined as follows: \[\phi(x) = 1 - 1.757x + 0.5721x^2 + 0.1891x^3 \hspace{3cm} \psi(x) = 1 + 0.574x + 0.3956x^2\]

Diagnostic of ARMA(3,2)

Again, we check AR and MA polynomial roots:

## [1] 1.021745 1.021745 5.065511
## [1] 1.022422 2.472371

AR roots are 1.02, 1.02 and 5.07, while MA roots are 1.02 and 2.47, indicating that there might be redundancy in the model, but the causality and invertibility are satiefied.

We then check residuals of ARMA(3,2) by plotting it against time and making the autocorrelation plot.

Here we see very similar plots between ARMA(3,2) and ARMA(1,1), therefore, ARMA(3,2) also has uncorrelated residuals.

Similarly, from this QQ-plot, we think the residuals are roughly normally distributed.

Many of the times, directly using the time series data might not be a good idea. Instead, calculating the difference between data points and using that as the new data usually gives pretty good fit.

4 Fitting an ARIMA model

In time series, it’s usually worth a try to use the differenced data instead of the original one, since the differenced data might sometimes produce more mean-stationary time series, such that an ARMA model qualifies better. Therefore, let’s consider fitting some ARIMA models.

First, generate the differenced data by taking difference between adjacent data points and use as new data points.

From this time plot, we notice that it behaves quite similarly to untransformed time series.

In order to build proper model, we tabulate a table of AIC values under different pairs of (p,q) values for ARIMA(p,1,q).

MA 0 MA 1 MA 2 MA 3
AR 0 -1691.91 -1704.66 -1703.41 -1702.02
AR 1 -1702.26 -1703.21 -1701.86 -1701.04
AR 2 -1703.76 -1701.78 -1699.88 -1713.32
AR 3 -1701.78 -1699.95 -1710.59 -1711.36

From the AIC table, we notice that ARIMA(3,1,2), ARIMA(2,1,3) and ARIMA(3,1,3) have lowest AIC values.

arma213 <- arima(usuk, order = c(2,1,3))
arma312 <- arima(usuk, order = c(3,1,2))
arma313 <- arima(usuk, order = c(3,1,3))
# Check causality
abs(polyroot(c(1,-coef(arma213)["ma1"],-coef(arma213)["ma2"],-coef(arma213)["ma3"])))
## [1] 0.6261053 2.2832564 2.2832564
abs(polyroot(c(1,-coef(arma312)["ma1"],-coef(arma312)["ma2"])))
## [1] 0.8007233 1.3006628
abs(polyroot(c(1,-coef(arma313)["ma1"],-coef(arma313)["ma2"],-coef(arma313)["ma3"])))
## [1] 0.636418 2.357448 2.357448

However, after building these three models and check the roots for MA polynomials, they all have one of the MA roots having absolute value smaller than 1, which means they have non-invertible issue. Thus I will check the next smallest AIC model, which is ARIMA(0,1,1)

4.1 Fitting ARIMA(0,1,1)

ARIMA(0,1,1) model is basically the ARMA(0,1) model built on 1 lag differenced data. Let’s build the model and check its roots for AR and MA polynomials as usual.

## 
## Call:
## arima(x = usuk, order = c(0, 1, 1))
## 
## Coefficients:
##          ma1
##       0.2355
## s.e.  0.0581
## 
## sigma^2 estimated as 0.0002493:  log likelihood = 854.33,  aic = -1704.66
## [1] 4.246285

Therefore, we know that this model satisfies both causality and invertibility. Then we want to check its residuals in terms of the normality and correlation by making time plot, QQ-plot and autocorrelation plot.

From the time plot and QQ-plot, we notice that even if there are some points slightly off the line, most points line on the line, and thus we think the residuals are roughly normally distributed. By looking at the ACF plot, we note that all lags have values between the dashed lines, meaning the residuals are uncorrelated. Therefore, the assumptions of the model seem to be satisfied.

5. Forecasting future rates

After the above analysis, I want to use ARMA(1,1), ARMA(3,2) and ARIMA(0,1,1) models to forecast future rates (next 30 weeks) and compare the results with real values. This could be done through the “forecast” package in R.[3]

From above three plots we note that, ARMA(1,1) predicts the rates going up, ARIMA(0,1,1) predicts the rates staying at current level and ARMA(3,2) predicts that the rates drop first and go back up.

Then I will download the extended data set from “FRED”[1] and compare with the real data.

In the above real data time plot, the red part is the original data that we used to analyze earlier. The green part is the extended real data (30 more weeks after the first week of 2015). From here, we can clearly see that ARMA(3,2) model gives a better fit than the other two.

6 Conclusion

After all the analysis, we conclude that ARMA(3,2) is the most appropriate model in terms of forecasting accuracy. While ARIMA(0,1,1) and ARMA(1,1) might be chosen if we don’t care too much about forecast but the simplicity of the model.

7 References

[1] https://www.theglobaleconomy.com/guide/article/107/

[2] https://fred.stlouisfed.org/series/DEXUSUK#0

[3] https://www.datascience.com/blog/introduction-to-forecasting-with-arima-in-r-learn-data-science-tutorials

Besides above references, I also referenced the following two report from previous years:

[4] https://ionides.github.io/531w18/midterm_project/project5/ProjectPaper.html#fitting-an-armapq-model

[5] https://ionides.github.io/531w16/midterm_project/project22/mid-project.html