The world economy is gradually developing in the direction of anti-globalization. And the development of world trade faces many uncertain factors.The concern is partially comes from the reading of articles about trade and globalization[1]. Therefore, the project will focus on the trade between the Asian country Japan and the United States, and analyze the fluctuation of trade through the US import data. The data is downloaded from UN Comtrade database[2], and for the analysis, I choose the month trade values between two countries from 2010 to 2019.
The analysis will focus on studying how the trade value changed during past ten years, and try to see if there are some obvious fluctrations among the data. Moreover, I’ll try to use appropriate models to fit the data, which may help predict the future trend of the import trade value.
Starting by importing the data, we can see the month trade value went as below:
data<-read.csv('USA-JAPAN import month trade value data.csv',sep=',',header=TRUE)
data.TV<-data[,c("Trade.Value")]
data.GR<-data[,c("Growth.Rate")]
plot(data.TV, type='l')
And we can calculate some properties of the data:
library(plyr)
each(max,min,mean,median,sd)(data.TV)
## max min mean median sd
## 7833918854 4991606306 6232363006 6162532844 578622844
From the plot and the properties of the data, we can see that the trade value is large and has a large volatility, so then I calculate the growth rate of the value each year to see if that is more proper for analyzing.
plot(data.GR, type='l')
The properties of growth rate is as follows:
library(plyr)
data.GR[is.na(data.GR)] <- 0
each(max,min,mean,median,sd)(data.GR)
## max min mean median sd
## 0.317910964 -0.142626667 0.006071650 -0.005013895 0.085350082
From both the plot and the properties, the growth rate seems more stable and worth to analyze. Furthermore, we can use the acf and pacf plots to speculate if it’s stationary.
acf(data.GR,type="correlation")
acf(data.GR,type="partial")
Above analyzing processes are partly refered to my hw3 project[3]
It’s clear that the seasonality property exists for the growth rate data, so I’ll then try to smooth the data and find the period. The following processes are also applied in my hw4 project[4].
spectrum(data.GR, main="Unsmoothed Periodogram of Growth rate")
plot1 = spectrum(data.GR,spans=c(3,3),main="Smoothed Periodogram of Growth rate")
The dominant frequency seems to be around 0.42. Use the following code to find the dominant frequency point:
plot1$freq[which.max(plot1$spec)]
## [1] 0.4166667
It seems that the dominant frequency is 0.417, the accordingly period is around 2.4 month, while the other frequencies is not much significant. From this value, we can find that the ocillatory period is just close to what is shown in the time plot, which is around 2 month.
Then we try to use loess method to smooth.
month = data$Period
GR_loess <-loess(data.GR~month, span=0.2)
plot(data.GR,type="l")
lines(GR_loess$fitted,type="l")
After smoothing, we can see the growth rate is close to 0.01 and maintained at a rather stable level.Then we can start to fit the models with the above preperties of data set.
We start by assuming the data is stable wihout trend, and trying to let the data fit to an ARMA model.
The general equation of an ARMA model is given by:
\[ \phi(B) (Y_{n}-\mu )=\psi(B)\varepsilon _{n} \]
Witin the above equation, \(\phi(B)\) is the autoregressive operator, and \(\psi(B)\) is the moving average operator.\(\mu\) is the mean value of the data, and \(\varepsilon _{n}\) is the white noise. What we need to do is to find the pair of (p,q) which can make the data fit the model most.
A good method to choose the model is using AIC, which is given by:
\[ (-2)\times l(\theta)+2D \]
So, we can make a table of AIC values of different pairs of(p,q), and then try to compare them.
## Warning in arima(data, order = c(p, 0, q), method = "ML"): possible
## convergence problem: optim gave code = 1
## Warning in arima(data, order = c(p, 0, q), method = "ML"): possible
## convergence problem: optim gave code = 1
## Warning in arima(data, order = c(p, 0, q), method = "ML"): possible
## convergence problem: optim gave code = 1
## Warning in arima(data, order = c(p, 0, q), method = "ML"): possible
## convergence problem: optim gave code = 1
## Loading required package: knitr
MA0 | MA1 | MA2 | MA3 | MA4 | MA5 | |
---|---|---|---|---|---|---|
AR0 | -222.09 | -268.22 | -266.89 | -269.52 | -267.83 | -267.65 |
AR1 | -263.60 | -267.31 | -266.07 | -267.65 | -266.54 | -266.37 |
AR2 | -264.80 | -267.68 | -265.69 | -266.32 | -276.07 | -266.50 |
AR3 | -262.87 | -265.73 | -266.07 | -276.24 | -274.44 | -272.95 |
AR4 | -271.41 | -269.71 | -269.11 | -274.72 | -273.73 | -276.18 |
AR5 | -269.48 | -268.18 | -267.12 | -266.84 | -274.93 | -271.94 |
Comparing with their AIC values, we can achieve that ARMA(0,1) model may better reflect the data, which has a lower AIC value and less numbers of parameters. So then we can fit the ARMA(0,1) model to the data:
##
## Call:
## arima(x = data.GR, order = c(0, 0, 1))
##
## Coefficients:
## ma1 intercept
## -0.6911 0.0056
## s.e. 0.0678 0.0021
##
## sigma^2 estimated as 0.004594: log likelihood = 137.11, aic = -268.22
We can get the plot of residuals:
Compare the estimated data fiited with ARMA(0,1) model with the original data:
plot(data.GR,type="l")
lines(data.fitted,type="l",lty=2,col='blue')
Then calculate the bias at each point:
bias.GR = (data.GR-data.fitted)/data.GR
plot(bias.GR,type="l")
We can find that ARMA(0,1) model describes the similar trend, but there are some deviations in the values of the vertices. What’s more, the bias at these vertices can be quite large. As for it’s the estimation of growth rate, we may need to find a better model to describe the data. Considering the rotating period, I decide to try SARMA model.
Using \(SARIMA(0,0,1)\times(1,0,1)_{3}\) model to fit the data:
##
## Call:
## arima(x = data.GR, order = c(0, 0, 1), seasonal = list(order = c(1, 0, 1), period = 3))
##
## Coefficients:
## ma1 sar1 sma1 intercept
## -0.6443 0.8346 -1.0000 5e-03
## s.e. 0.0744 0.0706 0.0519 9e-04
##
## sigma^2 estimated as 0.004209: log likelihood = 139.67, aic = -269.34
The plot of residuals is below:
Again we can compare the estimated data under \(SARIMA(0,0,1)\times(1,0,1)_{3}\) model between the growth rate data:
We can calculate the bias plot as follows:
bias2.GR = (data.GR-data.fitted2)/data.GR
plot(bias2.GR,type="l")
The bias at certain points is still large, while some of the biases is smaller than ARMA(0,1) model. It seems that it’s better to change the estimated value with large biases, for growth rate can have accumulated results on trade value, or find a better model to fit the growth rate data.
We can just use the growth data estimated by \(SARIMA(0,0,1)\times(1,0,1)_{3}\) model to predict trade value. Comparing with the real value, we can get the plot below:
In conclusion, through all above analyzing processes, we can use \(SARIMA(0,0,1)\times(1,0,1)_{3}\) model to fit growth rate data, and then predict the future trade value. This method can totally catch the trend of the value, however, we can find directly from the plot that the certain value at vertices have some biases compared with the true value. So it still needs better model if we want to predict the value of trade amount.
By achieving the trend of the trade value, we can use this to further analyzing the economic growth, or sometime may also evaluate the relationship between two countries. From the result achieved by this project, we can see the United States’ import from Japan has a rather clear trend
[1]Our World in Data https://ourworldindata.org/trade-and-globalization
[2]UN Comtrade Database https://comtrade.un.org/data/
[3]Homework 3 https://ionides.github.io/531w20/hw03/hw03.html
[4]Homework 4 https://ionides.github.io/531w20/hw04/hw04.html