Licensed under the Creative Commons attribution-noncommercial license, http://creativecommons.org/licenses/by-nc/3.0/. |
Please share and remix noncommercially, mentioning its origin. |
A cup of coffee is ubiquitous in everyday American life. A Gallup poll in 2015 reports that just under two thirds of US adults drink a cup of coffee a day and that one-sixth of Americans say that they are addicted to coffee. The US is the top coffee importing country in the world and it imports most of its beans from Brazil. Brazil is the world’s largest exporter of coffee by far, exporting 64% more metric tons than the next largest exporter, Vietnam, in 2014. It is fair to say that the mechanics of coffee demand and supply in the US and Brazil largely determine how much you pay for your morning coffee.
Objectives
In this report, we seek to answer the following questions:
How do the amount of precipitation in Brazil, Brazilian Real/US Dollar (BRD/USD) exchange rate, and US coffee bean import volume affect futures prices at the New York “C” contract market? Which of these has the largest impact on how futures contracts are priced?
Coffea Arabica, the main type of coffee tree grown in Brazil, is sensitive to changes in climate, particularly to droughts and frosts which decrease its yield. Is Brazil receiving more extreme amounts of precipitation over time?
Does the price of coffee futures contracts exhibit a trend, or can we attribute fluctuations in price merely to random noise?
Sources:
http://www.ers.usda.gov/data-products/us-food-imports.aspx
http://www.macroadvisers.com/monthly-gdp/ma-monthly-gdp-index-41/
One common issue we have to deal with when analyzing time series data is when the data sets being analyzed have different frequencies. We want all our time series data sets to have monthly frequencies when we perform succeeding analyses. This data has an annual frequency, and we disaggregate the data into a monthly frequency as follows:
library(tempdisagg)
#Total USA Import Volume in thousand 60kg bags
volume <- c(24378.01304, 26093.39476, 26056.16257, 27016, 27559)
year <- seq(from=2010,to=2014,by=1)
importvol <- data.frame(cbind(year,volume),row.names = 1)
importvolts <- ts(importvol , frequency=1 , start=2010)
gdp <- read.csv("usamonthlyrealgdp.csv",header=TRUE,row.names = 1)
gdpts <- ts(gdp, frequency = 12, start = 2010)
summary(model <- td(importvolts ~ gdpts))
##
## Call:
## td(formula = importvolts ~ gdpts)
##
## Residuals:
## Time Series:
## Start = 2010
## End = 2014
## Frequency = 1
## [1] -485.3 653.0 -198.0 204.6 -174.2
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -929.6674 701.2565 -1.326 0.2768
## gdpts 0.2030 0.0457 4.443 0.0212 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## 'chow-lin-maxlog' disaggregation with 'sum' conversion
## 5 low-freq. obs. converted to 60 high-freq. obs.
## Adjusted R-squared: 0.8241 AR1-Parameter: 0 (truncated)
monthlyimportvol <- data.frame(matrix(model$values,60,1))
monthlyimportvolts <- ts(monthlyimportvol,frequency=12,start=2010)
ts.plot(monthlyimportvolts,xlab = "Year", ylab = "Thousand 60kg Bags", main = "Monthly US Import Volume")
We see from the above plot that the US imports of coffee has been increasing over the past five years.
Source: https://data.hdx.rwlabs.org/dataset/daily-summaries-of-precipitation-indicators-for-brazil
In the original data set, precipitation was recorded at several weather stations in Brazil per month from 2010 to 2014. To obtain the data set below, we averaged the precipitation over all weather stations with available data for each month. Note that the units of precipitation are in 0.1mm.
precip <- read.csv(file="brazilprecipitation.csv",header = TRUE, row.names = 1)
precipts <- ts(precip, frequency = 12, start = 2010)
ts.plot(precipts, xlab="Year",ylab="Precipitation in 0.1mm", main = "Average Monthly Precipitation in Brazil")
head(precip)
## precipitation
## 2010-01-01T00:00:00 2444.538
## 2010-02-01T00:00:00 1618.223
## 2010-03-01T00:00:00 1631.500
## 2010-04-01T00:00:00 1619.286
## 2010-05-01T00:00:00 749.500
## 2010-06-01T00:00:00 924.000
The above plot shows that the amount of precipitation Brazil receives is cyclical, and seems to have decreased in magnitude in recent years. In the second half of 2014, Brazil experienced drought and as a result arabica production dropped, which led to increased coffee prices during that time. With the phenominon of global warming being an important topic of discussion in the scientific, business, and political communities we ask whether the decreasing magnitude of precipitation we observe visually from the graph is indeed present.
Source: https://research.stlouisfed.org/fred2/series/DEXBZUS#
The original data set contains daily prices from 2010 to 2014. To obtain the data set below, we get the average daily BRD/USD exchange rate per month.
erate <- read.csv("BRLUSDrates.csv",header=TRUE,row.names = 1)
eratets <- ts(erate, frequency = 12, start = 2010)
ts.plot(eratets, xlab="Year",ylab="USD/BRL Exchange Rate", main = "Average Monthly USD/BRL Exchange Rate")
head(erate)
## DEXBZUS
## Jan-10 1.781653
## Feb-10 1.842005
## Mar-10 1.785539
## Apr-10 1.756782
## May-10 1.814155
## Jun-10 1.804200
We see from the plot above that the USD/BRL exchange rate shows a distinct upward trend.
Source: https://www.quandl.com/data/CHRIS/ICE_KC2-Coffee-C-Futures-Continuous-Contract-2-KC2
The original data set contains daily prices from 2010 to 2014. To obtain the data set below, we get the monthly average daily coffee futures prices in US cents per lb. The USD/BRL Exchange rate is a proxy for broader macroeconomic factors affecting both the US and Brazil.
futures <- read.csv("coffeeC.csv",header=TRUE,row.names = 1)
futurests <- ts(futures, frequency = 12, start = 2010)
ts.plot(futurests, xlab="Year",ylab="US cents per lb", main = "Average Monthly New York Coffee C Futures Contract Prices")
head(futures)
## ave_settlement_price
## Jan-10 141.9763
## Feb-10 133.5816
## Mar-10 134.0326
## Apr-10 134.3738
## May-10 134.9450
## Jun-10 152.0705
We are interested in changes over business cycle time scales once trends have been removed.To extract the cyclical component, we use an econometric method called the Hodrick-Prescott (HP) filter .
library(mFilter)
monthlyimportvolts_hp <- hpfilter(monthlyimportvolts, freq=100,type="lambda",drift=F)$cycle
precipts_hp <- hpfilter(precipts, freq=100,type="lambda",drift=F)$cycle
eratets_hp <- hpfilter(eratets, freq=100,type="lambda",drift=F)$cycle
futurests_hp <- hpfilter(futurests, freq=100,type="lambda",drift=F)$cycle
plot(futurests_hp,type="l",xlab="Year",ylab="Average Monthly Futures Contract Prices (USD cents per lb)")
par(new=TRUE)
plot(monthlyimportvolts_hp,col="red",type="l",axes=FALSE,xlab="",ylab="",main="Average Monthly US Coffee Import Volume (in Thousands of 60kg Bags per Month) against Average Monthly Futures Contract Prices")
axis(side=4, col="red")
plot(futurests_hp,type="l",xlab="Year",ylab="Average Monthly Futures Contract Prices (USD cents per lb)")
par(new=TRUE)
plot(precipts_hp,col="red",type="l",axes=FALSE,xlab="",ylab="",main="Average Monthly Precipitation in Brazil (in 0.1mm per month) against Average Monthly Futures Contract Prices")
axis(side=4, col="red")
plot(futurests_hp,type="l",xlab="Year",ylab="Average Monthly Futures Contract Prices (USD cents per lb)")
par(new=TRUE)
plot(eratets_hp,col="red",type="l",axes=FALSE,xlab="",ylab="",main="Average Monthly BRD/USD Exchange Rate against Average Monthly Futures Contract Prices")
axis(side=4, col="red")
From the above three graphs, we observe that detrended futures prices (FP), monthly import volumes (IV), monthly precipitation values (PV), and exchange rates (ER) seem to cycle together. We will perform some tests to see if this is so. Let us first analyze each of these time series using a regression with ARMA errors models. We discuss model selection in more detail in the supplementary analysis.
arima(futurests_hp,xreg=monthlyimportvolts_hp,order=c(2,0,1))
##
## Call:
## arima(x = futurests_hp, order = c(2, 0, 1), xreg = monthlyimportvolts_hp)
##
## Coefficients:
## ar1 ar2 ma1 intercept monthlyimportvolts_hp
## 1.5308 -0.7197 -1.0000 0.0203 0.0255
## s.e. 0.0905 0.0903 0.0573 0.3224 0.0702
##
## sigma^2 estimated as 73.6: log likelihood = -216.01, aic = 444.01
log_lik_ratio <- as.numeric(
logLik(arima(futurests_hp,xreg=monthlyimportvolts_hp,order=c(2,0,1))) -
logLik(arima(futurests_hp,order=c(2,0,1)))
)
LRT_pval <- 1-pchisq(2*log_lik_ratio,df=1)
LRT_pval
## [1] 0.7164655
The Likelihood Ratio Test yields a p-value of 0.716 so the association between futures prices and monthly import volume, which is a proxy metric for demand for coffee in the US, is not statistically significant.
arima(futurests_hp,xreg=precipts_hp,order=c(2,0,1))
##
## Call:
## arima(x = futurests_hp, order = c(2, 0, 1), xreg = precipts_hp)
##
## Coefficients:
## ar1 ar2 ma1 intercept precipts_hp
## 1.5113 -0.6986 -1.0000 -0.0014 0.0043
## s.e. 0.0953 0.0950 0.0567 0.3216 0.0040
##
## sigma^2 estimated as 72.54: log likelihood = -215.5, aic = 443.01
log_lik_ratio <- as.numeric(
logLik(arima(futurests_hp,xreg=precipts_hp,order=c(2,0,1))) -
logLik(arima(futurests_hp,order=c(2,0,1)))
)
LRT_pval <- 1-pchisq(2*log_lik_ratio,df=1)
LRT_pval
## [1] 0.2862275
Performing a Likelihood Ratio Test on the above models yields a p-value of 0.286. While the above do not suggest a statistically significant association between futures prices and precipitation in Brazil, a proxy metric for supply of coffee to the US, the above does show that precipitation is much more predictive of futures prices than import volume.
arima(futurests_hp,xreg=eratets_hp,order=c(2,0,1))
##
## Call:
## arima(x = futurests_hp, order = c(2, 0, 1), xreg = eratets_hp)
##
## Coefficients:
## ar1 ar2 ma1 intercept eratets_hp
## 1.4693 -0.7043 -1.0000 0.0310 -78.0678
## s.e. 0.0936 0.0891 0.0608 0.2494 32.7428
##
## sigma^2 estimated as 66.58: log likelihood = -213.06, aic = 438.13
log_lik_ratio <- as.numeric(
logLik(arima(futurests_hp,xreg=eratets_hp,order=c(2,0,1))) -
logLik(arima(futurests_hp,order=c(2,0,1)))
)
LRT_pval <- 1-pchisq(2*log_lik_ratio,df=1)
LRT_pval
## [1] 0.01415695
The Likelihood Ratio Test yields a p-value of 0.014 which indicates that USD/BRL exchange rates have a highly statistically significant association with futures prices.
Now we want to see whether combining our knowledge of detrended monthly US coffee import volume, precipitation in Brazil, and exchange rates will help us predict futures prices better than the models above.
covariates <- cbind(monthlyimportvolts_hp,precipts_hp,eratets_hp)
arima(futurests_hp,xreg=covariates,order=c(2,0,1))
##
## Call:
## arima(x = futurests_hp, order = c(2, 0, 1), xreg = covariates)
##
## Coefficients:
## ar1 ar2 ma1 intercept monthlyimportvolts_hp
## 1.4559 -0.6894 -1.0000 0.0202 0.0075
## s.e. 0.0978 0.0937 0.0602 0.2498 0.0689
## precipts_hp eratets_hp
## 0.0026 -73.5324
## s.e. 0.0040 33.0528
##
## sigma^2 estimated as 66.21: log likelihood = -212.85, aic = 441.7
log_lik_ratio <- as.numeric(
logLik(arima(futurests_hp,xreg=covariates,order=c(2,0,1))) -
logLik(arima(futurests_hp,order=c(2,0,1)))
)
LRT_pval <- 1-pchisq(2*log_lik_ratio,df=1)
LRT_pval
## [1] 0.01112603
The Likelihood Ratio Test yields a p-value of 0.011 which indicates that using all three factors will yield a model with even better predictive ability of futures prices.
We now want to check if there is a systematic trend in precipitation.
aic_table <- function(data,P,Q,xreg=NULL){
table <- matrix(NA,(P+1),(Q+1))
for(p in 0:P) {
for(q in 0:Q) {
table[p+1,q+1] <- arima(data,order=c(p,0,q),xreg=xreg)$aic
}
}
dimnames(table) <- list(paste("<b> AR",0:P, "</b>", sep=""),paste("MA",0:Q,sep=""))
table
}
aic_table(precipts,4,5,xreg=NULL)
## MA0 MA1 MA2 MA3 MA4 MA5
## <b> AR0</b> 943.7698 913.0685 898.6696 893.2338 891.3840 885.9707
## <b> AR1</b> 894.9997 895.2154 891.1400 891.1506 888.5265 886.4222
## <b> AR2</b> 893.1697 874.8741 853.5505 855.0038 884.7201 858.7203
## <b> AR3</b> 880.7106 867.0372 854.9427 857.3912 886.6988 860.6995
## <b> AR4</b> 872.6592 866.5707 856.8147 858.6678 860.9455 862.2198
The above table of AIC values tells us that the ARMA(2,2).
Below we fit the null and the alternative models and observe that autocorrelation function plots of the residuals of both models show that the redisuals are uncorrelated.
fit0 <- arima(precipts,xreg=NULL,order=c(2,0,2))
acf(resid(fit0))
fit1 <- arima(precipts,xreg=seq(from=2010,to=(2014+11/12),by=1/12),order=c(2,0,2))
acf(resid(fit1))
And then we perform a Likelihood Ratio Test
log_lik_ratio <- as.numeric(
logLik(fit1) - logLik(fit0)
)
LRT_pval <- 1-pchisq(2*log_lik_ratio,df=1)
LRT_pval
## [1] 0.442264
The test yields a p-value of 0.44 and so we do not accept out null hypothesis that there is a systematic increasing or decreasing trend in precipitation values. The period of droughts and frosts Brazil is experiencing in the recent years appears to be just part of the cyclical nature of climate.
We now want to check if there is a systematic trend in futures prices.
aic_table <- function(data,P,Q,xreg=NULL){
table <- matrix(NA,(P+1),(Q+1))
for(p in 0:P) {
for(q in 0:Q) {
table[p+1,q+1] <- arima(data,order=c(p,0,q),xreg=xreg)$aic
}
}
dimnames(table) <- list(paste("<b> AR",0:P, "</b>", sep=""),paste("MA",0:Q,sep=""))
table
}
aic_table(futurests,4,5,xreg=NULL)
## MA0 MA1 MA2 MA3 MA4 MA5
## <b> AR0</b> 635.5478 573.4213 537.2404 508.9052 497.2680 493.5545
## <b> AR1</b> 485.3688 480.3016 477.0233 478.0276 479.4272 481.1988
## <b> AR2</b> 477.1119 476.2021 478.4655 479.5319 481.3525 478.8547
## <b> AR3</b> 477.5905 479.5150 481.5235 481.1550 475.8408 477.3331
## <b> AR4</b> 479.4747 481.4744 480.7033 476.0772 479.9381 477.0906
The above table of AIC values tells us that the ARMA(3,4) would be the best small model.
aic_table(futurests,4,5,xreg=seq(from=2010,to=(2014+11/12),by=1/12))
## MA0 MA1 MA2 MA3 MA4 MA5
## <b> AR0</b> 632.4208 571.6264 536.0914 508.5752 497.6240 494.1989
## <b> AR1</b> 487.2865 482.2932 478.9718 479.9107 481.3797 483.1336
## <b> AR2</b> 479.0657 472.3514 480.3521 481.4484 483.3021 480.8522
## <b> AR3</b> 479.3268 481.2773 481.9943 483.1185 478.1400 479.3188
## <b> AR4</b> 481.2644 483.2280 482.5479 484.4435 478.0594 480.0867
The above table of AIC values tells us that the ARMA(2,1) would be the best small model.
Below we fit the null and the alternative models.
fit0 <- arima(futurests,xreg=NULL,order=c(3,0,4))
acf(resid(fit0))
fit1 <- arima(futurests,xreg=seq(from=2010,to=(2014+11/12),by=1/12),order=c(2,0,1))
acf(resid(fit1))
The first autocorrelation function plot shows that all vertical lines lie within the two horizontal dashed lines. The residuals of this model are uncorrelated, indicating a good fit. On the other hand, in the second autocorrelation function plot one out of the 14 vertical lines fall narrowly outside the horizontal dashed lines (ie. 7% of the lags) which is fine.
And then we perform a Likelihood Ratio Test
log_lik_ratio <- as.numeric(
logLik(fit1) - logLik(fit0)
)
LRT_pval <- 1-pchisq(2*log_lik_ratio,df=1)
LRT_pval
## [1] 1
The likelihood ratio test shows a p-value of 0.008 giving us strong evidence to reject the null hypothesis that there is no systematic trend in coffee futures prices.
Indeed we see a strong systematic upward trend in coffee futures prices at the New York “C” Contract Market. While demand for coffee in the US, supply side constraints (ie. precipitation in Brazil), and USD/BRL exchange rates all contribute to this upward trend, it is the USD/BRL eachange rate that is the primary driver for the increase in coffee futures prices.
As the USD/BRL exchange rate increases, this makes importing coffee from Brazil more expensive. This is reflected in the increasing coffee futures prices.
As USD/BRL rates continue to increase, we can expect to see coffee futures prices rise along with it.
We will use the following function to obtain the AIC values for the various models under consideration below:
aic_table <- function(data,P,Q,xreg=NULL){
table <- matrix(NA,(P+1),(Q+1))
for(p in 0:P) {
for(q in 0:Q) {
table[p+1,q+1] <- arima(data,order=c(p,0,q),xreg=xreg)$aic
}
}
dimnames(table) <- list(paste("<b> AR",0:P, "</b>", sep=""),paste("MA",0:Q,sep=""))
table
}
futures_aic_table <- aic_table(futurests_hp,4,5,xreg=monthlyimportvolts_hp)
require(knitr)
kable(futures_aic_table,digits=2)
MA0 | MA1 | MA2 | MA3 | MA4 | MA5 | |
---|---|---|---|---|---|---|
AR0 | 480.33 | 456.21 | 452.13 | 449.10 | 450.57 | 445.43 |
AR1 | 454.65 | 452.62 | 452.25 | 450.89 | 444.47 | 445.68 |
AR2 | 449.62 | 444.01 | 448.94 | 450.73 | 445.54 | 446.09 |
AR3 | 448.70 | 450.70 | 450.80 | 452.51 | 447.52 | 442.06 |
AR4 | 450.70 | 452.42 | 451.47 | 453.37 | 449.38 | 443.01 |
The table above suggests that the model with ARMA(2,1) errors is the best small model.
We should check the residuals for the fitted model, and look at their sample autocorrelation.
r <- resid(arima(futures,xreg=monthlyimportvolts_hp,order=c(2,0,1)))
plot(r)
acf(r)
There is some evidence that the residuals are increasing in magnitude. However, it does not appear to be too extreme. We also see in the ACF plot that none of the nonzero lags fall outside the two horizontal dashed lines and thus the residuals appear to follow white noise.
futures_aic_table <- aic_table(futurests_hp,4,5,xreg=precipts_hp)
require(knitr)
kable(futures_aic_table,digits=2)
MA0 | MA1 | MA2 | MA3 | MA4 | MA5 | |
---|---|---|---|---|---|---|
AR0 | 475.09 | 453.22 | 450.58 | 448.40 | 449.32 | 445.23 |
AR1 | 452.55 | 450.99 | 451.19 | 450.00 | 443.55 | 445.13 |
AR2 | 448.38 | 443.01 | 448.10 | 449.73 | 444.99 | 445.47 |
AR3 | 447.66 | 449.65 | 449.78 | 451.55 | 446.94 | 442.05 |
AR4 | 449.64 | 451.08 | 450.34 | 452.29 | 448.29 | 443.01 |
The table abov | e suggest | s that th | e model w | ith ARMA( | 2,1) erro | rs is the best small model. |
We should check the residuals for the fitted model, and look at their sample autocorrelation.
r <- resid(arima(futures,xreg=precipts_hp,order=c(2,0,1)))
plot(r)
acf(r)
The residuals appear to be realizations of a random noise process. We also see in the ACF plot that none of the nonzero lags fall outside the two horizontal dashed lines and thus the residuals appear to follow white noise.
futures_aic_table <- aic_table(futurests_hp,4,5,xreg=eratets_hp)
require(knitr)
kable(futures_aic_table,digits=2)
MA0 | MA1 | MA2 | MA3 | MA4 | MA5 | |
---|---|---|---|---|---|---|
AR0 | 474.55 | 451.78 | 446.50 | 446.30 | 447.46 | 441.28 |
AR1 | 450.93 | 448.54 | 447.63 | 448.11 | 440.85 | 441.74 |
AR2 | 444.14 | 438.13 | 442.34 | 443.55 | 440.89 | 442.58 |
AR3 | 440.71 | 442.53 | 444.02 | 445.93 | 442.39 | 440.22 |
AR4 | 442.49 | 444.47 | 443.74 | 442.63 | 443.54 | 440.86 |
The table abov | e suggest | s that th | e model w | ith ARMA( | 2,1) erro | rs is the best small model. |
We should check the residuals for the fitted model, and look at their sample autocorrelation.
r <- resid(arima(futures,xreg=eratets_hp,order=c(2,0,1)))
plot(r)
acf(r)
Again we see that the residuals appear to be realizations of a random noise process. We also see in the ACF plot that none of the nonzero lags fall outside the two horizontal dashed lines and thus the residuals appear to follow white noise.