1 Introduction

Cholera was introduced to Haiti in 2010 the first time of its history likely by UN peacekeepers from Nepal. Started in central Haiti, the cholera outbreak in 2010 rapidly triggered the world’s largest epidemic of the seventh cholera pandemic (Chin et al. 2011; Eppinger et al. 2014). Five years later, people now in Haiti are still suffering from cholera, and more than 9000 people have therefore died since the outbreak in 2010 (Kirpich et al. 2015). Even though many studies gave profound insights into the mechanism initiating the epidemic in 2010 (Rinaldo et al. 2012; Gaudart et al. 2013; Eisenberg et al. 2013), little attention has been paid to the recent epidemics. However, since the cholera in Haiti may have become endemic, the transmission dynamics can be very different from the initial outbreak. Moreover, followed by withdrawal of fund and aid, the situation in Haiti is getting even more difficult. Therefore, in order to develop a timely and sustainable countermeasures agains cholera in Haiti, better understanding for the underlying mechanisms of recent cholera outbreaks is crucial.

There are plenty of potential factors will affect cholera transmission dynamics, such as rainfall, temperature, waning of population immunity, etc (Koelle 2009). I start with the most widely studied one–the relationship between rainfall and cholera incidence. Lots of related studies have been done, and people found out both negative and positive effect of rainfall on cholera incidence through different mechanisms. For example, flooding can lead to contamination of drinking water supplies; whereas, lack of water will increase the chance for people to use contaminated water source. Either way can increase the risk of cholera transmission. In this project, I tried some preliminary time series analysis for cholera incidence and rainfall data from January 2012 to December 2015 in Artibonite, the biggest department in Haiti (Fig 1). I hope to get some basic ideas about the association between rainfall and cholera incidence as well as whether rainfall would be a potential driver for recent cholera epidemic in Haiti.


Fig 1: Map of Artibonite Department, Haiti

Fig 1: Map of Artibonite Department, Haiti


2 Data

Case Data Cholera daily incidence by departments is available on Ministere de la Sante Publique et de la Population (MSPP) website. Report is published everyday in pdf formate online. From 2012 to 2015, 120 dates are missing in the database, and there are data points that appear to be missing at random, with some of the data missing throughout large segments in time, particularly in October 2013 and June 2014. These missing data may bias the outcomes given the non-systematic distribution of the size of the missing data in time. Data crawling and parsing are managed using python 2.7.10.

Rainfall Data Area averaged rainfall estimates from the National Aeronautics and Space Administration (NASA) Tropical Rainfall Measuring Mission (TRMM) are used in the project. Geographical coverage is selected using longitude [-73.120633, -72.226589] and latitude [18.838958, 19.805066] with resolutions of 0.25° x 0.25°.


3 Analysis


Fig 2: daily cholera incidence and rainfall time series

Fig 2: daily cholera incidence and rainfall time series

Fig 3: weekly cholera incidence and rainfall time series

Fig 3: weekly cholera incidence and rainfall time series


(Usually, weekly aggregated data is better for visualization and sesonality analysis, but clustered missing data makes it hard to do sensible aggregation. Therefore, the following analysis is done with daily data, unless specified.)


Fig 5: cross-correlation between rainfall and cholera incidence

Fig 5: cross-correlation between rainfall and cholera incidence


Fig 6: pre-whitening cross-correlation between rainfall and cholera incidence

Fig 6: pre-whitening cross-correlation between rainfall and cholera incidence



Fig 7: residual acf

Fig 7: residual acf


## 
## Call:
## arima(x = dep1_case[1:1406], order = c(4, 0, 2), optim.control = list(maxit = 200))
## 
## Coefficients:
##          ar1     ar2      ar3      ar4      ma1      ma2  intercept
##       0.5493  0.8896  -0.3008  -0.1456  -0.0414  -0.7755    23.1682
## s.e.  0.1416  0.1971   0.0748   0.0327   0.1423   0.1216     5.0019
## 
## sigma^2 estimated as 68.74:  log likelihood = -4550.98,  aic = 9115.96

where \(\{\epsilon_n\}\) is a Gaussian ARMA process. I used an ARMA(4,2) model selected by AIC.

## 
## Call:
## arima(x = dep1_case[1:1406], order = c(4, 0, 2), xreg = dep1_rain[1:1406], optim.control = list(maxit = 200))
## 
## Coefficients:
##          ar1     ar2      ar3      ar4      ma1      ma2  intercept
##       0.5508  0.8879  -0.2999  -0.1463  -0.0436  -0.7742    23.2637
## s.e.  0.1478  0.2061   0.0770   0.0327   0.1488   0.1274     5.0350
##          xreg
##       -0.0244
## s.e.   0.0382
## 
## sigma^2 estimated as 68.71:  log likelihood = -4550.78,  aic = 9117.55


## [1] 0.5235408



Fig 8: model diagnostic

Fig 8: model diagnostic



## 
## Call:
## arima(x = dep1_case[1:1406], order = c(4, 0, 2), xreg = dep1_rain[48:1500][1:1406], 
##     optim.control = list(maxit = 200))
## 
## Coefficients:
##          ar1     ar2      ar3      ar4      ma1      ma2  intercept
##       0.6031  0.8159  -0.2789  -0.1474  -0.0976  -0.7262    22.9428
## s.e.  0.1659  0.2314   0.0791   0.0318   0.1670   0.1440     4.9898
##         xreg
##       0.0826
## s.e.  0.0370
## 
## sigma^2 estimated as 68.47:  log likelihood = -4548.48,  aic = 9112.97



## [1] 0.02552198


Nevertheless, the diagnostic plots still show violation of model assumption (ie. residual should be randomly distributed). Acf plot suggests that the model is not sufficient to capture the association between rainfall and cholera incidence.


Fig 9: model diagnostic

Fig 9: model diagnostic


4 Conclusions


Fig 9: cholera incidence seasonality

Fig 9: cholera incidence seasonality

Fig 10: rainfal seasonality

Fig 10: rainfal seasonality

5 Packages

  1. ggmap (Boettiger 2015)
  2. RCurl (Boettiger 2015)
  3. xts (Boettiger 2015)
  4. TSA (Boettiger 2015)
  5. fields (Boettiger 2015)
  6. akima (Boettiger 2015)

References

Boettiger, C. (2015). Knitcitations: Citations for ’knitr’ markdown files. Retrieved from http://CRAN.R-project.org/package=knitcitations

Chan, K.-S. & Ripley, B. (2012). TSA: Time series analysis. Retrieved from http://CRAN.R-project.org/package=TSA

Chin, C.-S., Sorenson, J., Harris, J.B., Robins, W.P., Charles, R.C., Jean-Charles, R.R., Bullard, J., Webster, D.R., Kasarskis, A., Peluso, P., Paxinos, E.E., Yamaichi, Y., Calderwood, S.B., Mekalanos, J.J., Schadt, E.E. & Waldor, M.K. (2011). The origin of the haitian cholera outbreak strain. New England Journal of Medicine, 364, 33–42. Retrieved from http://dx.doi.org/10.1056/NEJMoa1012928

Eisenberg, M.C., Kujbida, G., Tuite, A.R., Fisman, D.N. & Tien, J.H. (2013). Examining rainfall and cholera dynamics in haiti using statistical and dynamic modeling approaches. Epidemics, 5, 197–207. Retrieved from http://dx.doi.org/10.1016/j.epidem.2013.09.004

Eppinger, M., Pearson, T., Koenig, S.S.K., Pearson, O., Hicks, N., Agrawal, S., Sanjar, F., Galens, K., Daugherty, S., Crabtree, J., Hendriksen, R.S., Price, L.B., Upadhyay, B.P., Shakya, G., Fraser, C.M., Ravel, J. & Keim, P.S. (2014). Genomic epidemiology of the haitian cholera outbreak: A single introduction followed by rapid, extensive, and continued spread characterized the onset of the epidemic. mBio, 5, e01721–14. Retrieved from http://dx.doi.org/10.1128/mbio.01721-14

Gasparrini, A. (2011). Distributed lag linear and non-linear models in R: The package dlnm. Journal of Statistical Software, 43, 1–20. Retrieved from http://www.jstatsoft.org/v43/i08/

Gaudart, J., Rebaudet, S., Barrais, R., Boncy, J., Faucher, B., Piarroux, M., Magloire, R., Thimothe, G. & Piarroux, R. (2013). Spatio-temporal dynamics of cholera during the first year of the epidemic in haiti (H. Carabin, Ed.). PLoS Negl Trop Dis, 7, e2145. Retrieved from http://dx.doi.org/10.1371/journal.pntd.0002145

Kirpich, A., Weppelmann, T.A., Yang, Y., Ali, A., Morris, J.G. & Longini, I.M. (2015). Cholera transmission in ouest department of haiti: Dynamic modeling and the future of the epidemic (C. Munoz-Zanzi, Ed.). PLoS Negl Trop Dis, 9, e0004153. Retrieved from http://dx.doi.org/10.1371/journal.pntd.0004153

Koelle, K. (2009). The impact of climate on the disease dynamics of cholera. Clinical Microbiology and Infection, 15, 29–31. Retrieved from http://dx.doi.org/10.1111/j.1469-0691.2008.02686.x

R Core Team. (2014). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. Retrieved from http://www.R-project.org/

Rinaldo, A., Bertuzzo, E., Mari, L., Righetto, L., Blokesch, M., Gatto, M., Casagrandi, R., Murray, M., Vesenbeckh, S.M. & Rodriguez-Iturbe, I. (2012). Reassessment of the 2010-2011 haiti cholera outbreak and rainfall-driven multiseason projections. Proceedings of the National Academy of Sciences, 109, 6602–6607. Retrieved from http://dx.doi.org/10.1073/pnas.1203333109