1. Introduction

1.1 Dengue Epidemiology

Dengue is a viral infection commonly occur in tropical regions. Symptoms include high fever, headache, vomiting, muscle and joint pains, and a characteristic skin rash.

The viruses are transmitted through the bite of infected mosquitoes (Aedes aegypti and Aedes albopictus) that feed both indoors and outdoors during the daytime. These mosquitoes usually thrive in areas with standing water.

The virus has multiple types. Infection with one type usually gives lifelong immunity to that type, but only short-term immunity to the others. Although Dengue vaccine has been developed and launched in many countries, its usage is still limited. The main approach of disease prevention is limiting exposure to mosquito bites.

1.2 Dengue in Malaysia

Dengue has become a global problem since the Second World War and is common in more than 110 countries. As one of tropical countries, Malaysia has suffered multiple Dengue outbreak in recent years. In 2016, Malaysia reported more than 100,000 dengue fever cases, and the outbreak was ongoing in 2017.

1.3 Project Objective

This project aims to analyse historic Dengue incidence data to understand the epidemiologic dynamic behind the spread of this disease. The foundation of modeling is SEIR pomp model. Since the disease transmission requires the participation of mosquito, mosquito dynamic and its interaction with human Dengue spreading is also studied in this project. This improved model would help better explain Dengue incidence pattern especially when multiple outbreaks are overlapped.

1.4 Dataset

This project leveraged two data sources:

  • Weekly Dengue incidence data from June,2012 to May,2015. The original data is from Malaysia Open Data Portal. Github user @ shikin2015 cleaned and combined the data, the detail can be found here.

  • Precipitation data is from World Bank Climate Change Knowledge Portal. Since only country level monthly data is available, I used Malaysia national monthly rainfall data as a surrogate of district level rainfall data in this analysis. Considering the climate cycle is similar across the country, this assumption is acceptable.

Further data cleaning and combining performed by me can be found in R script submitted along with this report.

2. Data Exploration

Each district in Malaysia has different outbreak progress. For the purpose of this project, I narrowed down the analysis to one district, Petaling. This district was selected because its data is complete, and the dataset managed to captured two full cycle of Dengue outbreak from 2013 to 2015.

##      count            pr        
##  Min.   :  28   Min.   : 94.73  
##  1st Qu.: 742   1st Qu.:179.47  
##  Median :1032   Median :249.37  
##  Mean   :1274   Mean   :250.20  
##  3rd Qu.:1811   3rd Qu.:317.32  
##  Max.   :4154   Max.   :462.98

## <ScaleContinuousPosition>
##  Range:  
##  Limits:    0 --    1
## <ScaleContinuousPosition>
##  Range:  
##  Limits:    0 --    1

Plotting both weekly Dengue incidence and precipitation, we can observe some correlation between both trends. The peak of precipitation precedes the peak of Dengue incidence for ~5 weeks. The pattern is reasonable, since the reproduction of mosquito has been observed to be associated with precipitaiton. As the vector of this infectious disease, larger mosquito population indicates higher possibility of disease spreading.

The incidence cases is generally continous across time axis, however, there is a significant drop during the peak period around week 85, possibly caused by missing data. This outlier may influence our analysis.

3. Basic SEIR modeling for single outbreak

The data captured two outbreak in two consecutive years. We can start our analysis with SEIR model on single outbreak.

Dengue symptoms usually appears 4-7 days after mosquito bites, and then symptoms present for 3-10 days. The patients is only infective (through mosquitoes) during symptom phase. Therefore, a 4-compartment SEIR model is a reasonable choice to start.

\(S = \text{susceptible}\) \(E = \text{exposed and latent}\) \(I = \text{infected and infectious}\) \(R = \text{recovered and/or removed}\)

The moving count from compartment to compartment can be simply modeled as binomial distribution over short time.

The number moving from S to E is: \[\Delta{N_{SE}} \sim Binomial\ (S,1-e^{-\lambda I\Delta{t}})\] The number moving from E to I is: \[\Delta{N_{EI}} \sim Binomial\ (E,1-e^{-\beta\Delta{t}})\]

and the number moving from E to I is: \[\Delta{N_{IR}} \sim Binomial\ (I,1-e^{-\gamma\Delta{t}})\]

After setting up the pomp model, I tried out different sets of parameters and conducted the simulation. It seems that the simulation has quite large variance and the model fitness is not very well: Incidence peak simulated by this model is sharper and narrower than the actual trend. SEIR model might be more suitable for infectious diseases transmitted directly between human. For Dengue fever, the dynamic of mosquitoes may play an important role, rendering the model more complicated.

Log likelihood:

##                        se 
## -1805.202458     3.595296

4. Inspection of Mosquito Dynamic

4.1 The Life Cycle of Mosquito

Mosquito is the vector of Dengue virus transmission. Mosquitoes lay eggs on still water, such as puddles after rain. The eggs take about one week to develop into adult mosquito, which can feed on human. The life span of an adult mosquito is about 3-4 weeks.

4.2 Correlation between Precipitation and Dengue Incidence

As observed earlier, an increase in precipitation does lead the increase of Dengue incidence, which is aligned with theory. We can quantitatively examine this correlation through cross-correlation function (ccf) in R.

A sinusoidal pattern indicates a non-linear relation between precipitation and Dengue incidence. A significant negative lag with positive ACF indicates a strong positive correlation that precipitation leads the incidence of Dengue. However, the association may not be causal, it is possible that a confounder impact both trends.

To better identify the pattern of CCF, we can use prewhiten function to remove the potential common trend. The sinusoidal pattern almost disappear. Precipitation alone may not be adequate to explain the trend of Dengue incidence.

5. An Improved Model Leveraging Precipitation Data

5.1 Model Setup

Since we don’t have direct data on other climatic factors or population size of mosquitoes, we can try to incorporate mosquito dynamic through precipitation data. This incorporation may help us better modeling the Dengue incidence trend across multiple outbreaks.

To simplify the model, we assume the birth of new mosquito eggs is linearly associate with precipitation, ignoring the scale. Number of mosquitos that will interact with human community follows a poisson distribution with mean \(R\), where R is the precipitation(mm) at that time point.

\[M_{0,n} \tilde{} Poisson(R)\]

It takes about one week for mosquito eggs to grow into adult mosquito which are capable of feeding on human. After mosquito acquiring Dengue virus from human patients, it takes about one week for the virus to reproduce in mosquito’s blood stream before it becomes infective to other human. Mosquito carried Dengue virus can spread the disease to other human throughout its entire adult life span, which is about four weeks.

Therefore, the compartmental model of mosquito can be constructed as:

Mosquito Compartment Model

Mosquito Compartment Model

  • Flowing from one compartment to the next, a fixed proportion of mosquito dies. The proportion is set as \(m_{death} = 0.1\).
Human Compartment Model

Human Compartment Model

For human side of the dynamic model, we keep the SEIR model structure, with following improvements:

  • For all susceptible individuals, the chance of remaining in the susceptible population is:

\[p_n = exp\{-\lambda_n\}\]

\[\Delta N_{SE} = (1-p_n) \times S_n\]

  • Since almost all dengue infected patients will show obvious sympotoms one week after infection bites, we can move all latency patients to infected pool at every step:

\[\Delta N_{EI} = E_n\]

  • Every week, a porportion of Dengue infected people die. In Malaysia, the case fatality of Dengue is \(\delta\) = 2.72%

\[N_{death} = I_n \times \delta\]

  • For survived Dengue patients, the number of patients moving from infected to immuned follows poisson distribution with an upper limit \(I_n\):

\[\Delta N_{II_m} \sim Poisson(k_{i\_im} \times(1-\delta) \times I_n)\]

\[\Delta N_{II_m} \leq I_n \]

Parameter \(k_{i\_im}\) is unknown.

  • Dengue has many types, although infected patients acquire lifelong immunity to the specific type, they only acquire short-term immunity to other types, i.e, it is possible that they will get Dengue again. So every week, a fraction of immuned pool moves back to susceptible pool to reflect this dynamic.

\[\Delta N_{ImS} \sim Poisson(k_{im\_s} \times Im_n) \]

\[\Delta N_{ImS} \leq Im_n\] Parameter \(k_{im\_s}\) is unknown.

  • Natural birth and death of studied population is ignored. Total population in Petaling is set as 1.66 Million.

In summary, the change of states at each time step is:

  • \(\Delta S = \Delta N_{ImS} - \Delta N_{SE}\)

  • \(\Delta E = \Delta N_{SE} - \Delta N_{EI}\)

  • \(\Delta I = \Delta N_{EI} - \Delta N_{IIm} - \Delta N_{death}\)

  • \(\Delta Im = \Delta N_{IIm} - \Delta N_{ImS}\)

Virus Transmission Model

Virus Transmission Model

  • The infection force is modeled as related to both the absolute size of adult mosquito population among human community \(M_{2,n} + M_{3,n} + M_{4,n}\) and the relative size of infected Dengue patients \(I_n/P_n\):

\[\bar{\lambda_n}= \frac{(M_{2,n} + M_{3,n} + M_{4,n})\times I_n \times \beta}{P_n}+\psi\]

  • The force of infection has a stochastic perturbation, 1000 was added to adjust the scale of \(\lambda\). \(\sigma_{env}^2\), \(\sigma_{dem}^2\), \(\beta\), \(\phi\) are unknown parameters.

\[\lambda_n = \frac{\bar{\lambda_n} \times \epsilon_n}{1000}\]

\(\epsilon_n\) follows Gamma distribution with mean 1 and variance:

\[\sigma^2=\sigma_{env}^2+\sigma_{dem}^2/\bar{\lambda_n}\]

For initial state, the proportion of latency, infected, and immuned are set to be unknown parameters.

For observations, it is a discretized normal distribution truncated at zero, conditional on the state, with both environmental and Poisson-scale contributions to the variance.

\[Y_n= \max{\mathrm{round}(Z_n),0}, \quad Z_n\sim\mathrm{normal}\left(\rho I_n, \big(\tau I_n\big)^2 + \rho I_n\right).\]

\(\rho\) and \(\tau\) are unknown parameters.

5.4 MLE Simulation

Using the MLE we obtained from global search, we can simulate the dynamic process to inspect the pattern of each state.

As shown in the simulation plots, the incorporation of mosquito dynamic helped us better capture the trend of Dengue incidence across outbreaks. The simulated incidence peaks lagged behind precipitation as expected. The duration of outbreaks becomes longer, which is no longer a sharp peak as shown in simple SEIR mode. Besides, the movement of immuned people back to susceptible pool help us explain the overlap of two consecutive Dengue outbreaks.

5.5 Log Likelihood Convergence Diagnostic

The model has many unknown parameters and different combinations of values to achieve optimal likelihood. Therefore, the convergence of each single paramter is not guaranteed/expected. The bottom line is to achieve convergen on log likelihood. Plotting the top log likelihoods, we can see they all converge as expected.

6. Conclusion

7. Future Analysis

8. Reference

[1] “Dengue and severe dengue Fact sheet N°117”. WHO. May 2015
[2] Ranjit S, Kissoon N (January 2011). “Dengue hemorrhagic fever and shock syndromes”. Pediatric Critical Care Medicine. 12 (1): 90-100
[3] CDC Dengue Epidemiology: https://www.cdc.gov/dengue/epidemiology/index.html
[4] Nature Scitable Dengue Introduction: https://www.nature.com/scitable/topicpage/dengue-transmission-22399758
[5] Stats 531 Winter 2016 Final Projects: http://ionides.github.io/531w16/final_project/
[6] Stats 531 Winter 2018 Lecture Notes: https://ionides.github.io/531w18/