Measles in Contemporary Society / Motivation

Measles is a highly contagious disease with many potential complications. While the fatality rate is generally not very high in developed countries (~0.3%), the serious complications that can occur and the high rate of infection make prevention a priority for any government or public health organization.

Given that there does exist a rather effective measles vaccine with about a 95% rate of immunity for children of 12 months, it may seem like a waste of resources to focus on modelling the disease’s epidemiology. This is not the case. A simple google search reveals that outbreaks in developed countries where vaccines are readily available are not uncommon. Herd immunity, the percentage of a population that must be immune to a disease to protect those who cannot be vaccinated and against outbreaks, of measles is established at around 95%. Given the numerous controversies surrounding vaccination, it’s not unreasonable to imagine that the proportion of the population immune to the disease could dip below that. Coupling this fact with possible connections between socioeconomic status and vaccination, it feels as if establishing herd immunity permanently could be a long way off.

With this in mind, an analysis of the epidemiology of measles should have many benefits. If an outbreak is identified, resources could be appropriated with care. This would result in money, and likely even lives, saved. However, analyses of the spread of measles in post-vaccination society is not widely available, hence a motivation for this project.

I will be analyzing an outbreak of measles in late 80s and early 90s from California. The data was pulled from Project Tycho and is freely available. Much of this analysis will be based on an analysis by Aaron King and He et al.


Data Exploration

To begin, we will look at the data available from Project Tycho. We will view the cases from 1989-1992. Cases were recorded weekly, with some data missing.

plot of chunk unnamed-chunk-1

An initial look reveals more than just the aforementioned missing data, we see that cases recorded during the peak of the epidemic vary greatly, from 100s one week to close to 0 the next. This begs the question: “Could measles really spread in such a varying fashion?” Intuition, and advice from those more experienced in epidemiology than myself, leads to one answer: No. In order to better model the spread of the disease, the wild variations were removed. Specifically, any week with \(\leq 1\) case recorded sandwiched in between two weeks with \(>100\) cases recorded has been removed.

With the specified data removed, we see a somewhat less varying plot of the data below with the removed data in black and the data being used for our analysis in red:

plot of chunk unnamed-chunk-2

Covariates

Prior analysis by Aaron King used two main covariates for modeling measles: birth rates and population sizes. The population data is available from the United States Census website and the birth rates from the California Department of Public Health. Similar to the King analysis, the covarites were obtained yearly, and smoothed over each year to get a weekly value. Below we can see the data as points and the smoothed values as the red line. Birthrate is in number of Births per year and population is for children under 10 years old.

The selection of the population of children under 10 years should be addressed. There were two main motivations here. Firstly, a post-epidemic analysis (Dales et al 1993) shows that over 60% of the cases recorded during the peak epidemic years, 1989 and 1990, were in children under 10 years old. This is markedly higher than earlier years in the decade, where children of that age accounted for ~40% of all cases. Given this, it is reasonable to focus on this group as the main demographic of susceptibles in our analysis. Secondly, using the entirety of California’s population leads to a very small proportion of susceptibles, which makes it more difficult to identify the effect changes in our covariates and parameters have on our model. With this in mind, finding a way to lessen the population, given sufficient reason, is a logical next step.

plot of chunk unnamed-chunk-3

Here we see both smoothed lines do not perfectly capture the yearly estimates, but they do sufficiently well to able to measure the effect that changes may have on our model. It also should be noted that in our analysis, we will be working with lagged birthrate; both at the 6 month and 1 year lag. Reasons for this selection will be specified in the next section.


Model Creation

To model the disease, a slight variation on the standard SEIR compartment model was used.

\(b = \text{births}\)
\(S = \text{susceptibles}\)
\(E = \text{exposed, non-infectious, incubating}\)
\(I = \text{Infectious}\)
\(R = \text{Recovered}\)
(King, A. 2015)

Our model has the earmarks of a susceptible-exposed-infectious-recovered(SEIR) model. Births migrate to our susceptible pool at each time step at a specified rate. The effect of measles on the death rate is negligible, so our death rate at each department is kept equal. The main deviation here from the standard SEIR model is the arrow from \(S \rightarrow R\). This is to model vaccinations, where someone in compartment \(S\) may be vaccinated and go straight to compartment \(R\), where they are modeled as immune.

Analysis was conducted using the POMP framework. This is useful in that it allows for modeling of non-linear processes, which is a trait of epidemiological systems (King, A. 2015). It also will allow much more flexibility in the creation of a model, which is helpful when modeling a complex process such as the spread of a disease, possibly including numerous parameters.

Assuming the reader’s familiarity with non-linear POMP models, we will begin by introducing the process model below. This model borrowed heavily from Aaron King’s implementation of the model put forth by He et al. 2010. Any parameters not specified in this study can be found at either of the two above resources.

rproc <- Csnippet("
                  double beta, br, seas, foi, dw, births, vac;
                  double rate[6], trans[6];
                  
                  // term-time seasonality
                  t = (t-floor(t))*365.25;
                  if ((t>=7&&t<=100) || (t>=115&&t<=199) || (t>=252&&t<=300) || (t>=308&&t<=356))
                  seas = 1.0+amplitude*0.2411/0.7589;
                  else
                  seas = 1.0-amplitude;
                  
                  // transmission rate
                  beta = R0*(gamma+mu)*seas;
                  // expected force of infection
                  foi = beta*pow(I+iota,alpha)/pop;
                  // white noise (extrademographic stochasticity)
                  dw = rgammawn(sigmaSE,dt);
                  
                  rate[0] = foi*dw/dt;  // stochastic force of infection
                  rate[1] = mu;             // natural S death
                  rate[2] = sigma;        // rate of ending of latent stage
                  rate[3] = mu;             // natural E death
                  rate[4] = gamma;        // recovery
                  rate[5] = mu;             // natural I death
                  
                  // Poisson births
                  births = rpois(birthrate_lag05*dt);

                  // Vaccination
                  vac = nearbyint(vr*birthrate_lag1*.95*dt);

                  // transitions between classes
                  reulermultinom(2,S,&rate[0],dt,&trans[0]);
                  reulermultinom(2,E,&rate[2],dt,&trans[2]);
                  reulermultinom(2,I,&rate[4],dt,&trans[4]);

                  if (vac > S - trans[0] - trans[1]){
                    vac = S - tran[0] - tran[1];
                  }

                  S += births   - trans[0] - trans[1] - vac;
                  printf(\" births are: %f,  S is: %f \",births, S);
                  E += trans[0] - trans[2] - trans[3];
                  I += trans[2] - trans[4] - trans[5];
                  R = pop - S - E - I + vac;
                  W += (dw - dt)/sigmaSE;  // standardized i.i.d. white noise
                  C += trans[4];           // true incidence
                  ")

initlz <- Csnippet("
  double m = pop/(S_0+E_0+I_0+R_0);
                   S = nearbyint(m*S_0);
                   E = nearbyint(m*E_0);
                   I = nearbyint(m*I_0);
                   R = nearbyint(m*R_0);
                   W = 0;
                   C = 0;
                   ")

##  Measurement Model  ##
dmeas <- Csnippet("
  double m = rho*C;
                  double v = m*(1.0-rho+psi*psi*m);
                  double tol = 1.0e-18;
                  if (cases > 0.0) {
                  lik = pnorm(cases+0.5,m,sqrt(v)+tol,1,0)-pnorm(cases-0.5,m,sqrt(v)+tol,1,0)+tol;
                  } else {
                  lik = pnorm(cases+0.5,m,sqrt(v)+tol,1,0)+tol;
                  }
                  ")

rmeas <- Csnippet("
  double m = rho*C;
                  double v = m*(1.0-rho+psi*psi*m);
                  double tol = 1.0e-18;
                  cases = rnorm(m,sqrt(v)+tol);
                  if (cases > 0.0) {
                  cases = nearbyint(cases);
                  } else {
                  cases = 0.0;
                  }
                  ")

To briefly review the process model, we’ll start at the beginning. Major departures from the original model will be made clear and decisions to keep what were there will be explained.


Model Fitting

To investigate a decent fit for our model, we perform two analyses. First, we carry out a local search of model parameters based on prior estimates given in He et al. Our second analysis is based on a global search where our initial parameters are varying and compare them to our local search. We will specify the parameters in question below before moving on to our analysis.

Parameters

Below are the parameters we want to estimate with brief definitions:

  • \(R0\): “the expected number of secondary infections engendered by an infective introduced into a fully susceptible population” (He et al. 2010). This term helps parameterize \(\beta\), the transmission rate, in our rprocess model. A higher \(R0\) would signify a higher transmission rate.
  • \(\gamma\): Our rate of recovery
  • \(\alpha\): A mixing parameter; the closer to one, the more homogeneous mixing is in the population
  • \(\iota\): “the mean number of infectives visiting the population at any given time” (He et al. 2010)
  • \(\rho\): Our reporting rate
  • \(\psi\): Overdispersion parameter in the reporting process, found in the measurement model above.
  • \(\sigma\): Our rate of ending the latent stage
  • \(\sigma_{\text{SE}}\): Extrademographic stochasticity affecting the force of infection
  • \(vr\): Our vaccination rate
  • \(amplitude\): Our seasonality parameter

Remarks and Conclusions

Firstly, it should be noted that because of computational issues, only 20 mifs were ran for both the local and global searches with 200 mif iterations. All conclusions made ought to be taken in this context. Moreover, this could be the reason behind our finding a greater likelihood in our local search than our global search.

Initially we observe a few interesting results from our local and global searches. As mentioned, the local search reached a slightly higher likelihood than our global search. This is unexpected, as a search initializing at many different parameter values would seem as if it should reach a higher maximum. We also see a smaller variance in the likelihood of our global search; this further contradicts general intuition, as one would expect varying initial values would result in a larger range of likelihoods estimates from our mifs.

Looking at our pairs plot for both local and global searches, we see what looks like a trade off between \(\sigma_{\text{SE}}\) and \(\alpha\). We can look more closely at this result below, from our global results:

plot of chunk unnamed-chunk-11

We \(\sigma_{\text{SE}}\) and \(\alpha\) seem to trade off while maintaining likelihood.

Moving on, we can look at our diagnostic plots for our mif2 maximization procedure.

plot of chunk unnamed-chunk-12plot of chunk unnamed-chunk-12plot of chunk unnamed-chunk-12

There are a number of specific parameters that stand out. With the exception of one mif, we see that \(\rho\) converges to a very low value, suggesting a low reporting rate. We see \(\sigma_{\text{SE}}\) jumping to ~0.3 for most mifs. This is notably higher than the parameter estimates obtained by He et al. This may not be surprising, as one could imagine in a diverse and populous state such as California, extra demographic stochasticity may be quite large. We also see estimates for \(\alpha\) generally sink quite low, suggesting heterogeneous mixing in our population. Analysis by Dales et al. seems to corroborate this finding, suggesting that cases of measles may have been concentrated in minority communities and, as previously noted, young children.

Further Investigation

Depending on the interest of the reader or future researcher, any number of analysis may prove helpful. Generally speaking, running more mifs at more iterations would be helpful. A number of computational issues came about during this project which lead to this shortcoming; future analysis improving upon this alone would be useful as it could lead to different results.

Updated Compartment Models

Firstly, given additional computational power and epidemiological knowledge, a different compartment model could be used. It may be useful to separate infants after 6 months into a completely different susceptible pool. Given their larger susceptibility to the disease relative to older susceptibles, a separate compartment may be able to capture some disease dynamics that this compartment model was unable to investigate. The proposed model may look as follows:

Here, \(\text{S1}\) would model young children who are at a much higher level of susceptibility than older children and adults. These young children could move to either an exposed compartment, \(\text{E}\), or, upon successful vaccination, to a recovered compartment, \(\text{R}\). This type of model would need the inclusion of more of the population (i.e., not just children under ten years old), but may be worth investigating in the future.

Likelihood Profiles

Depending on the researchers parameter of interest, creating a likelihood profile may be quite useful. Developing some certainty in the estimate of parameters like \(R0\) would help quantify the force of a disease. It’s quite possible that a post-vaccination measles may have different infective properties than the disease prior to the widespread use of vaccinations. If the force of infection, reflected in a change in \(R0\), were to change, knowledge of this and the degree to which it has changed would help better prepare any public health institution for a future outbreak. Similarly, knowledge of our vaccination rate, \(vr\), would help determine whether or not additional resources may need to be allocated to getting the population better vaccinated. If a likelihood profile for vaccination rates revealed that the rate was significantly lower than desired - that would flag a need for these additional resources.

Additional Parameter Estimation

Many assumptions were made in this report regarding fixed model specifications/rates that could be estimated in a future model. A couple proposals are below:

  • Vaccination Effectiveness: In this study, vaccination effectiveness was held at 95%. It is possible to imagine a virus mutation or a batch of faulty vaccinations that would result in a low vaccination effectiveness. If this were the case, modelling vaccination effectiveness during an outbreak would be useful. Identification of a low effectiveness rate and, consequently, its cause could help prevent future outbreaks.
  • Susceptible Pool Entrance: A 6 month lag into the susceptible pool after birth was based on prior research on maternal immunity (Drucker 2010). A number of factors may contribute to this lag being different: not all mothers breastfeed, some mothers may not have measles antibodies, and maternal immunity length may not be uniform for all people. With this in mind, varying this lag to find a better fitting model may be reasonable.

References

King, A. (2015, June 7). Case study: Measles in large and small towns. Retrieved from http://kingaa.github.io/sbied/measles/measles.html

He, D., Ionides, E. L., & King, A. A. (2010). Plug-and-play inference for disease dynamics: measles in large and small populations as a case study. Journal of the Royal Society Interface, 7(43), 271-283. http://doi.org/10.1098/rsif.2009.0151

Dales, L. G., Kizer, K. W., Rutherford, G. W., Pertowski, C. A., Waterman, S. H., & Woodford, G. (1993). Measles epidemic from failure to immunize. Western Journal of Medicine, 159(4), 455-464.

Drucker, R. (2010). How Long Does Passive Immunity to Measles Last in Infants. NEJM Journal Watch. Retrieved from http://www.jwatch.org/pa201006020000002/2010/06/02/how-long-does-passive-immunity-measles-last

Population Estimates. (n.d.). Retrieved April 26, 2016, from https://www.census.gov/popest/data/historical/index.html

Statistics. (n.d.). Retrieved April 26, 2016, from http://www.cdph.ca.gov/data/statistics/Pages/default.aspx

Ahmad, S., Zahid, S., & Jan, A. (2013). The impact of parental education and socioeconomic status on routine childhood vaccination: An obsevational study. Journal Of Postgraduate Medical Institute (Peshawar - Pakistan), 27(3). Retrieved from http://www.jpmi.org.pk/index.php/jpmi/article/view/1391/1367

Measles vaccine. (n.d.). Retrieved April 26, 2016, from https://en.wikipedia.org/wiki/Measles_vaccine

Measles. (n.d.). Retrieved April 26, 2016, from https://en.wikipedia.org/wiki/Measles