Introduction

Measles is a highly contageous viral disease which, to this day, is a leading cause of death in children worldwide (Centers for Disease Control and Prevention 2015). Infection begins in the respiratory tract, and can lead to complications in up to 40% of the population (Moss and Griffin 2012). Often, these complications are respiratory, but more severe cases can involve the central nervous system (Moss and Griffin 2012). The introduction of a vaccine for measles in 1963 began a slow process of fighting the disease, with the virus being declared eradicated in the Americas in 2016 (Fox 2016).

However, “eradication” does not mean that no new cases of measles have been observed; rather, it means that new cases can be traced back to transmission from other locations (Berman 2015). Indeed, measles outbreaks have, in recent years, made headlines, and recent work has attributed these outbreaks to vaccine refusal (Phadke et al. 2016).

In 1963, a vaccine for measles was introduced (Moss and Griffin 2012). Because of the increase in awareness of measles outbreaks in a growing group of unvaccinated children, understanding epidemiological patterns in measles incidence is increasingly important. To this end, we use historical monthly data on measles cases in New York City to investigate the epidemiology of measles outbreaks in an unvaccinated population.

Data Summary

Data were obtained from the Time Series Data Library and consist of 534 monthly records of the number of reported measles cases in New York City from January, 1928 to June, 1972. (Hyndman). We begin our analysis by examining time plots of the data, first on the monthly scale, and second on an annual scale (by restricting focus to January of each year).

\label{fig:timeplots} Time plots of the number of reported measles cases in New York City from 1928 to 1972. Top: Monthly reported measles cases. Bottom: Reported measles cases in January of each year. Red dashed lines are drawn on January 1, 1964, the start of the first full year in which a measles vaccine was available.

Time plots of the number of reported measles cases in New York City from 1928 to 1972. Top: Monthly reported measles cases. Bottom: Reported measles cases in January of each year. Red dashed lines are drawn on January 1, 1964, the start of the first full year in which a measles vaccine was available.

In the top frame of the above figure, the number of reported measles cases is plotted against time for every month from 1928 to 1972. We notice that the spikes in the plot are severely dampened starting in the late 1960s, which is likely associated with the introduction of a measles vaccine in 1963 (Moss and Griffin 2012). The plot suggests potential seasonality in outbreaks – the regularity of the spikes may indicate that outbreaks occur at a particular time each year. To assess this claim, we examine a reduced version of the time series using only the number of cases reported in January of each year. A relatively flat plot would provide evidence for seasonality; rather, we discover that, upon looking at a less granular time scale, outbreaks seem to occur every 2-3 years, rather than annually. There is no observable trend in the data.

Due to the introduction of the measles vaccine in 1963, we have good reason to believe that the time series starting in 1964, the first full year in which the vaccine was available, arises from a different process than before. That is, the measles vaccine is likely to have led to meaningfully-dampened measles incidence relative to pre-1963 numbers. Furthermore, we propose that the effects of this exogenous “shock” are, for the purposes of this data, long-lasting. Indeed, this is a reasonable assumption, as measles was recently declared eradicated from North and South Americas by the WHO (Fox 2016). In order to investigate the “natural” epidemiology of measles (i.e., in an unvaccinated population), we restrict our analysis to data before 1964.

Frequency Domain Analysis

From the time plots, we see that measles seems to appear in regular outbreaks. Frequency domain analysis can help identify periodic patterns in the data. We plot the spectral density below, using a nonparametric smoother.

The largest peak on the nonparametrically-smoothed periodogram on the left occurs at a frequency of 0.083, which corresponds to a period of 12 months. Notice also that there is a second relatively large peak at a frequency of approximately 0.047, which corresponds to a period of 2 years. Both of these peaks are confirmed by the parametrically-smoothed periodogram.

This is consistent with the monthly time plot showing large spikes approximately every two years with some small upticks in reported cases in between. The finding suggests that measles outbreaks generally occurred biannually, which might suggest an AR(1) model for the annual process: the number of cases is likely negatively correlated with the previous year’s.

Frequencies higher than approximately 0.1 become much less important as the frequency increases. This suggests that outbreaks occur every two years and only last for relatively short periods of time.

Modeling

We wish to develop a reasonable model for measles outbreak patterns so as to investigate the epidemiology of the virus. Based on our previous analysis, we will develop a seasonal ARMA model.

Annual component

The January time series captures the general pattern of the data, but is unable to account for intra-year variation in reported measles incidence and/or multiple outbreaks per year. A plot of the sample autocorrelation of the January time series (below) indicates an AR(1) model might be appropriate for this annual data.

Plot of autocorrelation function (ACF) for NYC measles outbreaks in January, 1928 - 1963. The dashed lines indicate a 'confidence interval' around zero under the null hypothesis that the ACF is zero.

Plot of autocorrelation function (ACF) for NYC measles outbreaks in January, 1928 - 1963. The dashed lines indicate a ‘confidence interval’ around zero under the null hypothesis that the ACF is zero.

The choice of AR(1) is confirmed by comparing the AICs of a variety of ARMA(\(p\), \(q\)) models:

MA0 MA1 MA2 MA3 MA4 MA5
AR0 619.85 616.23 618.21 620.06 621.59 622.40
AR1 616.62 618.29 618.93 620.52 622.29 624.23
AR2 618.24 618.67 620.67 622.58 624.26 625.39
AR3 620.23 620.67 622.56 624.62 625.69 626.98
AR4 622.21 622.47 624.38 624.33 627.49 628.85
AR5 623.66 624.29 625.66 625.71 629.38 630.59

The above table suggests that an AR(1) model is a reasonable fit for the data. All adjacent AICs are within approximately 2 units of each other, which suggests likelihood maximization encountered no major numerical issues or instability.

Monthly component

We now focus on the montly component of the data, adding the annual AR(1) model to account for seasonality. We attempt to fit a variety of SARMA(\(p\),\(q\))\(\times\)(1,0)\(_{12}\) models and tabulate AICs below.

MA0 MA1 MA2 MA3 MA4 MA5
AR0 7614.46 7362.21 7341.13 7327.00 7326.26 7318.67
AR1 7443.47 7334.66 7334.94 7327.57 7328.16 7313.47
AR2 7341.75 7318.92 7313.43 7322.30 7313.44 7315.47
AR3 7338.90 7321.21 7313.81 7314.49 7315.47 7317.49
AR4 7329.50 7322.83 7324.85 7323.64 7316.89 7319.02
AR5 7330.95 7322.33 7315.88 7306.36 7318.57 7320.42

Here, we discover that an ARMA(2,2) model is appropriate for the monthly component.

The fitted SARMA(2,2)\(\times\)(1,0)\(_{12}\) model is \[ \left(1-1.315B + 0.456B^2\right) \left(1 - 0.592B^{12}\right) \left(Y_{n} - 1665.654\right) = \left(1 -0.322B -0.576B^{2}\right) \epsilon_{n}, \] where \(Y_{n}\) is the number of reported measles cases at timepoint \(n\), \(B\) is the backshift operator, and \(\epsilon_{n}\) is normally-distributed white noise associated with the measurement at timepoint \(n\).

Diagnostics

We examine the roots of the AR, MA, and SAR polynomials of the fitted SARMA(2,2)\(\times\)(1,0)\(_{12}\) model. For the AR(2) polynomial, both roots are outside the complex unit circle:

## [1] 1.440614+0.339858i 1.440614-0.339858i

Similarly for the MA(2) polynomial:

## [1] -0.279192+1.287248i -0.279192-1.287248i

The model is thus stationary, causal and invertible. We assess the residuals:

There is some residual autocorrelation at a 24-month lag, which may suggest that the strength of the 2-year period is not sufficiently captured by the model.

Conclusion

Measles, though declared eradicated in the Americas, is potentially experiencing a resurgence due to vaccine refusal. The unvaccinated population of New York City experienced a major outbreak of measles approximately every two years, in which over 1000 cases were reported each month.

We have used this historical data to discover patterns in measles incidence: namely, that outbreaks tend to occur every 2 years and are generally short-lived. This confirms existing research on the epidemiology of measles (Moss and Griffin 2012).

References

Berman, Mark. 2015. “How the U.S. went from eliminating measles to a measles outbreak at Disneyland - The Washington Post.” https://www.washingtonpost.com/news/post-nation/wp/2015/01/23/how-the-u-s-went-from-eliminating-measles-to-a-measles-outbreak-at-disneyland/?utm{\_}term=.decf93357a5f.

Centers for Disease Control and Prevention. 2015. “Measles.” Epidemiol. Prev. Vaccine-Preventable Dis. 13th Ed., 209–29. http://www.cdc.gov/vaccines/pubs/pinkbook/meas.html.

Fox, Maggie. 2016. “Measles Has Been Eliminated in the Americas, WHO Says.” https://www.nbcnews.com/health/health-news/measles-has-been-eliminated-americas-who-says-n655406.

Hyndman, R.J.“Time Series Data Library.” http://data.is/TSDLdemo.

Moss, William J., and Diane E. Griffin. 2012. “Measles.” Lancet 379 (9811): 153–64. doi:10.1016/S0140-6736(10)62352-5.

Phadke, Varun K., Robert A. Bednarczyk, Daniel A. Salmon, and Saad B. Omer. 2016. “Association Between Vaccine Refusal and Vaccine-Preventable Diseases in the United States.” JAMA 315 (11): 1149. doi:10.1001/jama.2016.1353.