Computation time is an unavoidable consideration when working with simulation-based inference, for all but small datasets and simple models.
The pomp package therefore allows you to specify the most computationally intensive steps—usually, simulation of the stochastic dynamic system, and evaluation of the measurement density—as snippets of C code.
Consequently, to use pomp, your R program must have access to a C compiler. In addition, pomp takes advantage of some Fortran code and therefore requires a Fortran compiler.
Installing the necessary compilers should be fairly routine, but does involve an extra step beyond the usual installation of an R package, unless you are running the Linux operating system for which they are usually installed by default. Given how fundamental C and Fortran are to scientific computing, it is unfortunate that Mac and Windows do not provide these compilers by default.
Detailed instructions for installing pomp and other software that we will use with it are provided in the following places:
Additional instructions on our course website
Please submit your solutions to Canvas as an Rmarkdown (.Rmd) file which the GSIs will compile into an HTML document. Your Rmd file can read in the Consett measles data from the internet, e.g., by
read.csv(paste0("https://kingaa.github.io/sbied/stochsim/","Measles_Consett_1948.csv"))
Question 6.1. Exploring behavior of a POMP model: simulating an SIR process.
Write a solution to Exercise 2.3 from Chapter 12 (Simulation of stochastic dynamic models). Note the following:
We are working toward formal inference for POMP models. Nevertheless, playing with your model by plotting simulations at various parameter values is a useful exercise for getting to understand how your model behaves. It is not enough to know just what parameter value maximizes the likelihood, we also want to understand enough about the model to be able to interpret this MLE. What types of behavior can the model exhibit? How could we describe the behaviors that are consistent with the data?
Your solution will have to build a copy of the measles model so that you can experiment with it. The R script from Chapter 13 may be useful. The script uses Hadley Wickham’s tidyverse
and ggplot
approach to R. This is a widely used approach, and well worth learning if you have not seen it before, but you may also stick with basic R. To read the script, you will need to know that x %>% myfunc(y)
is equivalent to myfunc(x,y)
, so %>%
is simply a convenient way to chain together functions, where the output of one function is piped into the next. Check that you understand this syntax for the code
library(tidyverse)
read_csv(paste0("https://kingaa.github.io/sbied/stochsim/",
"Measles_Consett_1948.csv")) %>%
select(week,reports=cases) -> meas
Here, the tidyverse version read_csv
is used in place of the basic R function read.csv
.
Worked solutions are linked from the notes, if you get stuck. Ideally, you may like to look at them after solving the homework independently. Your solution is welcome to discuss the relationship between your investigation of the model and the posted solutions.
Another example of building a pomp model is the Ricker model, originally developed to model fish populations and used in this example to model a bird population.
Various other tutorials and resources are available on the pomp package web site.
Question 6.2. Modifying a POMP model: Adding a latent period to the SIR model
Write a solution to Exercise 2.4 from Chapter 12 (Simulation of stochastic dynamic models).
Question 6.3. This feedback response is worth credit.
Explain which parts of your responses above made use of a source, meaning anything or anyone you consulted (including your class group, or other classmates, or online solutions to previous courses) to help you write or check your answers. All sources are permitted, but you are expected to explain clearly what is, and is not, your own original contribution, as discussed in the syllabus.
This homework is conceptually quite simple, but involves overcoming various technical hurdles. The hurdles may be overcome quite quickly, or could turn into a longer battle. To make progress on statistical inference for POMP models, we have to solve these underlying computational issues. How long did this homework take? Report on any technical difficulties that arose.
The questions derive from material in a short course on Simulation-based Inference for Epidemiological Dynamics