This provides an example for the more general task of running pomp analysis on a Linux cluster using Slurm scheduling. This is an example of the even more general task of organizing reproducible and computationally intensive statistical data analysis.
Copy the files to greatlakes. For moving the chapter 16 code to greatlakes, I cloned the 531w21 git repository. I could also have used scp.
Check the code is running, via an interactive session at run level 1
nano main.Rnw
srun --nodes=1 --account=stats531w21_class --ntasks-per-node=2 --pty /bin/bash
module load R/4.0.3
R
knitr::knit("main.Rnw")
likely, you will be missing some packages and you will have to run install.packages in your R session.
when step (e) is working for you, it will produce main.tex (or a md file from Rmd). It will also produce all the rda files saving the output of the computations.
Try running the code as a batch file. 531w21/16 has three sbat files corresponding to the three run levels: r-1.sbat, r-2.sbat, r-3.sbat. Try the batch job at run level 1 first. Note that if your knitr/rmarkdown file is using cache, you will have to remove all the cache files. For Chapter 16, the cache is put in a subdirectory called tmp, so we can delete all the cache by
rm -rf tmp
Then, having set run_level to 1 in main.Rnw, run
sbatch r-1.sbat
Also, if you need to re-run all the code at run level 1, you can do
rm -rf *1.rda
to remove all the rda files for run level 1.
Once this is debugged, you can move on to run levels 2 and 3. The only differences between r-1.sbat, r-2.sbat and r-3.sbat is the number of cores requested and the wall time. Setting these no larger than needed may lead to shorter queue times, and perhaps fewer billing minutes. Again, you may have to remove cached files, both the knitr/rmarkdown cache and the .rda files if you want to reconstruct them.
Once the rda files are computed, you can move them to wherever you are most comfortable for further editing of text and figures.