This homework is a small data analysis project designed as preparation for the midterm project. In particular, you are required to write your report using Rmd format, which is required for the midterm and final projects. Consequently you should submit two files, an Rmd file and the HTML document you obtain by compiling it. The Rmd file should be written so that the grader can run it, and therefore the data should be read from the homework website rather than from a local directory. The grader will not necessarily recompile the Rmd, but if they do then it should reproduce the HTML. If technical difficulties arise with learning to use Rmd format, please consult your peers, piazza, the GSI or myself.
Most of you will find that editing the Rmd file in Rstudio may be the simplest solution. Also, the source files for this homework and all the notes are on the class GitHub site: if you see anything in the notes that you’d like to reproduce, you can take advantage of that. Opening the file hw03.Rmd in Rstudio and editing it to produce your solution is one way to proceed. You may also like to browse http://www.stat.cmu.edu/~cshalizi/rmarkdown/.
You will need to know some Latex to write equations in Rmarkdown. Many tutorials exist online, e.g. http://www.latex-tutorial.com/tutorials. One legitimate approach is to identify equations in the notes that you would like to modify, and then dig out the source code from the course github repository. If you look at code from the slides, it will be in Rnw format not Rmd format: both methods combine Latex and R in similar ways, and the main practical difference is the symbols used to separate code chunks from text.
Your report should contain a reference section listing sources. The grader should be able to clearly identify where the sources were used, for example using reference numbers in the text. Anything and anyone consulted while you are working on the homework counts as a source and should be credited. The homework will be graded following the posted rubric.
Question 3.1. Try out some of the ARMA techniques studied in class on the Ann Arbor January Low temperature time series that we saw in Chapter 1 of the notes. Write a report as an Rmd file and submit this file on the class Canvas site. This is an open-ended assignment, but you’re only expected to do what you can in a reasonable amount of time. Some advice follows.
x <- read.table(file="http://ionides.github.io/531w24/01/ann_arbor_weather.csv",header=TRUE)
plot(Low~Year,data=x,type="l")
Your report should involve model equations and hypotheses, and should define the notation used. Please be careful to distinguish between symbols representing random variables (usually using capital letters) and numbers. You are welcome to follow the notation in the course notes, and if you deviate from this notation it is especially necessary to define the notation that you choose.
You are advised to try a few things from the notes, spot something that catches your attention, and try a few more things to investigate it. Write up what you found, and you’re finished!
When writing up your homework report, you must choose which
pieces of R code to include in the HTML document. To tell Rmarkdown not
to include the R code in the HTML document, use the
echo=FALSE
chunk option, e.g.,
```{r chunk_without_code, echo=FALSE}
cat("only the output of this code chunk will be printed\n")
```
You should only display the code in the HTML document if you think that, in your context, the helpfulness of showing the code outweighs the extra burden on the reader, since the reader can work through the Rmd source file if necessary. For your homework report, it is helpful to show more code than you would in a project report. A suitable balance might be similar to the style of the course notes.
When you have got everything you can out of the Ann Arbor January Low temperature time series, consider it in the context of the global mean annual temperature time series on the class github site:
global <- read.table(file="http://ionides.github.io/531w24/hw03/Global_Temperature.txt",header=TRUE)
plot(Annual~Year,data=global,type="l")