This assignment involves getting your computer ready for the computing environment we will be using for STATS/DATASCI 531. There is nothing to turn in. For most students, it is expected that these tools are familiar. If they are unfamiliar, you are welcome to ask for advice or assistance in group meetings or via Piazza.


Internet repositories for collaboration and open-source research: git and GitHub

git clone https://github.com/ionides/531w21
git pull

R and Rstudio

You have probably used R before, and if not it is time to start! We will make extensive use of R. Please check R is installed on your laptop. It is available at www.r-project.org

Rstudio is a popular environment for carrying out statistical analysis in R. You can choose whether or not to access R through Rstudio for this course, but many people find that a convenient approach. It can be downloaded from www.rstudio.com

Rmarkdown and knitr

The midterm and final projects will be submitted as reproducible reports written in Rmarkdown or knitr. A reproducible report combines text and source code, generates the results by running the code, and inserts the resulting tables, figures and numbers into the finished document. Advantages of this approach are: (i) you can easily modify your report if you want to try doing something differently; (ii) the reader can, if necessary, inspect or run the code that gave the results; (iii) classmates can easily learn effective data analysis techniques from each other. Rmarkdown is a popular approach for doing this, see rmarkdown.rstudio.com. If you have not used Rmarkdown before, you might like to start familiarizing yourself with it. Rstudio works well with Rmarkdown (Rmd) files, especially for generating HTML documents. Knitr is similar to Rmarkdown, and provides a better environment for producing pdf documents. The course notes are written using knitr, and you are welcome to inspect the source files in the GitHub repository.