For your homework report this week, you do not need to submit solutions to the swirl lessons. Please report whether you successfully completed them, and discuss any issues that arose in the Sources and Please explain statements. For the data analysis exercise, write a brief report including the output you are asked to compute and the R code you used to generate it. Recall that you are permitted to collaborate, or to use internet resources including the source code for the textbook at http://www.stat.tamu.edu/~sheather/book/, but you must list all sources that make a substantial contribution to your report.
Start by reading Section 1.2.1. of the textbook. This section describes the data and the models we will consider, and then goes further to discuss p-values we have not yet introduced in class. We will get to that soon. Look at the data at https://ionides.github.io/401w18/hw/hw03/FieldGoals2003to2006.csv. The data are described in the header of that file.
Read the data into R (as in Homework 1).
data_nfl <- read.csv("https://ionides.github.io/401w18/hw/hw03/FieldGoals2003to2006.csv",header = TRUE,skip=5)
y <- data_nfl$FGt
X <- cbind(data_nfl$FGtM1,rep(1,length(data_nfl$FGtM1)))
b <- c("m","c")
#Confirm that you have defined the correct matrices
head(y); head(X)
## [1] 73.5 93.9 80.0 89.4 82.7 84.3
## [,1] [,2]
## [1,] 90.0 1
## [2,] 73.5 1
## [3,] 93.9 1
## [4,] 80.0 1
## [5,] 88.2 1
## [6,] 82.7 1
b <- solve( t(X) %*% X ) %*% t(X) %*% y
b
## [,1]
## [1,] -0.1509583
## [2,] 94.6097871
plot(x=data_nfl$FGtM1,y=jitter(y),xlab = "Av Field Goals in previous year (t-1)", ylab = "Av Field Goals in current year (t)")
abline(a=b[2],b=b[1],col="red")
rep()
, and glue them together with cbind()
.Z <- matrix(0,nrow=19*4,ncol=19)
for(k in 1:19){Z[,k]<-c(rep(c(0,0,0,0),k-1),rep(1,4),rep(c(0,0,0,0),19 - k))}
# or
for(k in 1:19){Z[,k][(4*(k-1)+1) : (4*k)] <- rep(1,4)}
X <- cbind(data_nfl$FGtM1,Z)
b <- solve( t(X) %*% X ) %*% t(X) %*% y
m <- b[1]; m
## [1] -0.5037008
There are other succinct ways to construct this matrix, and you can look for them if you wish. Report whether your least squares estimate of \(m\), constructed using the design matrix \({\mathbb{X}}\), matches the value of -0.504 in Figure 1.2 of Sheather.
plot(x=data_nfl$FGtM1,y=jitter(y),xlab = "Av Field Goals in previous year (t-1)", ylab = "Av Field Goals in current year (t)")
for(i in 1:19){abline(a=b[i+1],b=b[1],col=palette(rainbow(19))[i])}