HW03 Solutions

Q2

(a)

# (a)
## download unemployment data file
download.file(destfile="unemployment.csv",
  url="https://ionides.github.io/401f18/01/unemployment.csv")

## read-in unemployment data into R
data <- read.table(file="unemployment.csv",sep=",",header=TRUE)

To define y, we look at a snipet of the dataset to see how to obtain ‘y’.

##   Year Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
## 1 1948 4.0 4.7 4.5 4.0 3.4 3.9 3.9 3.6 3.4 2.9 3.3 3.6
## 2 1949 5.0 5.8 5.6 5.4 5.7 6.4 7.0 6.3 5.9 6.1 5.7 6.0
## 3 1950 7.6 7.9 7.1 6.0 5.3 5.6 5.3 4.1 4.0 3.3 3.8 3.9
## 4 1951 4.4 4.2 3.8 3.2 2.9 3.4 3.3 2.9 3.0 2.8 3.2 2.9
## 5 1952 3.7 3.8 3.3 3.0 2.9 3.2 3.3 3.1 2.7 2.4 2.5 2.5
## 6 1953 3.4 3.2 2.9 2.8 2.5 2.7 2.7 2.4 2.6 2.5 3.2 4.2

As we can see, to define ‘y’, we need to take the average of all columns except the first one. This can be done in one of the following two ways:

# (a) ctd
## Defining y
y <- apply(data[,-1],1,mean)
  #or
y <- apply(data[,2:ncol(data)],1,mean)

(b)

Recall from class notes (Notes03,LM3 and Notes03, slide15) that the design matrix for such a model will be \({\mathbb{X}}=[\mathbf{x_1} \mathbf{x_2]}\) where \(\mathbf{x_1}\) is the vector of years (that is, its \(i^{th}\) entry corresponds to the \(i^{th}\) year in the dataset) and \(\mathbf{x_2}\) is a vector of 1s (of length equal to the number of observations in the dataset). Let us now construct this matrix

#(b)
## Extract vector of years
x1 <- data$Year

## Define a vector containing 1s (of appropriate length)
x2 <- rep(1,nrow(data))

## Define X = [x1 x2]
X <- cbind(x1,x2)

(c)

Recall formula for to obtain b from class notes (Notes03, LM4)

# (c)
b <- solve(t(X)%*%X)%*%t(X)%*%y
rownames(b) <- c("b1","b2") #renaming the elements of b, this is not essential, being done here so that the notation                              matches that of the hw
b

##            [,1]
## b1   0.03134917
## b2 -56.29399009

(d)

Here, we have to rename the first column (\(\mathbf{x_1}\)) as \(\mathbf{x}\).

# (d)
## Define x
x <- x1

## Define vector of fitted values
fitted_values <- c(X%*%b)

## Plot y ~ x
plot(x,y, main="Plot of average unemployment per year", ylab="average unemployment", xlab= "year")
lines(x,fitted_values,lty="solid",col="blue")

Clearly, there seems to be an increasing trend, that is, the average annual unemployment rate appears to be increasing every year.

(e)

# (e)
## Fitting the linear model
lm1 <- lm(y~x1, data = data) #remember x is labeled as x1 in the design matrix

## Compare first 5-values
lm1$fitted.values[1:5]

##        1        2        3        4        5 
## 4.774190 4.805539 4.836888 4.868238 4.899587

fitted_values[1:5]

## [1] 4.774190 4.805539 4.836888 4.868238 4.899587

## Check if both vectors are equal by taking their difference
table(round(lm1$fitted.values - fitted_values,4))

## 
##  0 
## 68

Q3

If \({\mathbb{A}},{\mathbb{B}}\) are two invertible square matrices, we have the property that \({\mathbb{(AB)}}^{-1} = {\mathbb{B}}^{-1}{\mathbb{A}}^{-1}\).
However, this formula runs into difficulties if \({\mathbb{A}}\) or \({\mathbb{B}}\) is not invertible. This is the problem in the argument. In fact, \({\mathbb{X}}\) is usually not invertible. One way to see this is that \({\mathbb{X}}\) being invertible implies there is an exact solution to the system of equations \({\mathbb{X}}{\boldsymbol{\mathrm{b}}}={\boldsymbol{\mathrm{y}}}\) given by \({\boldsymbol{\mathrm{b}}}={\mathbb{X}}^{-1}{\boldsymbol{\mathrm{y}}}\). You may know that to determine \(p\) variables in a system of linear equations, we need \(p\) equations. Fewer than \(p\) equations and we won’t have enough information to solve for \(p\) variables. More than \(p\) equations and likely there won’t be a solution. Thus, only a square matrix is invertible. For us, \(n\) is the number of data points and \(p\) is the number of coefficients in the linear model so usually \(p\) is much less than \(n\). Thus, \({\mathbb{X}}\) is not square and cannot be invertible.

If \({\mathbb{X}}\) were invertible then (a) would be true. The following steps (b), (c) and (d) are correct matrix calculations and so this product would in fact equal identity in that case.

Q4

Since \({\mathbb{A}}\) is a square matrix whose determinant is \(6 \neq 0\), its inverse exists. Now, we know that \({\mathbb{A}}{\mathbb{A}}^{-1} = {\mathbb{I}}\) where \(mat{I}\) is a diagonal matrix as well. Also, from above we know that \({\mathbb{A}}{\mathbb{A}}^{-1}\) is the same as multiplying the first columns of \({\mathbb{A}}^{-1}\) by 1,2, and 3 respectively. Hence, the natural guess would be that \[ {\mathbb{A}}^{-1} = \left[ \begin{array}{ccc} 1 & 0 & 0 \\ 0 & 1/2 & 0 \\ 0 & 0 & 1/3 \end{array} \right]. \] and indeed this is correct! Let’s verify using R.

# Q4
## Define A, inv_A 
A <- matrix(c(1,0,0,0,2,0,0,0,3),3)
inv_A <- matrix(c(1,0,0,0,1/2,0,0,0,1/3),3)

## Verify 
A%*%inv_A

##      [,1] [,2] [,3]
## [1,]    1    0    0
## [2,]    0    1    0
## [3,]    0    0    1

HW03 Solutions

STATS401 team

9/17/2018

Q2

(a)

(b)

(c)

(d)

(e)

Q3

Q4