Quiz 1, STATS 401 W18

The quiz will test the skills covered in homeworks 1 to 4. You will have 50 minutes allocated, though the quiz may take you much less time and you can leave lab once you are done. The quiz will be closed book. Any electronic devices in your possession must be turned off and remain in a bag on the floor. Technical skills tested will include matrix multiplication and transposition, inversion for 2x2, use of the sigma notation for sums. R coding will be tested by multiple choice questions. There will be questions on setting up equations for fitting a linear model, writing these equations in matrix form, and obtaining and interpreting the least squares fit to data.

This document generates different random quizzes each time it is compiled by Rmarkdown. The actual quiz will be a realization generated by this random process, or something very similar.

Matrix exercises

M1. Evaluate ${\mathbb{A}}{\mathbb{B}}$ when \[ {\mathbb{A}}= \begin{bmatrix} -2 & -1 \\ 2 & -1 \\ 3 & -2 \\ \end{bmatrix} , \quad {\mathbb{B}} = \begin{bmatrix} -1 & -2 \\ 1 & -2 \\ \end{bmatrix} \]

M2. For ${\mathbb{A}}$ as above, write down ${\mathbb{A}}^{{\scriptscriptstyle \mathrm{T}}}$.

M3. For ${\mathbb{B}}$ as above, find ${\mathbb{B}}^{-1}$ if it exists. If ${\mathbb{B}}^{-1}$ doesn’t exist, explain how you know this.

Summation exercises

S1. A basic exercise.

Calculate $\sum_{i=k}^{k+4} (i+3)$, where $k$ is a whole number. Your answer should depend on $k$.

S2. An example involving sums of squares and products.

Show that $\frac{1}{n} \sum_{i=1}^n \big(x_i - \bar x\big)^2 = \Big(\frac{1}{n}\sum_{i=1}^n x_i^2\Big) -\bar x^2$, where $\bar x = \frac{1}{n}\sum_{i=1}^n x_i$.

R exercises

R1. Using rep() and matrix().

Which of the following code successfully construct the matrix $\mathbb{A} = \begin{bmatrix}1 & 1\\2 & 2\\3 & 3\end{bmatrix}$

(a). $\quad$ A <- matrix(c(1,1,2,2,3,3) ,nrow=3)

(b). $\quad$ A <- cbind(c(1,1),c(2,2),c(3,3))

(c). $\quad$ A <- t(matrix(c(1,1,2,2,3,3) ,nrow=2))

(d). $\quad$ A <- c(c(1:3),c(1:3))

R2. Manipulating vectors and matrices in R.

Suppose we define an R vector by y <- c(3,NA,-1,4,NA,-2). What will y[y>0] give you?

(a). A vector of the positive elements and NA values of y.

(b). A vector of the negative elements of y.

(c). A vector of all NAs.

(d). A vector of TRUEs and FALSEs.

(e). A vector of TRUEs and FALSEs and NAs.

Fitting a linear model by least squares

F1. Recall the dataset uswages containing ten variables on 2000 subjects from the 1988 Current Population Survey.

head(uswages, n=4)

##         wage educ exper race smsa ne mw so we pt
## 6085  771.60   18    18    0    1  1  0  0  0  0
## 23701 617.28   15    20    0    1  0  0  0  1  0
## 16208 957.83   16     9    0    1  0  0  1  0  0
## 2720  617.28   12    24    0    1  1  0  0  0  0

Suppose we want to fit a linear model using wage as response, with years of education and years of experience as predictors. Which of the following code succesfully construct the matrix $\mathbb{X}$ for a representation ${\mathbf{y}}={\mathbb{X}}{\mathbf{b}}+{\mathbf{e}}$.

(a). X <- matrix(uswages$educ, uswages$exper)

(b). X <- matrix(rep(1,nrow(uswages)), uswages$educ, uswages$exper)

(c). X <- cbind(rep(1,nrow(uswages)), uswages$educ, uswages$exper)

(d). X <- cbind(uswages$educ, uswages$exper)

F2. If we want to fit the model using R function lm(), which of the following calls is correct?

(a). lm(wage ~ ., data = uswages)

(b). lm(y ~ x, data = uswages)

(c). lm(wage = educ + exper, data = uswages)

Explain briefly how you would check whether your proposed solution is correct.

Acknowledgements: Some questions are derived from https://genomicsclass.github.io/book. Some are derived from http://swirlstats.com/.

License: This material is provided under an MIT license

Quiz 1, STATS 401 W18

In lab on 2/1 or 2/2

Matrix exercises

Summation exercises

R exercises

Fitting a linear model by least squares