Scheduling


Instructions

The midterm exam will test the skills covered in homeworks 1–6, lab 1–6, and material covered in class up to Wednesday 2/14. You will have 80 minutes allocated. The exam will be closed book. Any electronic devices in your possession must be turned off and remain in a bag on the floor. Technical skills tested will include both numerical and algebraic manipulation of variance and covariance, as well as the skills covered in quiz 1. R coding will be tested by multiple choice questions and reading given code. The only time you may have to write code in the exam is to write a call to ‘pnorm()’ or ‘qnorm()’ to show how to evaluate a normal distribution calculation. There will be questions on explaining R output related to fitting linear models, as well as questions on writing linear models in matrix form and as equations with subscripts.


Formulas


Question categories.

All question categories from the quiz will be included except for the basic matrix exercises (M1,M2,M3).


Summation exercises

S1. A basic exercise.

S2. An example involving the summation representation of matrix multiplication.


R exercises

R1. Using rep() and matrix().

R2. Manipulating vectors and matrices in R.


Fitting a linear model by least squares

[This category is similar, but slightly different, from the F1 and F2 questions in the quiz.]

F1. Write the sample version of a linear model in subscript form given the matrix form.

F2. Write the sample version of a linear model in matrix form given the subscript form.

F3. Write the sample version of a linear model in matrix form subscript form given a dataset and verbal description of the model OR writing the sample version of a linear model in matrix form given a dataset and verbal description of the model.

F4. Explain how to obtain the least square value of the coefficients and the fitted values.


Properties of variance and covariance

V1. A numerical calculation to find the variance of a linear combination using matrix techniques.

V2. An algebraic calculation using basic definitions of variance & covariance, together with the linearity of expectation.


Normal probability calculations

N1. A normal approximation to estimate a probability using the mean and variance.

N2. A normal approximation to find a region with a given probability using the mean and variance.


The population version (or probability version) of the linear model

P1. Describe a suitable probability model, in subscript form, to give a population version of a linear model.

P2. Describe a suitable probability model, in matrix form, to give a population version of a linear model.

P3. Explain how R produces standard errors for coefficients in a linear model. Interpret the standard errors using the probability model.



Example: patient satisfaction in a hospital

The following survey data on a collection of hospital patients measures self-reported satisfaction, age, a measure of case severity, and a measure of anxiety. The hospital managers want to see whether satisfaction can be explained by the other variables, and, if so, which variables are important.

patients <- read.table("patients.txt",header=T)
dim(patients)
## [1] 46  4
head(patients)
##   Satisfaction Age Severity Anxiety
## 1           48  50       51     2.3
## 2           57  36       46     2.3
## 3           66  40       48     2.2
## 4           70  41       44     1.8
## 5           89  28       43     1.8
## 6           36  49       54     2.9

(F1,F2,F3). Write the sample version of a linear model to address this question, in subscript form and matrix form.

(P1,P2). Write a probability model that can be used to assess the chance variation in the coefficients of the sample linear model. What is the source of this chance variation?

(P3) Explain how this probability model is used to obtain standard errors for the coefficient estimates.