Homework 6, STATS 401 W18

Exercises on variance, covariance and the normal distribution
Understanding how lm() obtains standard errors

Write solutions to the exercises in the first half of the homework. For the data analysis, you do not have to report anything for questions 1–4. For questions 5 and 6, report your code together with a brief explanation. Question 7 asks you to carefully write out the probability model you have used for the standard errors. Recall that you are permitted to collaborate, or to use any internet resources, but you must list all sources that make a substantial contribution to your report. As usual, following the syllabus, you are also requested to give some feedback in a “Please explain” statement.

Exercises on variance, covariance and the normal distribution

Suppose \(X\) and \(Y\) are bivariate random variables with means \(\mu_X\) and \(\mu_Y\) respectively. Use the linearity of expectation, together with the formulas \[ {\mathrm{Var}}(X)={\mathrm{E}}[X^2]-({\mathrm{E}}[X])^2, \quad {\mathrm{Cov}}(X,Y)={\mathrm{E}}[XY]-{\mathrm{E}}[X]\, {\mathrm{E}}[Y], \] to show that \[ {\mathrm{Var}}(X+Y)={\mathrm{Var}}(X)+{\mathrm{Var}}(Y) +2{\mathrm{Cov}}(X,Y). \] This is similar to a calculation we did in class, but here it is a little easier since you can start from the above formulas for \({\mathrm{Var}}(X)\) and \({\mathrm{Cov}}(X,Y)\) rather than going all the way back to the basic definition \[ {\mathrm{Var}}(X)={\mathrm{E}}\big[ (X- {\mathrm{E}}[X])^2\big], \quad {\mathrm{Cov}}(X,Y)={\mathrm{E}}\big[ (X-{\mathrm{E}}[X])(Y-{\mathrm{E}}[Y]) \big]. \]
Let \({\boldsymbol{\mathrm{Z}}}=(Z_1,Z_2,Z_3)\) be a vector of independent, unit variance, mean zero random variables. In other words, suppose \({\boldsymbol{\mathrm{Z}}}\) has \({\mathrm{E}}[{\boldsymbol{\mathrm{Z}}}]={\boldsymbol{\mathrm{0}}}\) and variance matrix \[{\mathbb{I}}=\begin{bmatrix}1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1\end{bmatrix}.\]
1. Use R to find the variance matrix \({\mathrm{Var}}({\boldsymbol{\mathrm{Y}}})\), when \({\boldsymbol{\mathrm{Y}}}\) is defined by \[ {\boldsymbol{\mathrm{Y}}} = \begin{bmatrix} 6 & 3 & 1 \\ 0 & 5 & 2 \\ 0 & 0 & 4 \\ \end{bmatrix} {\boldsymbol{\mathrm{Z}}} \]
2. Calculating by hand, use your expression for \({\mathrm{Var}}({\boldsymbol{\mathrm{Y}}})\) to find the variance of \(Y_1-Y_2\). You can do this using the formula \({\mathrm{Var}}(X-Y)={\mathrm{Var}}(X)+{\mathrm{Var}}(Y)- 2{\mathrm{Cov}}(X,Y)\) or by writing \[ Y_1 - Y_2 = \begin{bmatrix} 1 & -1 & 0\end{bmatrix} \begin{bmatrix}Y_1 \\ Y_2 \\ Y_3 \end{bmatrix} = {\mathbb{A}} {\boldsymbol{\mathrm{Y}}} \] for \({\mathbb{A}} = \begin{bmatrix} 1 & -1 & 0\end{bmatrix}\), and then using the formula \({\mathrm{Var}}({\mathbb{A}}{\boldsymbol{\mathrm{Y}}})= {\mathbb{A}}{\mathrm{Var}}({\boldsymbol{\mathrm{Y}}}){\mathbb{A}}^{{\scriptscriptstyle \mathrm{T}}}\). (Optionally, you can use R for this, and if you do this then the matrix version is easier to code.)
3. Supposing that \(Y_1-Y_2\) is normally distributed, use pnorm() to find \({\mathrm{P}}[ Y_1-Y_2>3]\).
Let \({\boldsymbol{\mathrm{X}}}=(X_1,\dots,X_n)\) be a random vector with expectation \({\mathrm{E}}[{\boldsymbol{\mathrm{X}}}]={\boldsymbol{\mathrm{\mu}}}\) for \({\boldsymbol{\mathrm{\mu}}}=(\mu_1,\dots,\mu_n)\). Let \({\mathbb{V}}\) be the \(n\times n\) variance matrix of \({\boldsymbol{\mathrm{X}}}\), so \({\mathbb{V}}_{ij}={\mathrm{Cov}}(X_i,X_j)\) for \(i=1,\dots,n\) and \(j=1,\dots,n\).
1. Interpreting \({\boldsymbol{\mathrm{X}}}\) and \({\boldsymbol{\mathrm{\mu}}}\) as column vectors, show that
  
  \([\mathrm{Eq.~1}] \hspace{1.5cm} {\mathbb{V}}= {\mathrm{E}}[{\boldsymbol{\mathrm{X}}}{\boldsymbol{\mathrm{X}}}^{{\scriptscriptstyle \mathrm{T}}}]- {\boldsymbol{\mathrm{\mu}}}{\boldsymbol{\mathrm{\mu}}}^{{\scriptscriptstyle \mathrm{T}}}.\)
2. Now let \({\boldsymbol{\mathrm{Y}}}=(Y_1,\dots,Y_m)\) be a random vector with expectation \({\mathrm{E}}[{\boldsymbol{\mathrm{Y}}}]={\boldsymbol{\mathrm{\nu}}}\) for \({\boldsymbol{\mathrm{\nu}}}=(\nu_1,\dots,\nu_m)\). Let \({\mathbb{A}}=[a_{ij}]_{n\times m}\) be an arbitrary constant \(n\times m\) matrix. Apply [Eq. 1] with \({\boldsymbol{\mathrm{X}}}={\mathbb{A}}{\boldsymbol{\mathrm{Y}}}\) and use the matrix linearity of expectation (i.e., summation and multiplication by a constant matrix can be moved through \({\mathrm{E}}\)) to show that
  
  \([\mathrm{Eq.~2}] \hspace{1.5cm} {\mathrm{Var}}({\mathbb{A}}{\boldsymbol{\mathrm{Y}}}) = {\mathbb{A}}{\mathrm{Var}}({\boldsymbol{\mathrm{Y}}}){\mathbb{A}}^{{\scriptscriptstyle \mathrm{T}}}.\)
  
  You will need the formula we gave earlier in the notes for the transpose of a matrix product, \(\big({\mathbb{A}}{\boldsymbol{\mathrm{Y}}}\big)^{{\scriptscriptstyle \mathrm{T}}}= {\boldsymbol{\mathrm{Y}}}^{{\scriptscriptstyle \mathrm{T}}}{\mathbb{A}}^{{\scriptscriptstyle \mathrm{T}}}\). This problem has given us another way to find the formula [Eq. 2] which we demonstrated earlier in class.

Understanding how `lm()` obtains standard errors

Read the analysis of newspaper circulation data in Section 1.2.2 of Sheather. This example is continued in Section 6.2.2 of Sheather. You are now in a position to read most of this section too, but you are not required to do so at this point.

Make a local copy of the data, available at https://ionides.github.io/401w18/hw/hw06/circulation.txt. The columns in this data file are separated by a tab, represented in R by sep="\t". Since there are spaces within some newspaper names, read.table(....,sep=" ") does not work. Instead, use

circulation <- read.table("circulation.txt",sep="\t",header=T)

Transform the data. Add two new columns to the dataframe called log_Sunday and log_Weekday containing the natural logarithm of the corresponding columns. The R command log() gives this natural logarithm, also known as log to the base \(e\). We’ll discuss later in class how and why to choose a suitable transformation of the data, which is an important decision for data analysis.
Build the model in R. Create a linear model called lm1 by fitting the logarithm of weekday circulation and the binary variable for tabloid competitor as explanatory variables for the logarithm of Sunday circulation. Your code may look something like

lm1 <- lm(log_Sunday~log_Weekday+Tabloid_with_serious_competitor,data=circulation)

Set X to be the design matrix using the model.matrix() command by typing

X <- model.matrix(lm1)

Use the design matrix and the response variable to compute the least squares coefficients and their standard errors by matrix calculations in R. It may be helpful to set

y <- circulation$log_Sunday

Check that these match the output of summary(lm1). Also, check that your calculation of the estimated standard deviation of the measurement error matches the residual standard error offered by summary(lm1). Why do you think summary(lm1) says that this is computed on 86 degrees of freedom?
Write out in mathematical notation the probability model used to contruct these standard errors. Be careful to define the notation you use. Specify a letter for each quantity - you can use words to help define the quantities in your equation, but you should usually avoid words in an equation. You can write your equation either using vectors & matrices or by using subscripts to denote each unit \(i\) and specifying the range of values of \(i\) for which the equation holds. Be explicit about what quantities are random vectors. If you define a measurement error model, be sure to specify all means, variances and covariances for the error random variables.

License: This material is provided under an MIT license
Acknowledgement: The linear model fitting problem develops an example from S. J. Sheather (2009) “A Modern Approach to Regression with R.”

Homework 6, STATS 401 W18

Due in your lab on 2/15 or 2/16

Exercises on variance, covariance and the normal distribution

Understanding how lm() obtains standard errors

Understanding how `lm()` obtains standard errors