No Title

STAT 350: 97-1

Midterm, 19 February 1997Instructor: Richard Lockhart


Instructions: This is an open book test. You may use notes, text, other books and a calculator. Your presentations of statistical analysis will be marked for clarity of explanation. I expect you to explain what assumptions you are making and to comment if those assumptions seem unreasonable. The exam is out of 25.


1.
Suppose two objects with weights $\alpha_1$ and $\alpha_2$ are weighed separately and then together. The resulting data points Y1, Y2 and Y3 satisfy $Y_1 = \alpha_1 + \epsilon_1$, $Y_2 = \alpha_2 + \epsilon_2$ and $Y_3 = \alpha_1+\alpha_2+\epsilon_3$.

This question is straight from the notes. See Lecture 9

(a)
What is the design matrix of this linear model? [2 marks]


\begin{displaymath}X=\left[\begin{array}{rr}
1 & 0 \\ 0 & 1 \\ 1 & 1 \end{array}\right]
\end{displaymath}

(b)
If

\begin{displaymath}(X^TX)^{-1}= \left[\begin{array}{rr} \frac{2}{3} & -\frac{1}{3} \\
-\frac{1}{3} & \frac{2}{3}\end{array}\right]\end{displaymath}

what is the hat matrix? [2 marks]


\begin{displaymath}H= X (X^TX)^{-1} X^T = \left[\begin{array}{rrr}
\frac{2}{3} &...
...\
\frac{1}{3} & \frac{1}{3} & \frac{2}{3}
\end{array}\right]
\end{displaymath}

(c)
Write $\hat\alpha_1$ in the form a1Y1+a2Y2+a3Y3 giving specific numerical values for the ai. [2 marks]


\begin{displaymath}\hat\alpha =(X^tX)^-1 X^TY
= \left[\begin{array}{rr} \frac{2...
...\left[\begin{array}{r} Y_1+Y_3 \\ Y_2 + Y_3 \end{array}\right]
\end{displaymath}

so that

\begin{displaymath}\hat\alpha_1 = \frac{2}{3} ( Y_1+Y_3) -\frac{1}{3}( Y_2 + Y_3) = \frac{2}{3} Y_1
-\frac{1}{3} Y_2 + \frac{1}{3} Y_3 \end{displaymath}

Thus a1 = 2/3, a2 =-1/3 and a3 = 1/3.

(d)
What is the standard error of $\hat\alpha_1$? [2 marks]

$\sigma \sqrt{(X^TX)^{-1}_{11} } = \sigma \sqrt{\frac{2}{3}}$

(e)
What is the variance of the residual corresponding to Y1? [2 marks]


\begin{displaymath}\sigma^2 (1-H_{11}) = \frac{1}{3} \sigma^2\end{displaymath}

2.
A company measures its annual sales Y in each of 26 regions, along with the values of 4 covariates, X1, the advertising expenditure in the region, X2, the number of active accounts in the region, X3, the number of competing brands, and X4, a measure of the potential for sales in the region. I attach some SAS code and an edited version of the output.

(a)
Is the regression significant? [3 marks]


\begin{displaymath}F= 479.10 \qquad P=0.0001 \end{displaymath}

so the regression is definitely significant.

(b)
Can advertising expenditure and sales potential be dropped from the full model? [3 marks]

This is an extra sum of squares F-test. The 2 error sums of squares are 2210.44 and 1937.14 on 23 and 21 degrees of freedom. Thus

\begin{displaymath}F= \frac{( 2210.44 - 1937.14 ) / 2}{92.24} = 1.48 \, .
\end{displaymath}

Compare this to the critical points F2,21,0.5 = 0.72 and F2,21,0.9 = 2.57 to see that they may, indeed, be dropped from the model. (The P-value is around 0.25.)

(c)
In a model which includes all 4 covariates test the hypothesis that the advertising expenditure is an unimportant predictor. [3 marks]

The relevant t statistic is 1.67 with a P value of 0.1094 so that we would conclude that advertising expenditure is an unimportant predictor.

(d)
What final fitted model seems best? (You will not be able to examine plots or diagnostics). [3 marks]

The covariates X2 and X3 cannot be deleted from the model but X1 and X4 can so the best fitted model is

\begin{displaymath}\hat{Y} = 186.7 + 3.41 X_2 - 21.2 X_3 \, .
\end{displaymath}

(e)
Give a 95% confidence interval for the coefficient of X3. [3 marks]

You should get the standard error and estimates from the model with X2 and X3. This gives

\begin{displaymath}\hat\beta_3 \pm t_{0.025,23} \hat\sigma_{\hat\beta_3} = -21.19 \pm 2.069\times 0.80 \, .
\end{displaymath}

Complete SAS output.



Richard Lockhart
1999-01-19