No Title

STAT 350: 97-1

Midterm, 19 February 1997Instructor: Richard Lockhart

Instructions: This is an open book test. You may use notes, text, other books and a calculator. Your presentations of statistical analysis will be marked for clarity of explanation. I expect you to explain what assumptions you are making and to comment if those assumptions seem unreasonable. The exam is out of 25.

1.

Suppose two objects with weights $\alpha_1$ and $\alpha_2$ are weighed separately and then together. The resulting data points Y₁, Y₂ and Y₃ satisfy $Y_1 = \alpha_1 + \epsilon_1$ , $Y_2 = \alpha_2 + \epsilon_2$ and $Y_3 = \alpha_1+\alpha_2+\epsilon_3$ .

This question is straight from the notes. See Lecture 9

(a)

What is the design matrix of this linear model? [2 marks]

$\begin{displaymath}X=\left[\begin{array}{rr} 1 & 0 \\ 0 & 1 \\ 1 & 1 \end{array}\right] \end{displaymath}$

(b)

$\begin{displaymath}(X^TX)^{-1}= \left[\begin{array}{rr} \frac{2}{3} & -\frac{1}{3} \\ -\frac{1}{3} & \frac{2}{3}\end{array}\right]\end{displaymath}$

what is the hat matrix? [2 marks]

$\begin{displaymath}H= X (X^TX)^{-1} X^T = \left[\begin{array}{rrr} \frac{2}{3} &... ...\ \frac{1}{3} & \frac{1}{3} & \frac{2}{3} \end{array}\right] \end{displaymath}$

(c)

Write $\hat\alpha_1$ in the form a₁Y₁+a₂Y₂+a₃Y₃ giving specific numerical values for the a_i. [2 marks]

$\begin{displaymath}\hat\alpha =(X^tX)^-1 X^TY = \left[\begin{array}{rr} \frac{2... ...\left[\begin{array}{r} Y_1+Y_3 \\ Y_2 + Y_3 \end{array}\right] \end{displaymath}$

so that

$\begin{displaymath}\hat\alpha_1 = \frac{2}{3} ( Y_1+Y_3) -\frac{1}{3}( Y_2 + Y_3) = \frac{2}{3} Y_1 -\frac{1}{3} Y_2 + \frac{1}{3} Y_3 \end{displaymath}$

Thus a₁ = 2/3, a₂ =-1/3 and a₃ = 1/3.

(d)

What is the standard error of $\hat\alpha_1$ ? [2 marks]

$\sigma \sqrt{(X^TX)^{-1}_{11} } = \sigma \sqrt{\frac{2}{3}}$

(e)

What is the variance of the residual corresponding to Y₁? [2 marks]

$\begin{displaymath}\sigma^2 (1-H_{11}) = \frac{1}{3} \sigma^2\end{displaymath}$

2.

A company measures its annual sales Y in each of 26 regions, along with the values of 4 covariates, X₁, the advertising expenditure in the region, X₂, the number of active accounts in the region, X₃, the number of competing brands, and X₄, a measure of the potential for sales in the region. I attach some SAS code and an edited version of the output.

(a)

Is the regression significant? [3 marks]

$\begin{displaymath}F= 479.10 \qquad P=0.0001 \end{displaymath}$

so the regression is definitely significant.

(b)

Can advertising expenditure and sales potential be dropped from the full model? [3 marks]

This is an extra sum of squares F-test. The 2 error sums of squares are 2210.44 and 1937.14 on 23 and 21 degrees of freedom. Thus

$\begin{displaymath}F= \frac{( 2210.44 - 1937.14 ) / 2}{92.24} = 1.48 \, . \end{displaymath}$

Compare this to the critical points F_2,21,0.5 = 0.72 and F_2,21,0.9 = 2.57 to see that they may, indeed, be dropped from the model. (The P-value is around 0.25.)

(c)

In a model which includes all 4 covariates test the hypothesis that the advertising expenditure is an unimportant predictor. [3 marks]

The relevant t statistic is 1.67 with a P value of 0.1094 so that we would conclude that advertising expenditure is an unimportant predictor.

(d)

What final fitted model seems best? (You will not be able to examine plots or diagnostics). [3 marks]

The covariates X₂ and X₃ cannot be deleted from the model but X₁ and X₄ can so the best fitted model is

$\begin{displaymath}\hat{Y} = 186.7 + 3.41 X_2 - 21.2 X_3 \, . \end{displaymath}$

(e)

Give a 95% confidence interval for the coefficient of X₃. [3 marks]

You should get the standard error and estimates from the model with X₂ and X₃. This gives

$\begin{displaymath}\hat\beta_3 \pm t_{0.025,23} \hat\sigma_{\hat\beta_3} = -21.19 \pm 2.069\times 0.80 \, . \end{displaymath}$

Complete SAS output.

Richard Lockhart
1999-01-19