No Title

STAT 350: 97-1

Final Exam, 8 April 1997Instructor: Richard Lockhart

Instructions: This is an open book test. You may use notes, text, other books and a calculator. Your presentations of statistical analysis will be marked for clarity of explanation. I expect you to explain what assumptions you are making and to comment if those assumptions seem unreasonable. The exam is out of 60.

1.

When a weight is hung from a wire, the wire stretches (returning to its original length when the weight is removed). A 1 kilogram weight is hung from a piece of wire and the length stretched is measured. This is repeated and the two resulting lengths are L_1,1 and L_1,2. Then a 2 kilogram weight is tried 3 times resulting in lengths L_2,1, L_2,2 and L_2,3. To save analysis effort the experimenter averages the two measurements made with the 1 kilogram weight, obtaining Y₁=(L_1,1 + L_1,2)/2 and the 3 measurements made with the two kilogram weight, obtaining Y₂. [Total of 20 marks]

(a)

Assume that the individual lengths satisfy, for i from 1 to 2 and j from 1 to 2 (for i=1) or 1 to 3 (for i=2),

$\begin{displaymath}L_{i,j} = x_{i,j}\beta + \epsilon_{i,j} \end{displaymath}$

where the errors $\epsilon_{i,j}$ are independent normal variables and have mean 0 and variance $\sigma^2$ . What is the design matrix for this linear model? [2 marks]

Solution:

$\begin{displaymath}X=\left[\begin{array}{r} 1 \\ 1 \\ 2 \\ 2 \\ 2\end{array}\right] \end{displaymath}$

(b)

Give an explicit, simple, formula for the least squares estimate of $\beta$ ; I do not want a general formula such as (X^TX)^-1X^TY. [4 marks]

Solution:

$\begin{displaymath}X^TX = [14] \qquad (X^TX)^{-1} = [1/14] \end{displaymath}$

and

X^TY = L_1,1+L_1.2+2L_2,1+2L_2,2+2L_2,3

so that

$\begin{displaymath}\hat\beta = \frac{ L_{1,1}+L_{1.2}+2L_{2,1}+2L_{2,2}+2L_{2,3}}{14} \, .\end{displaymath}$

(c)

Give the mean and variance of the estimator in (b). [2 marks] Solution:

$\begin{displaymath}Var(\hat\beta) = \frac{1}{14^2} (\sigma^2+\sigma^2+4\sigma^2+4\sigma^2+4\sigma^2) = \frac{\sigma^2}{14} = \sigma^2 (X^TX)^{-1} \end{displaymath}$

and

$\begin{displaymath}E(\hat\beta) = \beta \, . \end{displaymath}$

(d)

The average measurements Y_i also satisfy a linear model

$\begin{displaymath}Y_i = x_i \gamma + \epsilon_i \end{displaymath}$

i.

What is $\gamma$ in terms of $\beta$ ? [1 marks] Solution:

$\begin{displaymath}\gamma=\beta \end{displaymath}$

ii.

What is the joint distribution of $(\epsilon_1,\epsilon_2)$ ? In particular what are the variances and means of each $\epsilon_i$ ? [3 marks]

Solution:

$\begin{displaymath}\left[\begin{array}{rr}\epsilon_1 \\ \epsilon_2\end{array}\ri... ..._{2,1} \\ \epsilon_{2,2} \\ \epsilon_{2,3} \end{array}\right] \end{displaymath}$

so that $(\epsilon_1,\epsilon_2)$ has a multivariate normal distribution with mean 0 and variance covariance matrix

$\begin{displaymath}\Sigma = \sigma^2\left[\begin{array}{rrrrr} \frac{1}{2} & \fr... ...ay}{rr} \frac{1}{2} & 0 \\ 0 & \frac{1}{3} \end{array}\right] \end{displaymath}$

iii.

What is the design matrix of this linear model? [1 marks]

Solution:

$\begin{displaymath}X_a =\left[\begin{array}{r} 1 \\ 2\end{array}\right] \, . \end{displaymath}$

(e)

Show that the weighted least squares estimate of $\gamma$ is

$\begin{displaymath}\hat\gamma = (Y_1+3Y_2)/7 \end{displaymath}$

[4 marks]

Solution: The variances of the errors are $\sigma^2/2$ and $\sigma^2/3$ so that the weights are w₁=2 and w₂=3. Then

$\begin{displaymath}X_a^TW X_a = \left[\begin{array}{rr} 1&2\end{array}\right] \l... ...end{array}\right] = \left[\begin{array}{r}14\end{array}\right] \end{displaymath}$

and

X_a^TW Y = 2Y₁+6Y₂

so that

$\begin{displaymath}\hat\gamma = \frac{2 Y_1+6Y_2}{14} = \frac{\sum L_{i,j}}{14}\, .\end{displaymath}$

(f)

What is the distribution of $\hat\gamma$ . [2 marks]

Solution: Normal with mean $\gamma=\beta$ and variance $\sigma^2/14$ .

(g)

Why would analysis of the original variables L_i,j be better than analysis of the Y_i? [1 mark]

Solution: We would have 4 degrees of freedom for error rather than 1.

2.

A variable Y (a measurement of oxygen taken up by a system) is regressed on 4 predictors $X_1,\ldots,X_4$ . A total of 20 measurements were made and Y was regressed on various subsets of the predictor variables leading to the following table of Error Sums of Squares.

Vars	ESS	Vars	ESS	Vars	ESS	Vars	ESS
X₁	154	X₁,X₂	109	X₂,X₄	133	X₁,X₃,X₄	139
X₂	156	X₁,X₃	144	X₃,X₄	175	X₂,X₃,X₄	132
X₃	203	X₁,X₄	146	X₁,X₂,X₃	106	All	104
X₄	250	X₂,X₃	150	X₁,X₂,X₄	107	None	506

(a)

Does adding the variables X₃ and X₄ to the model containing X₁ and X₂ significantly improve the fit? [6 marks]

Solution: This compares the model with all variables in to the model with just X₁ and X₂ and so

$\begin{displaymath}F = \frac{(109-104)/2}{104/15} = 37.5/104 \end{displaymath}$

This is much less than 1 so the added variables are not significant.

(b)

Use Backwards selection with a 10% significance level to stay to select a suitable subset of regression variables. [8 marks]

Solution: We begin with all variables. Among the 3 variable models the model containing only X₁, X₂ and X₃ has the smallest error sum of squares so if we delete a variable it must be X₄. The F statistic is

$\begin{displaymath}F = \frac{(106-104)/1}{104/15}=30/104 \end{displaymath}$

so we delete X₄. Among the two variable models which contain 2 of the variables X₁, X₂ and X₃ the model containing X₁ and X₂ has the smallest error sum of squares so we try to delete X₃ getting

$\begin{displaymath}F = \frac{(109-106)/1}{106/16}=48/106 \end{displaymath}$

which is still far from significant. We delete X₃ and look at 1 variable models which either use X₁ or X₂. The smallest error SS is for X₁ so we try to delete X₂ getting

$\begin{displaymath}F = \frac{(154-109)/1}{109/17}= 41.18 \end{displaymath}$

We compare this to the F tables with 1 numerator and 17 denominator degrees of freedom and see that $P \approx 0.02$ so that X₂ and X₁ will be retained.

(c)

If the estimated slope associated with X₁ in the model including X₁ and X₂ only as predictors is positive what is the value of the t statistics for testing the hypothesis that the true coefficient of X₁ is 0? [1 mark] Solution: $t=\sqrt{F} = \sqrt{[(156-109)/1]/[109/17]} = 2.71$ .

3.

Five different treatments, A, B, C, D and E, are to be examined for their effect on blood pressure. Fifty patients are randomly split into 5 groups of 10. The initial blood pressure X of each patient is measured, the treatment is applied and then the final blood pressure Y is measured. Let $i=1,\ldots,5$ label the treatment and j running from 1 to 10 label the patient within the treatment group. Three models were fitted:

Model I

$\begin{displaymath}Y_{i,j} = \alpha + \beta X_{i,j} + \epsilon_{i,j} \end{displaymath}$

the error sum of squares is 85355 and the estimates are

$\hat\alpha$	$\hat\beta$
37.26	0.65

Model II

$\begin{displaymath}Y_{i,j} = \mu_i + \beta X_{i,j} + \epsilon_{i,j} \end{displaymath}$

the error sum of squares is 66115 and the estimates are

$\hat\mu_1$	$\hat\mu_2$	$\hat\mu_3$	$\hat\mu_4$	$\hat\mu_5$	$\hat\beta$
14.2424	67.5325	48.3918	49.6033	68.7786	0.5509

For this model

$\begin{displaymath}(X^TX)^{-1} = \left[\begin{array}{llllll} 1.026&0.995&0.924&... ...2& -0.00786 \\ & & & & &6.63\times10^{-5} \end{array}\right] \end{displaymath}$

Model III

$\begin{displaymath}Y_{i,j} = \mu_i + \beta_i X_{i,j} + \epsilon_{i,j} \end{displaymath}$

the error sum of squares is 62433 and the estimates are

$\hat\mu_1$	$\hat\mu_2$	$\hat\mu_3$	$\hat\mu_4$	$\hat\mu_5$
52.04954	-68.05918	62.48453	46.66416	112.529
$\hat\beta_1$	$\hat\beta_2$	$\hat\beta_3$	$\hat\beta_4$	$\hat\beta_5$
0.2309892	1.619385	0.4313726	0.5757114	0.1818949

(a)

Of the three models, based on the information available to you, which model provides the best fit to the data. [10 marks]

Solution: Testing Model III vs Model II we get

$\begin{displaymath}F = \frac{(66115- 62433)/4}{ 62433/40} = 0.59 \end{displaymath}$

which is not significant. Thus Model II is preferred to Model III. Comparing Model II to Model I we have

$\begin{displaymath}F = \frac{(85355 -66115)/4}{66115/44} = 3.2 \end{displaymath}$

which leads to a P-value around 0.03 so that Model II is preferred to Model I.

(b)

There are 10 possible comparisons between pairs of treatments. It is desired to give simultaneous 95% confidence intervals for all possible comparisons based based on the second model above. I want you to show clearly that you know how to get these ten confidence intervals. Your answer will include a clear description of the parameters for which intervals are needed, written in terms of the notation used above for the second model and the resulting confidence interval for the difference between treatment A and treatment B with all the numbers filled in. You need not work it out to the point of a numerical value for the lower and upper limit. [5 marks]

Solution: I want confidence intervals for the 10 values of $\mu_i-\mu_j$ with i<j. To get simultaneous 95% confidence intervals you divide $\alpha=0.05$ by 10 and just work out ordinary 99.5% confidence intervals. The t multiplier is around 2.96. z You also need a standard error for $\hat\mu_1-\hat\mu_2$ which is the square root of

$\begin{displaymath}\sigma^2 \left[1, -1 , 0,0,0,0\right] (X^TX)^{-1} \left[\begin{array}{r} 1 \\ -1 \\ 0\\ 0\\ 0\\ 0 \end{array}\right] \, . \end{displaymath}$

You estimate $\sigma^2$ using 66115/44 and get

$\begin{displaymath}52.04954+68.0591 \pm 2.96 \sqrt{66115/44}\sqrt{ 1.026 -2(0.995) +1.168} \, . \end{displaymath}$

(c)

Examine the residual plots attached for the three models. Is there anything wrong with our fit? If so suggest what you might try next. Be quite clear. [5 marks]

Solution: The plots show clear signs of heteroscedasticity; a transformation might be useful. (In fact taking logs is the thing to do.)

(d)

I attach a table of regression diagnostics for the fit to model II above. For each diagnostic review the values and comment on whether or not they show any problems and which cases might warrant further examination. [5 marks]

Solution:

I just wanted people to compare the various statistics to the guidelines in the text. For the externally studentized residuals I was looking for some mention of the Bonferroni adjustment. Cases 15 and 44 stand out as worth looking at again.

Diagnostics for Model II for Question 3

		Ext'ly					Ext'ly
Obs	h_ii	Stud'zed	DFFITS	Cooks	Obs	h_ii	Stud'zed	DFFITS	Cooks
#		Residual		D_i	#		Residual		D_i
1	0.120	-0.777	-0.287	0.014	26	0.100	-0.096	-0.032	0.000
2	0.108	0.407	0.142	0.003	27	0.100	2.018	0.674	0.071
3	0.129	0.047	0.018	0.000	28	0.158	0.768	0.333	0.019
4	0.103	0.868	0.295	0.015	29	0.104	-0.475	-0.162	0.004
5	0.101	0.141	0.047	0.000	30	0.106	-0.997	-0.343	0.020
6	0.124	-0.377	-0.142	0.003	31	0.102	-1.133	-0.383	0.024
7	0.102	0.681	0.229	0.009	32	0.144	-0.139	-0.057	0.001
8	0.150	-0.578	-0.243	0.010	33	0.154	-0.201	-0.086	0.001
9	0.148	-0.180	-0.075	0.001	34	0.103	1.186	0.401	0.027
10	0.127	-0.261	-0.099	0.002	35	0.137	-0.009	-0.004	0.000
11	0.121	1.073	0.398	0.026	36	0.134	0.607	0.238	0.010
12	0.100	-1.076	-0.359	0.021	37	0.114	0.184	0.066	0.001
13	0.102	-0.179	-0.060	0.001	38	0.101	0.069	0.023	0.000
14	0.130	0.329	0.127	0.003	39	0.109	0.372	0.130	0.003
15	0.106	3.436	1.186	0.188	40	0.101	-0.934	-0.312	0.016
16	0.180	-0.613	-0.288	0.014	41	0.115	-2.130	-0.766	0.091
17	0.104	-0.306	-0.104	0.002	42	0.146	-0.732	-0.303	0.015
18	0.100	0.516	0.172	0.005	43	0.126	1.295	0.491	0.040
19	0.110	-1.138	-0.401	0.027	44	0.101	3.038	1.016	0.145
20	0.110	-1.742	-0.611	0.059	45	0.107	-1.635	-0.565	0.051
21	0.117	0.211	0.076	0.001	46	0.148	1.019	0.425	0.030
22	0.130	0.385	0.149	0.004	47	0.115	0.417	0.150	0.004
23	0.152	-0.699	-0.296	0.015	48	0.103	0.105	0.036	0.000
24	0.111	-0.320	-0.113	0.002	49	0.143	-0.911	-0.372	0.023
25	0.104	-0.715	-0.243	0.010	50	0.142	-0.333	-0.135	0.003

Richard Lockhart
1999-03-23