next up previous


Postscript version of these notes

STAT 350 Lecture 2

Reading: Chapter 5 sections 1-4.

Matrix form of a linear model

Stack Yi, $\mu_i$ and $\epsilon_i$ into vectors:

\begin{displaymath}\begin{array}{ccc}
Y = \left[ \begin{array}{c} Y_1 \\ Y_2 \\ ...
...silon_2 \\ \vdots \\ \epsilon_n \end{array} \right]
\end{array}\end{displaymath}

Define

\begin{displaymath}\begin{array}{cc}
\beta = \left[ \begin{array}{c} \beta_1 \\ ...
... & \cdots & x_{n,p} \end{array} \right]_{n\times p}
\end{array}\end{displaymath}

Note

\begin{displaymath}X\beta = \left[ \begin{array}{c}
x_{1,1} \beta_1 + \cdots + x...
...1} \beta_1 + \cdots + x_{n,p} \beta_p
\end{array} \right]
=\mu
\end{displaymath}

so

\begin{displaymath}\mu=X\beta
\end{displaymath}

Finally

\begin{displaymath}Y=X\beta + \epsilon
\end{displaymath}

is our original set of n model equations written in vector matrix form.

Assumptions so far:
\begin{align*}& {\rm E}(\epsilon_i) = 0
\\
& Y=\mu+\epsilon
\\
& \mu=X\beta
\end{align*}

Still to come: independence, homoscedasticity, normality.

Examples: please take the point that this is a very large class of models.

1.
One sample problem:

2.
Two sample problem: n=r+s

\begin{displaymath}\mu_1 = \cdots = \mu_r = \beta_1 \qquad \mu_{r+1} = \cdots = \mu_{r+s} = \beta_2
\end{displaymath}

For $i\le r$

\begin{displaymath}Y_i = \beta_1 + \epsilon_i \qquad {\rm E}(Y_i) = \beta_1
\end{displaymath}

For $r < i \le r+s$

\begin{displaymath}Y_i = \beta_2 + \epsilon_i \qquad {\rm E}(Y_i) = \beta_2
\end{displaymath}

In matrix form

\begin{displaymath}Y = \left[ \begin{array}{cc}
1 & 0 \\ \vdots & \vdots \\ 1 &...
...gin{array}{c} \beta_1 \\ \beta_2 \end{array}\right] + \epsilon
\end{displaymath}

Sometimes it is convenient to write:

\begin{displaymath}X^T = \left[ \overbrace{
\begin{array}{ccc} 1 & \cdots & 1 \...
... & 0 \\ 1 & \cdots & 1 \end{array}}^{s \mathrm{\ cols}}\right]
\end{displaymath}

which is a partitioned matrix where I have described the transpose of X.

3.
Simple linear regression:

Yi = TL


Di = Dose

The model

\begin{displaymath}Y_i = \beta_1 + \beta_2 D_i + \epsilon_i
\end{displaymath}

gives

\begin{displaymath}\beta=\left[\begin{array}{c} \beta_1 \\ \beta_2 \end{array}\r...
...c}
1 & D_1 \\
\vdots & \vdots \\
1 & D_n
\end{array}\right]
\end{displaymath}

4.
Polynomial models: ``polynomial regression''. In Lecture 1 we had the quadratic model:

\begin{displaymath}Y_i = \beta_1 + D_i \beta_2 + D_i^2 \beta_3 + \epsilon_i
\end{displaymath}

for which

\begin{displaymath}\beta=\left[\begin{array}{c} \beta_1 \\ \beta_2 \\ \beta_3 \e...
...ts & D_n \\
D_1^2 & D_2^2 & \cdots & D_n^2 \end{array}\right]
\end{displaymath}

In general we might fit a polynomial of degree p-1 to get

\begin{displaymath}Y_i = \beta_1 + D_i \beta_2 + \cdots + D_i^{p-1} \beta_p + \epsilon_i
\end{displaymath}

for which

\begin{displaymath}\beta=\left[\begin{array}{c} \beta_1 \\ \vdots \\ \beta_p \en...
...
D_1^{p-1} & D_2^{p-1} & \cdots & D_n^{p-1} \end{array}\right]
\end{displaymath}

5.
Analysis of Covariance: fitting two straight lines

Consider the TL data again but now suppose that samples 1 to r were ``bleached'' (left in the sun for several hours before analysis) and samples r+1 to s were ``unbleached''. We combine the 2 sample problem with the straight line problem:
\begin{align*}\mu_i &= \beta_1 + \beta_2 D_i \qquad i=1,\ldots,r
\\
\mu_i &= \beta_3 + \beta_4 D_i \qquad i=r+1,\ldots,r+s
\end{align*}

\begin{displaymath}\beta = \left[\begin{array}{c} \beta_1 \\ \beta_2 \\ \beta_3 \\ \beta_4
\end{array}\right]
\end{displaymath}


\begin{displaymath}X^T = \left[ \begin{array}{ccc} 1 & \cdots & 1 \\
D_1 & \cdo...
... \\
1 & \cdots & 1 \\
D_1 & \cdots & D_r
\end{array} \right]
\end{displaymath}

Special case: ``No interaction'' of Bleach and Dose: the effect of dose is the same for bleached and unbleached samples. That is:

\begin{displaymath}\beta_2 = \beta_4
\end{displaymath}


\begin{displaymath}\beta = \left[\begin{array}{c} \beta_1 \\ \beta_2 \\ \beta_3
\end{array}\right]
\end{displaymath}


\begin{displaymath}X^T = \left[ \begin{array}{ccc} 1 & \cdots & 1 \\
D_1 & \cdo...
...\\
D_1 & \cdots & D_r \\
1 & \cdots & 1
\end{array} \right]
\end{displaymath}

Note: we usually re-order the parameters in this case to get

\begin{displaymath}\beta = \left[\begin{array}{c} \beta_1 \\ \beta_3 \\ \beta_2
\end{array}\right]
\end{displaymath}


\begin{displaymath}X^T = \left[ \begin{array}{ccc} 1 & \cdots & 1 \\
0 & \cdots...
...\\
1 & \cdots & 1 \\
D_1 & \cdots & D_r
\end{array} \right]
\end{displaymath}

6.
Weighing designs: (a simple example mostly for illustration) Idea: weigh two objects with (true) weights $\beta_1$ and $\beta_2$.
\begin{align*}Y_1 & = \mbox{measured weight of Object 1} \\
Y_2 & = \mbox{measu...
...} \\
Y_3 & = \mbox{measured weight of Object 1 and 2 together} \\
\end{align*}
Now we have

\begin{displaymath}\mu_1 = \beta_1 \qquad \mu_2 = \beta_2 \qquad \mu_3 = \beta_1+\beta_2
\end{displaymath}

and get

\begin{displaymath}\left[ \begin{array}{c} \mu_1 \\ \mu_2 \\ \mu_3 \end{array} \...
...\left[ \begin{array}{c}
\beta_1 \\ \beta_2 \end{array} \right]
\end{displaymath}

so that

\begin{displaymath}X = \left[ \begin{array}{cc}
1 & 0
\\
0 & 1
\\
1 & 1
\end{array} \right]
\end{displaymath}

Notice that ${\rm E}(Y_i)= \mbox{ true weight}$ is a physically meaningful and important assumption. This sort of assumption may well be wrong.

7.
One way layout (ANOVA). Example has data Yij being the blood coagulation time for rat number j fed diet number i for i=1,2,3,4. There were 4 rats for diet 1, 6 for diets 2 and 3 and 8 rats fed diet 4. We use $\mu_{ij}$ as notation for ${\rm E}(Y_{ij})$. The idea is that all the rats fed diet 1 have the same mean coagulation time $\beta_1$ so $\mu_{11}=\mu_{12} =\mu_{13} = \mu_{14} = \beta_1$. (In fact it is pretty common notation to use $\mu_1$ for $\beta_1$ but this will conflict, for the time being, with my notation for the mean of the first Y.) If we stack up the Ys we get

\begin{displaymath}Y =
\left[
\begin{array}{c}
Y_{11} \\ Y_{12} \\ Y_{13} \\ Y_...
...1} \\ \beta_{2} \\ \beta_{3} \\ \beta_{4}
\end{array} \right]
\end{displaymath}

Again we have $\mu = X \beta$.

Jargon: X is called a ``design matrix''.

The one way layout as a linear model

The sum of squares decomposition in one example

The data consist of blood coagulation times for 24 animals fed one of 4 different diets. Here are the data with the 4 diets being the 4 columns.

\begin{displaymath}\left[
\begin{array}{rrrr}
62 & 63 & 68 & 56 \\
60 & 67 & ...
... & 66 & 68 & 64 \\
& & & 63 \\
& & & 59
\end{array}\right]
\end{displaymath}

The usual ANOVA model equation is

\begin{displaymath}Y_{ij} = \mu_i +\epsilon_{ij}
\end{displaymath}

which we can write in matrix form by stacking up the observations into a column.

\begin{displaymath}\left[\begin{array}{r}
62 \\
60 \\
63 \\
59 \\
63 \\
67 ...
...n_{46} \\
\epsilon_{47} \\
\epsilon_{48}
\end{array}\right]
\end{displaymath}

Let X denote the $24\times 4$ design matrix in this formula. Usually we reparametrize the model in the form

\begin{displaymath}Y_{ij}=\mu+\alpha_i +\epsilon_{ij}
\end{displaymath}

which would lead to a design matrix which looked like X above with an extra column on the left all of whose entries are equal to 1. The parameter vector $\beta$ would now be

\begin{displaymath}\beta^{T} = \left[\mu \quad \alpha_1 \quad \alpha_2 \quad \alpha_3 \quad \alpha_4 \quad\right]
\end{displaymath}

It will turn out that trying to use this parametrization the different parameters are not identifiable, that is, they cannot be separately estimated, because making $\mu$ bigger by a certain amount and each $\alpha_i$ smaller by the same amount leaves the data unchanged. We usually solve this problem by defining

\begin{displaymath}\mu = (n_1\mu_1 + \cdots
+ n_k \mu_k)/(n_1+ \cdots +n_k)
\end{displaymath}

and

\begin{displaymath}\alpha_i = \mu_i-\mu\, .\end{displaymath}

Now it is automatic that $\sum\alpha_i=0$ so we usually eliminate $\alpha_4$ by replacing it in the model equation by the quantity

\begin{displaymath}-(\alpha_1+\alpha_2+\alpha_3).\end{displaymath}

This leads to

\begin{displaymath}\beta^{T} = \left[\mu \quad \alpha_1 \quad \alpha_2 \quad \alpha_3 \quad\right]
\end{displaymath}

and

\begin{displaymath}X =
\left[
\begin{array}{rrrr}
1 & 1 & 0 & 0 \\
1 & 1 & 0 & ...
... \\
1 & -1 & -1 & -1 \\
1 & -1 & -1 & -1
\end{array}\right]
\end{displaymath}

Further analysis of this data.


next up previous



Richard Lockhart
1999-01-12