next up previous


Postscript version of these notes

STAT 350: Lecture 25

Interaction effects

Examples

Two way analysis of variance

Analysis of Covariance

This is the name given to the analysis of models in which there are categorical factors and continuous covariates. In the car example we had the categorical factor VEHICLE and the continuous covariate MILEAGE. Earlier I gave the design matrix for the model in which there are different intercepts for the two cars but 1 common slope. thus this model is 2 parallel lines. If we use corner point coding and fit a model in which VEHICLE and MILEAGE interact then the design matrix for the small data set above is

\begin{displaymath}\left[\begin{array}{rrrr}
1 & 0 & 0 & 0\\
1 & 0 & 1000 & 0\\...
... & 0\\
1 & 1 & 0 & 0\\
1 & 1 &1100 & 1100
\end{array}\right]
\end{displaymath}

where the last column is the product of columns 2 and 3. The design matrix corresponds to a model equation with some slope $\beta_1$ for the first vehicle and a slope $\beta_1+\gamma_2$ for the second vehicle. That is, the coefficient of the last column of the design matrix is the difference in slopes between the 2 vehicles. Use of the alternative coding based on an average intercept leads now to the design matrix

\begin{displaymath}\left[\begin{array}{rrrr}
1 & 1 & 0 & 0\\
1 & 1 & 1000 & 100...
... & 0 & 0 \\
1 & -\frac{3}{2} &1100 & -1650
\end{array}\right]
\end{displaymath}

Again the last column is the product of columns 2 and 3. the coefficient of column 3 is the average slope while the coefficient of the last column is the difference between the slope for vehicle 1 and this average slope.

You saw, in assignment 3, how to test the hypothesis of no interaction in this model.

Analysis of Models with Interaction Terms

Examples

Two way ANOVA: influence of SCHOOL, REGION on STAY

options pagesize=60 linesize=80;
data scenic;
 infile 'scenic.dat' firstobs=2;
 input Stay  Age Risk Culture Chest Beds 
          School Region Census Nurses Facil;
proc glm  data=scenic;
  class school region ;
  model Stay = School | Region / E 
          SOLUTION SS1 SS2 SS3 SS4 XPX INVERSE;
  output out=scout P=Fitted PRESS=PRESS H=HAT 
   RSTUDENT =EXTST R=RESID DFFITS=DFFITS COOKD=COOKD;
run ;
proc means data=scout;
  var stay;
  class school region;
run;
proc print data=scout;

EDITED SAS OUTPUT (Complete output)

                                 The X'X Matrix
                INTERCEPT     SCHOOL 1     SCHOOL 2     REGION 1     REGION 2
INTERCEPT             113           17           96           28           32
SCHOOL 1               17           17            0            5            7
SCHOOL 2               96            0           96           23           25
REGION 1               28            5           23           28            0
REGION 2               32            7           25            0           32
REGION 3               37            3           34            0            0
REGION 4               16            2           14            0            0
DUMMY001                5            5            0            5            0
DUMMY002                7            7            0            0            7
DUMMY003                3            3            0            0            0
DUMMY004                2            2            0            0            0
DUMMY005               23            0           23           23            0
DUMMY006               25            0           25            0           25
DUMMY007               34            0           34            0            0
DUMMY008               14            0           14            0            0
STAY              1090.26       186.85       903.41       310.49       309.87

\begin{displaymath}\vdots
\end{displaymath}

                          X'X Generalized Inverse (g2)
                INTERCEPT     SCHOOL 1     SCHOOL 2     REGION 1     REGION 2
INTERCEPT    0.0714285714 -0.071428571            0 -0.071428571 -0.071428571
SCHOOL 1     -0.071428571 0.5714285714            0 0.0714285714 0.0714285714
SCHOOL 2                0            0            0            0            0
REGION 1     -0.071428571 0.0714285714            0 0.1149068323 0.0714285714
REGION 2     -0.071428571 0.0714285714            0 0.0714285714 0.1114285714
REGION 3     -0.071428571 0.0714285714            0 0.0714285714 0.0714285714
REGION 4                0            0            0            0            0
DUMMY001     0.0714285714 -0.571428571            0 -0.114906832 -0.071428571
DUMMY002     0.0714285714 -0.571428571            0 -0.071428571 -0.111428571
DUMMY003     0.0714285714 -0.571428571            0 -0.071428571 -0.071428571
DUMMY004                0            0            0            0            0
DUMMY005                0            0            0            0            0
DUMMY006                0            0            0            0            0
DUMMY007                0            0            0            0            0
DUMMY008                0            0            0            0            0
STAY                 7.89         1.79            0 2.9304347826       1.5372

\begin{displaymath}\cdots
\end{displaymath}

Dependent Variable: STAY   
                                     Sum of            Mean
Source                  DF          Squares          Square   F Value     Pr > F
Model                    7     132.06558693     18.86651242      7.15     0.0001
Error                  105     277.14479360      2.63947422
Corrected Total        112     409.21038053
                  R-Square             C.V.        Root MSE            STAY Mean
                  0.322733         16.83864       1.6246459            9.6483186
Source                  DF        Type I SS     Mean Square   F Value     Pr > F
SCHOOL                   1      36.08413010     36.08413010     13.67     0.0003
REGION                   3      95.36410217     31.78803406     12.04     0.0001
SCHOOL*REGION            3       0.61735466      0.20578489      0.08     0.9718
Source                  DF       Type II SS     Mean Square   F Value     Pr > F
SCHOOL                   1      27.89404890     27.89404890     10.57     0.0015
REGION                   3      95.36410217     31.78803406     12.04     0.0001
SCHOOL*REGION            3       0.61735466      0.20578489      0.08     0.9718
Source                  DF      Type III SS     Mean Square   F Value     Pr > F
SCHOOL                   1      26.05955792     26.05955792      9.87     0.0022
REGION                   3      47.01938029     15.67312676      5.94     0.0009
SCHOOL*REGION            3       0.61735466      0.20578489      0.08     0.9718
Source                  DF       Type IV SS     Mean Square   F Value     Pr > F
SCHOOL                   1      26.05955792     26.05955792      9.87     0.0022
REGION                   3      47.01938029     15.67312676      5.94     0.0009
SCHOOL*REGION            3       0.61735466      0.20578489      0.08     0.9718
                                        T for H0:    Pr > |T|   Std Error of
Parameter                  Estimate    Parameter=0                Estimate
INTERCEPT               7.890000000 B        18.17     0.0001     0.43420487
SCHOOL        1         1.790000000 B         1.46     0.1480     1.22811685
              2         0.000000000 B          .        .          .        
REGION        1         2.930434783 B         5.32     0.0001     0.55072100
              2         1.537200000 B         2.83     0.0055     0.54232171
              3         1.180588235 B         2.29     0.0241     0.51591227
              4         0.000000000 B          .        .          .        
SCHOOL*REGION 1 1      -0.286434783 B        -0.20     0.8455     1.46660342
              1 2      -0.618628571 B        -0.44     0.6620     1.41099883
              1 3      -0.300588235 B        -0.19     0.8486     1.57026346
              1 4       0.000000000 B          .        .          .        
SCHOOL*REGION 2 1       0.000000000 B          .        .          .        
              2 2       0.000000000 B          .        .          .        
              2 3       0.000000000 B          .        .          .        
              2 4       0.000000000 B          .        .          .        

NOTE: The X'X matrix has been found to be singular and a generalized inverse 
      was used to solve the normal equations.   Estimates followed by the 
      letter 'B' are biased, and are not unique estimators of the parameters.
      SCHOOL        REGION  N Obs    N          Mean       Std Dev       Minimum
--------------------------------------------------------------------------------
           1             1      5    5    12.3240000     3.3527198     9.7800000
                         2      7    7    10.5985714     1.1317454     8.2800000
                         3      3    3    10.5600000     0.7362744    10.1200000
                         4      2    2     9.6800000     0.6788225     9.2000000
           2             1     23   23    10.8204348     2.5061460     8.0300000
                         2     25   25     9.4272000     1.0978635     7.3900000
                         3     34   34     9.0705882     1.1911516     7.0800000
                         4     14   14     7.8900000     0.8332420     6.7000000
--------------------------------------------------------------------------------
  OBS  STAY  AGE RISK CULTURE CHEST BEDS SCHOOL REGION CENSUS NURSES FACIL

   23  9.78 52.3  5.0   17.6   95.9  270    1      1     240    198   57.1
   25  9.20 52.2  4.0   17.5   71.1  298    1      4     244    236   57.1
   26  8.28 49.5  3.9   12.0  113.1  546    1      2     413    436   57.1
   44 10.12 51.7  5.6   14.9   79.1  362    1      3     313    264   54.3
   46 10.16 54.2  4.6    8.4   51.5  831    1      4     581    629   74.3
   47 19.56 59.9  6.5   17.2  113.7  306    2      1     273    172   51.4
   74 10.05 52.0  4.5   36.7   87.5  184    1      1     144    151   68.6
   90 11.41 50.4  5.8   23.8   73.0  424    1      3     359    335   45.7
  100 10.15 51.9  6.2   16.4   59.2  568    1      3     452    371   62.9
  112 17.94 56.2  5.9   26.4   91.8  835    1      1     791    407   62.9

  OBS  FITTED     PRESS      HAT       EXTST      RESID     DFFITS     COOKD

   23 12.3240   -3.18000   0.20000   -1.76835   -2.54400   -0.88418   0.09578
   25  9.6800   -0.96000   0.50000   -0.41618   -0.48000   -0.41618   0.02182
   26 10.5986   -2.70500   0.14286   -1.55177   -2.31857   -0.63351   0.04950
   44 10.5600   -0.66000   0.33333   -0.33029   -0.44000   -0.23355   0.00688
   46  9.6800    0.96000   0.50000    0.41618    0.48000    0.41618   0.02182
   47 10.8204    9.13682   0.04348    6.48789    8.73957    1.38322   0.17189
   74 12.3240   -2.84250   0.20000   -1.57592   -2.27400   -0.78796   0.07653
   90 10.5600    1.27500   0.33333    0.63897    0.85000    0.45182   0.02566
  100 10.5600   -0.61500   0.33333   -0.30774   -0.41000   -0.21761   0.00597
  112 12.3240    7.02000   0.20000    4.15303    5.61600    2.07652   0.46676
Comments on code and results

Analysis of covariance example

Here I regress STAY on SCHOOL, REGION and FACILITIES. I begin by putting in all the possible interaction effects.

options pagesize=60 linesize=80;
data scenic;
 infile 'scenic.dat' firstobs=2;
 input Stay  Age Risk Culture Chest Beds School Region Census Nurses Facil;
proc glm  data=scenic;
  class school region ;
  model Stay = School | Region | Facil / SS1 SS2 SS3 ;
  output out=scout P=Fitted PRESS=PRESS H=HAT RSTUDENT =EXTST R=RESID DFFITS=DFFITS COOKD=COOKD;
run ;
proc print data=scout;
proc glm  data=scenic;
  class school region ;
  model Stay = School | Region  Facil / SS1 SS2 SS3 ;
run ;
EDITED SAS OUTPUT (Complete output)
Dependent Variable: STAY   
                                     Sum of            Mean
Source                  DF          Squares          Square   F Value     Pr > F
Model                   15     173.90201568     11.59346771      4.78     0.0001
Error                   97     235.30836485      2.42585943
Corrected Total        112     409.21038053
                  R-Square             C.V.        Root MSE            STAY Mean
                  0.424970         16.14289       1.5575171            9.6483186
Source                  DF        Type I SS     Mean Square   F Value     Pr > F
SCHOOL                   1      36.08413010     36.08413010     14.87     0.0002
REGION                   3      95.36410217     31.78803406     13.10     0.0001
SCHOOL*REGION            3       0.61735466      0.20578489      0.08     0.9682
FACIL                    1       9.52496125      9.52496125      3.93     0.0504
FACIL*SCHOOL             1       1.32686372      1.32686372      0.55     0.4613
FACIL*REGION             3      21.28634656      7.09544885      2.92     0.0377
FACIL*SCHOOL*REGION      3       9.69825722      3.23275241      1.33     0.2683
Source                  DF       Type II SS     Mean Square   F Value     Pr > F
SCHOOL                   1       4.73069924      4.73069924      1.95     0.1658
REGION                   3       8.16560072      2.72186691      1.12     0.3441
SCHOOL*REGION            3       7.04260265      2.34753422      0.97     0.4113
FACIL                    1       9.52496125      9.52496125      3.93     0.0504
FACIL*SCHOOL             1       3.76491803      3.76491803      1.55     0.2158
FACIL*REGION             3      21.28634656      7.09544885      2.92     0.0377
FACIL*SCHOOL*REGION      3       9.69825722      3.23275241      1.33     0.2683
Source                  DF      Type III SS     Mean Square   F Value     Pr > F
SCHOOL                   1       2.34679006      2.34679006      0.97     0.3278
REGION                   3       2.46002453      0.82000818      0.34     0.7979
SCHOOL*REGION            3       7.04260265      2.34753422      0.97     0.4113
FACIL                    1       0.70390965      0.70390965      0.29     0.5913
FACIL*SCHOOL             1       1.50831325      1.50831325      0.62     0.4323
FACIL*REGION             3       1.92051520      0.64017173      0.26     0.8513
FACIL*SCHOOL*REGION      3       9.69825722      3.23275241      1.33     0.2683
  OBS  STAY  AGE RISK CULTURE CHEST BEDS SCHOOL REGION CENSUS NURSES FACIL
   25  9.20 52.2  4.0   17.5   71.1  298    1      4     244    236   57.1
   46 10.16 54.2  4.6    8.4   51.5  831    1      4     581    629   74.3
   47 19.56 59.9  6.5   17.2  113.7  306    2      1     273    172   51.4
  OBS  FITTED     PRESS      HAT       EXTST      RESID     DFFITS     COOKD
   25  9.2000     .        1.00000     .        -0.00000     .         .     
   46 10.1600     .        1.00000     .         0.00000     .         .     
   47 11.8970    8.29701   0.07641    5.96177    7.66301    1.71483   0.13553
COMMENTS


next up previous



Richard Lockhart
1999-03-10