Design and Analysis, Fourth Edition, by Keppel and Wickens Chapter 16: The Single-Factor Within-Subject Design

Data From Table 16.3, page 355

The data from table 16.3 can be set up two ways. The first way is as a ‘narrow format’ (table16_3), which enters each score on a separate record, while the ‘wide format’ (table16_3w) enters all of the scores for the within subjects variable on the same record. SAS’s proc glm can analyze either format, while proc mixed can only handle the narrow format.

data table16_3;
input a s y;
datalines;
1 1 745
2 1 764
3 1 774
1 2 777
2 2 786
3 2 788
1 3 734
2 3 733
3 3 763
1 4 779
2 4 801
3 4 797
1 5 756
2 5 786
3 5 785
1 6 721
2 6 732
3 6 740
;
run;

data table16_3w;
input s a1 a2 a3;
datalines;
1 745 764 774
2 777 786 788
3 734 733 763
4 779 801 797
5 756 786 785
6 721 732 740
;
run;

Table 16.3, page 355. Summary of the Analysis of Variance for a single-factor within-subject design;

This example will be solved four ways.

Through proc glm there are two ways of evaluating the effect of factor A with the data in a long format.

The first way approaches the ANOVA as a simple two-factor design treating subjects ‘s’ as a blocking factor.

proc glm data = table16_3;
  class a s;
  model y = a s / ss3;
run; 
quit;

The GLM Procedure

      Class Level Information

Class         Levels    Values
a                  3    1 2 3
s                  6    1 2 3 4 5 6

Number of observations    18

Dependent Variable: y
                                        Sum of
Source                      DF         Squares     Mean Square    F Value    Pr > F
Model                        7     10122.83333      1446.11905      26.50    <.0001
Error                       10       545.66667        54.56667
Corrected Total             17     10668.50000

R-Square     Coeff Var      Root MSE        y Mean
0.948853      0.966243      7.386925      764.5000

Source                      DF     Type III SS     Mean Square    F Value    Pr > F
a                            2     1575.000000      787.500000      14.43    0.0011
s                            5     8547.833333     1709.566667      31.33    <.0001

The second approach models the factors and their interaction (this expansion is done through the ‘|’ in the model) and explicitly requires through the test command, designation of the numerator effect h and the error term (denominator) e.

proc glm data = table16_3;
  class a s;
  model y = a|s /ss3;
  test h=a e=a*s;
run; 
quit;

The GLM Procedure

      Class Level Information
Class         Levels    Values
a                  3    1 2 3
s                  6    1 2 3 4 5 6

Number of observations    18

Dependent Variable: y
                                        Sum of
Source                      DF         Squares     Mean Square    F Value    Pr > F
Model                       17     10668.50000       627.55882        .       .
Error                        0         0.00000          .
Corrected Total             17     10668.50000

R-Square     Coeff Var      Root MSE        y Mean
1.000000           .               .      764.5000

Source                      DF     Type III SS     Mean Square    F Value    Pr > F
a                            2     1575.000000      787.500000        .       .
s                            5     8547.833333     1709.566667        .       .
a*s                         10      545.666667       54.566667        .       .

        Tests of Hypotheses Using the Type III MS for a*s as an Error Term
Source                      DF     Type III SS     Mean Square    F Value    Pr > F
a                            2     1575.000000      787.500000      14.43    0.0011

A third method is to use proc glm with data in wide format. This requires the left side of the model statement list the dependent variables forming the levels of the within-subjects factor. The repeated statement is used to indicate that the variables on the left side of the model to be treated as a within-subjects factor.

proc glm data = table16_3w;
  model a1 a2 a3 = / ss3;
  repeated a 3;
run; 
quit;

[a-level output omitted]

The GLM Procedure

Repeated Measures Analysis of Variance

       Repeated Measures Level Information
Dependent Variable          a1       a2       a3
        Level of a           1        2        3

 Manova Test Criteria and Exact F Statistics for the Hypothesis of no a Effect
                        H = Type III SSCP Matrix for a
                             E = Error SSCP Matrix
                               S=1    M=0    N=1
Statistic                        Value    F Value    Num DF    Den DF    Pr > F
Wilks' Lambda               0.08092840      22.71         2         4    0.0065
Pillai's Trace              0.91907160      22.71         2         4    0.0065
Hotelling-Lawley Trace     11.35660105      22.71         2         4    0.0065
Roy's Greatest Root        11.35660105      22.71         2         4    0.0065

Repeated Measures Analysis of Variance
Univariate Tests of Hypotheses for Within Subject Effects
                                                                                    Adj Pr > F
Source                     DF    Type III SS    Mean Square   F Value   Pr > F    G - G    H - F
a                           2    1575.000000     787.500000     14.43   0.0011   0.0029   0.0011
Error(a)                   10     545.666667      54.566667

Greenhouse-Geisser Epsilon    0.8052
Huynh-Feldt Epsilon           1.1302

A fourth method is via the use of proc mixed , which uses data in a long format. This requires the class statement to define identifiers for both factors and subjects, however the model statement is to include the non-subject factors. The repeated statement is used to indicate that the data comes from a repeated measures (within-subjects) design. The subject=s indicates that the variable ‘s’ defines the different subjects, and type=cs specifies the type covariance matrix, it this instance it is assumed to have the structure of compound symmetry.

proc mixed data = table16_3;
  class a s ;
  model y = a;
  repeated/ subject = s type = cs;
run; 
quit;

The Mixed Procedure

                  Model Information
Data Set                     WORK.TABLE16_3
Dependent Variable           y
Covariance Structure         Compound Symmetry
Subject Effect               s
Estimation Method            REML
Residual Variance Method     Profile
Fixed Effects SE Method      Model-Based
Degrees of Freedom Method    Between-Within

             Class Level Information
Class    Levels    Values
a             3    1 2 3
s             6    1 2 3 4 5 6

            Dimensions
Covariance Parameters             2
Columns in X                      4
Columns in Z                      0
Subjects                          6
Max Obs Per Subject               3
Observations Used                18
Observations Not Used             0
Total Observations               18

                     Iteration History
Iteration    Evaluations    -2 Res Log Like       Criterion
        0              1       144.05240866
        1              1       125.15764239      0.00000000
                   Convergence criteria met.

 Covariance Parameter Estimates
Cov Parm     Subject    Estimate
CS           s            551.67
Residual                 54.5667

           Fit Statistics
-2 Res Log Likelihood           125.2
AIC (smaller is better)         129.2
AICC (smaller is better)        130.2
BIC (smaller is better)         128.7

  Null Model Likelihood Ratio Test
    DF    Chi-Square      Pr > ChiSq
     1         18.89          <.0001

        Type 3 Tests of Fixed Effects
              Num     Den
Effect         DF      DF    F Value    Pr > F
a               2      10      14.43    0.0011

Table 16.6, page 360. Testing a within-subject contrast in a single-factor within-subject design

The simplest way to perform a within-subject contrast in single-factor within-subject design on a narrow formatted data set through proc glm is to create a new variable that defines the contrast over the within-subject factor. It is then treated as a continuous variable and is interacted with the subject variable in the model statement (NOTE: The factor which the contrast over is no longer in the model). The main effect of the contrast variable is tested against the interaction between subject and the contrast variable in the test statement.

data table16_3;
  set table16_3;
  if a=1 then c=-1;
  if (a=2 or a=3) then c=.5;
run;

proc glm data = table16_3;
  class s;
  model y = s|c/ss3;
  test h=c e=c*s;
run;
quit;

The GLM Procedure

      Class Level Information
Class         Levels    Values
s                  6    1 2 3 4 5 6

Number of observations    18

Dependent Variable: y
                                        Sum of
Source                      DF         Squares     Mean Square    F Value    Pr > F
Model                       11     10126.00000       920.54545      10.18    0.0049
Error                        6       542.50000        90.41667
Corrected Total             17     10668.50000

R-Square     Coeff Var      Root MSE        y Mean
0.949149      1.243789      9.508768      764.5000

Source                      DF     Type III SS     Mean Square    F Value    Pr > F
s                            5     8547.833333     1709.566667      18.91    0.0013
c                            1     1406.250000     1406.250000      15.55    0.0076
c*s                          5      171.916667       34.383333       0.38    0.8460

        Tests of Hypotheses Using the Type III MS for c*s as an Error Term
Source                      DF     Type III SS     Mean Square    F Value    Pr > F
c                            1     1406.250000     1406.250000      40.90    0.0014

The contrast done on the within-subject factor on a wide formatted data set is specified through the manova command in proc glm. The h option specifies the effects in the preceding model to use as hypothesis matrices, and _ALL_ provides tests for all effects listed in the model statement. Through the m option, the contrast on the dependent variables are established.

proc glm data = table16_3w;
  model a1 a2 a3  = /ss3;
  repeated a 3;
  manova h = _ALL_ m = (-1 .5 .5);
run; quit;

[a-level output omitted]

Repeated Measures Analysis of Variance

       Repeated Measures Level Information
Dependent Variable          a1       a2       a3
        Level of a           1        2        3

 Manova Test Criteria and Exact F Statistics for the Hypothesis of no a Effect
                        H = Type III SSCP Matrix for a
                             E = Error SSCP Matrix
                               S=1    M=0    N=1

Statistic                        Value    F Value    Num DF    Den DF    Pr > F
Wilks' Lambda               0.08092840      22.71         2         4    0.0065
Pillai's Trace              0.91907160      22.71         2         4    0.0065
Hotelling-Lawley Trace     11.35660105      22.71         2         4    0.0065
Roy's Greatest Root        11.35660105      22.71         2         4    0.0065

Univariate Tests of Hypotheses for Within Subject Effects
                                                                                    Adj Pr > F
Source                     DF    Type III SS    Mean Square   F Value   Pr > F    G - G    H - F
a                           2    1575.000000     787.500000     14.43   0.0011   0.0029   0.0011
Error(a)                   10     545.666667      54.566667

Greenhouse-Geisser Epsilon    0.8052
Huynh-Feldt Epsilon           1.1302

         M Matrix Describing Transformed Variables
                     a1                a2                a3
MVAR1                -1               0.5               0.5

Multivariate Analysis of Variance

Characteristic Roots and Vectors of: E Inverse * H, where
          H = Type III SSCP Matrix for Intercept
                  E = Error SSCP Matrix
     Variables have been transformed by the M Matrix

Characteristic               Characteristic Vector  V'EV=1
          Root    Percent           MVAR1
    8.17983519     100.00      0.06227237

MANOVA Test Criteria and Exact F Statistics for the Hypothesis of No Overall Intercept Effect
                   on the Variables Defined by the M Matrix Transformation
                           H = Type III SSCP Matrix for Intercept
                                    E = Error SSCP Matrix
                                   S=1    M=-0.5    N=1.5
Statistic                        Value    F Value    Num DF    Den DF    Pr > F
Wilks' Lambda               0.10893442      40.90         1         5    0.0014
Pillai's Trace              0.89106558      40.90         1         5    0.0014
Hotelling-Lawley Trace      8.17983519      40.90         1         5    0.0014
Roy's Greatest Root         8.17983519      40.90         1         5    0.0014