Data From Table 16.3, page 355
The data from table 16.3 can be set up two ways. The first way is as a ‘narrow
format’ (table16_3), which enters each score on a separate record, while the
‘wide format’ (table16_3w) enters all of the scores for the within subjects
variable on the same record. SAS’s proc glm can analyze either format,
while proc mixed can only handle the narrow format.
data table16_3; input a s y; datalines; 1 1 745 2 1 764 3 1 774 1 2 777 2 2 786 3 2 788 1 3 734 2 3 733 3 3 763 1 4 779 2 4 801 3 4 797 1 5 756 2 5 786 3 5 785 1 6 721 2 6 732 3 6 740 ; run; data table16_3w; input s a1 a2 a3; datalines; 1 745 764 774 2 777 786 788 3 734 733 763 4 779 801 797 5 756 786 785 6 721 732 740 ; run;
Table 16.3, page 355. Summary of the Analysis of Variance for a single-factor within-subject design;
This example will be solved four ways.
Through proc glm there are two ways of evaluating the effect of factor A with the data in a long format.
The first way approaches the ANOVA as a simple two-factor design treating subjects ‘s’ as a blocking factor.
proc glm data = table16_3; class a s; model y = a s / ss3; run; quit;
The GLM Procedure Class Level Information Class Levels Values a 3 1 2 3 s 6 1 2 3 4 5 6 Number of observations 18 Dependent Variable: y Sum of Source DF Squares Mean Square F Value Pr > F Model 7 10122.83333 1446.11905 26.50 <.0001 Error 10 545.66667 54.56667 Corrected Total 17 10668.50000 R-Square Coeff Var Root MSE y Mean 0.948853 0.966243 7.386925 764.5000 Source DF Type III SS Mean Square F Value Pr > F a 2 1575.000000 787.500000 14.43 0.0011 s 5 8547.833333 1709.566667 31.33 <.0001
The second approach models the factors and their interaction (this expansion is done through the ‘|’ in the model) and explicitly requires through the test command, designation of the numerator effect h and the error term (denominator) e.
proc glm data = table16_3; class a s; model y = a|s /ss3; test h=a e=a*s; run; quit;
The GLM Procedure Class Level Information Class Levels Values a 3 1 2 3 s 6 1 2 3 4 5 6 Number of observations 18 Dependent Variable: y Sum of Source DF Squares Mean Square F Value Pr > F Model 17 10668.50000 627.55882 . . Error 0 0.00000 . Corrected Total 17 10668.50000 R-Square Coeff Var Root MSE y Mean 1.000000 . . 764.5000 Source DF Type III SS Mean Square F Value Pr > F a 2 1575.000000 787.500000 . . s 5 8547.833333 1709.566667 . . a*s 10 545.666667 54.566667 . . Tests of Hypotheses Using the Type III MS for a*s as an Error Term Source DF Type III SS Mean Square F Value Pr > F a 2 1575.000000 787.500000 14.43 0.0011
A third method is to use proc glm with data in wide format. This requires the left side of the model statement list the dependent variables forming the levels of the within-subjects factor. The repeated statement is used to indicate that the variables on the left side of the model to be treated as a within-subjects factor.
proc glm data = table16_3w; model a1 a2 a3 = / ss3; repeated a 3; run; quit;
[a-level output omitted]
The GLM Procedure
Repeated Measures Analysis of Variance Repeated Measures Level Information Dependent Variable a1 a2 a3 Level of a 1 2 3 Manova Test Criteria and Exact F Statistics for the Hypothesis of no a Effect H = Type III SSCP Matrix for a E = Error SSCP Matrix S=1 M=0 N=1 Statistic Value F Value Num DF Den DF Pr > F Wilks' Lambda 0.08092840 22.71 2 4 0.0065 Pillai's Trace 0.91907160 22.71 2 4 0.0065 Hotelling-Lawley Trace 11.35660105 22.71 2 4 0.0065 Roy's Greatest Root 11.35660105 22.71 2 4 0.0065
Repeated Measures Analysis of Variance Univariate Tests of Hypotheses for Within Subject Effects Adj Pr > F Source DF Type III SS Mean Square F Value Pr > F G - G H - F a 2 1575.000000 787.500000 14.43 0.0011 0.0029 0.0011 Error(a) 10 545.666667 54.566667 Greenhouse-Geisser Epsilon 0.8052 Huynh-Feldt Epsilon 1.1302
A fourth method is via the use of proc mixed , which uses data in a long format. This requires the class statement to define identifiers for both factors and subjects, however the model statement is to include the non-subject factors. The repeated statement is used to indicate that the data comes from a repeated measures (within-subjects) design. The subject=s indicates that the variable ‘s’ defines the different subjects, and type=cs specifies the type covariance matrix, it this instance it is assumed to have the structure of compound symmetry.
proc mixed data = table16_3; class a s ; model y = a; repeated/ subject = s type = cs; run; quit;
The Mixed Procedure Model Information Data Set WORK.TABLE16_3 Dependent Variable y Covariance Structure Compound Symmetry Subject Effect s Estimation Method REML Residual Variance Method Profile Fixed Effects SE Method Model-Based Degrees of Freedom Method Between-Within Class Level Information Class Levels Values a 3 1 2 3 s 6 1 2 3 4 5 6 Dimensions Covariance Parameters 2 Columns in X 4 Columns in Z 0 Subjects 6 Max Obs Per Subject 3 Observations Used 18 Observations Not Used 0 Total Observations 18 Iteration History Iteration Evaluations -2 Res Log Like Criterion 0 1 144.05240866 1 1 125.15764239 0.00000000 Convergence criteria met. Covariance Parameter Estimates Cov Parm Subject Estimate CS s 551.67 Residual 54.5667 Fit Statistics -2 Res Log Likelihood 125.2 AIC (smaller is better) 129.2 AICC (smaller is better) 130.2 BIC (smaller is better) 128.7 Null Model Likelihood Ratio Test DF Chi-Square Pr > ChiSq 1 18.89 <.0001 Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F a 2 10 14.43 0.0011
Table 16.6, page 360. Testing a within-subject contrast in a single-factor within-subject design
The simplest way to perform a within-subject contrast in single-factor within-subject design on a narrow formatted data set through proc glm is to create a new variable that defines the contrast over the within-subject factor. It is then treated as a continuous variable and is interacted with the subject variable in the model statement (NOTE: The factor which the contrast over is no longer in the model). The main effect of the contrast variable is tested against the interaction between subject and the contrast variable in the test statement.
data table16_3; set table16_3; if a=1 then c=-1; if (a=2 or a=3) then c=.5; run;
proc glm data = table16_3; class s; model y = s|c/ss3; test h=c e=c*s; run; quit;
The GLM Procedure
Class Level Information Class Levels Values s 6 1 2 3 4 5 6 Number of observations 18 Dependent Variable: y Sum of Source DF Squares Mean Square F Value Pr > F Model 11 10126.00000 920.54545 10.18 0.0049 Error 6 542.50000 90.41667 Corrected Total 17 10668.50000 R-Square Coeff Var Root MSE y Mean 0.949149 1.243789 9.508768 764.5000 Source DF Type III SS Mean Square F Value Pr > F s 5 8547.833333 1709.566667 18.91 0.0013 c 1 1406.250000 1406.250000 15.55 0.0076 c*s 5 171.916667 34.383333 0.38 0.8460 Tests of Hypotheses Using the Type III MS for c*s as an Error Term Source DF Type III SS Mean Square F Value Pr > F c 1 1406.250000 1406.250000 40.90 0.0014
The contrast done on the within-subject factor on a wide formatted data set is specified through the manova command in proc glm. The h option specifies the effects in the preceding model to use as hypothesis matrices, and _ALL_ provides tests for all effects listed in the model statement. Through the m option, the contrast on the dependent variables are established.
proc glm data = table16_3w; model a1 a2 a3 = /ss3; repeated a 3; manova h = _ALL_ m = (-1 .5 .5); run; quit;
[a-level output omitted]
Repeated Measures Analysis of Variance
Repeated Measures Level Information Dependent Variable a1 a2 a3 Level of a 1 2 3 Manova Test Criteria and Exact F Statistics for the Hypothesis of no a Effect H = Type III SSCP Matrix for a E = Error SSCP Matrix S=1 M=0 N=1 Statistic Value F Value Num DF Den DF Pr > F Wilks' Lambda 0.08092840 22.71 2 4 0.0065 Pillai's Trace 0.91907160 22.71 2 4 0.0065 Hotelling-Lawley Trace 11.35660105 22.71 2 4 0.0065 Roy's Greatest Root 11.35660105 22.71 2 4 0.0065 Univariate Tests of Hypotheses for Within Subject Effects Adj Pr > F Source DF Type III SS Mean Square F Value Pr > F G - G H - F a 2 1575.000000 787.500000 14.43 0.0011 0.0029 0.0011 Error(a) 10 545.666667 54.566667 Greenhouse-Geisser Epsilon 0.8052 Huynh-Feldt Epsilon 1.1302 M Matrix Describing Transformed Variables a1 a2 a3 MVAR1 -1 0.5 0.5
Multivariate Analysis of Variance Characteristic Roots and Vectors of: E Inverse * H, where H = Type III SSCP Matrix for Intercept E = Error SSCP Matrix Variables have been transformed by the M Matrix Characteristic Characteristic Vector V'EV=1 Root Percent MVAR1 8.17983519 100.00 0.06227237 MANOVA Test Criteria and Exact F Statistics for the Hypothesis of No Overall Intercept Effect on the Variables Defined by the M Matrix Transformation H = Type III SSCP Matrix for Intercept E = Error SSCP Matrix S=1 M=-0.5 N=1.5 Statistic Value F Value Num DF Den DF Pr > F Wilks' Lambda 0.10893442 40.90 1 5 0.0014 Pillai's Trace 0.89106558 40.90 1 5 0.0014 Hotelling-Lawley Trace 8.17983519 40.90 1 5 0.0014 Roy's Greatest Root 8.17983519 40.90 1 5 0.0014