• Skip to primary navigation
  • Skip to content

Institute for Digital Research and Education

Image
Institute for Digital Research and Education

  • HOME
  • SOFTWARE
    • R
    • Stata
    • SAS
    • SPSS
    • Mplus
    • Other Packages
      • G*Power
      • SUDAAN
      • Sample Power
  • RESOURCES
    • Annotated Output
    • Data Analysis Examples
    • Frequently Asked Questions
    • Seminars
    • Textbook Examples
    • Which Statistical Test?
  • SERVICES
    • Books for Loan
    • Services and Policies
      • Walk-In Consulting
      • Email Consulting
      • Fee for Service
    • Software Purchasing and Updating
    • Consultants for Hire
    • Other Consulting Centers
      • Department of Statistics Consulting Center
      • Department of Biomathematics Consulting Clinic
  • ABOUT US
  • DONATE
This page is archived and no longer maintained.

Regression with Stata Chapter 6: More on interactions of categorical variables Draft version

This is a draft version of this chapter.  Comments and suggestions to improve this draft are welcome.

Chapter outline
    6.1. Analysis with two categorical variables
    6.2. Simple effects
      6.2.1 Analyzing simple effects using xi3 and regress
      6.2.2 Coding of simple effects
    6.3. Simple comparisons
      6.3.1 Analyzing simple comparisons using xi3 and regress
      6.3.2 Coding of simple comparisons
    6.4. Partial interaction
      6.4.1 Analyzing partial interactions using xi3 and regress
      6.4.2 Coding of partial interactions
    6.5. Interaction contrasts
      6.5.1 Analyzing interaction contrasts using xi3 and regress
      6.5.2 Coding of interaction contrasts
    6.6. Computing adjusted means
      6.6.1 Computing adjusted means via anova
      6.6.1 Computing adjusted means via regress
    6.7. More details on meaning of coefficients
    6.8. Simple effects via dummy coding versus effect coding
      6.8.1 Example 1. Simple effects of yr_rnd at levels of mealcat
      6.8.2 Example 2. Simple effects of mealcat at levels of yr_rnd

For this chapter we will use the elemapi2 data file that we have been using in prior chapters. We will focus on the variables mealcat, and collcat as they relate to the outcome variable api00 (performance on the api in the year 2000. The variable mealcat is the variable meals broken up into three categories, and the variable collcat is the variable some_col broken into 3 categories. We could think of mealcat as being the number of students receiving free meals and broken up into low, middle and high. The variable collcat can be thought of as the number of parents with some college education, and we could think of it as being broken up into low, medium and high. For our analysis, we think that both mealcat and collcat may be related to api00, but it is also possible that the impact of mealcat might depend on the level of collcat. In other words, we think that there might be an interaction of these two categorical variables. In this chapter we will look at how these two categorical variables are related to api performance in the school, and we will look at the interaction of these two categorical variables as well. We will see that there is an interaction of these categorical variables, and will focus on different ways of further exploring the interaction.

We will first use the elemapi2 data file.

use https://stats.idre.ucla.edu/stat/stata/webbooks/reg/elemapi2

We drop the label for mealcat because this can get in the way of some of the points we will be demonstrating.

label drop mealcat
label values mealcat

6.1. Analysis with 2 categorical variables

One traditional way to analyze this would be to perform a 3 by 3 factorial analysis of variance using the anova command, as shown below. The results show a main effect of collcat (F=4.5, p-0.0117), a main effect of mealcat (F=509.04, p=0.0000) and an interaction of collcat by mealcat, (F=6.63, p=0.0000).

anova api00 collcat mealcat collcat*mealcat 
                               Number of obs =     400     R-squared     =  0.7733
                               Root MSE      =  68.412     Adj R-squared =  0.7687
    
                      Source |  Partial SS    df       MS           F     Prob > F
             ----------------+----------------------------------------------------
                       Model |  6243714.81     8  780464.351     166.76     0.0000
                             |
                     collcat |  42140.5662     2  21070.2831       4.50     0.0117
                     mealcat |  4764843.56     2  2382421.78     509.04     0.0000
             collcat*mealcat |  124167.809     4  31041.9522       6.63     0.0000
                             |
                    Residual |  1829957.19   391  4680.19741   
             ----------------+----------------------------------------------------
                       Total |  8073672.00   399  20234.7669   
    

We can use the adjust command to show the adjusted means broken down by collcat and mealcat.

adjust , by(collcat mealcat)
    ----------------------------------------------------------
         Dependent variable: api00     Command: anova
    ----------------------------------------------------------
    
    -------------------------------------
              |Percentage free meals in 3
              |        categories        
      collcat |       1        2        3
    ----------+--------------------------
            1 | 816.914   589.35  493.919
            2 | 825.651  636.605  508.833
            3 | 782.151  655.638  541.733
    -------------------------------------
         Key:  Linear Prediction

We can show a graph of the adjusted means as shown below. We use the separate command to make three variables corresponding to the three levels of collcat (i.e., yhat1 corresponds to the predicted value when collcat is low). We can then show the graph with the three levels of collcat represented as three separate lines.

predict yhat
separate yhat, by(collcat)
                  storage  display     value
    variable name   type   format      label      variable label
    -------------------------------------------------------------------------------
    yhat1           float  %9.0g                  yhat, collcat == 1
    yhat2           float  %9.0g                  yhat, collcat == 2
    yhat3           float  %9.0g                  yhat, collcat == 3
twoway line yhat1 yhat2 yhat3 mealcat, xlabel(1 2 3) sort
Image statar5-1

Now we drop the variables yhat yhat1 yhat2 yhat3 in case we wish to use these variables later.

drop yhat yhat1 yhat2 yhat3

We can do these same analyses using the regress command. Below we use the regress command with xi3 to look at the effect of collcat, mealcat and the interaction of these two variables.

xi3: regress api00 g.collcat*g.mealcat
    s.collcat         _Icollcat_1-3       (naturally coded; _Icollcat_1 omitted)
    s.mealcat         _Imealcat_1-3       (naturally coded; _Imealcat_1 omitted)
    s.col~t*s.mea~t   _IcolXmea_#_#       (coded as above)
    
          Source |       SS       df       MS              Number of obs =     400
    -------------+------------------------------           F(  8,   391) =  166.76
           Model |  6243714.81     8  780464.351           Prob > F      =  0.0000
        Residual |  1829957.19   391  4680.19741           R-squared     =  0.7733
    -------------+------------------------------           Adj R-squared =  0.7687
           Total |  8073672.00   399  20234.7669           Root MSE      =  68.412
    
    ------------------------------------------------------------------------------
           api00 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
     _Icollcat_2 |   23.63531   9.105331     2.60   0.010     5.733782    41.53685
     _Icollcat_3 |   26.44625   9.995129     2.65   0.008     6.795332    46.09717
     _Imealcat_2 |  -181.0414   9.077126   -19.94   0.000    -198.8874   -163.1953
     _Imealcat_3 |  -293.4103   9.449459   -31.05   0.000    -311.9884   -274.8322
       _Ico2Xme2 |   38.51777   24.19532     1.59   0.112    -9.051421    86.08697
       _Ico2Xme3 |   6.177537   20.08262     0.31   0.759     -33.3059    45.66097
       _Ico3Xme2 |    101.051   22.88808     4.42   0.000     56.05192    146.0501
       _Ico3Xme3 |   82.57776   24.43941     3.38   0.001     34.52867    130.6268
           _cons |   650.0883   3.871885   167.90   0.000     642.4759    657.7006
    ------------------------------------------------------------------------------

We use the test command to test the two terms associated with collcat to get the main effect of collcat.

test  _Icollcat_2 _Icollcat_3
     ( 1)  _Icollcat_2 = 0.0
     ( 2)  _Icollcat_3 = 0.0
    
           F(  2,   391) =    4.50
                Prob > F =    0.0117

Likewise we use the test command to get the overall test of mealcat.

test  _Imealcat_2 _Imealcat_3
     ( 1)  _Imealcat_2 = 0.0
     ( 2)  _Imealcat_3 = 0.0
    
           F(  2,   391) =  509.04
                Prob > F =    0.0000

Finally, we use the test command to test the interaction of of collcat by mealcat.

test  _Ico2Xme2 _Ico2Xme3 _Ico3Xme2 _Ico3Xme3
     ( 1)  _IcolXmea_2_2 = 0.0
     ( 2)  _IcolXmea_2_3 = 0.0
     ( 3)  _IcolXmea_3_2 = 0.0
     ( 4)  _IcolXmea_3_3 = 0.0
    
           F(  4,   391) =    6.63
                Prob > F =    0.0000

First, note that the results of the test commands correspond to those from the anova command above. This is because collcat and mealcat were coded using simple effect coding, a coding scheme where the contrasts sum to 0. We indicated that we wanted simple effect coding by using g.collcat and g.mealcat on the regress command with xi3 (see Chapter 5 for more information about coding schemes available via the xi3 command). If this had been coded using dummy coding, e.g., i.collcat, then the results of the test commands for mealcat and somecat from the regress command would not have corresponded to the anova results. In addition to simple effect coding, we could have used e., h., r., a., b., or o. and the results of the test commands would have matched the anova command, although the meaning of the individual tests would have been different. This point will be explored in more detail later in this chapter.

We can obtain the adjusted means by using predict command to get the predicted values, calling them pred and then looking at the mean of pred broken down by collcat and mealcat. (We edited the table produced by tabulate to make it just contain the means.)

predict pred
tabulate collcat mealcat, summarize(pred)
            Means, Standard Deviations and Frequencies of Fitted values
    
               |  Percentage free meals in 3
               |          categories
       collcat |         1          2          3 |     Total
    -----------+---------------------------------+----------
             1 | 816.91431  589.34998  493.91891 | 596.34884
             2 | 825.65118  636.60468  508.83334 | 651.50002
             3 | 782.15094   655.6377  541.73334 |  692.1095
    -----------+---------------------------------+----------
         Total | 805.71757  639.39395  504.37956 | 647.62251
    

We can show a graph of cell means as shown below. We use the same strategy as we did in making the graph above.

separate pred, by(collcat)
                  storage  display     value
    variable name   type   format      label      variable label
    -------------------------------------------------------------------------------
    pred1           float  %9.0g                  pred, collcat == 1
    pred2           float  %9.0g                  pred, collcat == 2
    pred3           float  %9.0g                  pred, collcat == 3
twoway line pred1 pred2 pred3 mealcat, xlabel(1 2 3) sort
Image statar6-1

Now we drop the variables pred pred1 pred2 pred3 in case we wish to use these variable names later.

drop pred pred1 pred2 pred3

Note that we could have produced the same graph and table of predicted values using the postgr3 command. You can download postgr3 from within Stata by typing search postgr3 (see How can I used the search command to search for programs and get additional help? for more information about using search).

postgr3 mealcat, by(collcat) table2

Variables left asis: _Imealcat_2 _Imealcat_3 _Icollcat_2 _Icollcat_3 _IcolXmea_2_2 
  _IcolXmea_2_3 _IcolXmea_3_2 _IcolXmea_3_3
(option xb assumed; fitted values)
                          Means of Fitted values

           |  Percentage free meals in 3
           |          categories
   collcat |         1          2          3 |     Total
-----------+---------------------------------+----------
         1 | 816.91431  589.34998  493.91891 | 596.34884
         2 | 825.65118  636.60468  508.83334 | 651.50002
         3 | 782.15094   655.6377  541.73334 |  692.1095
-----------+---------------------------------+----------
     Total | 805.71757  639.39395  504.37956 | 647.62251

Image statar7-1

The graph of the cell means illustrates the interaction between collcat and mealcat. The graph shows the three  levels of collcat as three different lines, and the three levels of mealcat as the three values on the x-axis of the graph. We can see that the effect of collcat differs based on the level of mealcat. For example, when mealcat is low, schools where collcat is 3 have the lowest api00 scores, as compared to schools that are medium or high on mealcat, where schools with collcat of 3 have the highest api00 scores.

Let’s investigate this interaction further by looking at the simple effects of collcat at each level of mealcat.

6.2. Simple effects

We found that the main effect of collcat was significant, but because we have an interaction the effect of collcat depends on the level of mealcat. We might want to ask whether the effect of collcat is significant at each level of mealcat.

6.2.1 Analyzing simple effects using xi3 and regress

In order to look at the simple effects of collcat at the different levels of mealcat, we will use the @ symbol instead of * to indicate that we want the interaction terms to reflect the simple effects of collcat at each level of mealcat. We will use helmert coding for collcat, which will be discussed further later.

xi3: regress api00 h.collcat@g.mealcat
    h.collcat         _Icollcat_1-3       (naturally coded; _Icollcat_3 omitted)
    s.mealcat         _Imealcat_1-3       (naturally coded; _Imealcat_1 omitted)
    h.col~t@s.mea~t   _IcolWmea_#_#       (simple effects of collcat at mealcat)
    
          Source |       SS       df       MS              Number of obs =     400
    -------------+------------------------------           F(  8,   391) =  166.76
           Model |  6243714.81     8  780464.351           Prob > F      =  0.0000
        Residual |  1829957.19   391  4680.19741           R-squared     =  0.7733
    -------------+------------------------------           Adj R-squared =  0.7687
           Total |  8073672.00   399  20234.7669           Root MSE      =  68.412
    
    ------------------------------------------------------------------------------
           api00 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
     _Imealcat_2 |  -181.0414   9.077126   -19.94   0.000    -198.8874   -163.1953
     _Imealcat_3 |  -293.4103   9.449459   -31.05   0.000    -311.9884   -274.8322
       _Ico1Wme1 |   13.01323     13.528     0.96   0.337    -13.58349    39.60995
       _Ico1Wme2 |  -56.77117   16.67866    -3.40   0.001    -89.56223    -23.9801
       _Ico1Wme3 |  -31.36441   12.86955    -2.44   0.015    -56.66658   -6.062247
       _Ico2Wme1 |   43.50022   14.04092     3.10   0.002     15.89508    71.10536
       _Ico2Wme2 |  -19.03303   13.29175    -1.43   0.153    -45.16528    7.099219
       _Ico2Wme3 |      -32.9   20.23653    -1.63   0.105    -72.68603    6.886028
           _cons |   650.0883   3.871885   167.90   0.000     642.4759    657.7006
    ------------------------------------------------------------------------------

We can obtain the simple effect of collcat when mealcat is low (i.e., 1) via the test command below. This shows that the effect of collcat when mealcat is low is significant.

test  _Ico1Wme1 _Ico2Wme1
     ( 1)  _Ico1Wme1 = 0
     ( 2)  _Ico2Wme1 = 0
    
           F(  2,   391) =    5.44
                Prob > F =    0.0047

We use the describe command below to see the meaning of these terms and see that these two terms represent the two comparisons on collcat when mealcat is 1. For example, in the term _IcolWmea_2_1, the _2 means that this is the second comparison on collcat and the _1 means that it is when mealcat is 1.

describe  _Ico1Wme1 _Ico2Wme1
                  storage  display     value
    variable name   type   format      label      variable label
    -------------------------------------------------------------------------------
    _IcolWmea_1_1   double %10.0g                 collcat(1 vs. 2+) @ mealcat==1
    _IcolWmea_2_1   double %10.0g                 collcat(2 vs. 3) @ mealcat==1

We can test the simple effect of collcat when mealcat is 2 via the test command below. This shows that collcat is significant when mealcat is 2.

test  _Ico1Wme2 _Ico2Wme2
     ( 1)  _Ico1Wme2 = 0
     ( 2)  _Ico2Wme2 = 0
    
           F(  2,   391) =    7.33
                Prob > F =    0.0007

We can also test the simple effect of collcat when mealcat is 3 via the test command below. This shows that collcat is significant when mealcat is 3, if we use an alpha level of 0.05. We should note that since we are doing a number of additional tests, you might want to consider using post hoc corrections, such as a bonferoni correction to avoid Type I errors.

test  _Ico1Wme3 _Ico2Wme3
     ( 1)  _Ico1Wme3 = 0
     ( 2)  _Ico2Wme3 = 0
    
           F(  2,   391) =    3.20
                Prob > F =    0.0417

In summary, all three of the simple effects of collcat at each level of mealcat were significant. However, the effect of collcat when mealcat was 3 might not be significant if we used a post hoc criteria for evaluating its significance.

6.2.2 Coding of simple effects

While xi3 creates the coding for you, it is useful to see the coding it creates for making these simple effects. The coding for mealcat used simple coding, and it’s coding is just as we saw in chapter 5. Below we use the tablist command to show the coding for mealcat. You can download tablist from within Stata by typing search tablist (see How can I used the search command to search for programs and get additional help? for more information about using search).

We see that the coding of mealcat is just as we would expect from chapter 5.

tablist mealcat  _Imealcat_2 _Imealcat_3, s(v)
               mealcat  _Imealca~2  _Imealca~3   Freq
                     1  -.33333333  -.33333333    131
                     2   .66666667  -.33333333    132
                     3  -.33333333   .66666667    137

We requested helmert coding for collcat, and we can look at the coding of collcat to see that the terms _Icollcat_1 _Icollcat_2 are indeed coded using helmert coding. We should note that these terms are not used in the analysis, but are used by xi3 for creating the simple effects shown in the next section.

tablist collcat  _Icollcat_1 _Icollcat_2, s(v)
      collcat  _Icollca~1  _Icollca~2   Freq
            1   .66666667           0    129
            2  -.33333333          .5    134
            3  -.33333333         -.5    137

Now that we have seen the helmert coding for collcat, we can see how this is used to create the simple effects of collcat at each level of mealcat. First, we look at the two comparisons of collcat at mealcat of 1. Note that the coding is the same as we saw above, but only when mealcat is 1, otherwise these variables are coded 0.

tablist  mealcat collcat _Ico1Wme1 _Ico2Wme1, s(v)
               mealcat    collcat   _Ico1Wme1   _Ico2Wme1   Freq
                     1          1   .66666667           0     35
                     1          2  -.33333333          .5     43
                     1          3  -.33333333         -.5     53
                     2          1           0           0     20
                     2          2           0           0     43
                     2          3           0           0     69
                     3          1           0           0     74
                     3          2           0           0     48
                     3          3           0           0     15

Likewise, we look at the terms that form the effects of collcat when mealcat is 2, and we see that the variables are coded the same way when mealcat is 2, and otherwise 0.

tablist  mealcat collcat _Ico1Wme2 _Ico2Wme2, s(v)
               mealcat    collcat   _Ico1Wme2   _Ico2Wme2   Freq
                     1          1           0           0     35
                     1          2           0           0     43
                     1          3           0           0     53
                     2          1   .66666667           0     20
                     2          2  -.33333333          .5     43
                     2          3  -.33333333         -.5     69
                     3          1           0           0     74
                     3          2           0           0     48
                     3          3           0           0     15

LEFT OFF HERE

Finally, we see the same pattern for the terms that form the effect of collcat when mealcat is 3.

tablist  mealcat collcat _IcolWmea_1_3 _IcolWmea_2_3, s(v)
               mealcat    collcat  _IcolW~1_3  _IcolW~2_3   Freq
                     1          1           0           0     35
                     1          2           0           0     43
                     1          3           0           0     53
                     2          1           0           0     20
                     2          2           0           0     43
                     2          3           0           0     69
                     3          1   .66666667           0     74
                     3          2  -.33333333          .5     48
                     3          3  -.33333333         -.5     15

This illustrates how xi3 codes the variables to allow the simple effects analysis. If you wished, you could manually create variables according to this strategy to perform a simple effects analysis.

3. Simple comparisons

In the analyses above we looked at the simple effect of collcat at each level of mealcat. For example, we looked at the overall effect of collcat when mealcat was 1. This is the simple effect of collcat at mealcat=1. Because collcat has more than two levels, we may wish to make further comparisons among the three levels of collcat within mealcat=1. Simple comparisons allow us to make such comparisons.

6.3.1 Analyzing Simple Comparisons Using xi3 and regress

In the analyses above we used helmert coding for collcat. We chose this coding so we could compare group 1 with groups 2 and 3 and then compare groups 2 and 3. For example, if we wanted to compare collcat 1 versus 2 and 3, we would want to look at the effect _IcolWmea_1_1, and if we wanted to compare collcat groups 2 and 3 when mealcat is 1, then we would look at the effect _IcolWmea_2_1. Because xi3 creates labels for each term that it creates, we can use the describe command to verify that we are using the correct terms. Indeed, we see that these terms are as we expected.

describe  _IcolWmea_1_1 _IcolWmea_2_1
                  storage  display     value
    variable name   type   format      label      variable label
    -------------------------------------------------------------------------------
    _IcolWmea_1_1   double %10.0g                 collcat(1 vs. 2+) @ mealcat==1
    _IcolWmea_2_1   double %10.0g                 collcat(2 vs. 3) @ mealcat==1

We can use the regress command to see the effects for these terms.

regress
          Source |       SS       df       MS              Number of obs =     400
    -------------+------------------------------           F(  8,   391) =  166.76
           Model |  6243714.81     8  780464.351           Prob > F      =  0.0000
        Residual |  1829957.19   391  4680.19741           R-squared     =  0.7733
    -------------+------------------------------           Adj R-squared =  0.7687
           Total |  8073672.00   399  20234.7669           Root MSE      =  68.412
    
    ------------------------------------------------------------------------------
           api00 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
     _Imealcat_2 |  -181.0414   9.077126   -19.94   0.000    -198.8874   -163.1953
     _Imealcat_3 |  -293.4103   9.449459   -31.05   0.000    -311.9884   -274.8322
    _IcolWme~1_1 |   13.01323     13.528     0.96   0.337    -13.58349    39.60995
    _IcolWme~2_1 |   43.50022   14.04092     3.10   0.002     15.89508    71.10536
    _IcolWme~1_2 |  -56.77117   16.67866    -3.40   0.001    -89.56223    -23.9801
    _IcolWme~2_2 |  -19.03303   13.29175    -1.43   0.153    -45.16528    7.099219
    _IcolWme~1_3 |  -31.36441   12.86955    -2.44   0.015    -56.66658   -6.062247
    _IcolWme~2_3 |      -32.9   20.23653    -1.63   0.105    -72.68603    6.886028
           _cons |   650.0883   3.871885   167.90   0.000     642.4759    657.7006
    ------------------------------------------------------------------------------

We see that the collcat 1 is not significantly different from 2 and 3 at mealcat 1 (t=.96, p=.337), but collcat 2 is significantly different from collcat 3 at mealcat 1 (t=3.10, p=0.002).

6.3.2 Coding of Simple Comparisons

We can see that the coding of simple comparisons is the same as the coding of simple effects. For example, we can see that the coding of _Icollcat_1 and _Icollcat_2 is coded using helmert coding.

tablist collcat  _Icollcat_1 _Icollcat_2, s(v)
      collcat  _Icollca~1  _Icollca~2   Freq
            1   .66666667           0    129
            2  -.33333333          .5    134
            3  -.33333333         -.5    137

Then the term term _IcolWmea_1_1 represents the comparison of collcat 1 versus collcat 2 and 3 when mealcat is 1. Hence, the coding is the same as the coding for _Icollcat_1 when mealcat is 1, and 0 otherwise, see below.

tablist  mealcat collcat   _IcolWmea_1_1 , s(v)
               mealcat    collcat  _IcolWme~1   Freq
                     1          1   .66666667     35
                     1          2  -.33333333     43
                     1          3  -.33333333     53
                     2          1           0     20
                     2          2           0     43
                     2          3           0     69
                     3          1           0     74
                     3          2           0     48
                     3          3           0     15

6.4. Partial interaction

A partial interaction allows you to apply contrasts to one of the effects in an interaction term. For example, we can draw the interaction of collcat by mealcat like this below.

  Collcat low Collcat Med Collcat High
Mealcat Low      
Mealcat Med      
Mealcat High      

Say that we wanted to compare, in the context of this interaction, group 1 for collcat versus groups 2 and 3. The table of this partial interaction would look like this.  The contrast coefficients of -2 1 1 applied to collcat indicate the comparison of  group 1 for collcat versus groups 2 and 3. 

  -2 1 1
  Collcat low Collcat Med Collcat High
Mealcat Low      
Mealcat Med      
Mealcat High      

Likewise, we also might want to compare groups 2 and 3 of collcat by mealcat, and the table of this interaction would look like this.

  0 -1 1
  Collcat low Collcat Med Collcat High
Mealcat Low      
Mealcat Med      
Mealcat High      

These are called partial interactions because contrast coefficients are applied to one of the terms involved in the interaction.

6.4.1 Analyzing partial interactions using xi3 and regress

As shown above, we wish to compare groups 1 versus 2 and 3 on collcat, and then compare groups 2 and 3 on collcat. This implies helmert coding on collcat, as shown below. The coding for mealcat is chosen as forward difference coding (for the purposes of later analyses) but could have been any form of effect coding.

xi3: regress api00 h.collcat*f.mealcat
    h.collcat         _Icollcat_1-3       (naturally coded; _Icollcat_3 omitted)
    f.mealcat         _Imealcat_1-3       (naturally coded; _Imealcat_3 omitted)
    h.col~t*f.mea~t   _IcolXmea_#_#       (coded as above)
    
          Source |       SS       df       MS              Number of obs =     400
    -------------+------------------------------           F(  8,   391) =  166.76
           Model |  6243714.81     8  780464.351           Prob > F      =  0.0000
        Residual |  1829957.19   391  4680.19741           R-squared     =  0.7733
    -------------+------------------------------           Adj R-squared =  0.7687
           Total |  8073672.00   399  20234.7669           Root MSE      =  68.412
    
    ------------------------------------------------------------------------------
           api00 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
     _Icollcat_1 |  -25.04078   8.345388    -3.00   0.003    -41.44823   -8.633335
     _Icollcat_2 |  -2.810937   9.329377    -0.30   0.763    -21.15295    15.53108
     _Imealcat_1 |   181.0414   9.077126    19.94   0.000     163.1953    198.8874
     _Imealcat_2 |   112.3689   9.907594    11.34   0.000     92.89009    131.8477
    _IcolXme~1_1 |    69.7844    21.4752     3.25   0.001     27.56308    112.0057
    _IcolXme~1_2 |  -25.40675   21.06663    -1.21   0.229    -66.82479    16.01128
    _IcolXme~2_1 |   62.53325   19.33438     3.23   0.001      24.5209    100.5456
    _IcolXme~2_2 |   13.86697   24.21132     0.57   0.567    -33.73369    61.46763
           _cons |   650.0883   3.871885   167.90   0.000     642.4759    657.7006
    ------------------------------------------------------------------------------

Let’s look at all of the terms created by the xi3 command using the describe command.

describe _I*
                  storage  display     value
    variable name   type   format      label   variable label
    ----------------------------------------------------------------------------
    _Icollcat_1     double %10.0g              collcat(1 vs. 2+)
    _Icollcat_2     double %10.0g              collcat(2 vs. 3)
    _Imealcat_1     double %10.0g              mealcat(1 vs. 2)
    _Imealcat_2     double %10.0g              mealcat(2 vs. 3)
    _IcolXmea_1_1   double %10.0g              collcat(1 vs. 2+) & mealcat(1 vs. 2)
    _IcolXmea_1_2   double %10.0g              collcat(1 vs. 2+) & mealcat(2 vs. 3)
    _IcolXmea_2_1   double %10.0g              collcat(2 vs. 3) & mealcat(1 vs. 2)
    _IcolXmea_2_2   double %10.0g              collcat(2 vs. 3) & mealcat(2 vs. 3)

The partial interaction of collcat comparing groups 1 versus 2 and 3 by mealcat is composed of the interaction terms _IcolXmea_1_1 and _IcolXmea_1_2, because these are the terms from the interaction that compare groups 1 versus 2 and 3 on collcat. Below we use the test command to test this partial interaction. We find that this interaction is significant.

test  _IcolXmea_1_1 _IcolXmea_1_2
     ( 1)  _IcolXmea_1_1 = 0.0
     ( 2)  _IcolXmea_1_2 = 0.0
    
           F(  2,   391) =    5.78
                Prob > F =    0.0033

Likewise to compare groups 2 and 3 on collcat by mealcat, we test the two terms of the interaction that involve the comparison of groups 2 and 3 on collcat. We find that this comparison is also significant.

test  _IcolXmea_2_1 _IcolXmea_2_2
     ( 1)  _IcolXmea_2_1 = 0.0
     ( 2)  _IcolXmea_2_2 = 0.0
    
           F(  2,   391) =    7.11
                Prob > F =    0.0009

6.4.2 Coding of partial interactions

The terms _IcolXmea_1_1 and _IcolXmea_1_2 are just the product of their respective main effects. The coding for mealcat is really irrelevant, as long as some form of coding is used that sums to 0. Below you can see that _IcolXmea_1_1 is just _Icollcat_1 * _Imealcat_1.

tablist collcat mealcat  _Icollcat_1 _Imealcat_1 _IcolXmea_1_1 , s(v)
      collcat             mealcat  _Icollca~1  _Imealca~1  _IcolXme~1   Freq
            1                   1   .66666667   .66666667   .44444444     35
            1                   2   .66666667  -.33333333  -.22222222     20
            1                   3   .66666667  -.33333333  -.22222222     74
            2                   1  -.33333333   .66666667  -.22222222     43
            2                   2  -.33333333  -.33333333   .11111111     43
            2                   3  -.33333333  -.33333333   .11111111     48
            3                   1  -.33333333   .66666667  -.22222222     53
            3                   2  -.33333333  -.33333333   .11111111     69
            3                   3  -.33333333  -.33333333   .11111111     15

And you can see that _IcolXmea_1_2 is just _Icollcat_1 * _Imealcat_2.

tablist collcat mealcat  _Icollcat_1 _Imealcat_2 _IcolXmea_1_2 , s(v)
      collcat             mealcat  _Icollca~1  _Imealca~2  _IcolXme~2   Freq
            1                   1   .66666667   .33333333   .22222222     35
            1                   2   .66666667   .33333333   .22222222     20
            1                   3   .66666667  -.66666667  -.44444444     74
            2                   1  -.33333333   .33333333  -.11111111     43
            2                   2  -.33333333   .33333333  -.11111111     43
            2                   3  -.33333333  -.66666667   .22222222     48
            3                   1  -.33333333   .33333333  -.11111111     53
            3                   2  -.33333333   .33333333  -.11111111     69
            3                   3  -.33333333  -.66666667   .22222222     15

6.5. Interaction contrasts

Above we saw that a partial interaction allows you to apply contrast coefficients to one of the terms in a two-way interaction. An interaction contrast allows you to apply contrast coefficients to both of the terms in a two-way interaction.

For example, with respect to collcat say that we wish to compare groups 2 and 3, and with respect to mealcat we wish to compare groups 1 and 2. The table of this looks like this below.

  -1 1 0
Collcat low Collcat Med Collcat High
0 Mealcat Low      
-1 Mealcat Med      
1 Mealcat High      

We also would like to form a second interaction contrast that also compares groups 2 and 3 with respect to collcat, and compares groups 2 and 3 on mealcat. A table of this comparison is shown below.

  0 -1 1
Collcat low Collcat Med Collcat High
0 Mealcat Low      
-1 Mealcat Med      
1 Mealcat High      

If we look at the graph of the predicted values (repeated below) we constructed before, it compares the dashed and dotted lines (collcat 2 versus 3) by mealcat 1 versus 2, and then again by mealcat 2 versus 3.

Image statar3-1

6.5.1 Analyzing interaction contrasts using xi3 and regress

Because we would like to compare groups 1 versus 2, and then groups 2 versus 3 on mealcat, this implies forward difference coding for mealcat (which will compare 1 versus 2, then 2 versus 3). For collcat we wish to compare groups 2 and 3, so we can use helmert coding for that comparison as we did above (since this will compare 1 versus 2 and 3, then 2 versus 3).

xi3: regress api00 h.collcat*f.mealcat
    h.collcat         _Icollcat_1-3       (naturally coded; _Icollcat_3 omitted)
    f.mealcat         _Imealcat_1-3       (naturally coded; _Imealcat_3 omitted)
    h.col~t*f.mea~t   _IcolXmea_#_#       (coded as above)
    
          Source |       SS       df       MS              Number of obs =     400
    -------------+------------------------------           F(  8,   391) =  166.76
           Model |  6243714.81     8  780464.351           Prob > F      =  0.0000
        Residual |  1829957.19   391  4680.19741           R-squared     =  0.7733
    -------------+------------------------------           Adj R-squared =  0.7687
           Total |  8073672.00   399  20234.7669           Root MSE      =  68.412
    
    ------------------------------------------------------------------------------
           api00 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
     _Icollcat_1 |  -25.04078   8.345388    -3.00   0.003    -41.44823   -8.633335
     _Icollcat_2 |  -2.810937   9.329377    -0.30   0.763    -21.15295    15.53108
     _Imealcat_1 |   181.0414   9.077126    19.94   0.000     163.1953    198.8874
     _Imealcat_2 |   112.3689   9.907594    11.34   0.000     92.89009    131.8477
    _IcolXme~1_1 |    69.7844    21.4752     3.25   0.001     27.56308    112.0057
    _IcolXme~1_2 |  -25.40675   21.06663    -1.21   0.229    -66.82479    16.01128
    _IcolXme~2_1 |   62.53325   19.33438     3.23   0.001      24.5209    100.5456
    _IcolXme~2_2 |   13.86697   24.21132     0.57   0.567    -33.73369    61.46763
           _cons |   650.0883   3.871885   167.90   0.000     642.4759    657.7006
    ------------------------------------------------------------------------------
    

If we are not sure what term we want to use, we can use the describe command to show the labels for the interaction terms.

describe  _IcolXmea*
                  storage  display   value
    variable name   type   format    label   variable label
    -------------------------------------------------------------------------------
    _IcolXmea_1_1   double %10.0g            collcat(1 vs. 2+) & mealcat(1 vs. 2)
    _IcolXmea_1_2   double %10.0g            collcat(1 vs. 2+) & mealcat(2 vs. 3)
    _IcolXmea_2_1   double %10.0g            collcat(2 vs. 3) & mealcat(1 vs. 2)
    _IcolXmea_2_2   double %10.0g            collcat(2 vs. 3) & mealcat(2 vs. 3)

The first interaction comparison of interest is tested by _IcolXmea_2_1 , and this term is significant. As we expect, the red and green lines are not parallel when we compare mealcat 1 and 2.

The second interaction comparison of interest is tested by _IcolXmea_2_2 , and this term is not significant. Looking at the graph, we can see that the red and green lines are mostly parallel between mealcat 2 and 3.

6.5.2 Coding of interaction contrasts

The term _IcolXmea_1_1 is just the product of the respective main effects, as shown below.

tablist collcat mealcat  _Icollcat_1 _Imealcat_1 _IcolXmea_1_1 , s(v)
      collcat             mealcat  _Icollca~1  _Imealca~1  _IcolXme~1   Freq
            1                   1   .66666667   .66666667   .44444444     35
            1                   2   .66666667  -.33333333  -.22222222     20
            1                   3   .66666667  -.33333333  -.22222222     74
            2                   1  -.33333333   .66666667  -.22222222     43
            2                   2  -.33333333  -.33333333   .11111111     43
            2                   3  -.33333333  -.33333333   .11111111     48
            3                   1  -.33333333   .66666667  -.22222222     53
            3                   2  -.33333333  -.33333333   .11111111     69
            3                   3  -.33333333  -.33333333   .11111111     15

6.6 Computing adjusted means

6.6.1 Computing adjusted means via anova

First, we show how you can compute adjusted means using the anova command. We use the same model that we have been using, including mealcat, collcat and the interaction of these two variables.

anova api00 collcat mealcat collcat*mealcat emer, contin(emer)
                               Number of obs =     400     R-squared     =  0.7930
                               Root MSE      = 65.4617     Adj R-squared =  0.7882
    
                      Source |  Partial SS    df       MS           F     Prob > F
             ----------------+----------------------------------------------------
                       Model |  6402428.26     9  711380.918     166.01     0.0000
                             |
                     collcat |  34730.0899     2  17365.0449       4.05     0.0181
                     mealcat |  3017331.85     2  1508665.92     352.06     0.0000
             collcat*mealcat |  96789.1156     4  24197.2789       5.65     0.0002
                        emer |  158713.455     1  158713.455      37.04     0.0000
                             |
                    Residual |  1671243.73   390  4285.24034   
             ----------------+----------------------------------------------------
                       Total |  8073672.00   399  20234.7669   

After performing the anova, we can then use the adjust command to get adjusted means broken down by collcat and mealcat. These adjusted means compute the mean that would be expected if every school in the sample were at the mean for the variable emer. Note that it is possible to compute adjusted means with emer at other values besides the mean, for example if we had put emer=50 it would have computed means adjusting each school as though it had a mean of 50.

adjust emer , by(collcat mealcat)
    --------------------------------------------------------------------------
         Dependent variable: api00     Command: anova
      Covariate set to mean: emer = 12.6575
    --------------------------------------------------------------------------
    
    -------------------------------------
              |Percentage free meals in 3
              |        categories        
      collcat |       1        2        3
    ----------+--------------------------
            1 |  797.56  596.973  509.872
            2 |  812.55  636.405  523.885
            3 | 767.935  652.976  550.462
    -------------------------------------
         Key:  Linear Prediction

6.6.2 Computing adjusted means via regress

Now we illustrate how to get the same adjusted means if you were to to the analysis via the regress command. First, we perform the regression analysis that is equivalent to the anova command above.

xi3: regress api00 s.collcat*s.mealcat emer
    s.collcat         _Icollcat_1-3       (naturally coded; _Icollcat_1 omitted)
    s.mealcat         _Imealcat_1-3       (naturally coded; _Imealcat_1 omitted)
    s.col~t*s.mea~t   _IcolXmea_#_#       (coded as above)
    
          Source |       SS       df       MS              Number of obs =     400
    -------------+------------------------------           F(  9,   390) =  166.01
           Model |  6402428.26     9  711380.918           Prob > F      =  0.0000
        Residual |  1671243.73   390  4285.24034           R-squared     =  0.7930
    -------------+------------------------------           Adj R-squared =  0.7882
           Total |  8073672.00   399  20234.7669           Root MSE      =  65.462
    
    ------------------------------------------------------------------------------
           api00 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
     _Icollcat_2 |   22.81146   8.713721     2.62   0.009     5.679712     39.9432
     _Icollcat_3 |   22.32251   9.588069     2.33   0.020     3.471742    41.17328
     _Imealcat_2 |  -163.8973   9.131088   -17.95   0.000    -181.8497    -145.945
     _Imealcat_3 |  -264.6091   10.20556   -25.93   0.000    -284.6739   -244.5443
    _IcolXme~2_2 |   24.44231   23.26715     1.05   0.294    -21.30242    70.18704
    _IcolXme~2_3 |  -.9774027    19.2525    -0.05   0.960    -38.82908    36.87428
    _IcolXme~3_2 |   85.62852   22.04718     3.88   0.000     42.28233    128.9747
    _IcolXme~3_3 |   70.21457   23.47354     2.99   0.003     24.06406    116.3651
            emer |   -2.00997   .3302709    -6.09   0.000    -2.659304   -1.360636
           _cons |   675.2877    5.55622   121.54   0.000     664.3638    686.2116
    ------------------------------------------------------------------------------

To create the adjusted means we wish to assume that all of the schools are at the average on the variable emer. We do this by assigning the average of emer to the variable emer, but first making a copy of emer as temer so we don’t destroy the contents of this variable.

rename emer temer
egen emer = mean(temer)

Now we create yhat as the predicted value. Since the value of emer is set to the mean of emer, this will be the predicted value assuming that all schools are at the average for emer.

predict yhat

Now, we can look at the average of yhat broken down by collcat and mealcat, which you can see corresponds to the adjusted means that we found with the adjust command following the anova command above. 

tabulate collcat mealcat, sum(yhat) nostandard nofreq
            Means of Fitted values
    
               |  Percentage free meals in 3
               |          categories
       collcat |         1          2          3 |     Total
    -----------+---------------------------------+----------
             1 | 797.56042  596.97284  509.87225 | 601.43115
             2 | 812.55023  636.40497  523.88464 | 652.62341
             3 | 767.93524  652.97614  550.46161 | 686.22515
    -----------+---------------------------------+----------
         Total | 790.49498   639.0926  519.22579 |  647.6225
    

We then drop the variable emer and yhat since we no longer need these variables, and rename temer back to emer so the emer variable is back to the way it was before this process.

drop yhat emer
rename temer emer

6.63 Computing Adjusted means via postgr3

The postgr3 command can be used to simplify the process of computing adjusted means (i.e. predicted values when holding other variables constant).  Let’s assume that you have run the same regression as shown above

. xi3: regress api00 s.collcat*s.mealcat emer 
<output omitted to save space>

You can then show the graph of adjusted means and table of adjusted means using postgr3 as shown below. Below we show just the able of adjusted means, and you can see that they correspond to those computed above.  We should stress that it is important to use the xi3 command (rather than xi) before using postgr3 because then postgr3 knows which variables should be held constant (in this example emer) and which variables should not be help constant (in this example, _Imealcat_2 through  _IcolXmea_3_3).  

. postgr3 mealcat, by(collcat) connect(ll[_]l[.])  table2
Variables left asis: _Imealcat_2 _Imealcat_3 _Icollcat_2 _Icollcat_3 _IcolXmea_2_2 
  _IcolXmea_2_3 _IcolXmea_3_2 _IcolXmea_3_3
Holding emer constant at 12.6575

(option xb assumed; fitted values)

                          Means of Fitted values

           |  Percentage free meals in 3
           |          categories
   collcat |         1          2          3 |     Total
-----------+---------------------------------+----------
         1 | 797.56042  596.97284  509.87225 | 601.43115
         2 | 812.55023  636.40497  523.88464 | 652.62341
         3 | 767.93524  652.97614  550.46161 | 686.22515
-----------+---------------------------------+----------
     Total | 790.49498   639.0926  519.22579 |  647.6225

6.7 More details on meaning of coefficients

So far we have discussed a variety of techniques that you can use to help interpret interactions of categorical variables in regression, but we have not gone into great detail about the meaning of the coefficients in these analyses. Let’s consider this further. Consider the analysis below using collcat and mealcat, using simple contrasts on both of these variables.

xi3: regress api00 s.collcat*s.mealcat
    s.collcat         _Icollcat_1-3       (naturally coded; _Icollcat_1 omitted)
    s.mealcat         _Imealcat_1-3       (naturally coded; _Imealcat_1 omitted)
    s.col~t*s.mea~t   _IcolXmea_#_#       (coded as above)
    
          Source |       SS       df       MS              Number of obs =     400
    -------------+------------------------------           F(  8,   391) =  166.76
           Model |  6243714.81     8  780464.351           Prob > F      =  0.0000
        Residual |  1829957.19   391  4680.19741           R-squared     =  0.7733
    -------------+------------------------------           Adj R-squared =  0.7687
           Total |  8073672.00   399  20234.7669           Root MSE      =  68.412
    
    ------------------------------------------------------------------------------
           api00 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
     _Icollcat_2 |   23.63531   9.105331     2.60   0.010     5.733782    41.53685
     _Icollcat_3 |   26.44625   9.995129     2.65   0.008     6.795332    46.09717
     _Imealcat_2 |  -181.0414   9.077126   -19.94   0.000    -198.8874   -163.1953
     _Imealcat_3 |  -293.4103   9.449459   -31.05   0.000    -311.9884   -274.8322
    _IcolXme~2_2 |   38.51777   24.19532     1.59   0.112    -9.051421    86.08697
    _IcolXme~2_3 |   6.177537   20.08262     0.31   0.759     -33.3059    45.66097
    _IcolXme~3_2 |    101.051   22.88808     4.42   0.000     56.05192    146.0501
    _IcolXme~3_3 |   82.57776   24.43941     3.38   0.001     34.52867    130.6268
           _cons |   650.0883   3.871885   167.90   0.000     642.4759    657.7006
    ------------------------------------------------------------------------------

We can produce the adjusted means as shown below. These will be useful for interpreting the meaning of the coefficients.

predict yhat
tabulate collcat mealcat, sum(yhat) nofreq nostandard
                              Means of Fitted values
    
               |  Percentage free meals in 3
               |          categories
       collcat |         1          2          3 |     Total
    -----------+---------------------------------+----------
             1 | 816.91431  589.34998  493.91891 | 596.34884
             2 | 825.65118  636.60468  508.83334 | 651.50002
             3 | 782.15094   655.6377  541.73334 |  692.1095
    -----------+---------------------------------+----------
         Total | 805.71757  639.39395  504.37956 | 647.62251

We drop the variable yhat since we no longer need it in case we wish to use this variable name again.

drop yhat 

Let’s consider the meaning of the coefficient for _Icollcat_2. The coding for this variable compares group 2 versus group 1; hence, this coefficient corresponds to mean(collcat2) – mean(collcat1). Note that these are the unweighted means, so we compute the mean for collcat2 as the mean of the three cells corresponding to collcat2, i.e., (825.651+636.605+508.833)/3 . If we compare the result below to the coefficient for _Icollcat_2 we see that they are the same.

display (825.651+636.605+508.833)/3 - (816.914+589.35+493.919)/3
    23.635333

Likewise, the coefficient for _Icollcat_3 is mean(collcat3) – mean(collcat1), computed below. The value below corresponds to the coefficient for _Icollcat_3.

display (782.151+655.638+541.733)/3 - (816.914+589.35+493.919)/3
    26.446333

Likewise, the coefficient for _Imealcat_2 works out to be mean(mealcat2) – mean(mealcat1), see below.

display (589.35+636.605+655.638)/3 - (816.914+825.651+782.151)/3
    -181.041

And the coefficient for _Imealcat_3 is mean(mealcat3) – mean(mealcat1), see below.

display (493.919+508.833+541.733)/3 - (816.914+825.651+782.151)/3
    -293.41033

To get the meaning of the coefficients for the interaction terms, we need to multiply the contrast coding of the main effects that created the interaction terms. For example, the term _IsomXme~2_2 is the product of _Icollcat_2 and _Imealcat_2. We can form a 3 by 3 table showing the coding for _Icollcat_2 on the left, and _Imealcat_2 along the top, and then multiply these terms together and place the products in the cells of the table, see below

  -1 1 0
Collcat low Collcat Med Collcat High
-1 Mealcat Low 1 -1 0
1 Mealcat Med -1 1 0
0 Mealcat High 0 0 0

We then can multiply these terms in the cells by the means of the cells and we get the value for the coefficient for _IsomXme~2_2. In other words, we see that this coefficient corresponds to the means of cells (1,2) and (2,1) minus cells (1,1) and (2,2).

display ( 816.914 - 589.35 -  825.651 +  636.605 )
    38.518

We can go through the same process to verify the meaning of the coefficients for the other three interaction terms. We verify that _IcolXme~2_3 is 6.177.

display ( 816.914 - 493.919 -  825.651 + 508.833)
    6.177

We also verify that _IcolXme~3_2 is 101.051.

display ( 816.914 - 589.35 -  782.151 +  655.638 )
    101.051

And we verify that _IcolXme~3_3 is 82.577.

display ( 816.914 - 493.919 -  782.151 + 541.733 )
    82.577

6.8 Simple effects via dummy coding versus effect coding

You may wonder why we have gone to the effort of using xi3 for creating and testing these effects instead of just using dummy coding like we would get with the xi command. Let’s compare how to get simple effects using the xi3 command via effect coding to how we would get simple effects using xi with dummy coding. We hope to show that it is much easier to use effect coding via xi3 and that the interpretation of the coefficients is much more intuitive.

6.8.1 Example 1. Simple effects of yr_rnd at levels of mealcat

Let’s use an example from Chapter 3 (section 3.5). In that example we looked at an analysis using mealcat and yr_rnd and the interaction of these two variables. First, we look at how to do a simple effects analysis looking at the simple effects of yr_rnd at each level of mealcat using the xi3 command with effect coding. To make our results correspond to those from Chapter 3, we will make group 3 of mealcat the reference category.

char mealcat[omit] 3
xi3 : regress api00 s.yr_rnd@s.mealcat
    s.yr_rnd          _Iyr_rnd_0-1        (naturally coded; _Iyr_rnd_0 omitted)
    s.mealcat         _Imealcat_1-3       (naturally coded; _Imealcat_3 omitted)
    s.yr_~d@s.mea~t   _Iyr_Wmea_#_#       (simple effects of yr_rnd at mealcat)
    
          Source |       SS       df       MS              Number of obs =     400
    -------------+------------------------------           F(  5,   394) =  261.61
           Model |  6204727.82     5  1240945.56           Prob > F      =  0.0000
        Residual |  1868944.18   394  4743.51314           R-squared     =  0.7685
    -------------+------------------------------           Adj R-squared =  0.7656
           Total |  8073672.00   399  20234.7669           Root MSE      =  68.873
    
    ------------------------------------------------------------------------------
           api00 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
     _Imealcat_1 |   267.8108   14.61559    18.32   0.000     239.0765    296.5451
     _Imealcat_2 |   114.6572   11.12812    10.30   0.000     92.77923    136.5351
    _Iyr_Wmea_~1 |  -74.25691   26.75629    -2.78   0.006    -126.8599   -21.65397
    _Iyr_Wmea_~2 |  -51.74017   18.88854    -2.74   0.006    -88.87511   -14.60524
    _Iyr_Wmea_~3 |  -33.49254   11.77129    -2.85   0.005    -56.63492   -10.35015
           _cons |   632.2356   5.800477   109.00   0.000     620.8318    643.6393
    ------------------------------------------------------------------------------

Now we can obtain the simple effect of yr_rnd at mealcat=1 by inspecting the coefficient for _Iyr_Wmea_1_1, the simple effect of yr_rnd at mealcat=2 by inspecting the coefficient for _Iyr_Wmea_1_2 and the simple effect of yr_rnd at mealcat=3 by inspecting the coefficient for _Iyr_Wmea_1_3.

Now let’s perform the same analysis using xi with dummy coding. Again, we will explicitly make the third group for mealcat to be the omitted category.

char mealcat[omit] 3
xi : regress api00 i.mealcat*yr_rnd
    i.mealcat         _Imealcat_1-3       (naturally coded; _Imealcat_3 omitted)
    i.meal~t*yr_rnd   _ImeaXyr_rn_#       (coded as above)
    
          Source |       SS       df       MS              Number of obs =     400
    -------------+------------------------------           F(  5,   394) =  261.61
           Model |  6204727.82     5  1240945.56           Prob > F      =  0.0000
        Residual |  1868944.18   394  4743.51314           R-squared     =  0.7685
    -------------+------------------------------           Adj R-squared =  0.7656
           Total |  8073672.00   399  20234.7669           Root MSE      =  68.873
    
    ------------------------------------------------------------------------------
           api00 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
     _Imealcat_1 |   288.1929   10.44284    27.60   0.000     267.6623    308.7236
     _Imealcat_2 |    123.781   10.55185    11.73   0.000      103.036    144.5259
          yr_rnd |  -33.49254   11.77129    -2.85   0.005    -56.63492   -10.35015
    _ImeaXyr_r~1 |  -40.76438   29.23118    -1.39   0.164    -98.23297    16.70422
    _ImeaXyr_r~2 |  -18.24763   22.25624    -0.82   0.413    -62.00347     25.5082
           _cons |   521.4925   8.414197    61.98   0.000     504.9502    538.0349
    ------------------------------------------------------------------------------

In order to form a test of simple main effects we need to make a table like the one shown below that relates the means of the cells to the coefficients in the regression. Please see Chapter 3, section 3.5 for information on how this table was constructed.

            mealcat=1           mealcat=2         mealcat=3
            -------------------------------------------------
  yr_rnd=0  _cons               _cons             _cons    
            +BImealcat1         +BImealcat2 
            -------------------------------------------------
  yr_rnd=1  _cons               _cons             _cons    
            +Byr_rnd            +Byr_rnd          +Byr_rnd
            +BImealcat1         +BImealcat2           
            +B_ImeaXyr_rn_1     +B_ImeaXyr_rn_2 

Let’s start by looking at how to get the simple effect of yr_rnd when mealcat is 3. Looking at the table above, we can see that we would want to compare _cons with _cons + Byr_rnd. We can do this with the lincom command as shown below.

lincom _cons - (_cons + yr_rnd)
     ( 1) - yr_rnd = 0.0
    
    ------------------------------------------------------------------------------
           api00 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
             (1) |   33.49254   11.77129     2.85   0.005     10.35015    56.63492
    ------------------------------------------------------------------------------

We see that _cons drops out, yielding just yr_rnd. Instead, we can use the test command to test whether the coefficient for yr_rnd is 0. Note that this result corresponds to the result we found with the xi3 command also testing the simple effect of yr_rnd when mealcat is 3.

test yr_rnd=0
     ( 1)  yr_rnd = 0.0
    
           F(  1,   394) =    8.10
                Prob > F =    0.0047

Note that the coefficient for yr_rnd corresponds to the test of the effect of yr_rnd when all other variables are set to 0 (the reference category), in other words, when mealcat is set to the reference category. You may be tempted to interpret the coefficient for yr_rnd as the overall difference between year round schools and non-year round schools, but in this example we see that it really corresponds to the simple effect of yr_rnd. When using dummy coding people commonly misinterpret the lower order effects to refer to overall effects rather than simple effects.

Now let’s look at the simple effect of yr_rnd when mealcat=1. Looking at the table above we see that this involves the comparison of the coefficients for yr_rnd=1 versus yr_rnd=0 when mealcat=1, i.e., comparing _cons + yr_rnd + _Imealcat_1 + _ImeaXyr_rn_1 versus _cons + _Imealcat_1. Removing the terms that drop out we can do the test command below.

test yr_rnd + _ImeaXyr_rn_1=0
     ( 1)  yr_rnd + _ImeaXyr_rn_1 = 0.0
    
           F(  1,   394) =    7.70
                Prob > F =    0.0058

We can likewise obtain the effect of yr_rnd when mealcat is 2, as shown below.

test yr_rnd + _ImeaXyr_rn_2=0
     ( 1)  yr_rnd + _ImeaXyr_rn_2 = 0.0
    
           F(  1,   394) =    7.50
                Prob > F =    0.0064

These examples illustrate that it is more complicated to form simple effects when using dummy coding, and also that the interpretation of lower order effects when using dummy coding may not have the meaning that you would expect.

6.8.2 Example 2. Simple effects of mealcat at levels of yr_rnd

Example 1 looked at simple effects for yr_rnd, a variable with only two levels In this example, let’s consider the simple effects of mealcat at each level of yr_rnd. Because mealcat has more than two levels, we can see what is required for doing tests of simple effects for variables with more than two levels.

First, let’s show how to get these simple effects using the xi3 command using effect coding.

xi3 : regress api00 s.mealcat@s.yr_rnd
    s.mealcat         _Imealcat_1-3       (naturally coded; _Imealcat_3 omitted)
    s.yr_rnd          _Iyr_rnd_0-1        (naturally coded; _Iyr_rnd_0 omitted)
    s.mea~t@s.yr_~d   _ImeaWyr__#_#       (simple effects of mealcat at yr_rnd)
    
          Source |       SS       df       MS              Number of obs =     400
    -------------+------------------------------           F(  5,   394) =  261.61
           Model |  6204727.82     5  1240945.56           Prob > F      =  0.0000
        Residual |  1868944.18   394  4743.51314           R-squared     =  0.7685
    -------------+------------------------------           Adj R-squared =  0.7656
           Total |  8073672.00   399  20234.7669           Root MSE      =  68.873
    
    ------------------------------------------------------------------------------
           api00 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
      _Iyr_rnd_1 |  -53.16321   11.60095    -4.58   0.000    -75.97072    -30.3557
    _ImeaWyr~1_1 |   288.1929   10.44284    27.60   0.000     267.6623    308.7236
    _ImeaWyr~2_1 |    123.781   10.55185    11.73   0.000      103.036    144.5259
    _ImeaWyr~1_2 |   247.4286   27.30218     9.06   0.000     193.7524    301.1047
    _ImeaWyr~2_2 |   105.5333   19.59588     5.39   0.000     67.00776    144.0589
           _cons |   632.2356   5.800477   109.00   0.000     620.8318    643.6393
    ------------------------------------------------------------------------------

We can get the simple effect of mealcat at yr_rnd = 0 just as we did earlier in this chapter.

test  _ImeaWyr__1_1 _ImeaWyr__2_1
     ( 1)  _ImeaWyr__1_1 = 0.0
     ( 2)  _ImeaWyr__2_1 = 0.0
    
           F(  2,   394) =  411.46
                Prob > F =    0.0000

And we likewise get the simple effect of mealcat at yr_rnd = 1 as shown below.

test  _ImeaWyr__1_2 _ImeaWyr__2_2
     ( 1)  _ImeaWyr__1_2 = 0.0
     ( 2)  _ImeaWyr__2_2 = 0.0
    
           F(  2,   394) =   50.19
                Prob > F =    0.0000

We can now test the simple effects of mealcat at each level of yr_rnd via dummy coding.

xi : regress api00 i.mealcat*yr_rnd
    i.mealcat         _Imealcat_1-3       (naturally coded; _Imealcat_3 omitted)
    i.meal~t*yr_rnd   _ImeaXyr_rn_#       (coded as above)
    
          Source |       SS       df       MS              Number of obs =     400
    -------------+------------------------------           F(  5,   394) =  261.61
           Model |  6204727.82     5  1240945.56           Prob > F      =  0.0000
        Residual |  1868944.18   394  4743.51314           R-squared     =  0.7685
    -------------+------------------------------           Adj R-squared =  0.7656
           Total |  8073672.00   399  20234.7669           Root MSE      =  68.873
    
    ------------------------------------------------------------------------------
           api00 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
     _Imealcat_1 |   288.1929   10.44284    27.60   0.000     267.6623    308.7236
     _Imealcat_2 |    123.781   10.55185    11.73   0.000      103.036    144.5259
          yr_rnd |  -33.49254   11.77129    -2.85   0.005    -56.63492   -10.35015
    _ImeaXyr_r~1 |  -40.76438   29.23118    -1.39   0.164    -98.23297    16.70422
    _ImeaXyr_r~2 |  -18.24763   22.25624    -0.82   0.413    -62.00347     25.5082
           _cons |   521.4925   8.414197    61.98   0.000     504.9502    538.0349
    ------------------------------------------------------------------------------

The simple effect of mealcat when yr_rnd is 0 requires two test statements since it is a 2 degree of freedom test. We can do this by testing mean(mealcat1) = mean(mealcat2) and also testing mean(mealcat2) = mean(mealcat3). We can look at the table above and see that mean(mealcat1) = mean(mealcat2) is _Imealcat_1– _Imealcat_2 (after _cons drops out) and mean(mealcat2) = mean(mealcat3) is _Imealcat_2 after _cons drops out. So, we can perform this test using the two test commands below.

test  _Imealcat_1- _Imealcat_2=0
     ( 1)  _Imealcat_1 - _Imealcat_2 = 0.0
    
           F(  1,   394) =  343.05
                Prob > F =    0.0000
test  _Imealcat_2, accum
     ( 1)  _Imealcat_1 - _Imealcat_2 = 0.0
     ( 2)  _Imealcat_2 = 0.0
    
           F(  2,   394) =  411.46
                Prob > F =    0.0000

Note that the effects _Imealcat_1 and _Imealcat_2 do not correspond to overall effects of the variable mealcat but are the simple effects when yr_rnd is set to 0, the reference level. Again we see that the terms that we might be tempted to call main effects and think of as overall effects really are simple effects when dummy coding is used.

The second test command uses the accum option to accumulate the tests to get the 2 degree of freedom test that corresponds to the simple effect of mealcat when yr_rnd is 0.

Likewise, we can look at the table above to form the comparisons needed to obtain the simple effects of mealcat when yr_rnd is 1.

test  _Imealcat_1+ _ImeaXyr_rn_1- _Imealcat_2- _ImeaXyr_rn_2=0
     ( 1)  _Imealcat_1 - _Imealcat_2 + _ImeaXyr_rn_1 - _ImeaXyr_rn_2 = 0.0
    
           F(  1,   394) =   20.26
                Prob > F =    0.0000
test  _Imealcat_2+ _ImeaXyr_rn_2=0, accum
     ( 1)  _Imealcat_1 - _Imealcat_2 + _ImeaXyr_rn_1 - _ImeaXyr_rn_2 = 0.0
     ( 2)  _Imealcat_2 + _ImeaXyr_rn_2 = 0.0
    
           F(  2,   394) =   50.19
                Prob > F =    0.0000

Using this example we hoped to illustrate that when performing simple effects for a variable with more than two levels can be quite tricky and requires constructing multiple test commands, one test command for every degree of freedom in the simple effect. As you can see, constructing these terms can be very tricky and possibly error prone. Without a method for double checking results, it is very possible to make a mistake when constructing terms and form the wrong comparison. By comparison, using effect coding with xi3, forming comparisons can be much easier and the interpretation of the lower order effects is much more intuitive. The lower order effects do correspond to the overall effects of the variable, for example the effect of yr_rnd, when using effect coding, does correspond to the overall unweighted mean for the year round schools compared to the non-year round schools.

UCLA OIT
  1. © 2017 UC REGENTS TERMS OF USE & PRIVACY POLICY
  2. HOME
  3. CONTACT