## Method

This page is will show one method for estimating effects size for mixed models in Stata. Specifically, we will estimate Cohen’s \(f^2\) effect size measure using the method described by Selya(2012, see References at the bottom) .

Here is the formula we will use to estimate the (fixed) effect size for predictor \(b\), \(f^2_b\)_{,} in a mixed model:

\[f^2_b = \frac{R^2_{ab}-R^2_a}{1-R^2_{ab}} \]

\(R^2_{ab} \) represents the proportion of variance of the outcome explained by all the predictors in a full model, including predictor \(b\). \(1-R^2_{ab}\) in the denominator thus represents the proportion of variance of the outcome not explained by the full model.

\(R^2_a\) represents the proportion of variance of the outcome explained by the predictors in a reduced model *with all fixed effects from the full model except for the effect of \(b\), and random effects constrained to be the same as those from the full model*. \(R^2_{ab}- R^2_a\)** _{ }**in the numerator is the additional proportion of variance of the outcome solely attributable to

*\(b\)*.

Unlike linear regression models, \(R^2\) is not readily available from the output of mixed models, whereas residual variances typically are available, so we will calculate \(R^2\) from residual variances:

\[R^2 = \frac{V_{null} – V_{model}}{V_{null}} \]

where \(V_{null}\) is the residual variance of a null model with only the intercept and random effects, and \(V_{model}\) is the model that includes both fixed and random effects. We can thus interpret \(R^2\) from a mixed model as the additional variance explained by the predictors’ effects over the random effects (and intercept).

We can substitute the residual variances into the formula for \(f^2_b\):

\[f^2_b = \frac{\frac{V_{null}-V_{ab}}{V_{null}} – \frac{V_{null}-V_{a}}{V_{null}}}{1 – \frac{V_{null}-V_{ab}}{V_{null}}} \]

We thus need the residual variances \(V_{null}\), \(V_{ab}\) and \(V_{a}\) to calculate our effect size \(f^2_b\).

## Use `meglm`

instead of `mixed`

Because of the constraint that random effects be in the reduced in null models be the same as those from the full model, we use the `meglm`

command rather than `mixed`

, because `meglm`

allows `constraints()`

whereas `mixed`

does not. By default, without any further specification of `family()`

or `link()`

, `meglm`

runs linear mixed models.

Residual variances of `meglm`

models are “stored results” in Stata, so can be accessed through the `ereturn`

suite of commands.

## Example

For our example, we will use the hsbdemo data set. We are interested in estimating the effect size of predictor **female** on the outcome write in a full model that includes the covariate read and random intercepts by classroom, cid.

use https://stats.idre.ucla.edu/stat/data/hsbdemo, clear

### Full model

We first run the full model with both female and read as fixed effects, and random intercepts by cid (some output omitted throughout this page):

meglm write female read || cid:Mixed-effects GLM Number of obs = 200 Family: Gaussian Link: identity Group variable: cid Number of groups = 20 Obs per group: min = 7 avg = 10.0 max = 12 Integration method: mvaghermite Integration pts. = 7 Wald chi2(2) = 42.60 Log likelihood = -627.40316 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ write | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- female | 3.903393 .6995899 5.58 0.000 2.532222 5.274564 read | -.1843351 .0692818 -2.66 0.008 -.3201249 -.0485452 _cons | 60.41431 4.247365 14.22 0.000 52.08963 68.73899 -------------+---------------------------------------------------------------- cid | var(_cons)| 86.0646 29.97979 43.4827 170.3463 -------------+---------------------------------------------------------------- var(e.write)| 21.43537 2.278679 17.40381 26.40084 ------------------------------------------------------------------------------ LR test vs. linear model: chibar2(01) = 95.62 Prob >= chibar2 = 0.0000

As you can see **female** is statistically significant.

In the output above, we see that the residual variance, **var(e.write)**, is the fifth coefficient. Coefficients are typically stored in matrix **e(b)**. We store these results in our own matrix **coef**, which we then view with `matrix list`

:

matrix ab=e(b) matrix list abab[1,5] write: write: write: var(_cons~): var(e.wri~): female read _cons _cons _cons y1 3.9033927 -.18433505 60.41431 86.064602 21.435374

We capture the residual error variance of the full model in global macro **Vab**:

global Vab = ab[1,5]

We also need to capture the random intercept variance, because in this method, the reduced model is constrained to have the same random effects as the full model, so that the only effect that differs between the two models is the predictor that has been removed (whose effect size we are estimating). We see in the output table and the matrix listing for **e(b)** that the random intercept variance is fourth coefficient. Here, we set up a `constraint`

, labeled constraint 1, that will fix the random intercept variance in the reduced to be equal to this random intercept variance. We will use this constraint for the reduced and null models:

constraint 1 _b[var(_cons[cid]):_cons]= ab[1,4]

*Note:* In order to get the name of the random intercept variance coefficient to use in `constraint`

, run the `meglm`

model with the option `coeflegend`

:

meglm write female read || cid:, coeflegend

### Reduced model with constrained random intercept variance

Next we run a model without the effect of interest, female, but with random intercept variance constrained (using constraint 1 defined above) to be the same as the full model above.

meglm write read || cid:, constraints(1)Mixed-effects GLM Number of obs = 200 Family: Gaussian Link: identity Group variable: cid Number of groups = 20 Obs per group: min = 7 avg = 10.0 max = 12 Integration method: mvaghermite Integration pts. = 7 Wald chi2(1) = 9.52 Log likelihood = -641.9216 Prob > chi2 = 0.0020 ( 1) [var(_cons[cid])]_cons = 86.0646 ------------------------------------------------------------------------------ write | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- read | -.2123955 .0688434 -3.09 0.002 -.3473261 -.0774649 _cons | 64.03051 4.170172 15.35 0.000 55.85712 72.2039 -------------+---------------------------------------------------------------- cid | var(_cons)| 86.0646 (constrained) -------------+---------------------------------------------------------------- var(e.write)| 24.99122 2.656517 20.29114 30.77999 ------------------------------------------------------------------------------

Notice how the random intercept variance has been constrained to be the same as the full model above.

In this case the residual variance is the *fourth* coefficient (since we no longer have a coefficient for female).

matrix a = e(b) matrix li aa[1,4] write: write: var(_cons~): var(e.wri~): read _cons _cons _cons y1 -.21239548 64.030511 86.0646 24.991221

We will capture the residual variance in global macro **Va**

global Va = a[1,4]

### Null model

Finally, we remove all predictors from the model and retain only the random intercepts. We still constrain the variance of the random intercepts to be the same as the full model:

meglm write || cid:, constraints(1)Mixed-effects GLM Number of obs = 200 Family: Gaussian Link: identity Group variable: cid Number of groups = 20 Obs per group: min = 7 avg = 10.0 max = 12 Integration method: mvaghermite Integration pts. = 7 Wald chi2(0) = . Log likelihood = -646.4063 Prob > chi2 = . ( 1) [var(_cons[cid])]_cons = 86.0646 ------------------------------------------------------------------------------ write | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- _cons | 52.92335 2.107457 25.11 0.000 48.79281 57.05389 -------------+---------------------------------------------------------------- cid | var(_cons)| 86.0646 (constrained) -------------+---------------------------------------------------------------- var(e.write)| 27.27164 2.871684 22.18607 33.52294 ------------------------------------------------------------------------------

Now the residual variance is the third coefficient:

matrix null=e(b) matrix list nullnull[1,3] write: var(_cons~): var(e.wri~): _cons _cons _cons y1 52.923354 86.0646 27.271639

We capture the residual variance of the null model in global macro **Vnull**

global Vnull = null[1,3]

## Calculation of effect size and \(R^2\) values

We now have the residual variances, \(V_{ab}\), \(V_{a}\), and \(V_{null}\), necessary to calculate the effect size of predictor female, \(f^2_b\).

Because they are interesting quantities themselves, we first calculate \(R^2_{ab}\) and \(R^2_a\) and display their values.

global R2ab = ($Vnull - $Vab)/$Vnull global R2a = ($Vnull - $Va)/$Vnulldisplay "Proportion explained full model = $R2ab"Proportion explained full model = .2140049273598291display "Proportion explained reduced model = $R2a"Proportion explained reduced model = .0836186727647698

Finally, we compute the effect size and display its value:

global f2b = ($R2ab - $R2a)/(1-$R2ab)display "Effect size = $f2b"Effect size = .1658868600245669

## Reference

Selya AS, Rose JS, Dierker LC, Hedeker D, Mermelstein RJ. A Practical Guide to Calculating Cohen’s *f*^{2}, a Measure of Local Effect Size, from PROC MIXED. *Frontiers in Psychology *2012.