Mediator variables are variables that sit between independent variable and dependent variable and mediate the effect of the IV on the DV. A model with two mediators is shown in the figure below.

Now, what if **MV _{1}** and

**DV**were binary variables while

**MV**and

_{2}**IV**were continuous. In that case the calculation of the indirect effects would require a combination of OLS regression along with either logit or probit models. This web page presents a Stata program,

**binary_mediation**, that can be with multiple mediator variables in any combination of binary or continuous along with either a binary or continuous response variable. You can download

**binary_mediation**by typing

**search binary_mediation**in Stata’s command window and following the instructions.

Different researchers compute indirect effects using different approaches. We will compute indirect effects using the product of coefficients approach. This is fairly straight forward when all the variables are continuous. Having a combination of continuous and binary variables makes things a bit trickier.

David A. Kenny in a paper available from his website (Mediation with Dichotomous Outcomes), recommends rescaling (standardizing) coefficients before computing indirect effects. The reasoning behind this is that in OLS regression the residual variance for the model changes as variables are entered or removed from the regression equation. In logistic or probit regression, on the other hand, the residual variance is fixed. Since the residual is fixed the scaling of the coefficients varies. Computing indirect effects involves multiple models, each with different variables. In order to compare coefficients from one model to Kenny recommends standardizing the coefficients. Coefficients from OLS models are rescaled using the standard deviations of the observed variables. For logit or probit models the rescaling involves the standard deviation of the underlying latent variable for the binary variable. Once the coefficients are rescaled (standardized) the indirect effects van be computed as the product of coefficients. Nathaniel Herr has a very nice diagram on his webpage that illustrates the different scaling that occurs when both the mediator and response variables are binary.

The user written command, **binary_mediation**, can be used
to compute indirect effects using the product of coefficients approach. The program
standardizes all the coefficients for OLS, logit and probit models. The results
using logit or probit, once standardized, are very similar.

Please note: **binary_mediation** does not compute standard errors or confidence intervals
directly. You will need to use **binary_mediation** with the **bootstrap** command
to obtain standard errors and confidence intervals.

## Example

For this series of example we will use the **hsbdemo** dataset. We will create
a binary mediator **hiread** by dichotomizing **read**. We do not recommend dichotomizing
continuous variables, we just want to demonstrate the process with one binary mediator.
Along with **hiread** we will use **science** as a continuous mediator, **ses** as a
continuous predictor and **honors** as a binary response variable.

The **binary_mediation** program will detect which variables are continuous and which are
binary.

use http://www.ats.ucla.edu/stat/data/hsbdemo, clear generate hiread=read>=50 /* create binary mediator */ summarize ses hiread science honors /* descriptive statistics */Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- ses | 200 2.055 .7242914 1 3 hiread | 200 .585 .4939585 0 1 science | 200 51.85 9.900891 26 74 honors | 200 .265 .4424407 0 1binary_mediation, dv(honors) mv(hiread science) iv(ses)Logit: hiread on iv (a1 path) Logistic regression Number of obs = 200 LR chi2(1) = 12.40 Prob > chi2 = 0.0004 Log likelihood = -129.52516 Pseudo R2 = 0.0457 ------------------------------------------------------------------------------ hiread | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- ses | .7204026 .2109932 3.41 0.001 .3068636 1.133942 _cons | -1.115341 .4465912 -2.50 0.013 -1.990643 -.2400381 ------------------------------------------------------------------------------ OLS regression: science on iv (a2 path) ------------------------------------------------------------------------------ science | Coef. Std. Err. t P>|t| Beta -------------+---------------------------------------------------------------- ses | 3.866564 .9317955 4.15 0.000 .2828553 _cons | 43.90421 2.029732 21.63 0.000 . ------------------------------------------------------------------------------ Logit: dv on iv (c path) Logistic regression Number of obs = 200 LR chi2(1) = 7.34 Prob > chi2 = 0.0068 Log likelihood = -111.97593 Pseudo R2 = 0.0317 ------------------------------------------------------------------------------ honors | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- ses | .6185825 .2344357 2.64 0.008 .159097 1.078068 _cons | -2.337778 .5417028 -4.32 0.000 -3.399496 -1.27606 ------------------------------------------------------------------------------ Logit: dv on mv & iv (b & c' paths) Logistic regression Number of obs = 200 LR chi2(3) = 51.61 Prob > chi2 = 0.0000 Log likelihood = -89.83923 Pseudo R2 = 0.2231 ------------------------------------------------------------------------------ honors | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- hiread | 1.597298 .5332837 3.00 0.003 .552081 2.642515 science | .0901672 .0253211 3.56 0.000 .0405389 .1397956 ses | .2516925 .266301 0.95 0.345 -.2702479 .7736328 _cons | -7.658069 1.456197 -5.26 0.000 -10.51216 -4.803975 ------------------------------------------------------------------------------ Indirect effects with binary response variable honors indir_1 = .09141282 (hiread, binary) indir_2 = .10582395 (science, continuous) total indirect = .19723677 direct effect = .07639769 total effect = .27363446 c_path = .23980637 proportion of total effect mediated = .72080384 ratio of indirect to direct effect = 2.5817112 Binary models use logit regression

By default **binary_mediation** displays each of the models used in computing the indirect
effects.
The coefficients in this part of the output are not standardized. Following the raw output is
a summary of the direct and indirect effects. For this example,
the total indirect effect seems fairly substantial being approximately two and a
half times larger than the direct effect. The proportion of the total effect that is mediated is
about 0.72 which is also substantial.

The **binary_mediation** program does not produce any standard errors or confidence intervals
on its own.
We will use the **bootstrap** command to obtain a standard errors for the direct and indirect
effects along with a 95% percentile confidence intervals. We will demonstrate the process using
500 bootstrap replications but you can set the number to anything you prefer.
We recommend the percentile or
biased-corrected confidence intervals over
normal-based confidence intervals. You can bootstrap any of the effects found in the **return list**.

quietly bootstrap r(indir_1) r(indir_2) r(tot_ind) r(dir_eff) r(tot_eff), /// reps(500): binary_mediation, dv(honors) iv(ses) mv(hiread science)estat bootstrap, percentile bcBootstrap results Number of obs = 200 Replications = 499 command: binary_mediation, dv(honors) iv(ses) mv(hiread science) _bs_1: r(indir_1) _bs_2: r(indir_2) _bs_3: r(tot_ind) _bs_4: r(dir_eff) _bs_5: r(tot_eff) ------------------------------------------------------------------------------ | Observed Bootstrap | Coef. Bias Std. Err. [95% Conf. Interval] -------------+---------------------------------------------------------------- _bs_1 | .09141282 -.0000552 .03717104 .0299178 .1781988 (P) | .0333105 .1959342 (BC) _bs_2 | .10582395 .001447 .03999136 .0421071 .1912641 (P) | .0443143 .1973525 (BC) _bs_3 | .19723677 .0013918 .05159597 .098798 .3049328 (P) | .107806 .3141167 (BC) _bs_4 | .07639769 -.0046966 .07954484 -.0831474 .2288187 (P) | -.0747334 .2309053 (BC) _bs_5 | .27363446 -.0033048 .09258509 .0802001 .4406839 (P) | .0739526 .4394651 (BC) ------------------------------------------------------------------------------ (P) percentile confidence interval (BC) bias-corrected confidence interval Note: one or more parameters could not be estimated in 1 bootstrap replicate; standard-error estimates include only complete replications.

The **bootstrap** program encountered one replicate in which it could not estimate the model.
We don’t know, for sure, exactly what happened but during the resampling process samples with
perfect prediction or complete separation can occur. In these cases the coefficients cannot be
computed. Since it occurred in only one out of 500 replication we are not worried.

In looking at the **bootstrap** results, we see that both of the indirect effects appear to
be significant (confidence interval does not contain zero) along with the total indirect effect.
The direct effect, however, is not statistically significant.

For comparison purposes we will rerun **binary_mediation** using the **probit** option along
with the **diagram** option.

binary_mediation, dv(honors) mv(hiread science) iv(ses) probit diagramProbit: hiread on iv (a1 path) Probit regression Number of obs = 200 LR chi2(1) = 12.37 Prob > chi2 = 0.0004 Log likelihood = -129.54145 Pseudo R2 = 0.0456 ------------------------------------------------------------------------------ hiread | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- ses | .4437993 .1279512 3.47 0.001 .1930196 .6945791 _cons | -.687353 .2744143 -2.50 0.012 -1.225195 -.1495109 ------------------------------------------------------------------------------ OLS regression: science on iv (a2 path) ------------------------------------------------------------------------------ science | Coef. Std. Err. t P>|t| Beta -------------+---------------------------------------------------------------- ses | 3.866564 .9317955 4.15 0.000 .2828553 _cons | 43.90421 2.029732 21.63 0.000 . ------------------------------------------------------------------------------ Probit: dv on iv (c path) Probit regression Number of obs = 200 LR chi2(1) = 7.05 Prob > chi2 = 0.0079 Log likelihood = -112.12049 Pseudo R2 = 0.0305 ------------------------------------------------------------------------------ honors | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- ses | .3500684 .1332785 2.63 0.009 .0888474 .6112894 _cons | -1.36609 .3001514 -4.55 0.000 -1.954376 -.7778043 ------------------------------------------------------------------------------ Probit: dv on mv & iv (b * c' paths) Probit regression Number of obs = 200 LR chi2(3) = 50.99 Prob > chi2 = 0.0000 Log likelihood = -90.149337 Pseudo R2 = 0.2205 ------------------------------------------------------------------------------ honors | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- hiread | .8613714 .2822841 3.05 0.002 .3081048 1.414638 science | .0510501 .0142893 3.57 0.000 .0230435 .0790566 ses | .1156009 .1543426 0.75 0.454 -.186905 .4181068 _cons | -4.250488 .7704867 -5.52 0.000 -5.760614 -2.740362 ------------------------------------------------------------------------------ Indirect effects with binary response variable honors indir_1 = .09911685 (hiread, binary) indir_2 = .10883112 (science, continuous) total indirect = .20794797 direct effect = .06373715 total effect = .27168512 c_path = .24577434 proportion of total effect mediated = .76540067 ratio of indirect to direct effect = 3.2625868 Binary models use probit regression Reference Mediation Diagram IV --- coef c --- DV MV1 / coef a1 coef b1 / IV --- coef c' --- DV / coef a2 coef b2 / MV2

The ratio of indirect to direct effect is larger for this probit example but most of the other values are very similar to the logit results from the first example. Please note that the reference diagram always shows the example of two mediators. The diagram does not change with the number of mediators in the command itself.

## References

Kenny, D. A.(2008) Mediation with Dichotomous Outcomes. Retrieved April 23, 2010 from website: http://davidakenny.net/doc/dichmed.pdf .

Kenny, D. A.(2009) Mediation. Retrieved April 23, 2010 from website: http://davidakenny.net/cm/mediate.htm .

Herr, N. A. (undated) Mediation with Dichotomous Outcomes. Retrieved April 18, 2011 from website: http://www.nrhpsych.com/mediation/logmed.html .