The **margins** command introduced in Stata 11
is a very popular post-estimation command. However, it can be tricky to use in
conjunction with multiple imputation and survey data.

Let’s begin by looking at the data.

use http://www.ats.ucla.edu/stat/data/hsbmar, clear sum honors female prog read math science socst, sep(0)Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- honors | 200 .265 .4424407 0 1 female | 185 .5459459 .4992356 0 1 prog | 200 2.025 .6904772 1 3 read | 185 51.61622 10.19104 28 76 math | 190 52.17895 9.246168 33 75 science | 193 51.57513 9.86396 26 74 socst | 188 51.59043 10.44862 26 71

As you can see from the table above, all of
the variables except for **honors** and **prog** have missing values.

**honors** is the binary response variable while **female** (two level categorical)
and **prog** (three level categorical) are the
research variables of interest with **read**, **math**, **science** and **socst**
serving as control variables. Our primary interest is in the **female**-by-**prog**
interaction. We will want to compute the predicted probabilities for each of the
six cells of the 2-by-3 interaction.

## So, what’s the big deal?

Why not just impute the data and then run the **margins** command.
Well, we can impute the data, but we
need a way to run both **svy logit** and **margins** on each imputed dataset and
then combine the **margins** results into a single output. The issue
is that **margins** does not work with **mi estimate**.

We can accomplish this by writing a wrapper program called **mimargins** and saving it
in a file called **mimargins.ado**.
It contains both the **svy logit** and **margins** commands. By setting the option
**properties** to **mi**, **mimargins** can be used with **mi estimate**.
We also need to declare **mimargins** to be an **eclass** program.

Here is what the **mimargins** program looks like.

program mimargins, eclass properties(mi) version 12 svy: logit honors i.female##i.prog read math science socst margins female#prog, atmeans asbalanced post end

Here is how you use **mimargins** in the calling program.

mi estimate, cmdok: mimargins 1

The **cmdok** is needed because Stata does not recognize **mimargins** as an mi estimable
program.

Next, we need to note that our data are not truly survey data. We are going to fake this
by declaring that the values of **write** are the pweights and that **ses** is the
stratification variable. Since this is part of a multiple imputation we need to run the
survey set command as **mi svyset**. Here is the code for performing the multiple
imputation using chained equations creating 10 imputed datasets. Note, the value 10 for
the number of imputed datasets was selected for demonstration purposes and does not
represent a recommendation.

set seed 1234543 mi set mlong mi register imputed female math read science socst mi svyset [pw=write], strata(ses) mi impute chain (logit) female (regress) math read science socst = /// ses write awards, add(10)Conditional models: science: regress science math socst i.female read ses write awards math: regress math science socst i.female read ses write awards socst: regress socst science math i.female read ses write award female: logit female science math socst read ses write awards read: regress read science math socst i.female ses write awards Performing chained iterations ... Multivariate imputation Imputations = 10 Chained equations added = 10 Imputed: m=1 through m=10 updated = 0 Initialization: monotone Iterations = 100 burn-in = 10 female: logistic regression math: linear regression read: linear regression science: linear regression socst: linear regression ------------------------------------------------------------------ | Observations per m |---------------------------------------------- Variable | Complete Incomplete Imputed | Total -------------------+-----------------------------------+---------- female | 185 15 15 | 200 math | 190 10 10 | 200 read | 185 15 15 | 200 science | 193 7 7 | 200 socst | 188 12 12 | 200 ------------------------------------------------------------------ (complete + incomplete = total; imputed is the minimum across m of the number of filled-in observations.)

Next, we can run our survey logit model and check the interaction. Please note the order
of the commands: The **mi estimate:** comes first, followed by the **svy:**,
which in turn, is followed by the **logit** command itself.

mi estimate: svy: logit honors i.female##i.prog read math science socstMultiple-imputation estimates Imputations = 10 Survey: Logistic regression Number of obs = 200 Number of strata = 3 Population size = 10555 Number of PSUs = 200 Average RVI = 0.1343 Largest FMI = 0.4661 Complete DF = 197 DF adjustment: Small sample DF: min = 32.60 avg = 140.25 max = 190.51 Model F test: Equal FMI F( 9, 186.0) = 4.92 Within VCE type: Linearized Prob > F = 0.0000 ------------------------------------------------------------------------------ honors | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- 1.female | 2.011976 1.145094 1.76 0.081 -.2513263 4.275278 | prog | 2 | .8609826 1.052203 0.82 0.414 -1.214482 2.936447 3 | -.3461753 1.119538 -0.31 0.758 -2.554972 1.862622 | female#prog | 1 2 | -.8082329 1.207095 -0.67 0.504 -3.189971 1.573505 1 3 | .9091944 1.361134 0.67 0.505 -1.777474 3.595863 | read | .0718714 .0447126 1.61 0.118 -.019139 .1628818 math | .1122547 .0379181 2.96 0.004 .0372987 .1872107 science | .0673994 .0418923 1.61 0.110 -.0155836 .1503824 socst | -.0013026 .0342759 -0.04 0.970 -.0694318 .0668267 _cons | -16.23994 2.727678 -5.95 0.000 -21.6285 -10.85138 ------------------------------------------------------------------------------mi test 1.female#2.prog 1.female#3.prognote: assuming equal fractions of missing information ( 1) [honors]1.female#2.prog = 0 ( 2) [honors]1.female#3.prog = 0 F( 2, 190.2) = 1.22 Prob > F = 0.2983

Unfortunately our interaction was not statistically significant. However, we will push ahead and compute the predicted cell probabilities for the 2×3 interaction just to show how it can be done.

mi estimate, cmdok: mimargins 1Multiple-imputation estimates Imputations = 10 Adjusted predictions Number of obs = 200 Average RVI = 0.0414 Largest FMI = 0.0977 DF adjustment: Large sample DF: min = 979.08 avg = 35230.16 Within VCE type: Delta-method max = 83462.61 ------------------------------------------------------------------------------ | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- female#prog | 0 1 | .0840548 .0682084 1.23 0.218 -.0496921 .2178018 0 2 | .1351131 .0607053 2.23 0.026 .0159857 .2542405 0 3 | .0575222 .0361655 1.59 0.112 -.013363 .1284074 1 1 | .331319 .1053479 3.14 0.002 .1247258 .5379122 1 2 | .2856044 .0804723 3.55 0.000 .1278792 .4433295 1 3 | .4312711 .1482311 2.91 0.004 .1407392 .721803 ------------------------------------------------------------------------------

And that is how you can compute adjusted predictions for multiply imputed survey data.
This approach will generalize to other estimation commands as well as other **margins**
commands.