This page shows an example of an multinomial logistic regression analysis with footnotes explaining the output. The data were collected on 200 high school
students and are scores on various tests, including science, math, reading and social studies. The outcome measure in this analysis is
socio-economic status (**ses**)- low, medium and high- from which we are going to see what relationships exists with science test scores (**science**),
social science test scores (**socst**) and gender (**female**). Our response variable, **ses**, is going to be treated as
categorical under the assumption that the levels of **ses** status have *no* natural ordering
and we are going to allow Stata to choose the referent group, middle **ses**. The first half of this page
interprets the coefficients in terms of multinomial log-odds (logits) and the second half interprets the coefficients in terms of
relative risk ratios.

use http://www.ats.ucla.edu/stat/data/hsb2, clear mlogit ses science socst femaleIteration 0: log likelihood = -210.58254 Iteration 1: log likelihood = -194.75041 Iteration 2: log likelihood = -194.03782 Iteration 3: log likelihood = -194.03485 Iteration 4: log likelihood = -194.03485 Multinomial logistic regression Number of obs = 200 LR chi2(6) = 33.10 Prob > chi2 = 0.0000 Log likelihood = -194.03485 Pseudo R2 = 0.0786 ------------------------------------------------------------------------------ ses | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- low | science | -.0235647 .0209747 -1.12 0.261 -.0646744 .017545 socst | -.0389243 .0195165 -1.99 0.046 -.0771759 -.0006726 female | .8166202 .3909813 2.09 0.037 .050311 1.582929 _cons | 1.912256 1.127256 1.70 0.090 -.2971258 4.121638 -------------+---------------------------------------------------------------- high | science | .022922 .0208718 1.10 0.272 -.0179861 .0638301 socst | .0430036 .0198894 2.16 0.031 .0040211 .081986 female | -.032862 .3500153 -0.09 0.925 -.7188793 .6531553 _cons | -4.057323 1.222939 -3.32 0.001 -6.45424 -1.660407 ------------------------------------------------------------------------------ (ses==middle is the base outcome)

## Iteration Log^{a}

Iteration 0: log likelihood = -210.58254 Iteration 1: log likelihood = -194.75041 Iteration 2: log likelihood = -194.03782 Iteration 3: log likelihood = -194.03485 Iteration 4: log likelihood = -194.03485

a. This is a listing of the log likelihoods at each iteration. Remember that multinomial logistic regression, like binary and ordered logistic regression, uses maximum likelihood estimation, which is an iterative procedure. The first iteration (called iteration 0) is the log likelihood of the "null" or "empty" model; that is, a model with no predictors. At the next iteration, the predictor(s) are included in the model. At each iteration, the log likelihood decreases because the goal is to minimize the log likelihood. When the difference between successive iterations is very small, the model is said to have "converged", the iterating stops, and the results are displayed. For more information on this process for binary outcomes, see Regression Models for Categorical and Limited Dependent Variables by J. Scott Long (page 52-61).

## Model Summary

Multinomial logistic regression Number of obs^{c}= 200 LR chi2(6)^{d}= 33.10 Prob > chi2^{e}= 0.0000 Log likelihood = -194.03485^{b}Pseudo R2^{f}= 0.0786

b.** Log Likelihood** – This is the log likelihood of the fitted model. It is used in the Likelihood Ratio Chi-Square test of whether all predictors’
regression coefficients in the model are simultaneously zero and in tests of nested models.

c.** Number of obs** – This is the number of observations used in the
multinomial logistic regression.
It may be less than the number of cases in the dataset if there are missing
values for some variables in the equation. By default, Stata does a listwise
deletion of incomplete cases.

d.** LR chi2(6)** – This is the Likelihood Ratio (LR) Chi-Square test that
for both equations (low **ses** relative to middle **ses** and high **ses**
relative to
middle **ses**) at
least one of the predictors’ regression coefficient is not equal to zero.
The number in the parentheses indicates the degrees of freedom of the Chi-Square distribution
used to test the LR Chi-Square statistic and is defined by the number of models
estimated (2) times the number of predictors in the model (3).
The LR Chi-Square statistic can be calculated by
-2*( L(null model) – L(fitted model)) = -2*((-210.583) – (-194.035)) = 33.096, where L(null model)
is from the log likelihood with just the response variable in the model (Iteration 0)
and L(fitted model) is the log likelihood from the final iteration (assuming the model converged) with all the parameters.

e.** Prob > chi2** – This is the probability of getting a LR test statistic as extreme as, or more so, than the observed under the null
hypothesis; the null hypothesis is that all of the regression coefficients
across both models are simultaneously equal to zero. In other words, this is the probability of obtaining this
chi-square statistic (33.10) if there is in fact no effect of the predictor variables. This p-value is compared to a specified alpha level, our willingness
to accept a type I error, which is typically set at 0.05 or 0.01. The small p-value from the LR test, <0.00001, would lead us to conclude that at least
one of the regression coefficients in the model is not equal to zero. The parameter of the Chi-Square distribution used to test the null hypothesis is defined
by the degrees of freedom in the prior line, ** chi2(6)**.

f.** Pseudo R2** – This is McFadden’s pseudo R-squared. Logistic regression does not have an equivalent to the R-squared that is found in OLS regression;
however, many people have tried to come up with one. There are a wide variety of pseudo-R-square statistics. Because this statistic does not mean what
R-square means in OLS regression (the proportion of variance for the response variable explained by the predictors), we suggest interpreting this statistic with great
caution.

## Parameter Estimates

------------------------------------------------------------------------------ ses^{g}| Coef.^{h}Std. Err.^{j}z^{k}P>|z|^{k}[95% Conf. Interval]^{l}-------------+---------------------------------------------------------------- low | science | -.0235647 .0209747 -1.12 0.261 -.0646744 .017545 socst | -.0389243 .0195165 -1.99 0.046 -.0771759 -.0006726 female | .8166202 .3909813 2.09 0.037 .050311 1.582929 _cons | 1.912256 1.127256 1.70 0.090 -.2971258 4.121638 -------------+---------------------------------------------------------------- high | science | .022922 .0208718 1.10 0.272 -.0179861 .0638301 socst | .0430036 .0198894 2.16 0.031 .0040211 .081986 female | -.032862 .3500153 -0.09 0.925 -.7188793 .6531553 _cons | -4.057323 1.222939 -3.32 0.001 -6.45424 -1.660407 ------------------------------------------------------------------------------ (ses==middle is the base outcome)^{i}

g. **ses** – This is the response variable in the multinomial logistic regression. Underneath **ses** are two
replicates of the predictor variables, representing the two models that
are estimated: low **ses** relative to middle **ses** and high **ses**
relative to
middle **ses**.

h and i. **Coef.** and **referent group** – These are the estimated
multinomial logistic regression coefficients and the referent level,
respectively, for the model. An important feature of the multinomial logit model
is that it estimates *k-1* models, where *k* is the number of levels
of the dependent variable. In this instance, Stata, by default, set middle **ses** as the
referent group and therefore estimated a model for low **ses** relative to middle **
ses** and a model for high **ses** relative to middle **ses**. Therefore, since
the parameter estimates are relative to the referent group, the standard
interpretation of the multinomial logit is that for a unit change in the
predictor variable, the logit of outcome *m* relative to the referent group
is expected to change by its respective parameter estimate given the variables
in the model are held constant.

** low ses relative to middle ses**

** science** – This is the multinomial logit estimate
for a one unit increase in **science** test score for low **ses** relative
to middle **ses** given the other variables in the model are held constant.
If a subject were to increase his **science** test score by one point, the
multinomial log-odds for low **ses** relative to middle **ses** would be
expected to decrease by 0.024 unit while holding all other variables in the
model constant.

** socst** – This is the multinomial logit estimate
for a one unit increase in **socst** test score for low **ses** relative
to middle **ses** given the other variables in the model are held constant.
If a subject were to increase his **socst** test score by one point, the
multinomial log-odds for low **ses** relative to middle **ses** would be
expected to decrease by 0.039 unit while holding all other variables in the
model constant.

** female** – This is the multinomial logit estimate
comparing females to males for low **ses** relative
to middle **ses** given the other variables in the model are held constant.
The multinomial logit for females relative to males is 0.817 unit higher for
being in low **ses** relative to middle **ses** given all other predictor variables in the
model are held constant.

** _cons** – This is the multinomial logit estimate for
low **ses** relative to middle **ses** when the predictor variables in the model
are evaluated at zero. For males (the variable **female** evaluated at zero)
with zero **science** and **socst** test scores, the logit for being in
low **ses** versus middle **ses** is 1.912. Note, evaluating **science** and **socst**
at zero is out of the range of plausible test scores and if the test scores were
mean-centered, the intercept would have a natural interpretation: log odds of
being in low **ses** versus middle **ses** for a male with average **science** and **socst** test score.

** high ses relative to middle ses**

** science** – This is the multinomial logit estimate
for a one unit increase in **science** test score for high **ses** relative
to middle **ses** given the other variables in the model are held constant.
If a subject were to increase his **science** test score by one point, the
multinomial log-odds for high **ses** relative to middle **ses** would be
expected to increase by 0.023 unit while holding all other variables in the
model constant.

** socst** – This is the multinomial logit estimate
for a one unit increase in **socst** test score for high **ses** relative
to middle **ses** given the other variables in the model are held constant.
If a subject were to increase his **socst** test score by one point, the
multinomial log-odds for high **ses** relative to middle **ses** would be
expected to increase by 0.043 unit while holding all other variables in the
model constant.

** female** – This is the multinomial logit estimate
comparing females to males for high **ses** relative
to middle **ses** given the other variables in the model are held constant.
The multinomial logit for females relative to males is 0.033 unit lower for
being in high **ses** relative to middle **ses** given all other predictor variables in the
model are held constant.

** _cons** – This is the multinomial logit estimate for
high **ses** relative to middle **ses** when the predictor variables in the model
are evaluated at zero. For males (the variable **female** evaluated at
zero) with zero **science** and **socst** test scores, the logit for being in
high **ses** relative to middle **ses** is -4.057.

j. **Std. Err.** – These are the standard errors of the individual
regression coefficients for the two respective models estimated. They are used
in both the calculation of the **z **test
statistic, superscript k, and the confidence interval of the regression coefficient, superscript
l.

k. **z** and **P>|z|** – These are the test statistics and p-value, respectively,
that within a given model the
null hypothesis that an individual predictor’s regression
coefficient is zero given that the rest of the predictors are in the model. The test statistic **z** is the ratio of the **Coef.** to the
**Std. Err.** of the respective predictor. The **z** value follows a standard normal distribution which is used to test against a two-sided
alternative hypothesis that the **Coef.** is not equal to zero. The probability that a particular **z** test statistic is as extreme as, or more
so, than what has been observed under the null hypothesis is defined by **P>|z|**.
The interpretation of the parameter estimates’ significance is limited only to the
first equation, low **ses** relative to middle **ses**. The interpretation
for the second model, high **ses** relative to middle **ses**, naturally falls out of the first
equations interpretation.

For low **ses** relative to middle **ses**, the **z** test statistic for the predictor **science** (-0.024/0.021) is
-1.12 with an associated p-value of 0.261. If we set our
alpha level to 0.05, we would fail to reject the null hypothesis and conclude that
for low **ses** relative to middle **ses**, the regression coefficient for **science**
has not been found to be statistically different from zero given **socst** and **female** are in the model.

For low **ses** relative to middle **ses**, the **z** test statistic for the predictor **socst** (-0.039/0.020) is
-1.99 with an associated p-value
of 0.046. If we again set our alpha level to 0.05, we would reject the null hypothesis and conclude that the regression coefficient for **socst** has
been found to be statistically different from zero for low **ses** relative
to middle **ses** given
that **science** and **female** are in the model.

For low **ses** relative to middle **ses**, the **z** test statistic for the predictor **
female** (0.817/0.391) is 2.09 with an associated p-value
of 0.037. If we again set our alpha level to 0.05, we would reject the null hypothesis and conclude that the
difference between males and females has been found to be statistically
different for low **ses** relative to middle **ses** given
that **science** and **female** are in the model.

For low **ses** relative to middle **ses**, the **z** test statistic for the
intercept, **_cons** (1.912/1.129) is 1.70 with an associated p-value
of 0.090. With an alpha level of 0.05, we would fail to reject the
null hypothesis and conclude, a) that the multinomial logit for males (the
variable **
female** evaluated at zero) and with zero **science** and **socst**
test scores in low **ses** relative to middle **ses** are found not to be
statistically different from zero; or b) for males with zero **science** and
**socst** test scores, you are statistically uncertain whether they are more
likely to be classified as low **ses** or middle **ses**. We can make the
second interpretation when we view the **_cons** as a specific covariate
profile (males with zero **science** and **socst** test scores). Based on the direction and significance of
the coefficient, the **_cons** tells whether the profile would have a greater
propensity to fall in one of the levels of the dependent variable.

l. **[95% Conf. Interval]** – This is the Confidence Interval (CI) for an individual
multinomial logit regression coefficient given the other predictors are in the model
for outcome *m* relative to the referent group.
For a given predictor with a level of 95% confidence, we’d say that we are 95% confident that the "true" population
multinomial logit regression coefficient lies
between the lower and upper limit of the interval for outcome *m*
relative to the referent group. It is calculated as the **Coef.** ± (z_{α/2})*(**Std.Err.**), where z_{α/2}
is a critical value on the standard normal distribution. The CI is equivalent to the **z** test statistic: if the CI includes zero, we’d fail to
reject the null hypothesis that a particular regression coefficient is zero given the other predictors are in the model.
An advantage of a CI is that it is illustrative; it provides a range where the "true" parameter may lie.

## Relative Risk Ratio Interpretation

The following is the interpretation of the multinomial logistic regression in terms of
relative risk ratios and can be obtained by
**mlogit, rrr** after running the multinomial logit model or by specifying the **rrr** option
when the full model is specified. This part of the interpretation applies to the output below.

mlogit ses science socst female, rrrIteration 0: log likelihood = -210.58254 Iteration 1: log likelihood = -194.75041 Iteration 2: log likelihood = -194.03782 Iteration 3: log likelihood = -194.03485 Iteration 4: log likelihood = -194.03485 Multinomial logistic regression Number of obs = 200 LR chi2(6) = 33.10 Prob > chi2 = 0.0000 Log likelihood = -194.03485 Pseudo R2 = 0.0786 ------------------------------------------------------------------------------ ses | RRR^{a}Std. Err. z P>|z| [95% Conf. Interval]^{b}-------------+---------------------------------------------------------------- low | science | .9767108 .0204862 -1.12 0.261 .9373726 1.0177 socst | .9618236 .0187714 -1.99 0.046 .925727 .9993276 female | 2.262839 .8847276 2.09 0.037 1.051598 4.869199 -------------+---------------------------------------------------------------- high | science | 1.023187 .0213558 1.10 0.272 .9821747 1.065911 socst | 1.043942 .0207633 2.16 0.031 1.004029 1.085441 female | .9676721 .3387 -0.09 0.925 .4872981 1.921595 ------------------------------------------------------------------------------ (ses==middle is the base outcome)

a. **Relative Risk Ratio** – These are the relative risk ratios for the
multinomial logit model shown earlier. They can be obtained by exponentiating
the multinomial logit coefficients, e^{coef.}, or by specifying the **rrr** option.
Recall that the multinomial logit model estimates k-1 models, where the k^{th} equation is relative to the referent group. If the model
was to be written out in an exponentiated form where the predictor of interest
is evaluated at x + δ and at x for outcome m relative to
referent group, where
δ is the change in the predictor we are interested
in (δ is traditionally is set to one) while the other variables in the
model are held constant. If we then take their ratio, the ratio would reduce to the ratio
of two probabilities, the relative risk. In this sense, the exponentiated
multinomial logit coefficient provides an estimate of relative risk. However,
the exponentiated coefficient are commonly interpreted as odds
ratios. Standard interpretation of
the relative risk ratios is for a unit change in the predictor variable, the
relative risk ratio of
outcome m relative to the referent group is expected to change by a
factor of the respective parameter estimate given the variables in
the model are held constant.

** low ses relative to middle ses**

** science** – This is the relative risk ratio for a one unit
increase in **science** score for low **ses** relative to middle **ses**
level given that the other variables in the model are held constant. If a
subject were to increase her **science** test score by one unit, the
relative risk for low **ses** relative to middle **ses** would be expected to
decrease by a factor
of 0.977 given the other variables in the model are held constant. So, given a
one unit increase in **science**, the relative risk of being in the low **
ses** group would be 0.977 times more likely when the other variables in the
model are held constant. More generally, we can say that if a subject were to
increase their **science** test score, they’d be
expected to fall into middle **ses** as compared to low **ses**.

** socst** – This is the relative risk ratio for a one
unit increase in **socst** score for low **ses** relative to middle **ses**
level given that the other variables in the model are held constant. If a
subject were to increase her **socst** test score by one unit, the
relative risk for low **ses** relative to middle **ses** would be expected to decrease by a factor
of 0.962 given the other variables in the model are held constant.

** female** – This is the relative risk ratio comparing
females to males for low **ses** relative to middle **ses**
level given that the other variables in the model are held constant. For females
relative to males, the relative risk for low **ses** relative to middle **ses** would be expected to increase by a factor
of 2.263 given the other variables in the model are held constant.

** high ses relative to middle ses**

** science** – This is the relative risk ratio for a one unit
increase in **science** score for high **ses** relative to middle **ses**
level given that the other variables in the model are held constant. If a
subject were to increase her **science** test score by one unit, the
relative risk for high **ses** relative to middle **ses** would be expected to increase by a factor
of 1.023 given the other variables in the model are held constant.

** socst** – This is the relative risk ratio for a one
unit increase in **socst** score for high **ses** relative to middle **ses**
level given that the other variables in the model are held constant. If a
subject were to increase their **socst** test score by one unit, the
relative risk for high **ses** relative to middle **ses** would be expected to increase by a factor
of 1.043 given the other variables in the model are held constant.

** female** – This is the relative risk ratio comparing
females to males for high **ses** relative to middle **ses**
level given that the other variables in the model are held constant. For females
relative to males, the relative risk for high **ses** relative to middle **ses** would be expected to
decrease by a factor
of 0.968 given the other variables in the model are held constant.

b.** [95% Conf. Interval]** – This is the CI for the relative risk ratio
given the other predictors are in the model. For a given predictor with a level
of 95% confidence, we’d say that we are 95% confident that the "true" population
relative risk ratio comparing outcome *m* to the referent group lies
between the lower and upper limit of the interval. An advantage of a CI is that it is
illustrative; it provides a range where the “true” relative risk ratio may lie.