Multinomial Models for Discrete Outcomes
/* This file compares procedures for multinomial logit and ordered probit for data that are naturally ordered. A
common mistake is to estimate naturally ordered data with MNL or MNP. First, we estimate a multinomial logit (MNL) for the Spector and Mazzeo data with a new dependent variable, LETTERS. LETTERS is coded 0=C, 1=B, 2=A.
Then we reestimate the model as an ordered probit to compare the results.
Note of caution: Strictly speaking, choice models assume that n individuals make the choices, and that the choices are independent. However, in this example it is the instructor making the choices, and the choices are obviously not independent. In order to consider this example as a "choice" model, we need to think of the instructor as making n independent decisions based on the students' past grades, achievement tests, and whether s/he received a treatment, PSI. This may not be substantively correct, since the instructor should merely evaluate performance based on course materials, rather than paying attention to the student's attributes. However, the objective is to illustrate how to estimate these models; not specify an appropriate substantive model.*/
/*Using this file requires installation of dmlogit2 and the Hausman and Small-Hsiao IIA modules. It also uses SPOST which you should have already installed. Run a net search from within STATA on the following commands and install the proper files:
dmlogit2
iia
smhsiao
*/
/*Let's load the data*/
set more off
use "c:\users\B. Dan
Wood\My Documents\My Teaching\Maximum Likelihood\Data\letters.dta",
clear
summarize
/*Multinomial Logit */
mlogit letters gpa
tuce psi, base(0)
/*The command
"base(0)" sets 0 as the comparison group */
/* Note that MNL produces J-1 sets of coefficient
estimates for J choices. Typically, these are then used to compute probabilities
and marginal effects. CAUTION: the coefficients from MNL models can be
misleading. Their signs can actually be in the opposite direction from the
marginal effects, because the coefficients from all J-1 equations enter into
the calculations of both the marginal effects and probabilities.*/
/*Stata does not
automatically report the predicted probabilities. However, we can retrieve them in the
following manner*/
predict p0 p1 p2, p /*Calculate p(y=1) for each y
*/
predict z0, xb
outcome(0) /*Calculate z
for each y */
predict z1, xb
outcome(1) /*Calculate z
for each y */
predict z2, xb
outcome(2) /*Calculate z
for each y */
list letters z0 p0 z1 p1 z2 p2 /*List y, predicted probabilities,
and z */
/*We can also obtain other
helpful statistics*/
fitstat /*Obtain
various fit statistics on the logit */
listcoef, help /*List the
coefficients and standardized coefficients*/
prvalue, x(psi=0)
rest(mean) /*Compute p(grade=1) when psi=0, and rest at mean */
prvalue, x(psi=1)
rest(mean) /*Compute p(grade=1) when psi=1, and rest at mean */
prchange, help /*Compute
various first differences */
dmlogit2 letters gpa
tuce psi, base(0) /*The model in derivatives with all
variables at means*/
/* Reestimate the model
for an alternative procedure to calculate derivatives */
mlogit letters gpa
tuce psi, base(0)
mfx compute,
predict(outcome(1)) /*The model
in derivatives switching psi from zero to one*/
mfx compute,
predict(outcome(2))
/*Test for Independence of Irrelevant Alternatives
in MNL model with SPOST and mlogtest. By the way, this procedure does a variety of
tests, including whether categories can be combined or not. You can ask for the tests individually, but
the option “all” gives you everything.
Use "search mlogtest" to get a
listing of the tests.*/
mlogit letters gpa
tuce psi, base(0)
mlogtest, all
/*Alternatively, there are
other user defined procedures for doing the Hausman
and Small-Hsiao tests. */
mlogit letters gpa
tuce psi, base(0)
iia
smhsiao letters gpa
tuce psi, elim(0)
/*Now let's drop the predicted probabilities*/
drop z0 p0 z1 p1 z2 p2
/* STATA also offers a version of multinomial probit. However, it assumes no correlation between errors
and seems to give short shrift to the required restrictions. The following
iterates for a long time and should probably be stopped after sufficient
iterations.*/
mprobit letters gpa
tuce psi
/* MNL is inefficient when the dependent variable is
naturally ordered. Grade assignments is a classic
example of ranking data. So, let's reestimate the
relationship with ordered probit. */
oprobit letters gpa
tuce psi
/* Note that with ordered probit
there is only one set of coefficient estimates. This is more appealing, but
CAUTION is still required in interpreting the coefficients. As with MNL above,
ordered probit coefficients can also have opposite
signs from the marginal effects. This is
because increasing X, while holding the coefficient and threshold estimates
constant actually shifts the distribution to the right. This may decrease the
probability associated with particular outcomes. See Greene, pp. 736-40 for an
example. */
/*We can also obtain
predicted probabilities similar to the MNL above*/
predict p0 p1 p2, p /*Calculate p(y=1) for each y
*/
predict z, xb
/*Calculate
z for each y */
list letters z p0 p1 p2 /*List y, predicted probabilities, and z
*/
fitstat /*Obtain
various fit statistics on the logit */
listcoef, help /*List the
coefficients and standardized coefficients*/
prvalue, x(psi=0)
rest(mean) /*Compute p(grade=1) when psi=0, and rest at mean */
prvalue, x(psi=1)
rest(mean) /*Compute p(grade=1) when psi=1, and rest at mean */
prchange, help /*Compute
various first differences */
mfx compute,
predict(outcome(1)) /*The model in
derivatives switching psi from zero to one*/
mfx compute,
predict(outcome(2))
/*Again, let's drop the
predicted probabilities*/
drop z p0 p1 p2
/* It is also possible to obtain ordered logit estimates. These are rarely seen in the literature.
However, the following illustrates the ordered logit
procedure. */
ologit letters gpa
tuce psi
predict p0 p1 p2, p /*Calculate p(y=1) for each y
*/
predict z, xb
/*Calculate
z for each y */
list letters z p0 p1 p2 /*List y, predicted probabilities, and z
*/
fitstat /*Obtain
various fit statistics on the logit */
listcoef, help /*List the
coefficients and standardized coefficients*/
prvalue, x(psi=0)
rest(mean) /*Compute p(grade=1) when psi=0, and rest at mean */
prvalue, x(psi=1)
rest(mean) /*Compute p(grade=1) when psi=1, and rest at mean */
prchange, help /*Compute
various first differences */
mfx compute,
predict(outcome(1)) /*The
model in derivatives switching psi from zero to one*/
mfx compute,
predict(outcome(2))
/*Again, let's drop the
predicted probabilities*/
drop z p0 p1 p2
/*Now let's take a look at conditional logit. To do this,
we will need to use a new data set*/
use "c:\users\B. Dan
Wood\My Documents\My Teaching\Maximum Likelihood\Data\clogit.dta",
clear
set more off
/* This file illustrates an
application of STATA's DISCRETE CHOICE procedure,
conditional logit. The model is one of choice of mode
for transportation, for a sample of individuals who travel between Sydney and
Note the special set-up for the data for DISCRETE
CHOICE models. The data consist of observations on choices, rather than on
individuals. For each individual there are j possible choices, so that there
are j*n observations in the data set. Basically, the data setup is similar to
that for panel data. See the STATA help system or manual for more on setting up
the data for DISCRETE CHOICE models.
Conditional Logit is
commonly used when the decisionmaker chooses primarily
on the basis of attributes of the choices, rather than attributes of the
individual. This approach is illustrated in the program below.*/
/* Below we estimate a conditional logit model for choice of mode of transport using choice
specific constants and a generalized measure of perceived cost of the method of
transportation, household income, and terminal waiting time. Note that unlike
LIMDEP we must specify the group ourselves, in this case "group(id)"*/
clogit mode aasc
tasc basc gc ttme hinca,
group(id)
predict prob,
p /*Calculate p(y=1) for each y */
predict zi,
xb /*Calculate
z for each y */
list mode prob
zi /*List
y, predicted probabilities, and z */
fitstat /*Obtain various fit statistics on
the logit */
listcoef, help /*List the
coefficients and standardized coefficients*/
/* With conditional logit as opposed to MNL, we get a single set of coefficient
estimates. This is an appealing attribute of the model. The preceding estimates
say simply that people choose their transportation mode based on perceived cost
and a set of unmeasured attributes associated with the choices. Note that
perceived cost is an attribute of the choice, rather than an attribute of the
individual.
However, it might be reasonable to assert that the decisionmaker's income is also relevant to the choice. The decisionmaker's income does not vary across the
alternatives. As a result, the discrete choice probabilities are homogeneous of
degree zero in the parameters. That bit of jargon means that if there are any
attributes that are the same for all outcomes for every individual, they drop
out of the probability model.
Thus, for example, any individual characteristic
such as age, or income, will cause this problem. The only way such variables
can be brought into the conditional logit model is by
interacting them with the choice specific constants. This is similar to
analysis of covariance in regression.*/
/*Test for IIA using conditional logit
setup */
clogit mode aasc
tasc basc gc ttme hinca,
group(id)
est store all
clogit mode aasc
tasc basc gc ttme hinca
if choice ~=1, group(id)
est store partial
hausman partial all, alleqs constant
/*Now let’s do a Nested Logit
which avoids the problem of IIA, as well as allows modeling a “tree-like” probability
model where choices are conditioned by prior choices.*/
/*Read in the data again to rid ourselves of the
preceding definitions. */
use "c:\users\B. Dan
Wood\My Documents\My Teaching\Maximum Likelihood\Data\clogit.dta",
clear
set more off
sort id
by id: gen
travel=_n-1
nlogitgen type=travel(fly: 0, land:
1|2|3)
nlogittree travel type
nlogit mode (travel=aasc tasc basc
gc ttme) (type=hinca),group(id)