Multinomial Models for Discrete Outcomes
The purpose of this session is to show you how to use LIMDEP's procedures for doing Multinomial Logit (MNL) and Probit (MNP). Additionally, we look at Conditional Logit and Ordered Logit and Probit.
First, MNL and Ordered Probit
/* This file
compares procedures for multinomial logit and ordered probit for data that are
naturally ordered. A common mistake is to estimate naturally ordered data with MNL
or MNP. First, we estimate a multinomial logit (MNL) for the Spector and Mazzeo
data with a new dependent variable, LETTERS. LETTERS is coded 0=C, 1=B, 2=A.
Then we reestimate the model as an ordered probit to compare the results.
Note of caution:
Strictly speaking, choice models assume that n individuals make the choices,
and that the choices are independent. However, in this example it is the
instructor making the choices, and the choices are obviously not independent.
In order to consider this example as a "choice" model, we need to
think of the instructor as making n independent decisions based on the
students' past grades, achievement tests, and whether s/he received a
treatment, PSI. This may not be substantively correct, since the instructor
should merely evaluate performance based on course materials, rather than
paying attention to the student's attributes. However, the objective is to
illustrate how to estimate these models; not specify an appropriate substantive
model.*/
Reset $
/* Read the data.
*/
Read ; Nobs=32 ;
Nvar=6 ; Names=OBS,GPA,TUCE,PSI,GRADE,LETTERS $
1 2.66 20 0 0 0
2 2.89 22 0 0 1
3 3.28 24 0 0 1
4 2.92 12 0 0 1
5 4 21 0 1 2
6 2.86 17 0 0 1
7 2.76 17 0 0 1
8 2.87 21 0 0 1
9 3.03 25 0 0 2
10 3.92 29 0 1 2
11 2.63 20 0 0 0
12 3.32 23 0 0 1
13 3.57 23 0 0 1
14 3.26 25 0 1 2
15 3.53 26 0 0 1
16 2.74 19 0 0 1
17 2.75 25 0 0 0
18 2.83 19 0 0 0
19 3.12 23 1 0 1
20 3.16 25 1 1 2
21 2.06 22 1 0 2
22 3.62 28 1 1 2
23 2.89 14 1 0 2
24 3.51 26 1 0 1
25 3.54 24 1 1 2
26 2.83 27 1 1 2
27 3.39 17 1 1 2
28 2.67 24 1 0 1
29 3.65 21 1 1 2
30 4 23 1 1 2
31 3.1 21 1 0 2
32 2.39 19 1 1 2
/* Create a
namelist matrix that contains the right side variables */
NAMELIST ; X =
one,gpa,tuce,psi $
/* Estimate a
multinomial logit with 3 categories. LIMDEP automatically recognizes that you
want MNL by the coding of the dependent variable. The coding must always be
0,1,2,...,J, beginning with 0. Note that we are also calling for marginal
effects based on the stratification variable, psi. */
LOGIT ; LHS = letters ; Rhs = X ; Margin=psi $
/* Note that MNL
produces J-1 sets of coefficient estimates for J choices. Typically, these are
then used to compute probabilities and marginal effects. CAUTION: the
coefficients from MNL models can be misleading. Their signs can actually be in
the opposite direction from the marginal effects, because the coefficients from
all J-1 equations enter into the calculations of both the marginal effects and
probabilities.*/
/* LIMDEP's MNL
program does not retain the predicted probabilities. While this is a bit
inconvenient, it is easy to calculate them. Below is a short program that
calculates the probabilities from the 3 category MNL model estimated above. You
would need to modify this when there are more categories. See LIMDEP's help
system for an example with 4 categories. However, note the error in LIMDEP's
help file (i.e., X'B versus B'X). */
CALC ; list ; K =
Col(X) ; K1 = K+1
; TwoK = 2*K ; K21 = TwoK+1 $
MATRIX ; list ; B1 = Part(B,1,K) ; B2 = Part(B,K1,TwoK) $
CREATE ; p0=1/(1+exp(B1'X)+exp(B2'X))
; p1=exp(B1'X)*P0
; p2=exp(B2'X)*P0 $
List ; p0,p1,p2 $
/* MNL is
inefficient when the dependent variable is naturally ordered. Grade assignments
is a classic example of ranking data. So, let's reestimate the relationship
with ordered probit.
*/
ORDERED ;
LHS=letters ; Rhs=X ; Margin=psi ; List $
/* Note that with
ordered probit there is only one set of coefficient estimates. This is more
appealing, but CAUTION is still required in interpreting the coefficients. As
with MNL above, ordered probit coefficients can also have opposite signs from
the marginal effects.
This is because
increasing X, while holding the coefficient and threshold estimates constant
actually shifts the distribution to the right. This may decrease the
probability associated with particular outcomes. See Greene, pp. 928-29 for an
example. */
/* It is also
possible to obtain ordered logit estimates. These are rarely seen in the
literature. However, the following illustrates the ordered logit procedure. */
ORDERED ; Logit ;
LHS=letters ; Rhs=X ; Margin=psi ; List $
/* Compare the
ordered probability models to MNL above. Note especially that the ordered
models correctly classified 22 of 32 outcomes, while MNL only got 17 of the 32.
MNL is especially bad, given that we could have correctly classified 15 simply
by looking at the modal category, 2. Also, note the large number of errors for
MNL in classification of B's. */
delete ; * $
Conditional Logit is commonly used when the decisionmaker chooses primarily on the basis of attributes of the choices, rather than attributes of the individual. This approach is illustrated in the program below.
/* This file illustrates an application of LIMDEP's DISCRETE CHOICE procedure,
conditional logit. The model is one of choice of mode for transportation, for a
sample of individuals who travel between Sydney and Melbourne, Australia. The
four choices are Air, Train, Bus, and Car.
Note the special
set-up for the data for DISCRETE CHOICE models. The data consist of
observations on choices, rather than on individuals. For each individual there
are j possible choices, so that there are j*n observations in the data set.
Basically, the data setup is similar to that for panel data. See the LIMDEP
help system or manual for more on setting up the data for DISCRETE CHOICE
models.
The data set
contains the following:
Mode = 0/1 for four
alternatives: 1=Air, 2=Train, 3=Bus, 4=Car
Ttme = terminal waiting time
Invc = Invehicle cost for all stages
Invt = Invehicle time for all stages
Gc = Generalized cost measure = Invc + Invt þ value of time
Chair = Dummy variable for chosen mode is air
Hinc = Household income in thousands
Psize = Travelling party size.
Transformed variables include
Indj = Indicator to select mode given not air
Indi = Indicator to select mode Air/Not air
Aasc = Choice specific dummy for Air
Tasc = Choice specific dummy for Train
Basc = Choice specific dummy for Bus
Casc = Choice specific dummy for Car
Psizea = Psize þ Aasc
Z = Tasc + Basc + Casc = Dummy variable for Not Air
Nij=1 if Aasc=1 and 3 if Aasc=0, = number of choices in branch
Ni = 2 = number of branches in tree.
*/
Read ;
File=clogit.dat ; nobs=840 ; nvar = 19 ; names=Mode, Ttme,Invc, Invt, Gc,
ChAir, Hinc, Psize, Indj, Indi, Aasc, Tasc, Basc, Casc, Hinca, Psizea, Nalt,
Nij, Ni $
Sample ; 1 - 840 $
/* Below we
estimate a conditional logit model for choice of mode of transport using choice
specific constants and a generalized measure of perceived cost of the method of
transportation.
Note that LIMDEP
creates a set of choice specific dummy variables automatically when we include
the ONE option on the RHS= statement. */
Nlogit ; lhs=mode
; choices=air,train,bus,car
; rhs=one,gc $
/* We can also
specify the conditional logit model through a set of utility functions for each
of the J choices. For example, consider the following that produces the same
result as the preceding. */
Nlogit ; lhs=mode
; choices=Air,Train,Bus,Car
; model:
U(Air) = BA + Bz*gc/
U(Train) = BT + BZ*gc/
U(Bus) = BB + BZ*gc/
U(Car) = BZ*gc $
/* The advantage of the utility function approach is that you can specify
different utility functions for each choice. */
/* With conditional
logit as opposed to MNL, we get a single set of coefficient estimates. This is
an appealing attribute of the model. The preceding estimates say simply that
people choose their transportation mode based on perceived cost and a set of
unmeasured attributes associated with the choices. Note that perceived cost is
an attribute of the choice, rather than an attribute of the individual.
However, it might
be reasonable to assert that the decisionmaker's income is also relevant to the
choice. The decisionmaker's income does not vary across the alternatives. As a
result, the discrete choice probabilities are homogeneous of degree zero in the
parameters. That bit of jargon means that if there are any attributes that are
the same for all outcomes for every individual, they drop out of the
probability model.
Thus, for example,
any individual characteristic such as age, or income, will cause this problem.
The only way such variables can be brought into the conditional logit model is
by interacting them with the choice specific constants. This is similar to
analysis of covariance in regression.
The specification
below includes interaction terms to capture the effect of income on the choice
decision. LIMDEP adds these automatically when the Rh2= option is present. */
Nlogit ; lhs=mode
; choices=air,train,bus,car
; rhs=one,gc
; rh2=hinc $
/* Model
interpretation is again not easy with conditional logit. We need to be
especially cautious about interpreting the coefficients. We can also look at
probabilities, partial derivatives, and elasticities. Below is a program that
illustrates some of the various outputs from the DISCRETE CHOICE procedure.*/
Nlogit ; lhs=mode
; choices=air,train,bus,car
; rhs=one,gc
; rh2=hinc
; describe;
; crosstab
; list
; effects: gc[*] $
/* We can also test
for violations of the IIA assumption as follows. In practice, the test often
fails because the difference matrix is not positive definite. This can be seen
by the example below. */
Nlogit ; lhs=mode
; choices=air,train,bus,car
; rhs=one,gc
; rh2=hinc $
MATRIX ; Bu = B ; Vu = VARB$
Nlogit ; lhs=mode
; choices=air,train,bus,car
; rhs=one,gc
; rh2=hinc
; ias= air $
delete ; * $