Multinomial Models for Discrete Outcomes

The purpose of this session is to show you how to use LIMDEP's procedures for doing Multinomial Logit (MNL) and Probit (MNP). Additionally, we look at Conditional Logit and Ordered Logit and Probit.

First, MNL and Ordered Probit

/* This file compares procedures for multinomial logit and ordered probit for data that are naturally ordered. A common mistake is to estimate naturally ordered data with MNL or MNP. First, we estimate a multinomial logit (MNL) for the Spector and Mazzeo data with a new dependent variable, LETTERS. LETTERS is coded 0=C, 1=B, 2=A. Then we reestimate the model as an ordered probit to compare the results.

Note of caution: Strictly speaking, choice models assume that n individuals make the choices, and that the choices are independent. However, in this example it is the instructor making the choices, and the choices are obviously not independent. In order to consider this example as a "choice" model, we need to think of the instructor as making n independent decisions based on the students' past grades, achievement tests, and whether s/he received a treatment, PSI. This may not be substantively correct, since the instructor should merely evaluate performance based on course materials, rather than paying attention to the student's attributes. However, the objective is to illustrate how to estimate these models; not specify an appropriate substantive model.*/

Reset $

/* Read the data. */

Read ; Nobs=32 ; Nvar=6 ; Names=OBS,GPA,TUCE,PSI,GRADE,LETTERS $

1 2.66 20 0 0 0
2 2.89 22 0 0 1
3 3.28 24 0 0 1
4 2.92 12 0 0 1
5 4 21 0 1 2
6 2.86 17 0 0 1
7 2.76 17 0 0 1
8 2.87 21 0 0 1
9 3.03 25 0 0 2
10 3.92 29 0 1 2
11 2.63 20 0 0 0
12 3.32 23 0 0 1
13 3.57 23 0 0 1
14 3.26 25 0 1 2
15 3.53 26 0 0 1
16 2.74 19 0 0 1
17 2.75 25 0 0 0
18 2.83 19 0 0 0
19 3.12 23 1 0 1
20 3.16 25 1 1 2
21 2.06 22 1 0 2
22 3.62 28 1 1 2
23 2.89 14 1 0 2
24 3.51 26 1 0 1
25 3.54 24 1 1 2
26 2.83 27 1 1 2
27 3.39 17 1 1 2
28 2.67 24 1 0 1
29 3.65 21 1 1 2
30 4 23 1 1 2
31 3.1 21 1 0 2
32 2.39 19 1 1 2

/* Create a namelist matrix that contains the right side variables */

NAMELIST ; X = one,gpa,tuce,psi $

/* Estimate a multinomial logit with 3 categories. LIMDEP automatically recognizes that you want MNL by the coding of the dependent variable. The coding must always be 0,1,2,...,J, beginning with 0. Note that we are also calling for marginal effects based on the stratification variable, psi. */

LOGIT ; LHS = letters ; Rhs = X ; Margin=psi $

/* Note that MNL produces J-1 sets of coefficient estimates for J choices. Typically, these are then used to compute probabilities and marginal effects. CAUTION: the coefficients from MNL models can be misleading. Their signs can actually be in the opposite direction from the marginal effects, because the coefficients from all J-1 equations enter into the calculations of both the marginal effects and probabilities.*/

/* LIMDEP's MNL program does not retain the predicted probabilities. While this is a bit inconvenient, it is easy to calculate them. Below is a short program that calculates the probabilities from the 3 category MNL model estimated above. You would need to modify this when there are more categories. See LIMDEP's help system for an example with 4 categories. However, note the error in LIMDEP's help file (i.e., X'B versus B'X). */

CALC ; list ; K = Col(X) ; K1 = K+1
      ; TwoK = 2*K ; K21 = TwoK+1 $
MATRIX ; list ; B1 = Part(B,1,K) ; B2 = Part(B,K1,TwoK) $
CREATE ; p0=1/(1+exp(B1'X)+exp(B2'X))
; p1=exp(B1'X)*P0
    ; p2=exp(B2'X)*P0 $
List ; p0,p1,p2 $

/* MNL is inefficient when the dependent variable is naturally ordered. Grade assignments is a classic example of ranking data. So, let's reestimate the relationship with ordered probit.

*/

ORDERED ; LHS=letters ; Rhs=X ; Margin=psi ; List $

/* Note that with ordered probit there is only one set of coefficient estimates. This is more appealing, but CAUTION is still required in interpreting the coefficients. As with MNL above, ordered probit coefficients can also have opposite signs from the marginal effects.

This is because increasing X, while holding the coefficient and threshold estimates constant actually shifts the distribution to the right. This may decrease the probability associated with particular outcomes. See Greene, pp. 928-29 for an example. */

/* It is also possible to obtain ordered logit estimates. These are rarely seen in the literature. However, the following illustrates the ordered logit procedure. */

ORDERED ; Logit ; LHS=letters ; Rhs=X ; Margin=psi ; List $

/* Compare the ordered probability models to MNL above. Note especially that the ordered models correctly classified 22 of 32 outcomes, while MNL only got 17 of the 32. MNL is especially bad, given that we could have correctly classified 15 simply by looking at the modal category, 2. Also, note the large number of errors for MNL in classification of B's. */

delete ; * $

Conditional Logit is commonly used when the decisionmaker chooses primarily on the basis of attributes of the choices, rather than attributes of the individual. This approach is illustrated in the program below.


/* This file illustrates an application of LIMDEP's DISCRETE CHOICE procedure, conditional logit. The model is one of choice of mode for transportation, for a sample of individuals who travel between Sydney and Melbourne, Australia. The four choices are Air, Train, Bus, and Car.

Note the special set-up for the data for DISCRETE CHOICE models. The data consist of observations on choices, rather than on individuals. For each individual there are j possible choices, so that there are j*n observations in the data set. Basically, the data setup is similar to that for panel data. See the LIMDEP help system or manual for more on setting up the data for DISCRETE CHOICE models.

The data set contains the following:

Mode = 0/1 for four alternatives: 1=Air, 2=Train, 3=Bus, 4=Car
Ttme = terminal waiting time
Invc = Invehicle cost for all stages
Invt = Invehicle time for all stages
Gc = Generalized cost measure = Invc + Invt þ value of time
Chair = Dummy variable for chosen mode is air
Hinc = Household income in thousands
Psize = Travelling party size.
Transformed variables include
Indj = Indicator to select mode given not air
Indi = Indicator to select mode Air/Not air
Aasc = Choice specific dummy for Air
Tasc = Choice specific dummy for Train
Basc = Choice specific dummy for Bus
Casc = Choice specific dummy for Car
Psizea = Psize þ Aasc
Z = Tasc + Basc + Casc = Dummy variable for Not Air
Nij=1 if Aasc=1 and 3 if Aasc=0, = number of choices in branch
Ni = 2 = number of branches in tree.

*/

Read ; File=clogit.dat ; nobs=840 ; nvar = 19 ; names=Mode, Ttme,Invc, Invt, Gc, ChAir, Hinc, Psize, Indj, Indi, Aasc, Tasc, Basc, Casc, Hinca, Psizea, Nalt, Nij, Ni $

Sample ; 1 - 840 $

/* Below we estimate a conditional logit model for choice of mode of transport using choice specific constants and a generalized measure of perceived cost of the method of transportation.

Note that LIMDEP creates a set of choice specific dummy variables automatically when we include the ONE option on the RHS= statement. */

Nlogit ; lhs=mode
; choices=air,train,bus,car
; rhs=one,gc $

/* We can also specify the conditional logit model through a set of utility functions for each of the J choices. For example, consider the following that produces the same result as the preceding. */

Nlogit ; lhs=mode
; choices=Air,Train,Bus,Car
; model:
U(Air) = BA + Bz*gc/
U(Train) = BT + BZ*gc/
U(Bus) = BB + BZ*gc/
U(Car) = BZ*gc $

/* The advantage of the utility function approach is that you can specify different utility functions for each choice. */

/* With conditional logit as opposed to MNL, we get a single set of coefficient estimates. This is an appealing attribute of the model. The preceding estimates say simply that people choose their transportation mode based on perceived cost and a set of unmeasured attributes associated with the choices. Note that perceived cost is an attribute of the choice, rather than an attribute of the individual.

However, it might be reasonable to assert that the decisionmaker's income is also relevant to the choice. The decisionmaker's income does not vary across the alternatives. As a result, the discrete choice probabilities are homogeneous of degree zero in the parameters. That bit of jargon means that if there are any attributes that are the same for all outcomes for every individual, they drop out of the probability model.

Thus, for example, any individual characteristic such as age, or income, will cause this problem. The only way such variables can be brought into the conditional logit model is by interacting them with the choice specific constants. This is similar to analysis of covariance in regression.

The specification below includes interaction terms to capture the effect of income on the choice decision. LIMDEP adds these automatically when the Rh2= option is present. */

Nlogit ; lhs=mode
; choices=air,train,bus,car
; rhs=one,gc
; rh2=hinc $

/* Model interpretation is again not easy with conditional logit. We need to be especially cautious about interpreting the coefficients. We can also look at probabilities, partial derivatives, and elasticities. Below is a program that illustrates some of the various outputs from the DISCRETE CHOICE procedure.*/

Nlogit ; lhs=mode
; choices=air,train,bus,car
; rhs=one,gc
; rh2=hinc
; describe;
; crosstab
; list
; effects: gc[*] $

/* We can also test for violations of the IIA assumption as follows. In practice, the test often fails because the difference matrix is not positive definite. This can be seen by the example below. */

Nlogit ; lhs=mode
; choices=air,train,bus,car
; rhs=one,gc
; rh2=hinc $
MATRIX ; Bu = B ; Vu = VARB$
Nlogit ; lhs=mode
; choices=air,train,bus,car
; rhs=one,gc
; rh2=hinc
; ias= air $

delete ; * $