Dichotomous Logit and Probit
The purpose of this
session is to show you how to use LIMDEP's
"canned" procedures for doing dichotomous Logit
and Probit analysis. This includes obtaining
predicted probabilities, predictions of the dependent variable, coefficients
and marginal effects for the variables, model diagnostics, hypothesis tests,
and the heteroskedastic Probit
model. In addition, we provide programs that obtain each of these outputs the
"hard" way for illustrative purposes only.
/* This
program illustrates some of the "canned" procedures in LIMDEP for
doing Logit and Probit. The
data is a sample of 32 observations from a study by Spector
and Mazzeo on pre- and post-tests for economic
literacy. The variables are:
GPA = grade point
average
TUCE = test score on teaching college level economics
PSI = program participation variable (school district)
GRADE = response: 1 if grade on post is higher than on pre,
0 otherwise. */
/* First read the
data. Notice that this time we are putting the data directly into the file. */
Reset $
Read; Nobs=32; Nvar=4;
Names=GPA,TUCE,PSI,GRADE $
2.66 20 0 0
2.89 22 0 0
3.28 24 0 0
2.92 12 0 0
4.00 21 0 1
2.86 17 0 0
2.76 17 0 0
2.87 21 0 0
3.03 25 0 0
3.92 29 0 1
2.63 20 0 0
3.32 23 0 0
3.57 23 0 0
3.26 25 0 1
3.53 26 0 0
2.74 19 0 0
2.75 25 0 0
2.83 19 0 0
3.12 23 1 0
3.16 25 1 1
2.06 22 1 0
3.62 28 1 1
2.89 14 1 0
3.51 26 1 0
3.54 24 1 1
2.83 27 1 1
3.39 17 1 1
2.67 24 1 0
3.65 21 1 1
4.00 23 1 1
3.10 21 1 0
2.39 19 1 1
Sample
; 1 - 32
$
/* Binary choice models: Since the interest is in the PSI variable - this is
the treatment Effect of the program- we will test the hypothesis that the
coefficient is equal to 0.0. These commands will illustrate several ways to
test hypotheses using LIMDEP. */
Probit ; Lhs = Grade ; Rhs = One,GPA,TUCE,PSI $
/* The t-ratio on the coefficient reported for PSI provides a
direct test. We'll repeat the test a couple of different ways. First, we can
extract the t-statistic directly from the model results if we wish. */
Calc
; list ;
T for PSI = b(4) / Sqr(Varb(4,4))
$
/* We can also ask LIMDEP to carry out a Wald test. For a
single coefficient the chi-squared is going to be the same as the square of the
t-ratio. This command uses the Last Model setup, which allows specification of
Wald tests using a predefined set of parameter names and uses the results of
the previous model for the coefficients and covariance matrix */
Wald
; Fn1 = b_psi - 0 $
/* LIMDEP will also
carry out a test of a set of linear restrictions as part of the estimation
procedure */
Probit ; Lhs = Grade ; Rhs = One,GPA,TUCE,PSI ; wald: b(4) = 0 $
/* For more involved problems, a likelihood ratio test or an LM
test might preferable. To carry out the likelihood ratio test, we require
chi - squared = 2 * (LOGL_with PSI - LOGL_without PSI)
We have the first
one from the last estimate. To get the second, we just leave PSI out of the
model and retrieve logl. If the chi-squared value
exceeds the critical value 3.84, we reject the hypothesis. Alternatively if the
probability to the left of our statistic is less that, say, .05, we reject the
hypothesis. */
Calc ; YESPSI = LogL $
Probit; Lhs = Grade ; Rhs =
One,GPA,TUCE $
Calc ; NOPSI = LogL $
Calc ; List ; Chisq = 2*(YesPSI
- NoPSI)
; Cprob = 1 - Chi(chisq,1) $
/* Lastly, to compute an LM statistic, we need the chi-squared
statistic
CLM = g0' V g0
where g0 is the gradient of the
log-likelihood function for the probit model
evaluated at the maximum likelihood estimates with the hypothesis assumed. This
is the last model estimated above, which computes the MLE with the coefficient
on PSI fixed at 0. (by dropping PSI). V is the inverse
Hessian computed at this estimate. LIMDEP automates this; you just provide the
restricted estimates and specify zero iterations. */
Probit; Lhs = Grade
; Rhs = One,GPA,TUCE,PSI
; Start = B,0 ; Maxit = 0 $
Calc ; List ; LMSTAT $
/* Now let's look at the available options on Logit/Probit
procedures
; List - gives
fitted values
; Keep= -creates a variable containing the predictions
; Res= -creates a variable containing the residuals
; Alg= -specifies the estimation algorithm.
; Tlf= ; Tlb= ; Tlg= - overrides the convergence criterion
; Maxit= -gives the maximum number of iterations. If
0, then a LaGrange Multiplier statistic is given.
; Var= -list of variances to extract from the
variance matrix of b.
; Start= -gives a vector of starting values. This is useful for complex
optimizations.
; Margin computes marginal effects (partial derivatives) for each coefficient
at the means.
; Margin = variable - computes marginal effects (partial derivatives) with all
coefficients at their means except a stratification variable.
; Het - estimates a heteroskedastic
version of probit when ....
; Rh2 - a vector of variables producing the heteroskedasticity
is given.
; Rst= imposes restrictions on the listed
coefficients to values given in the vector of starting values. */
/* Below we estimate the homoskedastic
model for both logit and probit
and all output, including marginal effects for each variable. */
Probit ; Lhs=grade ; Rhs=One,GPA,TUCE,PSI
; List ; Margin=PSI $
Logit ; Lhs=Grade ; Rhs=One,GPA,TUCE,PSI
; List ; Margin=PSI $
/* Now let's do the
heteroskedastic probit with
heteroskedasticity a function of logged GPA and
compute a Wald statistic. */
Create ;
LOGGPA=log(GPA) $
Probit ; Lhs=grade ; Rhs=One,GPA,TUCE,PSI
; Rh2=LOGGPA ; Het ; Wald: b(5)=0 $
/* The t statistic on LOGGPA is the square root of the Wald statistic for no heteroskedasticity. In more complex specifications we could
use the Wald option for joint tests of multiple coefficients in the variance
vector as was done above. We could also compute a LaGrange Multiplier test for
no heteroskedasticity as is done below. */
Probit ; Lhs=grade ; Rhs=One,GPA,TUCE,PSI $
Probit ; Lhs=grade ; Rhs=One,GPA,TUCE,PSI ;
; Rh2=LOGGPA ; Het ; Start=B,0 ; Maxit=0
$
Calc ; List ; LMSTAT $
/* Finally, we could do a likelihood ratio test for no heteroskedasticity*/
Probit ; Lhs=grade ; Rhs=One,GPA,TUCE,PSI $
Calc ; LNL_Homo=LogL $
Probit ; Lhs=grade ; Rhs=One,GPA,TUCE,PSI
; Rh2=LOGGPA ; Het $
Calc ; LNL_Het=LogL $
Calc ; List ; LRHet=2*(LNL_Het
- LNL_Homo) $
/* Suppose for some reason you want to do restricted
estimation. Let's impose the restriction that GPA=3. */
Probit ; Lhs=grade ; Rhs=One,GPA,TUCE,PSI
; Start=-7,3,.05,1.43 ; Rst=b1,(),b3,b4 $
/* Here the
restriction is contained in the coefficient start values and the coefficient to
be restricted is marked by ().
/* Delete all variables, and get ready for another Program */
Delete
; * $
The preceding makes use
of LIMDEP's built-in functions. However, the curious
among you may not be satisfied with this, given your earlier experience with
the Maximize command. Here is a program that illustrates how to do Logit/Probit using your own log-likelihood function. I also
show how to input analytical derivatives to the Maximize procedure.
/* This file shows how to estimate Logit
and Probit models using the Maximize procedure. There
is no need, practically, to do this, because LIMDEP has a very nice set of
"canned" procedures for Logit/Probit.
However, it is still a useful exercise for learning. Also, the last procedure
shows how to supply analytical derivatives to the Maximize procedure. */
Reset $
/* First, let's create some simulated data to use in this demonstration. X is
random normal numbers with mean zero and standard deviation one. From x we
create y and q for the log-likelihood function. */
Sample ; 1-200 $
Create ; x=rnn(0,1) ; y=x+rnn(0,1)
; y=y>0 ; q=2*y-1 $
/* Now, let's compute some descriptive statistics on the simulated data. Note
the normality plots in particular in relation to the dependent variable */
Dstats ; Rhs=x,y ; Quantiles ; Plot $
/* Now let's use
the maximize procedure to maximize the log-likelihood with respect to the
parameters. See Greene, p. 882,note 6 for the
parameterization below of the log-likelihood functions. */
/* Let's do Logit First */
Maximize
;
Labels=b0,b1
; Start=0,1
; Fcn=log(lgp(q*(b0+b1*x)))
$
/* Now do Probit */
Maximize
;
Labels=b0,b1
; Start=0,1
; Fcn=log(phi(q*(b0+b1*x)))
$
/* Now let's use
the maximize procedure using some subfunctions and
supplying analytical derivatives to the procedure. This could be useful if you
have a large data set in terms of speed. Note that the | indicates a subfunction. The |_ indicates analytical first partial
derivatives. We'll use probit for this example. */
Maximize
;
Labels=b0,b1
; Start=0,1
; Fcn=
bx=q*(b0+b1*x)
|d=q*n01(bx)/phi(bx)
|_b0=d
|_b1=d*x
|log(phi(bx)) $
/* Delete the variables
and prepare for another run */
Delete ; * $
Some of you may also be
curious about how LIMDEP's "canned"
procedure obtains probabilities, predictions, and marginal effects. You may
also want to see more details on the heteroskedastic probit model. Here is a program that does both.
/* This program illustrates some of the matrix and calculation
procedures of LIMDEP in the context of comparing dichotomous Logit and Probit. In particular,
we show how to obtain the marginal effects, probabilities, and predicted values
the hard way. Also, the heteroskedastic probit model is estimated using the Maximize procedure for
illustrative purposes. */
/* Let's use the Spector and Mazzeo data again for this example. */
Reset $
Read; Nobs=32 ; Nvar=4 ; Names=GPA,TUCE,PSI,GRADE $
2.66 20 0 0
2.89 22 0 0
3.28 24 0 0
2.92 12 0 0
4.00 21 0 1
2.86 17 0 0
2.76 17 0 0
2.87 21 0 0
3.03 25 0 0
3.92 29 0 1
2.63 20 0 0
3.32 23 0 0
3.57 23 0 0
3.26 25 0 1
3.53 26 0 0
2.74 19 0 0
2.75 25 0 0
2.83 19 0 0
3.12 23 1 0
3.16 25 1 1
2.06 22 1 0
3.62 28 1 1
2.89 14 1 0
3.51 26 1 0
3.54 24 1 1
2.83 27 1 1
3.39 17 1 1
2.67 24 1 0
3.65 21 1 1
4.00 23 1 1
3.10 21 1 0
2.39 19 1 1
Sample
; 1 - 32
$
/* Are the probit and logit models really
different? The coefficients are, but are the slopes and the predicted
probabilities? Let's find out below. */
? 1. List of regressors
Namelist ; X = One,GPA,TUCE,PSI
$
? 2. Vector of
means
Matrix
; Xbar = Mean(X) $
? 3. Probit model. Keep predictions of Grade (0s and 1s)
Probit ; Lhs = Grade ; Rhs = X ; keep = GFP ; list $
? 4. Derivatives
are a scale times the coefficients
Calc
; Scale
= N01(b'Xbar) $
? 5. Predicted probit probabilities
Create
; PP =
Phi(b'x) $
? 6. Probit coefficients and derivatives of conditional mean
Matrix
; BP = B
; DP = scale * B $
? Now, repeat for
the logit model
Logit ; Lhs = Grade ; Rhs = X ; keep = GFL $
? 1. Derivatives
for logit are also a scale
times coefficients
Calc
; Scale
= lgd(b'Xbar) $
? 2. Predicted logit probabilities
Create
; PL = lgp(b'x) $
? Logit coefficients and derivatives of the conditional mean
Matrix
; BL = B
; DL = scale * B $
? Compare the two
models. First, coefficients, side by side
Matrix
; List ;
[BP,BL] ; Pause $
? Now, derivatives,
side by side
Matrix
; List ;
[DP,DL] ; Pause $
? Predictions and
fitted probabilities
List
;
PP,PL,GFP,GFL $
/* Now let's do the
heteroskedastic probit
using the Maximize procedure. In the example in the "canned"
procedure, we used GPA as the possible source of heteroskedasticity.
This time let's see if there are group effects causing heteroskedasticity
by using PSI. We'll obtain starting values by estimating the "canned"
homoskedastic procedure first. Note that we could
also have gotten this from the "hard" procedure above. */
Create ;
q=2*Grade-1 $
Probit ; Lhs=Grade ; Rhs=One,GPA,Tuce,PSI $
Maximize ; Labels=b0,b1,b2,b3,gamma
; Start=B,1
; Fcn=
Xb=b0+b1*GPA+b2*Tuce+b3*PSI
|Sig=exp(gamma*PSI)
|Xbh=Xb/Sig
|log(phi(q*(Xbh)))
$
/* The t statistic on gamma in the preceding is the square root
of a Wald test for the null of no heteroskedasticity.
We could also do LaGrange Multiplier and Likelihood Ratio tests. I leave it to
you to replicate the analysis on p. 891 of Greene. (By the
way, note the math error in Greene's Wald test on p. 891.)
Delete
; * $