Binary Logit and Probit
The purpose of this week’s lesson is to illustrate some of LIMDEP’s procedures for doing Logit/Probit analysis. The data is a sample of 32 observations from a study by Spector and Mazzeo on pre- and post-tests for economic literacy and is also used by Greene in his examples. The variables are:
GPA = grade point average
TUCE = test score on teaching college level economics
PSI = program participation variable (school district)
GRADE = response: 1 if grade on post is higher than on pre,
0 otherwise.
/* First read the data. */
Reset $
Delete ; * $
Read; Nobs=32; Nvar=4;
Names=GPA,TUCE,PSI,GRADE $
2.66 20 0 0
2.89 22 0 0
3.28 24 0 0
2.92 12 0 0
4.00 21 0 1
2.86 17 0 0
2.76 17 0 0
2.87 21 0 0
3.03 25 0 0
3.92 29 0 1
2.63 20 0 0
3.32 23 0 0
3.57 23 0 0
3.26 25 0 1
3.53 26 0 0
2.74 19 0 0
2.75 25 0 0
2.83 19 0 0
3.12 23 1 0
3.16 25 1 1
2.06 22 1 0
3.62 28 1 1
2.89 14 1 0
3.51 26 1 0
3.54 24 1 1
2.83 27 1 1
3.39 17 1 1
2.67 24 1 0
3.65 21 1 1
4.00 23 1 1
3.10 21 1 0
2.39 19 1 1
Sample ; 1 - 32 $
/* Binary choice models: Since the
interest is in the PSI variable - this is
the treatment Effect of the
program- we will test the hypothesis
that the coefficient is equal to
0.0. These commands will illustrate
several ways to test hypotheses
using LIMDEP. */
Probit ; Lhs = Grade ; Rhs =
One,GPA,TUCE,PSI $
/* The t-ratio on the coefficient
reported for PSI provides a direct test.
We'll repeat the test a couple of
different ways. First, we can extract the
t-statistic directly from the
model results if we wish. */
Calc ; list ; T for PSI = b(4) /
Sqr(Varb(4,4)) $
/* We can also ask LIMDEP to carry
out a Wald test. For a single coefficient
the chi-squared is going to be the
same as the square of the t-ratio.
This command uses the Last Model
setup, which allows specification of
Wald tests using a predefined set
of parameter names and uses the results
of the previous model for the
coefficients and covariance matrix */
Wald ; Fn1 = b_psi - 0 $
/* LIMDEP will also carry out a
test of a set of linear restrictions as
part of the estimation procedure
*/
Probit ; Lhs = Grade ; Rhs =
One,GPA,TUCE,PSI ; wald: b(4) = 0 $
/* For more involved problems, a
likelihood ratio test or an LM test might
be preferable. To carry out the
likelihood ratio test, we require
chi - squared = 2 * (LOGL_with PSI
- LOGL_without PSI)
We have the first one from the
last estimate. To get the second, we just
leave PSI out of the model and
retrieve logl. If the chi-squared value
exceeds the critical value 3.84,
we reject the hypothesis. Alternatively
if the probability to the left of
our statistic is less that, say, .05,
we reject the hypothesis. */
Calc ; YESPSI = LogL $
Probit; Lhs = Grade ; Rhs =
One,GPA,TUCE $
Calc ; NOPSI = LogL
; List ; Chisq = 2*(YesPSI -
NoPSI)
; Cprob = 1 - Chi(chisq,1) $
/* Lastly, to compute an LM
statistic, we need the chi-squared statistic
CLM = g0' V g0
where g0 is the gradient of the
log-likelihood function for the probit model
evaluated at the maximum
likelihood estimates with the hypothesis assumed.
This is the last model estimated
above, which computes the MLE with the
coefficient on PSI fixed at 0. (by
dropping PSI). V is the inverse Hessian
computed at this estimate. LIMDEP
automates this; you just provide the
restricted estimates and specify
zero iterations. */
Probit; Lhs = Grade ; Rhs =
One,GPA,TUCE,PSI
; Start = B,0 ; Maxit = 0 $
Calc ; List ; LMSTAT $
/* Now let's look at the available
options on Logit/Probit procedures
; List - gives fitted values
; Keep= -creates a variable
containing the predictions
; Res= -creates a variable
containing the residuals
; Alg= -specifies the estimation
algorithm.
; Tlf= ; Tlb= ; Tlg= - overrides
the convergence criterion
; Maxit= -gives the maximum number
of iterations. If 0, then a LaGrange Multiplier
statistic is given.
; Var= -list of variances to
extract from the variance matrix of b.
; Start= -gives a vector of
starting values. This is useful for complex optimizations.
; Margin computes marginal effects
(partial derivatives) for each coefficient at the means.
; Margin = variable - computes
marginal effects (partial derivatives) with all coefficients
at their means except a
stratification variable.
; Het - estimates a heteroskedastic
version of probit when ....
; Rh2 - a vector of variables
producing the heteroskedasticity is given.
; Rst= imposes restrictions on the
listed coefficients to values given in the
vector of starting values. */
/* Below we estimate the homoskedastic
model for both logit and probit and all output,
including marginal effects for
each variable. */
Probit ; Lhs=grade ;
Rhs=One,GPA,TUCE,PSI
; List ; Margin=PSI $
Logit ; Lhs=Grade ;
Rhs=One,GPA,TUCE,PSI
; List ; Margin=PSI $
/* Now let's do the
heteroskedastic probit with heteroskedasticity a function
of logged GPA and compute a Wald
statistic. */
Create ; LOGGPA=log(GPA) $
Probit ; Lhs=grade ;
Rhs=One,GPA,TUCE,PSI
; Rh2=LOGGPA ; Het ; Wald: b(5)=0
$
/* The t statistic on LOGGPA is
the square root of the Wald statistic for no
heteroskedasticity. In more
complex specifications we could use the Wald
option for joint tests of multiple
coefficients in the variance vector as was
done above. We could also compute
a LaGrange Multiplier test for no
heteroskedasticity as is done
below. */
Probit ; Lhs=grade ;
Rhs=One,GPA,TUCE,PSI $
Probit ; Lhs=grade ;
Rhs=One,GPA,TUCE,PSI ;
; Rh2=LOGGPA ; Het ; Start=B,0 ;
Maxit=0 $
Calc ; List ; LMSTAT $
/* Finally, we could do a likelihood
ratio test for no heteroskedasticity. */
Probit ; Lhs=grade ;
Rhs=One,GPA,TUCE,PSI $
Calc ; LNL_Homo=LogL $
Probit ; Lhs=grade ;
Rhs=One,GPA,TUCE,PSI
; Rh2=LOGGPA ; Het $
Calc ; LNL_Het=LogL $
Calc ; List ; LRHet=2*(LNL_Het -
LNL_Homo) $
/* Suppose for some reason you
want to do restricted estimation. Let's impose
the restriction that GPA=3. */
Probit ; Lhs=grade ;
Rhs=One,GPA,TUCE,PSI
; Start=-7,3,.05,1.43 ;
Rst=b1,(),b3,b4 $
/* Here the restriction is
contained in the coefficient start values and the
coefficient to be restricted is
marked by (). */
/* The following program
illustrates some of the matrix and calculation procedures
of LIMDEP in the context of
comparing dichotomous Logit and Probit. In particular,
we show how to obtain the marginal
effects, probabilities, and predicted values the
hard way. Also, the
heteroskedastic probit model is estimated using the Maximize
procedure for illustrative
purposes. */
/* Are the probit and logit models
really different? The coefficients are, but are the
slopes and the predicted
probabilities? Let's find out below. */
? 1. List of regressors
Namelist ; X = One,GPA,TUCE,PSI $
? 2. Vector of means
Matrix ; Xbar = Mean(X) $
? 3. Probit model. Keep predictions
of Grade (0s and 1s)
Probit ; Lhs = Grade ; Rhs = X ;
keep = GFP ; list $
? 4. Derivatives are a scale times
the coefficients
Calc ; Scale = N01(b'Xbar) $
? 5. Predicted probit
probabilities
Create ; PP = Phi(b'x) $
? 6. Probit coefficients and
derivatives of conditional mean
Matrix ; BP = B ; DP = scale * B $
? Now, repeat for the logit model
Logit ; Lhs = Grade ; Rhs = X ;
keep = GFL $
? 1. Derivatives for logit are
also a scale times coefficients
Calc ; Scale = lgd(b'Xbar) $
? 2. Predicted logit probabilities
Create ; PL = lgp(b'x) $
? Logit coefficients and
derivatives of the conditional mean
Matrix ; BL = B ; DL = scale * B $
? Compare the two models. First,
coefficients, side by side
Matrix ; List ; [BP,BL] ; Pause $
? Now, derivatives, side by side
Matrix ; List ; [DP,DL] ; Pause $
? Predictions and fitted
probabilities
List ; PP,PL,GFP,GFL $
/* Now let's do the
heteroskedastic probit using the Maximize procedure. In the
example in the "canned" procedure,
we used GPA as the possible source of heteroskedasticity.
This time let's see if there are
group effects causing heteroskedasticity by using PSI.
We'll obtain starting values by
estimating the "canned" homoskedastic procedure first.
Note that we could also have
gotten this from the "hard" procedure above. */
Create ; q=2*Grade-1 $
Probit ; Lhs=Grade ;
Rhs=One,GPA,Tuce,PSI $
Maximize ;
Labels=b0,b1,b2,b3,gamma
; Start=B,1
; Fcn=
Xb=b0+b1*GPA+b2*Tuce+b3*PSI
|Sig=exp(gamma*PSI)
|Xbh=Xb/Sig
|log(phi(q*(Xbh)))
$
/* The t statistic on gamma in the
preceding is the square root of a Wald test for the
null of no heteroskedasticity. We
could also do LaGrange Multiplier and
Likelihood Ratio tests. I leave it
to you to replicate the analysis on p. 830 of Greene. */
Delete ; * $