Binary Logit and Probit

 

The purpose of this week’s lesson is to illustrate some of LIMDEP’s procedures for doing Logit/Probit analysis.  The data is a sample of 32 observations from a study by Spector and Mazzeo on pre- and post-tests for economic literacy and is also used by Greene in his examples. The variables are:

 

GPA = grade point average

TUCE = test score on teaching college level economics

PSI = program participation variable (school district)

GRADE = response: 1 if grade on post is higher than on pre,

0 otherwise.

 

/* First read the data. */

Reset $

Delete ; * $

Read; Nobs=32; Nvar=4; Names=GPA,TUCE,PSI,GRADE $

2.66 20 0 0

2.89 22 0 0

3.28 24 0 0

2.92 12 0 0

4.00 21 0 1

2.86 17 0 0

2.76 17 0 0

2.87 21 0 0

3.03 25 0 0

3.92 29 0 1

2.63 20 0 0

3.32 23 0 0

3.57 23 0 0

3.26 25 0 1

3.53 26 0 0

2.74 19 0 0

2.75 25 0 0

2.83 19 0 0

3.12 23 1 0

3.16 25 1 1

2.06 22 1 0

3.62 28 1 1

2.89 14 1 0

3.51 26 1 0

3.54 24 1 1

2.83 27 1 1

3.39 17 1 1

2.67 24 1 0

3.65 21 1 1

4.00 23 1 1

3.10 21 1 0

2.39 19 1 1

Sample ; 1 - 32 $

 

/* Binary choice models: Since the interest is in the PSI variable - this is

the treatment Effect of the program- we will test the hypothesis

that the coefficient is equal to 0.0. These commands will illustrate

several ways to test hypotheses using LIMDEP. */

 

Probit ; Lhs = Grade ; Rhs = One,GPA,TUCE,PSI $

 

/* The t-ratio on the coefficient reported for PSI provides a direct test.

We'll repeat the test a couple of different ways. First, we can extract the

t-statistic directly from the model results if we wish. */

 

Calc ; list ; T for PSI = b(4) / Sqr(Varb(4,4)) $

 

/* We can also ask LIMDEP to carry out a Wald test. For a single coefficient

the chi-squared is going to be the same as the square of the t-ratio.

This command uses the Last Model setup, which allows specification of

Wald tests using a predefined set of parameter names and uses the results

of the previous model for the coefficients and covariance matrix */

 

Wald ; Fn1 = b_psi - 0 $

 

/* LIMDEP will also carry out a test of a set of linear restrictions as

part of the estimation procedure */

 

Probit ; Lhs = Grade ; Rhs = One,GPA,TUCE,PSI ; wald: b(4) = 0 $

 

/* For more involved problems, a likelihood ratio test or an LM test might

be preferable. To carry out the likelihood ratio test, we require

 

chi - squared = 2 * (LOGL_with PSI - LOGL_without PSI)

 

We have the first one from the last estimate. To get the second, we just

leave PSI out of the model and retrieve logl. If the chi-squared value

exceeds the critical value 3.84, we reject the hypothesis. Alternatively

if the probability to the left of our statistic is less that, say, .05,

we reject the hypothesis. */

 

Calc ; YESPSI = LogL $

Probit; Lhs = Grade ; Rhs = One,GPA,TUCE $

Calc ; NOPSI = LogL

; List ; Chisq = 2*(YesPSI - NoPSI)

; Cprob = 1 - Chi(chisq,1) $

 

/* Lastly, to compute an LM statistic, we need the chi-squared statistic

 

CLM = g0' V g0

 

where g0 is the gradient of the log-likelihood function for the probit model

evaluated at the maximum likelihood estimates with the hypothesis assumed.

This is the last model estimated above, which computes the MLE with the

coefficient on PSI fixed at 0. (by dropping PSI). V is the inverse Hessian

computed at this estimate. LIMDEP automates this; you just provide the

restricted estimates and specify zero iterations. */

 

Probit; Lhs = Grade ; Rhs = One,GPA,TUCE,PSI

; Start = B,0 ; Maxit = 0 $

Calc ; List ; LMSTAT $

 

/* Now let's look at the available options on Logit/Probit procedures

; List - gives fitted values

; Keep= -creates a variable containing the predictions

; Res= -creates a variable containing the residuals

; Alg= -specifies the estimation algorithm.

; Tlf= ; Tlb= ; Tlg= - overrides the convergence criterion

; Maxit= -gives the maximum number of iterations. If 0, then a LaGrange Multiplier

statistic is given.

; Var= -list of variances to extract from the variance matrix of b.

; Start= -gives a vector of starting values. This is useful for complex optimizations.

; Margin computes marginal effects (partial derivatives) for each coefficient at the means.

; Margin = variable - computes marginal effects (partial derivatives) with all coefficients

at their means except a stratification variable.

; Het - estimates a heteroskedastic version of probit when ....

; Rh2 - a vector of variables producing the heteroskedasticity is given.

; Rst= imposes restrictions on the listed coefficients to values given in the

vector of starting values. */

 

 

/* Below we estimate the homoskedastic model for both logit and probit and all output,

including marginal effects for each variable. */

 

Probit ; Lhs=grade ; Rhs=One,GPA,TUCE,PSI

; List ; Margin=PSI $

 

Logit ; Lhs=Grade ; Rhs=One,GPA,TUCE,PSI

; List ; Margin=PSI $

 

/* Now let's do the heteroskedastic probit with heteroskedasticity a function

of logged GPA and compute a Wald statistic. */

 

Create ; LOGGPA=log(GPA) $

Probit ; Lhs=grade ; Rhs=One,GPA,TUCE,PSI

; Rh2=LOGGPA ; Het ; Wald: b(5)=0 $

 

/* The t statistic on LOGGPA is the square root of the Wald statistic for no

heteroskedasticity. In more complex specifications we could use the Wald

option for joint tests of multiple coefficients in the variance vector as was

done above. We could also compute a LaGrange Multiplier test for no

heteroskedasticity as is done below. */

 

Probit ; Lhs=grade ; Rhs=One,GPA,TUCE,PSI $

Probit ; Lhs=grade ; Rhs=One,GPA,TUCE,PSI ;

; Rh2=LOGGPA ; Het ; Start=B,0 ; Maxit=0 $

Calc ; List ; LMSTAT $

 

/* Finally, we could do a likelihood ratio test for no heteroskedasticity. */

 

Probit ; Lhs=grade ; Rhs=One,GPA,TUCE,PSI $

Calc ; LNL_Homo=LogL $

Probit ; Lhs=grade ; Rhs=One,GPA,TUCE,PSI

; Rh2=LOGGPA ; Het $

Calc ; LNL_Het=LogL $

Calc ; List ; LRHet=2*(LNL_Het - LNL_Homo) $

 

/* Suppose for some reason you want to do restricted estimation. Let's impose

the restriction that GPA=3. */

 

Probit ; Lhs=grade ; Rhs=One,GPA,TUCE,PSI

; Start=-7,3,.05,1.43 ; Rst=b1,(),b3,b4 $

 

/* Here the restriction is contained in the coefficient start values and the

coefficient to be restricted is marked by (). */

 

/* The following program illustrates some of the matrix and calculation procedures

of LIMDEP in the context of comparing dichotomous Logit and Probit. In particular,

we show how to obtain the marginal effects, probabilities, and predicted values the

hard way. Also, the heteroskedastic probit model is estimated using the Maximize

procedure for illustrative purposes. */

 

/* Are the probit and logit models really different? The coefficients are, but are the

slopes and the predicted probabilities? Let's find out below. */

 

? 1. List of regressors

Namelist ; X = One,GPA,TUCE,PSI $

? 2. Vector of means

Matrix ; Xbar = Mean(X) $

? 3. Probit model. Keep predictions of Grade (0s and 1s)

Probit ; Lhs = Grade ; Rhs = X ; keep = GFP ; list $

? 4. Derivatives are a scale times the coefficients

Calc ; Scale = N01(b'Xbar) $

? 5. Predicted probit probabilities

Create ; PP = Phi(b'x) $

? 6. Probit coefficients and derivatives of conditional mean

Matrix ; BP = B ; DP = scale * B $

? Now, repeat for the logit model

Logit ; Lhs = Grade ; Rhs = X ; keep = GFL $

? 1. Derivatives for logit are also a scale times coefficients

Calc ; Scale = lgd(b'Xbar) $

? 2. Predicted logit probabilities

Create ; PL = lgp(b'x) $

? Logit coefficients and derivatives of the conditional mean

Matrix ; BL = B ; DL = scale * B $

? Compare the two models. First, coefficients, side by side

Matrix ; List ; [BP,BL] ; Pause $

? Now, derivatives, side by side

Matrix ; List ; [DP,DL] ; Pause $

? Predictions and fitted probabilities

List ; PP,PL,GFP,GFL $

 

/* Now let's do the heteroskedastic probit using the Maximize procedure. In the

example in the "canned" procedure, we used GPA as the possible source of heteroskedasticity.

This time let's see if there are group effects causing heteroskedasticity by using PSI.

We'll obtain starting values by estimating the "canned" homoskedastic procedure first.

Note that we could also have gotten this from the "hard" procedure above. */

 

Create ; q=2*Grade-1 $

Probit ; Lhs=Grade ; Rhs=One,GPA,Tuce,PSI $

Maximize ; Labels=b0,b1,b2,b3,gamma

; Start=B,1

; Fcn=

Xb=b0+b1*GPA+b2*Tuce+b3*PSI

|Sig=exp(gamma*PSI)

|Xbh=Xb/Sig

|log(phi(q*(Xbh)))

$

 

/* The t statistic on gamma in the preceding is the square root of a Wald test for the

null of no heteroskedasticity. We could also do LaGrange Multiplier and

Likelihood Ratio tests. I leave it to you to replicate the analysis on p. 830 of Greene. */

 

Delete ; * $