Censoring and Truncation

 

The purpose of this session is to show you how to use LIMDEP's procedures for doing censored and truncated regression. We also estimate Heckman's two-stage procedure for samples with selection bias which is a form of incidential truncation.

 

/* This file demonstrates some of LIMDEP's procedures for doing censored and truncated regression. In particular, we estimate a lower limit censored regression (i.e., Tobit), Cragg's model that assumes a heterogenous censoring process, Heckman's incidential truncation model for dealing with sample selection bias, and truncated regression.*/

 

Reset $

 

/*We will use some of the Mroz data on female labor force participation and income for these examples. The first 428 observations of the Mroz data contained women who worked in 1975. The remaining 345 bservations contained women who did not work. We will use only the first 50 observations from each of these subsets of the data.*/

 

READ ; File="c:\documents and settings\b. dan wood\my documents\my teaching\maximum likelihood\data\tobit.dat" ; Nrec = 100 ; Nvar = 19

; Names = LFP,WHRS,KL6,K618,WA,WE,WW,RPWG,HHRS,HA,HE,HW,

FAMINC,MTR,WMED,WFED,UN,CIT,AX $

 

Sample ; 1-100 $

Skip $

 

/*Let's create a Namelist to be used below.*/

 

Namelist ; X=One,KL6,K618,WA,WE $

 

/* Now let's estimate a Tobit model and also save the log likelihood for later testing. The dependent variable (WHRS)is the wife's hours worked in 1975. The independent variables are a constant, number of children less than 6 years old (KL6), number of children between 6 and 18 (KL618), wife's age (WA), and wifes education (WE). The Parameters' option saves the full parameter set for later calculations.

*/

 

Tobit ; Lhs=Whrs ; Rhs=X ; Margin ; List ; Parameters $

 

Calc ; Ltobit=Logl $

 

/*On the Tobit model above, the censoring limit is set automatically to zero. However, you can change this limit to an upper limit, a lower limit other than zero, or both if your data warrants this. */

 

/* McDonald and Moffit suggest a useful decomposition of the marginal effects associated with the censored regression model. They show that a change in the conditional mean due to right side variables derives from two sources:

 

1) It affects the conditional mean in the uncensored part of the distribution

 

2) It affects the conditional mean by also affecting the probability that an observation will lie in the uncensored part of the distribution.

 

Below we calculate McDonald and Moffit's decomposition using both matrices and the Wald procedure. It is P, P1, and P2 that are of interest. P is the probability that an observation will lie in the uncensored part of the distribution, given the model. P1 and P2

are probabilities that would be multiplied by the coefficient vector, B, to obtain the decomposition into the two separate effects. */

 

Matrix ; Xb=Mean(X) ; Beta=Part(B,1,5) $

Calc ; List

; BXoverS=Beta'Xb/S ; Mu=N01(BXoverS)/Phi(BXoverS)

; P=Phi(BXoverS)

; P1=P*(1-BXoverS*Mu-Mu^2)

; P2=N01(BXoverS)*BXoverS+N01(BXoverS)*Mu $

Wald ; Labels=b0,b1,b2,b3,b4,V ; Start=B ; Var=Varb

; Fn1=b0'Xb/V

; Fn2=Phi(Fn1)

; Fn3=N01(Fn1)/Fn2

; Fn4=Fn2*(1-Fn3*(Fn1+Fn3))

; Fn5=Fn2*Fn3*(Fn1+Fn3) $

 

/* Cragg has suggested that assuming a censoring limit that depends on the same distribution as the uncensored observations is often incorrect. He suggests a two equation system in which the first equation estimates the probability of being above the censoring limit and the second is a truncated regression on the uncensored observations. Below we estimate Cragg's model using Probit and LIMDEP's Truncated regression procedure. Also, we do a likelihood ratio test of whether Cragg's model is significantly different than the Tobit model.

 

*/

 

Probit ; Lhs=Lfp

; Rhs=X $

Calc ; List ; LProbit=Logl $

Sample ; 1-50 $

Truncate ; Lhs=Whrs ; Rhs=X $

Calc ; List ; LTrunc=Logl

; LRtest=2*((LProbit+LTrunc)-LTobit) $

 

/* The restricted model is Tobit. The unrestricted model is the two models estimated separately. The test statistic is 14.4, which is chi-squared with 5 degrees of freedom for the number of additional parameters being estimated. The critical value is 11.07, so we reject the null that the restricted model is true. The two equation approach is therefore more appropriate than Tobit. */

 

/* We can also estimate Tobit with heteroskedasticity by including a vector of variables in a second link function, Rh2. Below we estimate a heteroskedastic Tobit with the log of family income posited as the source of the nonconstant variance. This model has difficulty converging, so we specify a different convergence criterion than the default. */

 

Sample ; 1-100 $

 

Create ; LFINC=Log(Faminc) $

Tobit ; Lhs=Whrs ; Rhs=X ; Rh2=LFinc ; Het

; Margin ; Tlf=.0001 $

 

/* The t statistic on the heteroskedasticity term is non-significant, so we do not accept the heteroskedastic specification. */

 

 

/* Now let's turn to estimating a model with sample selection bias. In these cases the truncation is incidental, due to sample selection on another variable that is correlated with the truncation in the dependent variable. As discussed in class, the standard model is Heckman's two stage procedure. Here is an example. */

 

Probit ; Lhs=LFP

; Rhs=One,CIT,KL6

; Hold $

Select ; Lhs=Whrs

; Rhs=X

; List

; Margin $

 

/* The probit equation estimates an index, the inverse Mills ratio that attempts to measure the omitted variable in the equation for the incidentially truncated variable in the second equation. The 'Hold' command saves this variable, and it is inserted as an

additional variable in the Select equation. */

 

Delete ; * $