Censoring and Truncation
The
purpose of this session is to show you how to use LIMDEP's
procedures for doing censored and truncated regression. We also estimate
Heckman's two-stage procedure for samples with selection bias which is a form
of incidential truncation.
/*
This file demonstrates some of LIMDEP's
procedures for doing censored and truncated regression. In particular, we
estimate a lower limit censored regression (i.e., Tobit),
Cragg's model that assumes a heterogenous
censoring process, Heckman's incidential truncation
model for dealing with sample selection bias, and truncated regression.*/
Reset
$
/*We will use some of the Mroz data
on female labor force participation and income for these examples. The first
428 observations of the Mroz data contained women who
worked in 1975. The remaining 345 bservations
contained women who did not work. We will use only the first 50 observations
from each of these subsets of the data.*/
READ ; File="c:\documents and settings\b. dan wood\my documents\my teaching\maximum likelihood\data\tobit.dat" ; Nrec = 100 ; Nvar = 19
;
Names = LFP,WHRS,KL6,K618,WA,WE,WW,RPWG,HHRS,HA,HE,HW,
FAMINC,MTR,WMED,WFED,UN,CIT,AX $
Sample ; 1-100 $
Skip
$
/*Let's create a Namelist to be used
below.*/
Namelist ; X=One,KL6,K618,WA,WE $
/*
Now let's estimate a Tobit model and also save the
log likelihood for later testing. The dependent variable (WHRS)is the wife's hours worked in 1975. The independent
variables are a constant, number of children less than 6 years old (KL6),
number of children between 6 and 18 (KL618), wife's age (WA), and wifes education (WE). The Parameters' option saves the full
parameter set for later calculations.
*/
Tobit ; Lhs=Whrs ; Rhs=X ; Margin ; List ; Parameters $
Calc ; Ltobit=Logl $
/*On
the Tobit model above, the censoring limit is set
automatically to zero. However, you can change this limit to an upper limit, a
lower limit other than zero, or both if your data warrants this. */
/*
McDonald and Moffit suggest a useful decomposition of
the marginal effects associated with the censored regression model. They show
that a change in the conditional mean due to right side variables derives from
two sources:
1)
It affects the conditional mean in the uncensored part of the distribution
2)
It affects the conditional mean by also affecting the probability that an
observation will lie in the uncensored part of the distribution.
Below
we calculate McDonald and Moffit's decomposition
using both matrices and the Wald procedure. It is P,
P1, and P2 that are of interest. P is the probability that an observation will
lie in the uncensored part of the distribution, given the model. P1 and P2
are probabilities that would be multiplied by the
coefficient vector, B, to obtain the decomposition into the two separate
effects. */
Matrix ; Xb=Mean(X) ;
Beta=Part(B,1,5) $
Calc ; List
;
BXoverS=Beta'Xb/S ; Mu=N01(BXoverS)/Phi(BXoverS)
;
P=Phi(BXoverS)
;
P1=P*(1-BXoverS*Mu-Mu^2)
;
P2=N01(BXoverS)*BXoverS+N01(BXoverS)*Mu $
Wald ; Labels=b0,b1,b2,b3,b4,V
; Start=B ; Var=Varb
;
Fn1=b0'Xb/V
;
Fn2=Phi(Fn1)
;
Fn3=N01(Fn1)/Fn2
;
Fn4=Fn2*(1-Fn3*(Fn1+Fn3))
;
Fn5=Fn2*Fn3*(Fn1+Fn3) $
/*
Cragg has suggested that assuming a censoring limit
that depends on the same distribution as the uncensored observations is often
incorrect. He suggests a two equation system in which the first equation
estimates the probability of being above the censoring limit and the second is
a truncated regression on the uncensored observations. Below we estimate Cragg's model using Probit and LIMDEP's Truncated regression
procedure. Also, we do a likelihood ratio test of whether Cragg's
model is significantly different than the Tobit
model.
*/
Probit ; Lhs=Lfp
;
Rhs=X $
Calc ; List ; LProbit=Logl $
Sample ; 1-50 $
Truncate ; Lhs=Whrs ; Rhs=X $
Calc ; List ; LTrunc=Logl
;
LRtest=2*((LProbit+LTrunc)-LTobit) $
/*
The restricted model is Tobit.
The unrestricted model is the two models estimated separately. The test
statistic is 14.4, which is chi-squared with 5 degrees of freedom for the
number of additional parameters being estimated. The critical value is 11.07,
so we reject the null that the restricted model is true. The two equation
approach is therefore more appropriate than Tobit. */
/*
We can also estimate Tobit with heteroskedasticity
by including a vector of variables in a second link function, Rh2. Below we
estimate a heteroskedastic Tobit
with the log of family income posited as the source of the nonconstant
variance. This model has difficulty converging, so we specify a different
convergence criterion than the default. */
Sample ; 1-100 $
Create ; LFINC=Log(Faminc)
$
Tobit ; Lhs=Whrs
; Rhs=X ; Rh2=LFinc ; Het
;
Margin ; Tlf=.0001 $
/*
The t statistic on the heteroskedasticity
term is non-significant, so we do not accept the heteroskedastic
specification. */
/*
Now let's turn to estimating a model with sample selection bias. In these cases
the truncation is incidental, due to sample selection on another variable that
is correlated with the truncation in the dependent variable. As discussed in
class, the standard model is Heckman's two stage procedure. Here is an example.
*/
Probit ; Lhs=LFP
;
Rhs=One,CIT,KL6
;
Hold $
Select ; Lhs=Whrs
;
Rhs=X
;
List
;
Margin $
/*
The probit equation
estimates an index, the inverse Mills ratio that attempts to measure the
omitted variable in the equation for the incidentially
truncated variable in the second equation. The 'Hold' command saves this
variable, and it is inserted as an
additional variable in the Select equation. */
Delete ; * $