Maximum Likelihood and Limited Dependent Variables

2008 Essex Summer School in Social Science Data Analysis and Collection

Instructor: B. Dan Wood, Texas A&M University

 

This course is about the underlying theory and application of maximum likelihood (ML) procedures to social science research. There will be strong emphasis on the statistical theory of maximum likelihood, particularly during the first week when we develop principles of specification, estimation, inference, measures of fit, and properties of the ML model. We shall strongly emphasize in this course that good social science involves an appropriate fit between substantive theory and the statistical model of uncertainty that is chosen to represent that theory. Maximum likelihood offers a range of possible models of uncertainty. Among the specific models to be discussed are the normal general linear model, models for non-normal disturbances (such as with logged data or rare events), logit and probit models for binary and ordinal dependent variables, discrete choice models for multiple alternatives (such as voting for multiple parties as in any system with more than two parties), event count models for dependent variables which are counts of the number of times an event occurs in some period of time (such as wars in a decade, coups in a year, court appointments in a presidential term, or incumbents defeated in an election), and models for non-random selection (as when you observe the preferences of voters but not non-voters). The applications are almost endless.

The background required for the course is a good introduction to probability and statistical inference and at least one good regression course (something covering multiple regression, preferably with some emphasis on the matrix perspective). Some familiarity with linear algebra is assumed, though we will try to present explanations grounded in regular algebra, along with linear algebra. Similarly, some familiarity with calculus and function optimization would be helpful. Attendance at the early morning mathematics for social scientists lectures is highly recommended if you lack either of these tools.

Classes will meet each day for about two hours of lecture and two hours in the computer lab. In the computer lab, we will work with LIMDEP, STATA, or R (your choice). Daily computer exercises will emphasize application of statistical theory. No prior experience with LIMDEP, STATA, or R is required, but some familiarity with the PC environment would be helpful.

Readings for each of the topics covered will be assigned from the following. (Note that the Greene book below is fairly expensive. Relevant sections are in your course pack available in the summer school office. Also, course materials may be downloaded in pdf format by clicking here.) Obtain data to replicate the analyses in the Long book by clicking here.  Obtain data for the project assignments by clicking here.  Obtain the lecture notes in Adobe Acrobat format by clicking here. 

Eliason, Scott R. 1993. Maximum Likelihood Estimation: Logic and Practice. Newbury Park: Sage. (This is one of the fairly inexpensive green Sage publications).

Long, J. Scott. 1997. Regression Models for Categorical and Limited Dependent Variables. Newbury Park.: Sage. (This provides an introduction to the theory of likelihood, as well as nice discussions of interpretation and applications of various methods).

Evans, Merran, Nicholas Hastings, and Brian Peacock. 2000. Statistical Distributions, Third Edition. New York: John Wiley. (This is a fairly inexpensive reference on probability distributions that should be in everyone's personal library.)

Greene, William C. 2008. Econometric Analysis, 6th Edition. New York: Prentice Hall.

Course Outline

The following topics will be covered in the order specified. Students should complete the assigned reading prior to the class in which it will be discussed.

  1. Introduction to Probability Models and Likelihood

    Read Eliason, chapter 1; Long, chapter 1

    DO: Fundamentals of LIMDEP. Click here for LIMDEP Assignment 1.  Fundamentals of STATA.  Click here for STATA Assignment 1. Fundamentals of R. Click here for R Assignment 1.

 

  1. Review of Probability Distributions and Likelihood (continued)

    Read Evans, Hastings, and Peacock, chapters 1-3; browse Evans, Hastings, and Peacock, chapters 4-45; Greene, chapter 16, pp. 482-496 (look over Greene, Appendix B)

    DO: Probability distributions and estimating a mean and variance using MLE. Click here for LIMDEP Assignment 2.  Click here for STATA Assignment 2. Click here for R Assignment 2.

 

  1. Maximum Likelihood Estimation: The Normal General Linear Model

    ASSIGNED- Eliason, chapters 1-3; Long chapter 2, 4; Greene, remainder of chapter 16 (look over Greene Appendix E)
     
    DO: Estimating a linear regression using MLE. Click here for LIMDEP Assignment 3.  Click here for STATA Assignment 3. Click here for R Assignment 3.

 

  1. Maximum Likelihood Estimation: The Heteroskedastic and Autocorrelated General Linear Models

    ASSIGNED- Eliason, chapter 2; Greene, pp. 517-529

    DO: Estimating the heteroskedastic/autocorrelated linear regression using MLE. Click here for LIMDEP Assignment 4.  Click here for STATA Assignment 4. Click here for R Assignment 4.

 

  1. Continuous Distributions with Truncation: Gamma, Exponential, Weibull, Log Normal, Beta, and Truncated Normal Distributions

    ASSIGNED- Eliason, chapters 4-6; Greene, 996-998, 71-72, 119; Evans, Hastings, and Peacock, chapters 5, 14,19, 26,42

    DO: Models with non-normal disturbances. Click here for LIMDEP Assignment 5.  Click here for STATA Assignment 5. Click here for R Assignment 5.

 

  1. Models for Binary Choice: Logit and Probit

    ASSIGNED- Long, chapter 3; Greene, 770-796
    Click here for Scott Long’s XPOST Excel Interpretation Tools 
    Click here for instructions on installing  Scott Long’s SPOST interpretation tools for STATA
    Click here for instructions on installing Gary King’s Clarify  interpretation tools for STATA
    Click here for the Zelig website which contains full documentation for interpretational tools in Zelig and R.

    DO: Binary logit/probit. Click here for LIMDEP Assignment 6.  Click here to download examples of interpreting Probit and Logit using XPOST.  Click here for STATA Assignment 6.  Click here for examples of interpretation of Probit and Logit using Clarify. Click here for R Assignment 6 which includes interpretational tools using Zelig.

 

  1. Models with Multiple Choices: Multinomial Logit, Probit, and Ordered Probit

    ASSIGNED- Long, chapter 5, 6; Greene, 826-859

     DO: Multinomial Models for Discrete Outcomes. Click here for LIMDEP Assignment 7.  Click here for STATA Assignment 7.  Click here for examples of interpretation of Multinomial Logit, Ordered Probit, and Ordered Logit using Clarify. Click here for R Assignment 7 which includes interpretation in Zelig.

 

  1. Models for Count Data: Poisson and Negative Binomial Estimators

    ASSIGNED- Long, chapter 8 ; Greene, 906-931

     DO: Models for count data. Click here for LIMDEP Assignment 8. Click here for STATA Assignment 8. Click here for examples of interpretation of Poisson and Negative Binomial regression using Clarify. Click here for R Assignment 8.

 

  1. Limited Dependent Variables: Censoring and Truncation

    ASSIGNED- Long, chapter 7; Greene, 863-903

    DO: Censoring and Truncation. Click here for LIMDEP Assignment 9. Click here for STATA Assignment 9. Click here for R Assignment 9.

 

  1.   Parametric Duration Models

    ASSIGNED- Greene, chapter 931-942

    DO: Duration Models. Click here for LIMDEP Assignment 10. Click here for STATA Assignment 10. Click here for R Assignment 10.