The Linear Regression Model

The Purpose of this week's computer assignment is to introduce you to the matrix treatment of the linear regression model. In this assignment we will use LIMDEP's canned regression procedures. Then we will replicate all of the results using matrix algebra. Again, this is not because you would ever be using the matrix approach, but to deepen understanding.

 RESET $

/* Read the data. The data are from Kmenta, 1971, p. 456 and relate to corporate investment behavior. The data are for General Motors, and include 1) It=Investment at time t, 2) Ftlag1=outstanding Shares lagged 1 period, and 3) Ktlag1=capital stock lagged 1 period. */

READ ; FORMAT=wks ; NVAR=5 ; FILE=H:\teaching\POLS603\KMENT102.wks ; Names $

/* Set the sample length and use dstats to take a look at the data. */

SAMPLE ; 1-19 $
dstats ; rhs=* ; output=3 ; quantiles ; plot ; all ; $

/* Now let's do a regression the easy way. The following performs a regression of It on a constant and FT and Kt lagged one period. We also plot the residuals against the observation number. The res and keep command store the residuals and predictions as a variable for later use. */

Regress ; lhs=It ; rhs=one,FTLAG1,KTLAG1 ; list ; plot ; res=ehat ; keep=ypred $

/* You can also plot the residuals against any other variable or against yhat created above for diagnostic purposes. Also, you can plot the standardized residuals as we shall see later, which is an aid in evaluating for outliers. */

PLOT ; LHS=YPRED ; RHS=EHAT $
PLOT ; LHS=FTLAG1 ; RHS=EHAT $
PLOT ; LHS=KTLAG1 ; RHS=EHAT $

/* Now, let's illustrate how to do all of this the hard way using matrices. Define a matrix of independent variables, X */

namelist ; X=constant,FTLAG1,KTLAG1 ; y=It $

/* Create a constant k for the number of regressors. Note that N is a system constant already defined. However, we could also get this by taking i'i. */

CALC ; K=COL(X) $

/* Now let's create some matrices and results that will be useful for later calculations. Create M0, the matrix that converts to deviation form. */

MATRIX ; LIST ; I=IDEN(N) $
MATRIX ; LIST ; ONES=INIT(N,1,1) $
MATRIX ; LIST ; ONEOVERN={1/N}*ONES*ONES' $
MATRIX ; LIST ; M0=I-ONEOVERN $

/*Create the projection matrix P and the orthogonal projection matrix M */

MATRIX ; LIST ; P=X*<X'X>*X' $
MATRIX ; LIST ; M=I-P $

/*Calculate the mean of Y */

MATRIX ; list ; Ymean=1/N*y'1 $

/*Now, let's calculate the set of regression coefficients. */

MATRIX ; list ; Beta=<X'X>*X'Y $

/*Now, let's get the vector of predicted values using two different methods. */

MATRIX ; list ; YHAT=X*Beta $
MATRIX ; LIST ; YHAT=P*Y $

/*Now, let's get the vector of residuals using two different methods. */

MATRIX ; list ; E=Y-YHAT $
MATRIX ; LIST ; E=M*Y $

/*Now, let's calculate the total sum of squares for Y using two different methods.*/

MATRIX ; list ; SST=Y'Y-YMEAN^2*N $
MATRIX ; LIST ; SST=Y'M0*Y $

/*Now, let's calculate the sum of the squared errors using two different methods.*/

MATRIX ; list ; SSE=Y'Y-Beta'*X'Y $
MATRIX ; LIST ; SSE=E'E $

/*Now, let's calculate the sum of squares due to regression using two different methods. */

MATRIX ; list ; SSR=Beta'*X'Y-YMEAN^2*N $
MATRIX ; LIST ; SSR=Beta'*X'M0*X*Beta $

/*Now, let's get the three mean squares implied by an analysis of variance table. */

CALC ; list ; MST=SST/(N-1) $
CALC ; list ; MSE=SSE/(N-K) $
CALC ; list ; MSR=SSR/(K-1)$

/*Now, let's calculate the variance of the regression */

CALC ; list ; SIGMA2=SSE/(N-K) $

/*Now, let's calculate the Standard Error of Estimates. */

CALC ; list ; SEE=SQR(SIGMA2) $

/*Now, let's get the variance covariance matrix of coefficients using the variance of the regression calculated above. */

MATRIX ; list ; VARCOVB=SIGMA2*<X'X> $

/*Now, let's pick off the variances of the individual coefficients from the variance covariance matrix above. */

MATRIX ; list ; VARBeta=VECD(VARCOVB)$

/*The standard errors of the coefficients are just the square root of the diagonal of the variance covariance matrix of coefficients. */

MATRIX ; list ; STDERRS=ESQR(VARBeta) $

/*Using the coefficients we can compute t statistics for the null hypothesis that the coefficient equals zero just by dividing the coefficients by the standard errors. */

MATRIX ; LIST ; ONEOVSB=./STDERRS $
MATRIX ; list ; TSTAT=DIRP(BETA,ONEOVSB) $

/*Now, let's create a correlation matrix for the variance covariance matrix of coefficients. */

MATRIX ; LIST ; DIAGSB=DIAG(ONEOVSB) $
MATRIX ; list ; CORRMAT=DIAGSB*VARCOVB*DIAGSB' $

/*Now let's calculate R squared using a couple of different ways. */

CALC ; list ; RSQR=1-SSE/SST $
CALC ; LIST ; RSQR=SSR/SST $

/*Now, let's calculate adjusted R squared a couple of different ways. */

CALC ; list ; RSQRADJ=1-MSE/MST $
CALC ; LIST ; RSQRADJ=1-((N-1)/(N-K))*(1-RSQR) $

/* Now, let's calculate an F statistic for the overall significance of the regression a couple of different ways. */

CALC ; LIST ; FSTAT=MSR/MSE $
CALC ; LIST ; FSTAT=(RSQR/(K-1))/((1-RSQR)/(N-K)) $

/* Now, let's calculate the standardized residuals discussed on pp. 60-61 of Greene using both the regression procedure and a matrix approach. A standardized (sometimes called studentized) residual larger than +/-2 is suggestive of an outlier according to Belsey, Kuh, and Welch. The standardized residual is included for the list or plot option when the word standard is also used as an option. */

Regress ; lhs=It ; rhs=one,FTLAG1,KTLAG1 ; list; standard $

MATRIX ; HAT=P $
MATRIX ; Hii=VECD(HAT) $
MATRIX ; ONEMNHii=1-Hii $
MATRIX ; SQRTHii=ESQR(ONEMNHii) $
MATRIX ; DENOM=SEE.*SQRTHii $
MATRIX ; LIST ; UI=DIRP(E,./DENOM) $

/* Belsey, Kuh, and Welch also suggest looking at the leverage exerted on the regression of each observation. Leverage is just the values of the diagonal of the Hat matrix. A value of leverage larger than 2K/N suggests an outlier. The critical value would then be: */

CALC ; list ; HatCrit=2*K/N $


MATRIX ; XXI=<X'X> $
CREATE ; Hatvalue=QFR(X,XXI) $
CREATE ; IF(Hatvalue > HatCrit)Lookatme=1 $
LIST ; Hatvalue,Lookatme $

/* That's all folks. Delete all variables to be ready for a new analysis. */

Delete ; * $