[Gretl-users] Heckit goodness of fit

Allin Cottrell cottrell at wfu.edu
Sat Apr 11 11:03:20 EDT 2009


On Sat, 11 Apr 2009, Riccardo (Jack) Lucchetti wrote:

> On Sat, 11 Apr 2009, Allin Cottrell wrote:

> > Yes, I like it.  But note that at present it produces a horrid
> > mess for two-step heckit since the covariance matrix is stuffed
> > with NAs/nans.  I guess we should be able to fix that up without
> > too much difficulty.
>
> Not really, if the coefficients we test for 0 are only those for
> the main equation, and leave the selection equation alone (as I
> think we should:  the selection equation may be interesting in
> its own right, but the model you care about is the main
> equation); $vcv is block diagonal, so we should be ok.

OK, granted; you just have to careful to limit the test to the
main equation.

I do have one reservation.  As you put it, one typically wants the
R^2 as a quick check on whether "a particular model contains any
explanatory variables worth keeping".  Yes, there's that, but one
also wants a simple measure of "goodness of fit", and the two can
diverge.  Here's a silly tsls example:

<script>
open data4-10
ols ENROLL 0 2 3
tsls ENROLL 0 2 3 ; 3 4 5 6
matrix b= $coeff[2:]
matrix V = $vcv[2:,2:]
W = qform(b', invpd(V))
R2 = W / ( W + ($T-$ncoeff) )
R2 = corr(ENROLL, $yhat)^2
</script>

The correlation-based R^2 that we print currently (and which is
reproduced at the end) is just slightly lower for the tsls model
than the OLS.  And in one sense that seems right -- the _fit_ is
only slightly reduced by instrumenting variable 2.  On the other
hand, no coefficient is significant in the tsls variant, and the
Wald-based R^2 is much lower.

Allin.


More information about the Gretl-users mailing list