[Gretl-devel] time for 1.9.8? (long, I'm afraid)

Riccardo (Jack) Lucchetti r.lucchetti at univpm.it
Fri Feb 24 05:26:51 EST 2012


On Thu, 23 Feb 2012, Allin Cottrell wrote:

> I know we discussed gretl 2.0 issues a while back, and I know I need to 
> re-read what was said then, so that previous work is not just discarded. But 
> in the meantime here are a few thoughts off the top of my head.
>
> One general question: do we want to make a big deal of gretl 2.0, or do we 
> want to "do a Linus"?

[ ... ]

I don't think we are in the position to gallantly ignore the psychological 
and image implications od a major version number change like Linus did. 
Compared to Linux, I'd venture to say that gretl is a slightly less 
recognisable brand.

> In gretl's case it would be quite natural to roll over from 1.9.9 to 2.0.0. 
> As Jack says, we could instead roll to 1.10.0 (or 1.9.10) but multi-digit 
> minor numbers are perhaps a little unsightly and liable to cause confusion in 
> some contexts.

IMHO, there would be more confusion after "Linussing"; casual users would 
simply go "WOW! Oh, wait... what...?"

> OK, so much for doing a Linus: what's the case for making 2.0 a real 
> milestone? (And deferring that milestone until something special is ready.) I 
> can see a few possibilities (there may be more). Version 2.0 should contain 
> one or more of:
>
> 1) Major new functionality
>
> 2) Major changes in the GUI
>
> 3) A major backward-incompatible clean-up of hansl
>
> 4) A major change in the libgretl API, to make it easier for third
>  parties to use
>
> 5) A major purge of bugs and update/completion of documentation
>
> My current thinking (sorry if this is disappointing!) is that number 5 
> provides the strongest case for a "2.0 milestone". Let's go through the list.
>
> * Major new functionality: Well, if we're talking C code, then at present 
> that means stuff that Jack and I will produce. I put my view on this at the 
> 2011 gretl conference: I think we now have a good enough baseline that people 
> ought to be able to add functionality to gretl in the form of function 
> packages and "addons". I certainly stand ready to fix bugs and tweak the C 
> code (including the GUI code and the "gretl server" infrastructure) to make 
> that easier. But right now I myself have no plans to add major econometric 
> functionality in C form. Jack has been working on substantial new stuff, but 
> in the form of (brilliant) hansl code rather than C.

Thanks for the kind words, but in my experience hansl is absolutely and by 
far the best language to work in from an applied econometrician's 
viewpoint, so producing nice hansl code for doing even hard stuff is 
surprisingly easy.

You may think that mine is a slightly biased opinion, having contributed 
to the creation of hansl itself: it's a bit like Linus Torvalds saying he 
likes Linux or Larry Wall saying he prefers Perl to Python (cue to Sven to 
start the flamefest). You'd probably be right, but let my add one element: 
contrary to LT and LW, coding is not my job and I never even had a formal 
training at that. What I do as a job is being an academic economist (and a 
bass player, but I digress). I think I can claim I have a much better 
understanding of the needs of the end user than LT or LW can have, because 
I'm one of them!

And I think we can take as a fact that most applied economists learn how 
to use _one_ econometrics package or perhaps two, depending on their field 
of activity; some (surely not the majority) learn how to program in _one_ 
language; a few become proficient in that language; only a tiny minority 
can claim to be at ease with more than one language. Older time-series 
people like Gauss or Ox, younger fellows go with Matlab, the micro 
community worships Stata, the eccentric brag about R, corporate 
practitioners are not even aware that something else exists but Eviews, 
some pockets of resistance still exist in the TSP mountains, etcetera. I 
have some familiarity with all of these, and I'm ready to defend the point 
that NOTHING BEATS HANSL.

There are only three areas in which I see the necessity of low-level code 
work (but maybe I'm missing something):

a) Massive parallelisation is definitely the future of scientific 
computation. So far, we have cautiously explored some possibilities, but 
time will come when properly parallelising the internals of gretl will 
become unavoidable. But that's not for 2.0; I see it more as a 3.0 thing.

b) Both hansl and gretl (heh) may strongly benefit from setting up an 
infrastructure for managing data sets like Stata does. That is, do the 
things that, ideally, you'd use a RDBMS for, but you can't ask an applied 
economist to study SQL, can you? I'm talking about dataset merging, 
splitting, sorting, variable/cases keeping/dropping, etcetera. Anybody 
who's ever worked with large micro data bases knows exactly what I'm 
talking about. Stata is, to my knowledge, the only econometrics package 
that attempts to do this and does it, in my opinion, badly.

c) There may be the case for extending the way data are stored in gretl 
from a double-only representation to a more general one. This would enable 
us to have string and int variables. Allin and I talked a little about 
this in Toruń, but this is HUGE. The project currently contains about 
400,000 lines of C code, and my guesstimate is that at least half of this 
would have to be thoroughly revised, if not rewritten. Allin already has 
done some rationalisation work in libgretl which makes this a little 
easier, but it's a loooooooong way away.

> * Major changes in the GUI: That's up to me alone, and I have no plans in 
> that area. Nor do I expect to have time to implement truly big ideas that 
> others may come up with, though I'm always ready to consider incremental 
> improvements and bug fixes.

100% agree. (Notice my elegant silence on the issue of decimal 
separators.)

> * Major backward-incompatible clean-up of hansl: consistency and cleanliness 
> are good, but so is continuing backward compatibility. I can surely see a 
> case for scrapping some archaisms. But I seem to recall some folk wisdom from 
> computer science: the production of a backward-incompatible "cleaned up" 
> version 2 of language L often results in fragmentation of the user base and 
> decline of L.

IMO, it's a matter of common sense. We may scrap a few things that really 
are obsolete, but that's it IMO.

> * Clean-up of the libgretl API: A good idea. But this can be done without 
> much (if any) change that is visible to users of gretl itself, so it's 
> probably not very pertinent to the "2.0" question.
>
> * Purge of bugs and update/completion of documentation: Here I can really get 
> on board. One conception of gretl 2.0 is that it has achieved a degree of 
> maturity where we have squashed as many bugs as we can find on an extended 
> period of testing, and have documented in a reasonably comprehensible and 
> cross-referenced form all that the program can do.

Agree.


Riccardo (Jack) Lucchetti
Dipartimento di Economia
Università Politecnica delle Marche

r.lucchetti at univpm.it
http://www.econ.univpm.it/lucchetti


More information about the Gretl-devel mailing list