[Gretl-users] how to match observations with 'join' and gdt files

Allin Cottrell cottrell at wfu.edu
Thu Nov 19 15:16:35 EST 2015


On Thu, 19 Nov 2015, Sven Schreiber wrote:

> Am 19.11.2015 um 15:33 schrieb Riccardo (Jack) Lucchetti:
>> On Thu, 19 Nov 2015, Sven Schreiber wrote:
>>
>>> The sample and/or workfile ranges did not coincide, stopping the data
>>> import.
>>
>> What do you mean exactly? Non-overlapping samples? Different
>> periodicities? Both? Something else altogether?
>
> Here are some self-contained examples:
>
> <hansl>
> nulldata 100	# same example data to be joined
> setobs 4 2000:1
> series x = normal()
> store file1.gdt
>
> nulldata 40	# shorter sample
> setobs 4 2000:1
> join file1.gdt x	# doesn't work

A workaround is to add a suitable tkey column to the join source file:

<hansl>
nulldata 100
setobs 4 2000:1
series yq = $obsmajor + $obsminor/10
series x = normal()
store file1.gdt

nulldata 40     # shorter sample
setobs 4 2000:1
join file1.gdt x --tkey=yq,"%Y.%q"
</hansl>

However, pretty clearly this should not be required. We have an
implicit time-key on the left, and when joining from a gdt file with a 
time-series structure there should also be an implicit time-key on the 
right. This is just something that's not implemented, and whose 
absence hasn't been noticed till now.

> nulldata 100
> setobs 4 1990:1 # shifted but overlapping sample
> join file1.gdt x	# works but is WRONG
> </hansl>
>
> With respect to the last example I know you are going to say it's a
> feature not a bug, and the responsibility of the user.

Well, no, I wouldn't say that ;-) Once again, with two native 
time-series datasets the matching should be implicitly by time and 
should "just work" (despite what the doc says about joining from 
non-native plain text data files). Or at least, it should "just work" 
(with no options required) if the two datasets are of the same 
frequency.

Allin


More information about the Gretl-users mailing list