[Gretl-users] correlation matrix

Allin Cottrell cottrell at wfu.edu
Fri Apr 20 16:38:33 EDT 2007


On Mon, 16 Apr 2007, ab.news wrote:

> In the correlation matrix window, in the presence of missing
> values, does the 5% critical value make any sense since the
> number of observations "n" is not the same for all pairwise
> correlations? I guess -- when some some values are missing--,
> we'll need either a 5% critical value for each correlation
> coefficient or a matrix correlation computed on a listwise
> basis, meaning that the same observations are used throughout
> the dataset.

This is quite a tricky issue.  Here's what I have come up with:

* We keep track of how many observations, n_ij, are used in 
calculating each correlation coefficient in the matrix.

* We then calculate the proportional difference between the 
maximum and minimum n_ij values.  If this is less than 0.1, we 
report the 5 percent critical value, using min(n_ij) to be on the 
conservative side.  If the difference is greater than 0.1, we 
don't show a critical value.

* The above is the default behaviour.  But at the command line you 
can give the "--uniform" flag (uniform sample size) to the corr 
command.  In this case gretl will determine the maximum sample for 
which all variables are observed, and calculate all the 
coefficients using that sample.  A critical value will be 
reported.  (Obviously, this option makes a difference to the 
output only if there are missing values, and they're not perfectly 
aligned across the chosen variables.)

* Note also that you can always get a critical value for a 
specific correlation by calling "corr" with only 2 arguments.

This is in CVS and the current Windows snapshot.

Allin.


More information about the Gretl-users mailing list