[Gretl-users] data mining with rolling regression and restricted coefficients

Charles Ward cwrward at gmail.com
Sun Sep 2 11:23:33 EDT 2012

One trick to use for this type of problem is to transform the regression.
So if we want to regress y against x1, x2 and x3 and constrain the
parameters to sum to 1 we do the following,
Subtract x1 from each of the other variables so we have y-x1, x2-x1, x3-x1
Regress y-x1 against the other two variables (no constraints needed unless
you want all the parameters to be positive).
y-x1 = A + B(x2-x1) + C(x3-x1) +e
Rearranging the results gives us
y = A + x1(1-B-C) + Bx2 + Cx3 +e
Thus the parameters sum to 1.
The only problem in your case is that it would be easier to program if you
have one variable (x1 in the above case) that is always present on the RHS
of the regression.

Charles Ward


On 29 August 2012 14:38, Jan Tille <Jan.Tille at absolut-research.de> wrote:

> Dear gretl users,
> first of all let me thank you, that you have already provided me with
> solutions on other topics. Unfortunately, I need your help again.
> The problem I am now trying to solve is the following.
> Basically, I want to set up a rolling regression with parameter
> restrictions (all parameters, except for the constant shall sum to one) and
> store the coefficient estimates. So far this poses no problem:
> <hansl>
> Matrix C={}
> List indep=indep1..indep10
> Smpl 1 36
> Loop i=1..360
>         Ols dep const indep
>         Restrict
>                 b[2]+b[3]+b[4]+b[5]+ b[6]+b[7]+b[8]+b[9]+b[10]=1
>         end restrict
>         Matrix c=coeff'
>         C=C|c
>         Smpl +1 +1
> Endloop
> </hansl>
> But as you can see, I have lots of regressors and not all might be
> significant, or depending on the time window the significance will change.
> I know that I can use the omit --auto function to select only significant
> coefficient estimates, but here it is, where the problems start:
> 1.) Assume that during the first window, there are 3 significant
> coefficients, so that the matrix will have 3 columns. If it should be, that
> during the next time window, there are, say 4 significant coefficients,
> then the script breaks down (matrices do not fit). Therefore, I guess I
> have to reshape the matrix somehow, to allow for the new column.
> 2.)Assume that during the first window there are 3 significant
> coefficients (2, 3, 4) and during the next time window there are 3
> different significant coefficients (6,7,8). Then, the dimension of the
> matrix would be correct, but interpreting the matrix of coefficients
> afterwards in a time series context would not make much sense.
> To summarize 1.) and 2.), I would need a matrix with 10 columns, where
> "NA" is entered, if the respective coefficient is insignificant and else
> the coefficient. So that one can obtain the time series of significant
> regressors.
> Date    indep1  indep2 ...      indep10
> 1       0,8             0,1             0,1
> 2       0,6             NA              NA
> 3       0,75            NA              0,05
> The third issue arises with the parameter restriction. After omitting
> insignificant variables, the restriction that coefficients sum to one
> should still apply.
> Unfortunately there seems to be no simple shortcut for the restriction
> (for example restrict sum(coeff(2..n))) , whith n being the last
> significant coefficient).
> As I don't know ex-ante which parameters would be significant, I somehow
> have to dynamically readjust the restriction. Is there a way how one can do
> it?
> Thanks in advance for your time answering my questions.
> Kind regards,
> Jan
> _______________________________________________
> Gretl-users mailing list
> Gretl-users at lists.wfu.edu
> http://lists.wfu.edu/mailman/listinfo/gretl-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.wfu.edu/pipermail/gretl-users/attachments/20120902/2982171b/attachment.html 

More information about the Gretl-users mailing list