weight adaptation implementation

Forum for software developers to discuss BCI2000 software development
Locked
tiziano
Posts: 18
Joined: 19 Oct 2008, 10:22

weight adaptation implementation

Post by tiziano » 29 Nov 2008, 12:34

I have already created a topic related to weights adaptation in the Signal Processing section of this forum, but I think this question is more related to the sw implementation.

I anticipate that is a quite precise question about how weight adaptation was implemented in the version 1.4 of BCI2000.
In particular the piece of code I want to discuss is this:

Code: Select all

                predicted= 0;

                for( i=0;i<nh;i++)       // apply filter
                {
                        predicted+=  wt_buf[chan][i] * elements[i];
                }

                predicted-= sig_mean[chan];         // include mean in model

                err= (float)target - predicted;

                for(i=0;i<nh;i++)         // update weights
                {
                        wt_buf[chan][i]+= elements[i] * err * rate;
                }

Since I am re-implementing this feature in BCI2000 2.0, I want to be sure to have understood how it worked in the past.
Form past discussions, I got that the weights were updated using a gradient descent algorithm, and precisely using the delta rule.
Reading the old manual of BCI2000 and looking to the source code what I have understood is that:

- you compute the value of the linear equation at the current step

Code: Select all

                for( i=0;i<nh;i++)       // apply filter
                {
                        predicted+=  wt_buf[chan][i] * elements[i];
                }
- you get the desired target value from the application (target)

- you compute the error by the difference of the firsts two

Code: Select all

                err= (float)target - predicted;
- you compute the new weights using the delta rule:

Code: Select all

                for(i=0;i<nh;i++)         // update weights
                {
                        wt_buf[chan][i]+= elements[i] * err * rate;
                }
Reading the manual, I got that "target" is expressed with an integer value at zero mean. So if we have two targets (left and right) we could have target values of -1 for left and 1 for right. But we can also have -3 and 3 as well.

Now, since we want to have comparable values when we compute the error, we want "predicted" to be something like "target".
My idea is that "predicted" should be normalized using current values of gain and offset from the normalizer. This way "predicted" is transformed into a signal at zero mean and unit variance.

In the code I pasted above there is a normalization wrt mean at this line:

Code: Select all

predicted-= sig_mean[chan];         // include mean in model
But I can't find any normalization wrt to variance. Is there a reason for this?

And what about the feature value? is the value of

Code: Select all

elements[i]
the raw spectral amplitude coming from the source?

I'm asking this because when I debug my code it happens that the feature has a value like 200-300 and it seems strange that this kind of value (even if it is multiplied with "error" and a small learning rate) is summed up to the weights that are very small in absolute value.

I'm not sure I have explained my doubts clearly, neither that you can answer to such detailed questions about the code... but any kind of suggestion will be very appreciated.

Thanks a lot for your helpfulness.

Tiziano

gschalk
Posts: 615
Joined: 28 Jan 2003, 12:37

weight adaptation ...

Post by gschalk » 02 Dec 2008, 08:11

Tiziano,

As I mentioned before, I did not write the adaptation code in V1.4. Thus, take my comments with a grain of salt.

There is not really a "normalization." The code just subtracts the signal mean to get rid of the bias in the linear model.

elements, I think, referred to the weighted features in the linear classifier. This is why you get feature values of 200-300, and not -1..1. In other words, you need to run something like linear regression on the features first to come up with an approximate initial weighting of features. Thus, the weighted sum of all features will put you in the neighborhood of whatever your target was, e.g., -1..1. The weights are subsequently changed according to the delta rule.

I hope this helps.

Gerv

Locked

Who is online

Users browsing this forum: No registered users and 0 guests