I have been using the weight update function from Russell and Norvig's AI book:
Wj <- Wj + alpha x Err x g'(in) x Xj
Where g' is the derivative of the activation function (I think it can be ignored?), and alpha is just for scaling the amount of change. Xj is the j'th input bit. Err is the difference between the expected output and the output produced by the current weights.
(all the js are subscript, and alpha is the greek character).
This is for updating one node attached to all inputs, with one output. It will generate correct weightings if the function it is solving for is linearly seperable.
If I have many feed-forward nodes, how do I update the weights for all of them?
Wj <- Wj + alpha x Err x g'(in) x Xj
Where g' is the derivative of the activation function (I think it can be ignored?), and alpha is just for scaling the amount of change. Xj is the j'th input bit. Err is the difference between the expected output and the output produced by the current weights.
(all the js are subscript, and alpha is the greek character).
This is for updating one node attached to all inputs, with one output. It will generate correct weightings if the function it is solving for is linearly seperable.
If I have many feed-forward nodes, how do I update the weights for all of them?