CS 540 | Lecture Notes | Fall 1996 |
| n | ----- | \ | 1, if / w x >= t | ----- i i a = | i=1 | | 0, otherwise |
1 a = ---------------- n ----- _\ / w x ----- i i i=1 1 + e
So, from here on learning the weights in a neural net will mean learning the weights and the threshold values.
Note: Each pass through all of the training examples is called one epoch
where
xi is the input associated with the ith input unit. alpha is a constant between 0 and 1 called the learning rate.
Notes about this update formula:
The result of executing the learning algorithm for 3 epochs:
x1 | x2 | T | O | delta_w1 | w1 | delta_w2 | w2 | delta_w3 | w3 (=t) |
---|---|---|---|---|---|---|---|---|---|
- | - | - | - | - | .1 | - | .5 | - | .8 |
0 | 0 | 0 | 0 | 0 | .1 | 0 | .5 | 0 | .8 |
0 | 1 | 1 | 0 | 0 | .1 | .2 | .7 | -.2 | .6 |
1 | 0 | 1 | 0 | .2 | .3 | 0 | .7 | -.2 | .4 |
1 | 1 | 1 | 1 | 0 | .3 | 0 | .7 | 0 | .4 |
0 | 0 | 0 | 0 | 0 | .3 | 0 | .7 | 0 | .4 |
0 | 1 | 1 | 1 | 0 | .3 | 0 | .7 | 0 | .4 |
1 | 0 | 1 | 0 | .2 | .5 | 0 | .7 | -.2 | .2 |
1 | 1 | 1 | 1 | 0 | .5 | 0 | .7 | 0 | .2 |
0 | 0 | 0 | 0 | 0 | .5 | 0 | .7 | 0 | .2 |
0 | 1 | 1 | 1 | 0 | .5 | 0 | .7 | 0 | .2 |
1 | 0 | 1 | 1 | 0 | .5 | 0 | .7 | 0 | .2 |
1 | 1 | 1 | 1 | 0 | .5 | 0 | .7 | 0 | .2 |
So, the final learned network is:
where Ti is the teacher value for the ith example, and Oi is the network output value for the ith example, and there are m examples in the training set.
Now consider the n+1 dimensional space where n dimensions are the weights, and the last dimension is the error, E. The error is non-negative and defines a surface in this weight space as shown below:
So, the goal is to search for the point in weight space with (global) minimum mean squared error E
and then change the ith weight by
where weight wji connects hidden unit j to output unit i, alpha is the learning rate parameter, Ti is the teacher output associated with output unit i, Oi is the actual output of output unit i, aj is the output of hidden unit j, and g' is the derivative of the sigmoid activation function, which is known to be g' = g(1-g).
To solve both of these problems, ALVINN takes each input image and computes other views of the road by performing various perspective transformations (shift, rotate, and fill in missing pixels) so as to simulate what the vehicle would be seeing if its position and orientation on the road was not correct. For each of these synthesized views of the road, a "correct" steering direction is approximated. The real and the synthesized images are then used for training the network.
To avoid overfitting using just the most recent images captured, ALVINN maintains a buffer pool of 200 images (both real and synthetic). When a new image is obtained, it replaces one of the images in the buffer pool so that the average steering direction of all 200 examples is straight ahead. In this way, the buffer pool always keeps some images in many different steering directions.
Initially, a human driver controls the vehicle for about 5 minutes while the network learns weights starting from initial random weights. After that one epoch of training using the 200 examples in the buffer pool is performed approximately every 2 seconds.
Last modified December 5, 1996
Copyright © 1996 by Charles R. Dyer. All rights reserved.