Lecture 10 CSCI 784 - Back Propagation
Feb 12 Links
- Previous Lecture -
- Index -
- Next Lecture -
- Background:
- Correlation Matrix Computation Example
- On HW problem you were given orthonormal vectors
- If this is not the case then the Correlation matrix
will have some errors
- Gram-Schmidt can be used to convert orthonormal
- normalize by dividing by the norm of the vector
- Chain rule
- Example: Logistic function; phi(v) = 1/(1 + exp(-av))
- simplify to show phi'(v) = a(1-phi(v))phi(v)
- Multilayer Perceptrons
- Notational conventions (page 142)
- neuron i -> neuron j -> neuron k
- n = the nth training pattern
- ej(n) = dj(n) - yj(n) if neuron j is in
the output layer
- E(n) - instantaneous sum square errors = ...
- EAV average sum square error = ...
- yi = phi(vi)
- vi = ...
Note the yi where you would normally have xi,
but remember the input to layer with neuron j is the output of the layer
with neuron i.
- Recall
- wj0 = theataj
- xj0 = -1
- phi the same throughout the network
- Delta Rule
- delta-wji = - eta * gradient
- Use chain rule to calculate the Gradient of the instantaneous error sum wrt the weights
- local gradient: deltaj
- Updating weights
- Case 1 Neuron j is in the output layer
- use definition to calculate ej(n)
- simple to update
- Case 2 Neuron j is a hidden neuron
- ej(n) does not exist because dj(n)
does not exist
- Calculate local gradient
[Handouts:]
[Readings:] 6.1-6.4
[Assignment:] HW5 on the web
- Previous Lecture -
- Index -
- Next Lecture -
URL = http://www.cs.sc.edu/~matthews/Courses/784/Lectures/lec10.html