Prev:
W1 Next:
W3
# Summary
📗 Office hours: 5:30 to 8:30 Wednesdays (Dune) and Thursdays (
Zoom Link)
📗 Personal meeting room: always open,
Zoom Link
📗 Quiz (use your wisc ID to log in (without "@wisc.edu")): Socrative
Link, Regrade request form:
Google Form (select Q2).
📗 Math Homework:
M1,
M2,
📗 Programming Homework:
P1,
📗 Examples, Quizzes, Discussions:
Q2,
# Lectures
📗 Slides (before lecture, usually updated on Saturday):
Blank Slides:
Part 1,
Part 2,
Blank Slides (with blank pages for quiz questions):
Part 1,
Part 2,
📗 Slides (after lecture, usually updated on Tuesday):
Blank Slides with Quiz Questions:
Part 1,
Part 2,
Annotated Slides:
Part 1,
Part 2,
📗 My handwriting is really bad, you should copy down your notes from the lecture videos instead of using these.
📗 Notes
Image by
Vishal Arora via
medium
N/A
# Other Materials
📗 Pre-recorded Videos from 2020
Part 1 (Neural Network):
Link
Part 2 (Backpropogation):
Link
Part 3 (Multi-Layer Network):
Link
Part 4 (Stochastic Gradient):
Link
Part 5 (Multi-Class Classification):
Link
Part 6 (Regularization):
Link
📗 Relevant websites
Neural Network:
Link
Another Neural Network Demo:
Link
Neural Network Videos by Grant Sanderson:
Playlist
MNIST Neural Network Visualization:
Link
Neural Network Simulator:
Link
Overfitting:
Link
Neural Network Snake:
Video
Neural Network Car:
Video
Neural Network Flappy Bird:
Video
Neural Network Mario:
Video
MyScript: algorithm
Link demo
Link
Maple Calculator:
Link
📗 YouTube videos from 2019 to 2021
How to construct XOR network?
Link
How derive 2-layer neural network gradient descent step?
Link
How derive multi-layer neural network gradient descent induction step?
Link
Comparison between L1 and L2 regularization.
Link
Example (Quiz): Cross validation accuracy
Link
# Keywords and Notations
📗 Neural Network:
Neural network classifier for two layer network with logistic activation:
, where is the number of features (or input units), is the layer weight from input unit to hidden layer unit , is the bias for hidden layer unit , is the layer activation of instance hidden unit .
, where is the number of hidden units, is the layer weight from hidden layer unit , is the bias for the output unit, is the layer activation of instance .
Stochastic gradient descent step for two layer network with squared loss and logistic activation:
.
.
.
.
📗 Multiple Classes:
Softmax activation for one layer networks: , where is the number of classes (number of possible labels), is the activation of the output unit for instance , is component of the one-hot encoding of the label for instance .
📗 Regularization:
L1 regularization (squared loss): , where is the regularization parameter.
L2 regularization (sqaured loss): .
Last Updated: April 09, 2025 at 11:28 PM