Prev: W3 Next: W5

# Summary

📗 Monday lecture: 5:30 to 8:30, Zoom Link
📗 Office hours: 5:30 to 8:30 Wednesdays (Dune) and Thursdays (Zoom Link)
📗 Personal meeting room: always open, Zoom Link
📗 Quiz (use your wisc ID to log in (without "@wisc.edu")): Socrative Link, Regrade request form: Google Form (select Q4).
📗 Math Homework:
M4,
📗 Programming Homework:
P2,
📗 Examples, Quizzes, Discussions:
Q4,

# Lectures

📗 Slides (before lecture, usually updated on Saturday):
Blank Slides: Part 1, Part 2,
Blank Slides (with blank pages for quiz questions): Part 1, Part 2,
📗 Slides (after lecture, usually updated on Tuesday):
Blank Slides with Quiz Questions: Part 1, Part 2,
Annotated Slides: Part 1, Part 2,
📗 My handwriting is really bad, you should copy down your notes from the lecture videos instead of using these.

📗 Notes
Cat
Image via me.me
N/A

# Other Materials

📗 Pre-recorded Videos from 2020
Part 1 (Generative Models): Link
Part 2 (Natural Language): Link
Part 3 (Sampling): Link
Part 4 (Probability Distribution): Link
Part 5 (Bayesian Network): Link
Part 6 (Network Structure): Link
Part 7 (Naive Bayes): Link

📗 Relevant websites
Zipf's Law: Link
Markov Chain: Link
Google N-Gram: Link

Simple Bayes Net: Link, Link 2
ABNMS: Link, pathfinder: Link


📗 YouTube videos from 2019 to 2021
How to find the HOG features? Link
How to count the number of weights for training for a convolutional neural network (LeNet)? Link
Example (Quiz): How to find the 2D convolution between two matrices? Link
Example (Homework): How to find a discrete approximate Gausian filter? Link



# Keywords and Notations

📗 K-Nearest Neighbor:
Distance: (Euclidean) ρ(x,x)=xx2=j=1m(xjxj)2, (Manhattan) ρ(x,x)=xx1=j=1m|xjxj|, where x,x are two instances.
K-Nearest Neighbor classifier: y^i = mode {y(1),y(2),...,y(k)}, where mode is the majority label and y(t) is the label of the t-th closest instance to instance i from the training set.

📗 Natural Language Processing:
Unigram model: P{z1,z2,...,zd}=t=1dP{zt} where zt is the t-th token in a training item, and d is the total number of tokens in the item.
Maximum likelihood estimator (unigram): P^{zt}=cztz=1mcz, where cz is the number of time the token z appears in the training set and m is the vocabulary size (number of unique tokens).
Maximum likelihood estimator (unigram, with Laplace smoothing): P^{zt}=czt+1(z=1mcz)+m.
Bigram model: P{z1,z2,...,zd}=P{z1}t=2dP{zt|zt1}.
Maximum likelihood estimator (bigram): P^{zt|zt1}=czt1,ztczt1.
Maximum likelihood estimator (bigram, with Laplace smoothing): P^{zt|zt1}=czt1,zt+1czt1+m.







Last Updated: April 07, 2025 at 1:54 AM