# Summary
📗 Office hours: 5:30 to 8:30 Wednesdays (Dune) and Thursdays (
Zoom Link)
📗 Personal meeting room: always open,
Zoom Link
📗 Quiz (use your wisc ID to log in (without "@wisc.edu")): Socrative
Link, Regrade request form:
Google Form (select Q4).
📗 Math Homework:
M4,
📗 Programming Homework:
P2,
📗 Examples, Quizzes, Discussions:
Q4,
# Lectures
📗 Slides (before lecture, usually updated on Saturday):
Blank Slides:
Part 1,
Part 2,
Blank Slides (with blank pages for quiz questions):
Part 1,
Part 2,
📗 Slides (after lecture, usually updated on Tuesday):
Blank Slides with Quiz Questions:
Part 1,
Part 2,
Annotated Slides:
Part 1,
Part 2,
📗 My handwriting is really bad, you should copy down your notes from the lecture videos instead of using these.
📗 Notes
# Other Materials
📗 Pre-recorded Videos from 2020
Part 1 (Generative Models):
Link
Part 2 (Natural Language):
Link
Part 3 (Sampling):
Link
Part 4 (Probability Distribution):
Link
Part 5 (Bayesian Network):
Link
Part 6 (Network Structure):
Link
Part 7 (Naive Bayes):
Link
📗 Relevant websites
Zipf's Law:
Link
Markov Chain:
Link
Google N-Gram:
Link
Simple Bayes Net:
Link,
Link 2
ABNMS:
Link, pathfinder:
Link
📗 YouTube videos from 2019 to 2021
How to find the HOG features?
Link
How to count the number of weights for training for a convolutional neural network (LeNet)?
Link
Example (Quiz): How to find the 2D convolution between two matrices?
Link
Example (Homework): How to find a discrete approximate Gausian filter?
Link
# Keywords and Notations
📗 K-Nearest Neighbor:
Distance: (Euclidean) \(\rho\left(x, x'\right) = \left\|x - x'\right\|_{2} = \sqrt{\displaystyle\sum_{j=1}^{m} \left(x_{j} - x'_{j}\right)^{2}}\), (Manhattan) \(\rho\left(x, x'\right) = \left\|x - x'\right\|_{1} = \displaystyle\sum_{j=1}^{m} \left| x_{j} - x'_{j} \right|\), where \(x, x'\) are two instances.
K-Nearest Neighbor classifier: \(\hat{y}_{i}\) = mode \(\left\{y_{\left(1\right)}, y_{\left(2\right)}, ..., y_{\left(k\right)}\right\}\), where mode is the majority label and \(y_{\left(t\right)}\) is the label of the \(t\)-th closest instance to instance \(i\) from the training set.
📗 Natural Language Processing:
Unigram model: \(\mathbb{P}\left\{z_{1}, z_{2}, ..., z_{d}\right\} = \displaystyle\prod_{t=1}^{d} \mathbb{P}\left\{z_{t}\right\}\) where \(z_{t}\) is the \(t\)-th token in a training item, and \(d\) is the total number of tokens in the item.
Maximum likelihood estimator (unigram): \(\hat{\mathbb{P}}\left\{z_{t}\right\} = \dfrac{c_{z_{t}}}{\displaystyle\sum_{z=1}^{m} c_{z}}\), where \(c_{z}\) is the number of time the token \(z\) appears in the training set and \(m\) is the vocabulary size (number of unique tokens).
Maximum likelihood estimator (unigram, with Laplace smoothing): \(\hat{\mathbb{P}}\left\{z_{t}\right\} = \dfrac{c_{z_{t}} + 1}{\left(\displaystyle\sum_{z=1}^{m} c_{z}\right) + m}\).
Bigram model: \(\mathbb{P}\left\{z_{1}, z_{2}, ..., z_{d}\right\} = \mathbb{P}\left\{z_{1}\right\} \displaystyle\prod_{t=2}^{d} \mathbb{P}\left\{z_{t} | z_{t-1}\right\}\).
Maximum likelihood estimator (bigram): \(\hat{\mathbb{P}}\left\{z_{t} | z_{t-1}\right\} = \dfrac{c_{z_{t-1}, z_{t}}}{c_{z_{t-1}}}\).
Maximum likelihood estimator (bigram, with Laplace smoothing): \(\hat{\mathbb{P}}\left\{z_{t} | z_{t-1}\right\} = \dfrac{c_{z_{t-1}, z_{t}} + 1}{c_{z_{t-1}} + m}\).
Last Updated: November 18, 2024 at 11:43 PM