Young Wu's Homepage

# Lecture Note

📗 Slides

Lecture 1: Slides, With Quiz
Lecture 2: Slides, With Quiz
Annotated Lecture 1 Section 1: Slides
Annotated Lecture 2 Section 1: Slides
Annotated Week 1 Section 2: Part I, Part II

📗 Websites

Which face is real? Link
Guess two-thirds of the average? Link
Gradient Descent. Link
Eigenvalue in Endgame. Link

📗 YouTube videos

Why does the (batch) perceptron algorithm work? Link
Why cannot use linear regression for binary classification? Link
Why does gradient descent work? Link
How to derive logistic regression gradient descent step formula? Link
Example (Quiz): Perceptron update formula Link
Example (Quiz): Gradient descent for logistic activation with squared error Link
Example (Quiz): Computation of Hessian of quadratic form Link
Example (Quiz): Computation of eigenvalues Link
Example (Homework): Gradient descent for linear regression Link

📗 Math and Statistics Review

Checklist: Link, "math crib sheet" under "10/11"
Multivariate Calculus: Textbook, Chapter 16 and/or (Economics) Tutorials, Chapters 2 and 3.
Linear Algebra: Textbook, Chapters on Determinant and Eigenvalue.
Probability and Statistics: Textbook, Chapters 3, 4, 5.

# Written (Math) Problems

Submit on Canvas: PDF
Addition: please submit a file named "comments.txt", in the first line, a numerical grade 1, 1.5, or 2 for your whole homework (not individual questions).
Question 5 is missing a "not", sorry for the mistake. You can plot the function to see its convexity here: Link.
Example solutions are posted on Canvas.

# Programming Problem

📗 Short Instruction:

(1) Download the MNIST data from MNIST or CSV Files (easier) or the same dataset in another format from other places.
(2) Extract the training set data based on your wisc ID:
Type in your ID:
Your digits are:

Your test sets are:

(3) Train a logistic regression to classify the two digits.

📗 Files to submit

(1) weights.txt contains the weights of your logistic regression. Please put the bias on the first line and one weight per line.
(2) output.txt or outputs.txt contains the classifications of the digits from two test sets. For example, if your digits are 1 and 9, your output should contain only 1s and 9s, one digit per line, not 0s and 1s; and if your classifications are perfect, the file should contain 200 1s then 200 9s.
(3) code, not zipped, no data files, please specify in the comments what kind of data files are required and which files to compile or run to generate the outputs.
(4) comments.txt contains information on how to run your program, in particular, the names of the data files are required.
(5) Please do NOT submit data files!

📗 Things to try

(1) Experiment with different learning rates and the number of iterations (or stopping criterion).
(2) (Not required) Experiment with different regularizer parameters (covered in Week 2).
(3) For the images that are classified incorrectly, try to plot the image to see what happened.
Input the image (comma seperated):

📗 Longer Instruction

More (nonessential) details and hints: PDF.
For the students who use Python, R or Matlab and find the process too slow: (1) Vectorize see Link. (2) Use stochastic gradient descent (week 2).

📗 TAs' Solution

(1) Java: Link written by Tan
(2) Python: Link written by Dandi
Important note: You are not allowed to copy any code from the solution. MOSS will be used check for code similarity: changing just variable names and the spacing etc is still considered cheating. You can read and learn what the solution is doing but you MUST write all code yourself. The deadline for resubmission without 50 percent penalty is June 16.

Last Updated: November 09, 2021 at 1:05 AM