Young Wu's Homepage

Prev: L1, Next: L2

Zoom: Link, Piazza: Link, Google Form: Link.

Wisc ID for in-class quiz: (if your wisc email is "test@wisc.edu", please enter "test")
Token: (will be given during the lectures)

Slide:

# Welcome to CS540

📗 Major changes to the assignments.

➩ Old assignments can be solved by uploading the instructions to large language models (LLMs).

➩ Writing code is not an important part of the course anymore, being able to give natural language instructions to LLMs on what to do is more important for this course, and possibly for work and research.

➩ Reviewing and verifying the correctness of LLM-generated code or text is an important skill to develop in this course.

📗 One of exams or in-class quizzes is optional.

➩ In-class: (i) discussions: any relevant answer is okay; (ii) quizzes: can copy what I do during the lecture or try it yourself after the lecture.

➩ Exams: similar questions as the in-class quizzes, generated based on your ID, cannot work in a group, but LLM use is allowed.

# Group Work

📗 Almost all assignments require group work to achieve a high score.

➩ Technically, the whole class can work as a single group against the course staff.

➩ Expect some students not to participate and submit random stuff.

➩ Expect some students not to follow the group plan when the plan is not stable (we will talk about Nash equilibrium in game theory).

➩ Optional: add yourself to our Slack channel.

# Discussion Sessions

📗 Discussion sessions are optional, but highly recommended, ask the course staff for suggestions, talk to other students about strategies.

➩ Every weekday 3:00 to 4:00: Monday for group 1 and 4 meeting, Tuesday for group 2 and 5 meeting, Wednesday for group 3 and 6 meeting, Thursday for general discussions of the non-competition parts, Friday for general discussions for the competition.

📗 Interviews and presentations are also optional, and only for students getting 2, 3, 4 points out of 5 on the competitions, if you believe you did something creative or technically challenging, but did not get a high score.

➩ Every weekday 11:00 to 12:00: reserved for interviews and presentations, and TA office hours.

# Tools

📗 Useful tools (for this course and other CS courses):

➩ (University subscription) Copilot Chat: Link and VSCode with GitHub Copilot: Link.

➩ (University subscription) Gemini: Link.

📗 AI Policy for CS559 (similar recommendations for this course): Link.

➩ For CS540 this summer, it is recommended, but not required, that all code and text are generated by any large language models of your choice.

In-class Discussion

📗 Why are you taking the course?

A. To learn how to use large language models? [Not covered in class, but can practice in assignments]
B. To learn how to build and train machine learning models? [Main focus of the course]
C. To learn the math for machine learning? [No longer the main goal, but will be covered in lectures and exams]
D. Easy course to get A? [Ad: take CS559 Computer Graphics with me in Fall]
E. Other courses are not offered in summer? Which course do you want offered in summer?
[Note]

➩ For In-class quizzes including discussions, your answers will be visible to other students once they submit (they can hack my JavaScript to see other students' answers without submitting too, but please don't: wrong but relevant answers can get full points too). Your notes will not be visible to other students, and for some questions, you might not get the in-class quiz points if you only submit the answer without notes explaining your answer.

[Q1] Please check the box to confirm submission (submissions for questions not discussed during the lectures will result in in-class quiz point deduction).

Submit your answer to see other students answers (click the submit button to refresh):

# Machine Learning

📗 A machine learning data set usually contains features (text, images, ... converted to numerical vectors) and labels (categories, converted to integers).

➩ Features: \(X = \left(x_{1}, x_{2}, ..., x_{n}\right)\), where \(x_{i} = \left(x_{i1}, x_{i2}, ..., x_{im}\right)\), and \(x_{ij}\) is called feature (or attribute) \(j\) of instance (or item) \(i\).

➩ Labels: \(Y = \left(y_{1}, y_{2}, ..., y_{n}\right)\), where \(y_{i}\) is the label of item \(i\).

📗 Supervised learning: given training set \(\left(X, Y\right)\), estimate a prediction function \(y \approx \hat{f}\left(x\right)\) to predict \(y' = \hat{f}\left(x'\right)\) based on a new item \(x'\).

📗 Unsupervised learning: given training set \(\left(X\right)\), put points into groups (discrete groups \(\left\{1, 2, ..., k\right\}\) or "continuous" lower dimensional representations).

📗 Reinforcement learning: given an environment with states \(x\) and reward \(R\left(x_{t}, y_{t}\right)\) when action \(y_{t}\) is performed in state \(x_{t}\), estimate the optimal policy \(y' = f\left(x'\right)\) that selects the best action in state \(x'\) that maximizes the total reward.

# Linear Classifier

📗 A simple classifier is a linear classifier: \(\hat{y}_{i} = 1\) if \(w_{1} x_{i 1} + w_{2} x_{i 2} + ... + w_{m} x_{i m} + b \geq 0\) and \(\hat{y} = 0\) otherwise. This classifier is called an LTU (Linear Threshold Unit) perceptron: Wikipedia.

📗 Given a training set, the weights \(w_{1}, w_{2}, ..., w_{m}\) and bias \(b\) can be estimated based on the data \(\left(x_{1}, y_{1}\right), \left(x_{2}, y_{2}\right), ..., \left(x_{n}, y_{n}\right)\). One algorithm is called the Perceptron Algorithm.

➩ Initialize random weights and bias.

➩ For each item \(x_{i}\), compute the prediction \(\hat{y}_{i}\).

➩ If prediction is \(\hat{y}_{i} = 0\) and the actual label is \(y_{i} = 1\), increase the weights by \(w \leftarrow w + \alpha x_{i}, b \leftarrow b + \alpha\), \(\alpha\) is a constant called learning rate.

➩ If prediction is \(\hat{y}_{i} = 1\) and the actual label is \(y_{i} = 0\), decrease the weights by \(w \leftarrow w - \alpha x_{i}, b \leftarrow b - \alpha\).

➩ Repeat the process until convergent (weights are no longer changing).

In-class Discussion

ID:

📗 [3 points] Move the sliders below to change the green plane normal so that the largest number of the blue points are above the plane and the largest number of the red points are below the plane.

The current number of mistakes is ???.

📗 Answers:

\(w_{1}\) = 0
\(w_{2}\) = 0
\(w_{3}\) = 1
\(b\) = 0

[Note] Question: how would you find the weights with more points and in higher dimensional spaces?

[Q2] Please check the box to confirm submission (submissions for questions not discussed during the lectures will result in in-class quiz point deduction).

Submit your answer to see other students answers (click the submit button to refresh):

# Perceptron Algorithm

📗 The percetron algorithm update can be summarized as \(w \leftarrow w - \alpha \left(\hat{y}_{i} - y_{i}\right) x_{i}\) (or for \(j = 1, 2, ..., m\), \(w_{j} \leftarrow w_{j} - \alpha \left(\hat{y}_{i} - y_{i}\right) x_{ij}\)) and \(b \leftarrow b - \alpha \left(\hat{y}_{i} - y_{i}\right)\), where \(\hat{y}_{i} = 1\) if \(w_{1} x_{i 1} + w_{2} x_{i 2} + ... + w_{m} x_{i m} + b \geq 0\) and \(\hat{y}_{i} = 0\) if \(w_{1} x_{i 1} + w_{2} x_{i 2} + ... + w_{m} x_{i m} + b < 0\): Wikipedia.

📗 The learning rate \(\alpha\) controls how fast the weights are updated.

➩ \(\alpha\) can be constant (usually 1).

➩ \(\alpha\) can be a function of the iteration (usually decreasing), for example, \(\alpha_{t} = \dfrac{1}{\sqrt{t}}\).

In-class Discussion

ID:

📗 [3 points] Find the Perceptron weights by using the Perceptron algorithm: select a point on the diagram and click anywhere else to run one iteration of the Perceptron algorithm.

📗 You can set the learning rate here: .

📗 Answer: 0,0.1,0

[Note] Question: how would you set the learning rate?

[Q3] Please check the box to confirm submission (submissions for questions not discussed during the lectures will result in in-class quiz point deduction).

In-class Quiz

ID:

📗 [4 points] Consider a Linear Threshold Unit (LTU) perceptron with initial weights \(w\) = and bias \(b\) = trained using the Perceptron Algorithm. Given a new input \(x\) = and \(y\) = . Let the learning rate be \(\alpha\) = , compute the updated weights, \(w', b'\) = :

📗 Answer (comma separated vector): .

[Note] Use the space to explain the steps or just take notes:

[Q4] Please check the box to confirm submission (submissions for questions not discussed during the lectures will result in in-class quiz point deduction).

# Questions?

📗 If you have questions, please use (i) Zoom chat, (ii) Piazza: Link, (iii) Office hours and discussion sessions. Please do NOT use Canvas mail and use email only to the course instructor (not TAs) for grading issues.

Additional In-class Discussion

📗 Sometimes a question not in the notes will be asked during the lecture, you can submit your answer here:

Notes (not visible to other students):
[Q5] Please check the box to confirm submission (submissions for questions not discussed during the lectures will result in in-class quiz point deduction).

Submit your answer to see other students answers (click the submit button to refresh):

Additional In-class Quiz

📗 Sometimes a question not in the notes will be asked during the lecture, you can submit your answer here:

A.
B.
C.
D.
E.
Notes (not visible to other students):
[Q6] Please check the box to confirm submission (submissions for questions not discussed during the lectures will result in in-class quiz point deduction).

Submit your answer to see other students answers (click the submit button to refresh):

# In-class Quiz Instructions

📗 To get full points on the in-class quizzes for a lecture:

➩ Submit relevant answers to the questions discussed during the lecture: incorrect answers are okay.

➩ Some questions require [notes] to earn the point.

➩ Some questions require special ID (given during the lecture) to earn the point.

➩ Do not submit answers to questions that are not discussed during the lectures. Each such submission will result in a deduction of one point.

➩ Submissions after the lecture, before the midterm (first 14 lectures) and the final exam (last 14 lectures), are accepted. After the exams, no in-class quiz submissions will be accepted.

➩ The grade on Canvas Assignment Q1 is computed as number of points divided by the number of questions asked (out of 1) and updated on Canvas every weekend.

📗 If there are any issues with submission on the website, please use this Google form: Link.

📗 Bonus point opportunities during a few lectures (added to in-class quiz above 20 points).

📗 Notes and code adapted from the course taught by Professors Jerry Zhu, Blerina Gkotse, Yudong Chen, Yingyu Liang, Charles Dyer. Some content are generated using Copilot .

Prev: L1, Next: L2

Last Updated: July 16, 2026 at 12:17 PM