Young Wu's Homepage

Prev: M3 Next: M5
Back to week 2 page: Link

# M4 Written (Math) Problems

📗 Enter your ID (the wisc email ID without @wisc.edu) here: and click (or hit the "Enter" key)

📗 You can also load from your saved file
and click .

📗 If the questions are not generated correctly, try refresh the page using the button at the top left corner.

📗 The official deadline is July 3, late submissions within a week will be accepted without penalty, but please submit a regrade request form: Link.

📗 The same ID should generate the same set of questions. Your answers are not saved when you close the browser. You could print the page: , solve the problems, then enter all your answers at the end.

📗 Please do not refresh the page: your answers will not be saved.

📗 Please report any bugs on Piazza: Link

# Warning: please enter your ID before you start!

# Question 1

# Question 2

# Question 3

# Question 4

# Question 5

# Question 6

# Question 7

# Question 8

# Question 9

# Question 10

# Question 11

📗 [2 points] Given a weight vector \(w\) = , consider the line (plane) defined by \(w^\top x = c\) = . Along this line (on the plane), there is a point that is the closest to the origin. How far is that point to the origin in Euclidean distance?

📗 Note: the distance between the point and plane is the length of the red line in the diagram, the length of the blue line is \(\dfrac{c}{w_{z}}\), not the distance between the point and plane.

Hint

See Fall 2011 Midterm Q7. Use a similar approach to the derivation of the margin: suppose the vector from the origin to the point on the line is \(\lambda w\), then \(w^\top \lambda w = c\), which implies that \(\lambda = \dfrac{c}{w^\top w} = \dfrac{c}{\left\|w\right\|^{2}}\). The length of the vector \(\lambda w\) is the distance from the origin to the point on the line: \(\left\|\lambda w\right\| = \lambda \left\|w\right\| = \dfrac{c}{\left\|w\right\|^{2}} \left\|w\right\| = \dfrac{c}{\left\|w\right\|}\).

📗 Answer: .

📗 [3 points] A hard margin SVM (Support Vector Machine) is trained on the following dataset. Suppose we restrict \(b\) = , what is the value of \(w\)? Enter a single number, i.e. do not include \(b\). Assume the SVM classifier is \(1_{\left\{w x + b \geq 0\right\}}\) (this means it predict 1 if \(w x + b \geq 0\) and 0 otherwise.

\(x_{i}\)
\(y_{i}\)

Hint

See Fall 2019 Final Q11, Fall 2006 Final Q15, Fall 2005 Final Q15. To maximize the margin, the SVM should be at the center between the two classes: find the midpoint \(p\) between the rightmost point of one class and the leftmost point of the other class, the SVM should be in the form \(1_{\left\{x \geq p\right\}}\) or \(1_{\left\{x \leq p\right\}}\), rewrite the expression to get \(w\) and \(b\).

📗 Answer: .

📗 [6 points] A linear SVM (Support Vector Machine) with with weights \(w_{1}, w_{2}, b\) is trained on the following data set: \(x_{1}\) = , \(y_{1}\) = and \(x_{2}\) = , \(y_{2}\) = . The attributes (i.e. features) are two dimensional \(\left(x_{i1}, x_{i2}\right)\) and the label \(y_{i}\) is binary. The classification rule is \(\hat{y}_{i} = 1_{\left\{w_{1} x_{i1} + w_{2} x_{i2} + b \geq 0\right\}}\). Assuming \(b\) = , what is \(\left(w_{1}, w_{2}\right)\) ? The drawing is not graded.

Hint

See Fall 2019 Final Q11, Fall 2006 Final Q15, Fall 2005 Final Q15. Draw the line (decision boundary). One way to figure out the equation of the is by noting that the line passes through the midpoint between \(x_{1}\) and \(x_{2}\), and the line (slope) is perpendicular to the line from \(x_{1}\) to \(x_{2}\). Make sure you have the correct signs: the class-1 point should satisfy \(w_{1} x_{i1} + w_{2} x_{i2} + b \geq 0\) and the class-0 point should satisfy \(w_{1} x_{i1} + w_{2} x_{i2} + b < 0\).

📗 Answer (comma separated vector): .

📗 [4 points] Given the following training set, add one instance \(\begin{bmatrix} x_{1} \\ x_{2} \end{bmatrix}\) with \(y\) = so that all instances are support vectors for the Hard Margin SVM (Support Vector Machine) trained on the new training set.

\(x_{1}\)	\(x_{2}\)	\(y\)
		0
		0
		0
		1
		1
		1

📗 Note: in the diagram, currently, the two support vectors are connected by the grey line and the black line represents the SVM classification boundary. After adding one point, you should be able to make all seven points support vectors with the classification boundary given by the green line.

Hint

See Fall 2019 Final Q7 Q8. One example: say the negative points are \((-1, -1), \left(-2, -1\right), \left(-3, -1\right), ...\) and the positive points are \(\left(1, 1\right), \left(2, 1\right), \left(3, 1\right), ...\). Draw the points and draw the SVM decision boundary. Note that there are 2 support vectors. Suppose you add a point \((-1, 1)\). Draw the new point and draw the new SVM decision boundary. Note that now all points are support vectors.

📗 Answer (comma separated vector): .

📗 [4 points] Consider the linear SVM (Support Vector Machine) problem without slack variables or kernels: this is known as the hard margin SVM. If you give it a linearly separable training data set where \(\begin{bmatrix} x_{1} \\ x_{2} \end{bmatrix} \in \mathbb{R}^{2}\) and \(y \in \left\{0, 1\right\}\), it will learn a line in \(\mathbb{R}^{2}\). Tom did something to your data set, and hard margin SVM no longer works (no longer linearly separable) on the modified data set: \(\begin{bmatrix} x_{1} \\ x_{2} \end{bmatrix} \leftarrow \begin{bmatrix} m_{11} & m_{12} \\ m_{21} & m_{22} \end{bmatrix} \begin{bmatrix} x_{1} \\ x_{2} \end{bmatrix} + \begin{bmatrix} b_{1} \\ b_{2} \end{bmatrix} = M \begin{bmatrix} x_{1} \\ x_{2} \end{bmatrix} + b\). Suppose \(b\) = , give an example of \(M\)?

📗 Note: you can test your transformation using : the original points are on the left and the new points after the transformation are on the right.

Hint

See Fall 2014 Final Q17. When the columns or rows of \(M\) are linearly dependent, some of the points will be mapped to the same line making the resulting dataset not linearly separable.

📗 Answer (matrix with multiple lines, each line is a comma separated vector): .

📗 [2 points] Let \(w\) = and \(b\) = . For the point \(x\) = , \(y\) = , what is the smallest slack value \(\xi\) for it to satisfy the margin constraint?

Hint

See Fall 2011 Midterm Q8, Fall 2009 Final Q1. There are two inequality constraints for the slack variable: (1) \(\left(2 y - 1\right)\left(w^\top x + b\right) \geq 1 - \xi\) and (2) \(\xi \geq 0\). Combine the two inequalities to get the smallest slack variable value.

📗 Answer: .

📗 [4 points] Consider a kernel \(K\left(x_{i_{1}}, x_{i_{2}}\right)\) = + + , where both \(x_{i_{1}}\) and \(x_{i_{2}}\) are 1D positive real numbers. What is the feature vector \(\varphi\left(x_{i}\right)\) induced by this kernel evaluated at \(x_{i}\) = ?

Hint

See Fall 2009 Final Q2. Write \(K\left(x, y\right)\) as the dot product of \(\varphi\left(x\right)\) and \(\varphi\left(y\right)\) (guess and check). Then substitute the value of \(x\) into the vector.

📗 Answer (comma separated vector): .

📗 [4 points] If \(K\left(x, x'\right)\) is a kernel with induced feature representation \(\varphi\left(x_{0}\right)\) = , and \(G\left(x, x'\right)\) is another kernel with induced feature representation \(\theta\left(x_{0}\right)\) = , then it is known that \(H\left(x, x'\right) = a K\left(x, x'\right) + b G\left(x, x'\right)\), \(a\) = , \(b\) = is also a kernel. What is the induced feature representation of \(H\) for this \(x_{0}\)?

Hint

See Fall 2014 Midterm Q15, Fall 2013 Final Q7, Fall 2011 Midterm 9. This requires guess and check: suppose the feature representation is \(\begin{bmatrix} \sqrt{a} \varphi\left(x\right) \\ \sqrt{b} \theta\left(x\right) \end{bmatrix}\), then \(H\left(x, x'\right) = \begin{bmatrix} \sqrt{a} \varphi\left(x\right) & \sqrt{b} \theta\left(x\right) \end{bmatrix} \begin{bmatrix} \sqrt{a} \varphi\left(x'\right) \\ \sqrt{b} \theta\left(x'\right) \end{bmatrix} = \sqrt{a} \sqrt{a} \varphi^\top\left(x\right) \varphi\left(x'\right) + \sqrt{b} \sqrt{b} \theta^\top\left(x\right) \theta\left(x'\right) = a K\left(x, x'\right) + b G\left(x, x'\right)\).

📗 Answer (comma separated vector): .

📗 [3 points] Statistically, December 18 is the cloudiest day of the year in Madison, Wisconsin. Your professor (not me, this is Professor Jerry Zhu's question) is not making this up. On that day, the sky is overcast, mostly cloudy, or partly cloudy of the time (C = 0), and clear or mostly clear of the time (C = 1). What is the entropy of the binary random variable C? Reminder that log based 2 of x can be found by log(x) / log(2).

Hint

See Fall 2014 Midterm Q10, Fall 2006 Final Q11, Fall 2005 Final Q11. The entropy formula is \(H = -p_{1} \log_{2}\left(p_{1}\right) - p_{2} \log_{2}\left(p_{2}\right)\).

📗 Answer: .

📗 [3 points] A bag contains \(n\) = different colored balls. Randomly draw a ball from the bag with equal probability. What is the entropy of the outcome? Reminder that log based 2 of x can be found by log(x) / log(2) or log2(x).

Hint

See Fall 2014 Midterm Q10. The entropy formula is \(H = -\displaystyle\sum_{i=1}^{n} p_{i} \log_{2}\left(p_{i}\right)\). Here, since the probability of drawing each of the \(n\) balls is the same, \(p_{i} = \dfrac{1}{n}\) for each \(i\).

📗 Answer: .

📗 [1 points] Please enter any comments and suggestions including possible mistakes and bugs with the questions and the auto-grading, and materials relevant to solving the questions that you think are not covered well during the lectures. If you have no comments, please enter "None": do not leave it blank.

📗 Answer: .

# Grade

* * * * *

* * * * *

# Submission

📗 Please do not modify the content in the above text field: use the "Grade" button to update.

📗 Please wait for the message "Successful submission." to appear after the "Submit" button. If there is an error message or no message appears after 10 seconds, please save the text in the above text box to a file using the button or copy and paste it into a file yourself and submit it to Canvas Assignment M4. You could submit multiple times (but please do not submit too often): only the latest submission will be counted.

📗 You could load your answers from the text (or txt file) in the text box below using the button . The first two lines should be "##m: 4" and "##id: your id", and the format of the remaining lines should be "##1: your answer to question 1" newline "##2: your answer to question 2", etc. Please make sure that your answers are loaded correctly before submitting them.

# Solutions

📗 Some of the past exams referenced in the Hints can be found on Professor Zhu, Professor Liang and Professor Dyer's websites: Link, and Link.

📗 Some of the questions are from last year, and I recorded videos going through them, the links are at the bottom of the Week 1 to Week 8 pages, for example: W4 and W8.

Last Updated: July 01, 2025 at 1:48 AM