📗 Enter your ID (the wisc email ID without @wisc.edu) here: and click (or hit enter key) 1,2,3,4,5,6,7,8,9,10m4
📗 The official deadline is July 4, but you can submit or resubmit without penalty until July 18.
📗 The same ID should generate the same set of questions. Your answers are not saved when you close the browser. You could print the page: , solve the problems, then enter all your answers at the end.
📗 Please do not refresh the page: your answers will not be saved.
📗 [3 points] A linear SVM (Support Vector Machine) has \(w\) = and \(b\) = . Which of the following points is predicted positive (label 1)?
Hint
See Fall 2014 Midterm Q12, Fall 2013 Final Q4, Spring 2017 Final Q2. The positively labeled points are the \(x\) such that \(w x + b \geq 0\).
📗 Choices:
None of the above
📗 Calculator: .
📗 [2 points] Suppose an SVM (Support Vector Machine) has \(w\) = and \(b\) = . What is the actual distance between the two planes defined by \(w^\top x + b = -1\) and \(w^\top x + b = 1\).
📗 Note: the distance between the two planes is the length of the red line in the diagram, the blue line does not represent the distance between the planes. You may have to rotate the diagram to see.
Hint
See Fall 2014 Midterm Q14. The distance between the two planes is \(\dfrac{2}{\sqrt{w^\top w}}\).
📗 Answer: .
📗 [4 points] If \(K\left(x, x'\right)\) is a kernel with induced feature representation \(\varphi\left(x_{0}\right)\) = , and \(G\left(x, x'\right)\) is another kernel with induced feature representation \(\theta\left(x_{0}\right)\) = , then it is known that \(H\left(x, x'\right) = a K\left(x, x'\right) + b G\left(x, x'\right)\), \(a\) = , \(b\) = is also a kernel. What is the induced feature representation of \(H\) for this \(x_{0}\)?
Hint
See Fall 2014 Midterm Q15, Fall 2013 Final Q7, Fall 2011 Midterm 9. This requires guess and check: suppose the feature representation is \(\begin{bmatrix} \sqrt{a} \varphi\left(x\right) \\ \sqrt{b} \theta\left(x\right) \end{bmatrix}\), then \(H\left(x, x'\right) = \begin{bmatrix} \sqrt{a} \varphi\left(x\right) & \sqrt{b} \theta\left(x\right) \end{bmatrix} \begin{bmatrix} \sqrt{a} \varphi\left(x'\right) \\ \sqrt{b} \theta\left(x'\right) \end{bmatrix} = \sqrt{a} \sqrt{a} \varphi^\top\left(x\right) \varphi\left(x'\right) + \sqrt{b} \sqrt{b} \theta^\top\left(x\right) \theta\left(x'\right) = a K\left(x, x'\right) + b G\left(x, x'\right)\).
📗 Answer (comma separated vector): .
📗 [3 points] Recall a linear SVM (Support Vector Machine) with slack variables has the objective function \(\dfrac{1}{2} w^\top w + C \displaystyle\sum_{i=1}^{n} \varepsilon_{i}\). What is the optimal \(w\) when the trade-off parameter \(C\) is 0? The training data contains only points with label 0 and with label 1. Only enter the weights, no bias.
Hint
See Fall 2014 Midterm Q13, Fall 2012 Final Q7. The minimization problem is \(\displaystyle\min_{w} \dfrac{1}{2} w^\top w\), the answer is always \(w = 0\) independent of the training set.
📗 Answer (comma separated vector): .
📗 [2 points] Consider a small dataset with \(n\) points, where each point is in a dimensional space. For which values of \(n\), there exists a dataset such that, no matter what binary label we give to each point, a linear SVM (Support Vector Machine) can perfectly classify the resulting dataset.
Hint
See Fall 2019 Final Q6, Fall 2010 Final Q14. The largest such \(n\) is called the Vapnik-Chervonenkis (VC) dimension. The VC dimension for linear classifiers (for example SVM) is the dimension of the space plus 1. The following is an example in 2D with 3 points: no matter what binary label we give to each point, a line can always separate the two classes: note that it is not the case with 4 points (remember the XOR example).
For this question, you can select all values less than or equal to the dimension of the space plus 1.
📗 Choices:
None of the above
📗 [2 points] Given a weight vector \(w\) = , consider the line (plane) defined by \(w^\top x = c\) = . Along this line (on the plane), there is a point that is the closest to the origin. How far is that point to the origin in Euclidean distance?
📗 Note: the distance between the point and plane is the length of the red line in the diagram, the length of the blue line is \(\dfrac{c}{w_{z}}\), not the distance between the point and plane.
Hint
See Fall 2011 Midterm Q7. Use a similar approach to the derivation of the margin: suppose the vector from the origin to the point on the line is \(\lambda w\), then \(w^\top \lambda w = c\), which implies that \(\lambda = \dfrac{c}{w^\top w} = \dfrac{c}{\left\|w\right\|^{2}}\). The length of the vector \(\lambda w\) is the distance from the origin to the point on the line: \(\left\|\lambda w\right\| = \lambda \left\|w\right\| = \dfrac{c}{\left\|w\right\|^{2}} \left\|w\right\| = \dfrac{c}{\left\|w\right\|}\).
📗 Answer: .
📗 [2 points] Let \(w\) = and \(b\) = . For the point \(x\) = , \(y\) = , what is the smallest slack value \(\xi\) for it to satisfy the margin constraint?
Hint
See Fall 2011 Midterm Q8, Fall 2009 Final Q1. There are two inequality constraints for the slack variable: (1) \(\left(2 y - 1\right)\left(w^\top x + b\right) \geq 1 - \xi\) and (2) \(\xi \geq 0\). Combine the two inequalities to get the smallest slack variable value.
📗 Answer: .
📗 [6 points] A linear SVM (Support Vector Machine) with with weights \(w_{1}, w_{2}, b\) is trained on the following data set: \(x_{1}\) = , \(y_{1}\) = and \(x_{2}\) = , \(y_{2}\) = . The attributes (i.e. features) are two dimensional \(\left(x_{i1}, x_{i2}\right)\) and the label \(y_{i}\) is binary. The classification rule is \(\hat{y}_{i} = 1_{\left\{w_{1} x_{i1} + w_{2} x_{i2} + b \geq 0\right\}}\). Assuming \(b\) = , what is \(\left(w_{1}, w_{2}\right)\) ? The drawing is not graded.
Hint
See Fall 2019 Final Q11, Fall 2006 Final Q15, Fall 2005 Final Q15. Draw the line (decision boundary). One way to figure out the equation of the is by noting that the line passes through the midpoint between \(x_{1}\) and \(x_{2}\), and the line (slope) is perpendicular to the line from \(x_{1}\) to \(x_{2}\). Make sure you have the correct signs: the class-1 point should satisfy \(w_{1} x_{i1} + w_{2} x_{i2} + b \geq 0\) and the class-0 point should satisfy \(w_{1} x_{i1} + w_{2} x_{i2} + b < 0\).
📗 Answer (comma separated vector): .
📗 [4 points] Given a linear SVM (Support Vector Machine) that perfectly classifies a set of training data containing positive examples and negative examples with 2 support vectors. After adding one more positively labeled training example and retraining the SVM, what is the maximum possible number of support vectors possible in the new SVM.
Hint
See Fall 2019 Final Q7 Q8. Try to come up with an example such that all instances are support vectors: say the negative points are \((-1, -1), \left(-2, -1\right), \left(-3, -1\right), ...\) and the positive points are \(\left(1, 1\right), \left(2, 1\right), \left(3, 1\right), ...\). Draw the points and draw the SVM decision boundary. Note that there are 2 support vectors. Suppose you add a point \((-1, 1)\). Draw the new point and draw the new SVM decision boundary. Note that now all points are support vectors.
📗 Answer: .
📗 [1 points] Please enter any comments and suggestions including possible mistakes and bugs with the questions and the auto-grading, and materials relevant to solving the questions that you think are not covered well during the lectures. If you have no comments, please enter "None": do not leave it blank.
📗 Please do not modify the content in the above text field: use the "Grade" button to update.
📗 Please wait for the message "Successful submission." to appear after the "Submit" button. If there is an error message or no message appears after 10 seconds, please save the text in the above text box to a file using the button or copy and paste it into a file yourself and submit it to Canvas Assignment M4. You could submit multiple times (but please do not submit too often): only the latest submission will be counted.
📗 You could load your answers from the text (or txt file) in the text box below using the button . The first two lines should be "##m: 4" and "##id: your id", and the format of the remaining lines should be "##1: your answer to question 1" newline "##2: your answer to question 2", etc. Please make sure that your answers are loaded correctly before submitting them.
📗 Some of the past exams referenced in the Hints can be found on Professor Zhu's and Professor Dyer's websites: Link and Link.
📗 Some of the questions are from last year, and I recorded videos going through them, the links are at the bottom of the Week 1 to Week 8 pages, for example: W4 and W8.
📗 The links to the solutions the students volunteered to share on Piazza will be collected in this post around the official deadline: Link.