Young Wu's Homepage

Prev: M7 Next: M9
Back to week 5 page: Link

# M8 Written (Math) Problems

📗 Enter your ID (the wisc email ID without @wisc.edu) here: and click (or hit the "Enter" key)

📗 You can also load from your saved file
and click .

📗 If the questions are not generated correctly, try refresh the page using the button at the top left corner.

📗 The official deadline is July 24, late submissions within a week will be accepted without penalty, but please submit a regrade request form: Link.

📗 The same ID should generate the same set of questions. Your answers are not saved when you close the browser. You could print the page: , solve the problems, then enter all your answers at the end.

📗 Please do not refresh the page: your answers will not be saved.

📗 Please report any bugs on Piazza: Link

# Warning: please enter your ID before you start!

# Question 1

# Question 2

# Question 3

# Question 4

# Question 5

# Question 6

# Question 7

# Question 8

# Question 9

# Question 10

# Question 11

📗 [4 points] Consider the four points: \(x_{1}\) = , \(x_{2}\) = , \(x_{3}\) = , \(x_{4}\) = . Let there be two initial cluster centers \(c_{1}\) = , \(c_{2}\) = . Use Euclidean distance. Break ties in distances by putting the point in the cluster with the smaller index (i.e. favor cluster 1). If a cluster contains no points, do not move the cluster center (it stays at the initial position). Write down the cluster centers after one iteration of k-means, the first cluster center (comma separated vector) on the first line and the second cluster center (comma separated vector) on the second line.

📗 Note: the red points are the cluster centers and the other points are the training items.

Hint

See Fall 2019 Midterm Q22, Spring 2018 Midterm Q7, Fall 2017 Final Q22, Spring 2017 Midterm Q5, Fall 2014 Final Q20, Fall 2013 Final Q14, Fall 2006 Final Q14, Fall 2005 Final Q14. Find which cluster each \(x_{i}\) belongs to (call it \(k_{i}\)): it's the cluster center that is the closest to the point. Compute the new cluster centers \(c'_{1}, c'_{2}\) as \(c'_{k} = \dfrac{1}{\displaystyle\sum_{k_{i} = k} 1} \displaystyle\sum_{k_{i} = k} x_{i}\).

📗 Answer (matrix with multiple lines, each line is a comma separated vector): .

📗 [3 points] Perform k-means clustering on six points: \(x_{1}\) = , \(x_{2}\) = , \(x_{3}\) = , \(x_{4}\) = , \(x_{5}\) = , \(x_{6}\) = . Initially the cluster centers are at \(c_{1}\) = , \(c_{2}\) = . Run k-means for one iteration (assign the points, update center once and reassign the points once). Break ties in distances by putting the point in the cluster with the smaller index (i.e. favor cluster 1). What is the reduction in total distortion? Use Euclidean distance and calculate the total distortion by summing the squares of the individual distances to the center.

📗 Note: the red points are the cluster centers and the other points are the training items.

Hint

See Spring 2018 Midterm Q7, Fall 2016 Final Q9, Fall 2014 Midterm Q5, Fall 2012 Final Q3. (1) Find which cluster each \(x_{i}\) belongs to (call it \(k_{i}\)): it's the cluster center that is the closest to the point. (2) Compute the total distortion as \(\displaystyle\sum_{i=1}^{6} \left(x_{i} - c_{k_{i}}\right)^{2}\). (3) Compute the new cluster centers \(c'_{1}, c'_{2}\) as \(c'_{k} = \dfrac{1}{\displaystyle\sum_{k_{i} = k} 1} \displaystyle\sum_{k_{i} = k} x_{i}\). Then repeat (1) and (2). Take the difference between the two distortions.

📗 Answer: .

📗 [4 points] Suppose K-Means with \(K = 2\) is used to cluster the data set and initial cluster centers are \(c_{1}\) = and \(c_{2}\) = \(x\). What is the smallest value of \(x\) if cluster 1 has \(n\) = points initially (before updating the cluster centers). Break ties by assigning the point to cluster 2.

📗 Note: the red points are the cluster centers and the other points are the training items.

Hint

The \(n\) points on the left (or right, depending on the question) should be assigned to cluster 1. The \(n + 1\)-th point (call it \(x_{n + 1}\) from the left (or right) can be equidistant from cluster 1 center and cluster 2 center because if the distances to the clusters are the same, the point is assigned to cluster 2 due to the tie-breaking rule. Therefore, \(x_{n + 1} = \dfrac{1}{2} \left(c_{1} + c_{2}\right)\) can be used to solved for \(c_{2}\).

📗 Answer: .

📗 [3 points] Let \(x\) = and \(v\) = . The projection of \(x\) onto \(v\) is the point \(y\) on the direction of \(v\) such that the line connecting \(x, y\) is perpendicular to \(v\). Compute \(y\).

Hint

See Fall 2018 Midterm Q14. To compute the projection: if \(v\) is a unit vector \(\left\|v\right\| = 1\), use the simplified formula: \(v^\top x v\); otherwise, use the formula: \(\dfrac{v^\top x}{v^\top v} v\).

📗 Answer (comma separated vector): .

📗 [2 points] You performed PCA (Principal Component Analysis) in \(\mathbb{R}^{3}\). If the first principal component is \(u_{1}\) = \(\approx\) and the second principal component is \(u_{2}\) = \(\approx\) . What is the new 2D coordinates (new features created by PCA) for the point \(x\) = ?

📗 In the diagram, the black axes are the original axes, the green axes are the PCA axes, the red vector is \(x\), the red point is the reconstruction \(\hat{x}\) using the PCA axes.

Hint

See Fall 2018 Midterm Q13, Fall 2017 Final Q10. Coordinate \(i\) is given by the projection of \(x\) onto the principal component \(v_{i}\). If the principal component is a unit vector \(u_{i}\), use the simplified formula: \(u_{i^\top} x\); otherwise, use the formula: \(\dfrac{v_{i^\top} x}{v_{i^\top} v_{i}}\).

📗 Answer (comma separated vector): .

📗 [3 points] Given the variance matrix \(\hat{\Sigma}\) = . If one original data point is \(x\) = . What is the reconstructed vector using only the first principal components?

Hint

First find the principal components, call them \(u_{1}, u_{2}, u_{3}\): the first principal component is the eigenvector corresponding to the largest eigenvalue (for diagonal matrices, these are just the diagonal entries), and the \(i\)th principal component is the eigenvector corresponding to the \(i\)th largest eigenvalue. Here, the eigenvector corresponding to the first eigenvalue is \(\begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix}\), the eigenvector corresponding to the second eigenvalue is \(\begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix}\), and the eigenvector corresponding to the third eigenvalue is \(\begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix}\). Then find the new feature vector in the \(K\) dimensional space: the \(i\)th component in the new feature vector is \(u_{i^\top} x\), meaning the new feature vector here (with \(K = 2\) is \(\begin{bmatrix} v_{1} \\ v_{2} \end{bmatrix}\) = \(\begin{bmatrix} u_{1^\top} x \\ u_{2^\top} x \end{bmatrix}\). At the end, the reconstructed vector is given by \(v_{1} u_{1} + v_{2} u_{2}\) = \(u_{1^\top} x u_{1} + u_{2^\top} x u_{2}\).

📗 Answer (comma separated vector):

📗 [4 points] Consider the following Markov Decision Process. It has two states \(s\), A and B. It has two actions \(a\): move and stay. The state transition is deterministic: "move" moves to the other state, while "stay" stays at the current state. The reward \(r\) is for move (from A and B), for stay (in A and B). Suppose the discount rate is \(\beta\) = .

Find the Q table \(Q_{i}\) after \(i\) = updates of every entry using Q value iteration (\(i = 0\) initializes all values to \(0\)) in the format described by the following table. Enter a two by two matrix.

State \ Action	stay	move
A	?	?
B	?	?

Hint

The Bellman Equation is: \(Q_{i+1}\left(s_{t}, a_{t}\right)\) = \(Q_{i}\left(s_{t}, a_{t}\right) + \alpha \left(r + \gamma \displaystyle\max_{a'} Q_{i}\left(s_{t+1}, a'\right) - Q_{i}\left(s_{t}, a_{t}\right)\right)\) = \(r + \gamma \displaystyle\max_{a'} Q_{i}\left(s_{t+1}, a'\right)\) for this question.

📗 Answer (matrix with multiple lines, each line is a comma separated vector): .

📗 [3 points] Consider the Grid World with terminal states "RED" and "GREEN" and 7 other states shown in the table below.

RED	1	2
3	4	5
6	7	GREEN

There are four actions UP, DOWN, LEFT, RIGHT describing the movement between the states on the grid. The grid does not wrap around, i.e. using the action UP in state 1 results in state 1, not state 7.
Suppose the reward on all transitions (from actions UP, DOWN, LEFT, RIGHT) are \(R_{t}\) = , and the discount factor is \(\gamma\) = . The current policy \(\pi\) (probabilities of actions UP, DOWN, LEFT, RIGHT when in each state) is given in the following table.

State	UP	DOWN	LEFT	RIGHT
1
2
3
4
5
6
7

The current value function \(V_{k}\) is given in the table below.

\(0\)

		\(0\)

Find the value of state in the next step of value iteration (i.e. \(V_{k+1}\) for state ). Enter one number.

Hint

The value iteration formula is \(V'\left(s\right) = \displaystyle\sum_{a} \mathbb{P}\left\{a\right\} \left(r + \gamma \cdot V\left(s'\right)\right)\), and using the subscript \(k\) notation, it means \(V_{k+1}\left(s\right) = \displaystyle\sum_{a} \mathbb{P}\left\{a\right\} \left(r + \gamma \cdot V_{k}\left(s'\right)\right)\), where \(s'\) is the new state after action \(a\) is performed on the state \(s\), and \(\mathbb{P}\left\{a\right\}\) is the probability that the action \(a\) is used in the policy, \(r\) or \(R_{t}\) is the reward from performing action \(a\) in the state \(s\).

📗 Answer: .

📗 [4 points] There are 3 states \(s_{0}, s_{1}, s_{2}\) and 3 actions \(a_{0}, a_{1}, a_{2}\). We start from , choose , we get the reward and then move to , choose . Update the Q value for (, ) based on the current Q table and the movement above, using SARSA and Q-learning (enter two numbers, comma separated)? The reward decay (discount rate) is \(\gamma\) = , and the step size (learning rate) is \(\alpha\) = .

State \ Action	\(a_{0}\)	\(a_{1}\)	\(a_{2}\)
\(s_{0}\)
\(s_{1}\)
\(s_{2}\)

Hint

The Q-learning formula is \(Q'\left(s, a\right) = Q\left(s, a\right) + \alpha \left(r + \gamma \displaystyle\max_{a'} Q\left(s', a'\right) - Q\left(s, a\right)\right)\), and the SARSA formula is \(Q'\left(s, a\right) = Q\left(s, a\right) + \alpha \left(r + \gamma Q\left(s', a'\right) - Q\left(s, a\right)\right)\). The difference is where \(a'\) from the policy is used or the best \(a'\) among all actions (i.e. a' that leads to the highest Q) is used.

📗 Answer (comma separated vector): .

📗 [3 points] Consider state space \(S = \left\{s_{1}, s_{2}\right\}\) and action space \(A\) = {left, right}. In \(s_{1}\) the action "right" sends the agent to \(s_{2}\) and collects reward \(r = 1\). In \(s_{2}\) the action "left" sends the agent to \(s_{1}\) but with zero reward. All other state-action pairs stay in that state with zero reward. With discounting factor \(\gamma\) = , what is the value \(v\left(s_{2}\right)\) under the optimal policy.

Hint

See Fall 2017 Final Q4. The value is \(v = r_{0} + \gamma r_{1} + \gamma^{2} r_{2} + ...\) where \(r_{0}, r_{1}, ...\) are rewards from using the optimal policy. Guess the optimal policy: it should be one of "stay, stay", "move, move", "stay, move", or "move, stay", the value from all four can be computed and compared too, and the largest one is the V value. Use the formula, \(1 + \gamma + \gamma^{2} + ... = \dfrac{1}{1 - \gamma}\).

📗 Answer: .

📗 [1 points] Please enter any comments and suggestions including possible mistakes and bugs with the questions and the auto-grading, and materials relevant to solving the questions that you think are not covered well during the lectures. If you have no comments, please enter "None": do not leave it blank.

📗 Answer: .

# Grade

* * * * *

* * * * *

# Submission

📗 Please do not modify the content in the above text field: use the "Grade" button to update.

📗 Please wait for the message "Successful submission." to appear after the "Submit" button. If there is an error message or no message appears after 10 seconds, please save the text in the above text box to a file using the button or copy and paste it into a file yourself and submit it to Canvas Assignment M8. You could submit multiple times (but please do not submit too often): only the latest submission will be counted.

📗 You could load your answers from the text (or txt file) in the text box below using the button . The first two lines should be "##m: 8" and "##id: your id", and the format of the remaining lines should be "##1: your answer to question 1" newline "##2: your answer to question 2", etc. Please make sure that your answers are loaded correctly before submitting them.

# Solutions

📗 Some of the past exams referenced in the Hints can be found on Professor Zhu, Professor Liang and Professor Dyer's websites: Link, and Link.

📗 Some of the questions are from last year, and I recorded videos going through them, the links are at the bottom of the Week 1 to Week 8 pages, for example: W4 and W8.

Last Updated: July 01, 2025 at 1:48 AM