📗 Enter your ID (the wisc email ID without @wisc.edu) here: and click (or hit enter key) 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50x5
📗 If the questions are not generated correctly, try refresh the page using the button at the top left corner.
📗 The same ID should generate the same set of questions. Your answers are not saved when you close the browser. You could print the page: , solve the problems, then enter all your answers at the end.
📗 Please do not refresh the page: your answers will not be saved.
📗 [3 points] What is the distance between clusters \(C_{1}\) = {} and \(C_{2}\) = {} using linkage?
📗 Answer: .
📗 [4 points] You are given the distance table. Consider the next iteration of hierarchical agglomerative clustering (another name for the hierarchical clustering method we covered in the lectures) using linkage. What will the new values be in the resulting distance table corresponding to the new clusters? If you merge two columns (rows), put the new distances in the column (row) with the smaller index. For example, if you merge columns 2 and 4, the new column 2 should contain the new distances and column 4 should be removed, i.e. the columns and rows should be in the order (1), (2 and 4), (3), (5).
\(d\) =
📗 Answer (matrix with multiple lines, each line is a comma separated vector): .
📗 [3 points] Given three clusters, \(A\) = {, }, \(B\) = {\(x\)}, \(C\) = {, }. Find a value of \(x\) so that \(A\) and \(B\) will be merged in the next iteration of single linkage hierarchical clustering, and \(B\) and \(C\) will be merged in the next iteration of complete linkage hierarchical clustering. Break ties by merging with the cluster with the smaller index (i.e. \(A\), then \(B\), then \(C\)).
📗 Note: there can be multiple answers, including non-integer answers, enter one of them. If there are none, enter 0.
📗 Answer: .
📗 [3 points] Perform k-means clustering on six points: \(x_{1}\) = , \(x_{2}\) = , \(x_{3}\) = , \(x_{4}\) = , \(x_{5}\) = , \(x_{6}\) = . Initially the cluster centers are at \(c_{1}\) = , \(c_{2}\) = . Run k-means for one iteration (assign the points, update center once and reassign the points once). Break ties in distances by putting the point in the cluster with the smaller index (i.e. favor cluster 1). What is the reduction in total distortion? Use Euclidean distance and calculate the total distortion by summing the squares of the individual distances to the center.
📗 Note: the red points are the cluster centers and the other points are the training items.
📗 Answer: .
📗 [4 points] Consider the four points: \(x_{1}\) = , \(x_{2}\) = , \(x_{3}\) = , \(x_{4}\) = . Let there be two initial cluster centers \(c_{1}\) = , \(c_{2}\) = . Use Euclidean distance. Break ties in distances by putting the point in the cluster with the smaller index (i.e. favor cluster 1). If a cluster contains no points, do not move the cluster center (it stays at the initial position). Write down the cluster centers after one iteration of k-means, the first cluster center (comma separated vector) on the first line and the second cluster center (comma separated vector) on the second line.
📗 Note: the red points are the cluster centers and the other points are the training items.
📗 Answer (matrix with multiple lines, each line is a comma separated vector): .
📗 [3 points] Consider the 1D data set: \(x_{i} = i\) for \(i\) = to . To select good initial centers for k-means where \(k\) = , let's set \(c_{1}\) = . Then select \(c_{j}\) from the unused points in the data set, so that it is farthest from any already-selected centers \(c_{1}, ..., c_{j-1}\) (i.e. \(c_{j} = \mathop{\mathrm{argmax}}_{x_{i}} \displaystyle\min\left\{d\left(c_{1}, x_{i}\right), d\left(c_{2}, x_{i}\right), ..., d\left(c_{j-1}, x_{i}\right)\right\}\)). Enter the initial centers (including \(c_{1}\)) in increasing order (from the smallest to the largest). In case of ties, select the smaller number.
📗 Answer (comma separated vector): .
📗 [4 points] Suppose K-Means with \(K = 2\) is used to cluster the data set and initial cluster centers are \(c_{1}\) = and \(c_{2}\) = \(x\). What is the value of \(x\) if cluster 1 has \(n\) = points initially (before updating the cluster centers). Break ties by assigning the point to cluster 2.
📗 Answer: .
📗 [4 points] Given the dataset , the cluster centers are computed by k-means clustering algorithm with \(k = 2\). The first cluster center is \(x\) and the second cluster center is . What is the imum value of \(x\) such that the second cluster is empty (contains 0 instances). In case of a tie in distance, the point belongs to cluster 1.
📗 Answer: .
📗 [3 points] You have a dataset with unique data points which you want to use k-means clustering on. You setup the experiment as follows: you apply k-means with different k's: \(k\) = . Which \(k\) value will minimize the total distortion? Enter -1 if the answer depends on the data points.
📗 Answer: .
📗 [3 points] Given data and initial k-means cluster centers \(c_{1}\) = and \(c_{2}\) = , what is the initial total distortion (do not take the square root). Use Euclidean distance. Break ties by assigning points to the first cluster.
📗 Answer: .
📗 [2 points] You performed PCA (Principal Component Analysis) in \(\mathbb{R}^{3}\). If the first principal component is \(u_{1}\) = \(\approx\) and the second principal component is \(u_{2}\) = \(\approx\) . What is the new 2D coordinates (new features created by PCA) for the point \(x\) = ?
📗 In the diagram, the black axes are the original axes, the green axes are the PCA axes, the red vector is \(x\), the red point is the reconstruction \(\hat{x}\) using the PCA axes.
📗 Answer (comma separated vector): .
📗 [3 points] Let \(x\) = and \(v\) = . The projection of \(x\) onto \(v\) is the point \(y\) on the direction of \(v\) such that the line connecting \(x, y\) is perpendicular to \(v\). Compute \(y\).
📗 Answer (comma separated vector): .
📗 [4 points] What is the projected variance of and onto the principal component ? Use the MLE (Maximum Likelihood Estimate) formula for the variance: \(\sigma^{2} = \dfrac{1}{n} \displaystyle\sum_{i=1}^{n} \left(x_{i} - \mu\right)^{2}\) with \(\mu = \dfrac{1}{n} \displaystyle\sum_{i=1}^{n} x_{i}\).
📗 Answer: .
📗 [3 points] Given the variance matrix \(\hat{\Sigma}\) = , what is the first principal component?
📗 Answer (comma separated vector):
📗 [3 points] Given the variance matrix \(\hat{\Sigma}\) = . If one original data point is \(x\) = . What is the reconstructed vector using only the first principal components?
📗 Answer (comma separated vector):
📗 [3 points] Given the variance matrix \(\hat{\Sigma}\) is a diagonal matrix, what is the smallest value of \(K\) so that the Manhattan distance between the vector \(\begin{bmatrix} 1 \\ 1 \\ ... \\ 1 \end{bmatrix}\) with ones (\(1\)'s) and its reconstruction using the first \(K\) principal components is less than or equal to ?
📗 Answer: .
📗 [4 points] There are 3 states \(s_{0}, s_{1}, s_{2}\) and 3 actions \(a_{0}, a_{1}, a_{2}\). We start from , choose , we get the reward and then move to , choose . Update the Q value for (, ) based on the current Q table and the movement above, using SARSA and Q-learning (enter two numbers, comma separated)? The reward decay (discount rate) is \(\gamma\) = , and the step size (learning rate) is \(\alpha\) = .
State \ Action
\(a_{0}\)
\(a_{1}\)
\(a_{2}\)
\(s_{0}\)
\(s_{1}\)
\(s_{2}\)
📗 Answer (comma separated vector): .
📗 [3 points] Consider the Grid World with terminal states "RED" and "GREEN" and 7 other states shown in the table below.
RED
1
2
3
4
5
6
7
GREEN
There are four actions UP, DOWN, LEFT, RIGHT describing the movement between the states on the grid. The grid does not wrap around, i.e. using the action UP in state 1 results in state 1, not state 7.
Suppose the reward on all transitions (from actions UP, DOWN, LEFT, RIGHT) are \(R_{t}\) = , and the discount factor is \(\gamma\) = . The current policy \(\pi\) (probabilities of actions UP, DOWN, LEFT, RIGHT when in each state) is given in the following table.
State
UP
DOWN
LEFT
RIGHT
1
2
3
4
5
6
7
The current value function \(V_{k}\) is given in the table below.
\(0\)
\(0\)
Find the value of state in the next step of value iteration (i.e. \(V_{k+1}\) for state ). Enter one number.
📗 Answer: .
📗 [4 points] Consider the following Markov Decision Process. It has two states \(s\), A and B. It has two actions \(a\): move and stay. The state transition is deterministic: "move" moves to the other state, while "stay" stays at the current state. The reward \(r\) is for move (from A and B), for stay (in A and B). Suppose the discount rate is \(\beta\) = .
Find the Q table \(Q_{i}\) after \(i\) = updates of every entry using Q value iteration (\(i = 0\) initializes all values to \(0\)) in the format described by the following table. Enter a two by two matrix.
State \ Action
stay
move
A
?
?
B
?
?
📗 Answer (matrix with multiple lines, each line is a comma separated vector): .
📗 You could save the text in the above text box to a file using the button or copy and paste it into a file yourself .
📗 You could load your answers from the text (or txt file) in the text box below using the button . The first two lines should be "##x: 5" and "##id: your id", and the format of the remaining lines should be "##1: your answer to question 1" newline "##2: your answer to question 2", etc. Please make sure that your answers are loaded correctly before submitting them.