📗 Enter your ID (the wisc email ID without @wisc.edu) here: and click (or hit enter key) 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25m9
📗 If the questions are not generated correctly, try refresh the page using the button at the top left corner.
📗 The same ID should generate the same set of questions. Your answers are not saved when you close the browser. You could print the page: , solve the problems, then enter all your answers at the end.
📗 Please do not refresh the page: your answers will not be saved.
📗 [4 points] Given the following training data, what is the fold cross validation accuracy (i.e. LOOCV, Leave One Out Cross Validation) if NN (Nearest Neighbor) classifier with Manhattan distance is used. Break the tie (in distance) by using the instance with the smaller index. Enter a number between 0 and 1.
Index
1
2
3
4
5
6
\(x_{i}\)
\(y_{i}\)
📗 Answer: .
📗 [4 points] Given the following training data, what is the fold cross validation accuracy (i.e. LOOCV, Leave One Out Cross Validation) if NN (Nearest Neighbor) classifier with Manhattan distance is used. Break the tie (in distance) by using the instance with the smaller index. Enter a number between 0 and 1.
Index
1
2
3
4
5
\(x_{i}\)
\(y_{i}\)
📗 Answer: .
📗 [4 points] Given the following training data, what is the fold cross validation accuracy if NN (Nearest Neighbor) classifier with Manhattan distance is used. The first fold is the first instances, the second fold is the next instances, etc. Break the tie (in distance) by using the instance with the smaller index. Enter a number between 0 and 1.
\(x_{i}\)
\(y_{i}\)
📗 Answer: .
📗 [3 points] Consider points in 2D and binary labels. Given the training data in the table, and use Manhattan distance with 1NN (Nearest Neighbor), which of the following points in 2D are classified as 1? Answer the question by first drawing the decision boundaries. The drawing is not graded.
index
\(x_{1}\)
\(x_{2}\)
label
1
-1
-1
2
-1
1
3
1
-1
4
1
1
📗 Choices:
None of the above
📗 [4 points] You are given a training set of six points and their 2-class classifications (+ or -): (, +), (, +), (, +), (, -), (, -), (, -). What is the decision boundary associated with this training set using 3NN (3 Nearest Neighbor)? Note: there is one more point compared to the question from the homework.
📗 Answer: .
📗 [4 points] You are given a training set of five points and their 2-class classifications (+ or -): (, +), (, +), (, +), (, -), (, -). What is the decision boundary associated with this training set using 3NN (3 Nearest Neighbor)?
📗 Answer: .
📗 [4 points] You are given a training set of five points and their 2-class classifications (+ or -): (, +), (, +), (, -), (, -), (, -). What is the decision boundary associated with this training set using 3NN (3 Nearest Neighbor)?
📗 Answer: .
📗 [3 points] Consider binary classification in 2D where the intended label of a point \(x = \begin{bmatrix} x_{1} \\ x_{2} \end{bmatrix}\) is positive (1) if \(x_{1} > x_{2}\) and negative (0) otherwise. Let the training set be all points of the form \(x\) = where \(a, b\) are integers. Each training item has the correct label that follows the rule above. With a 1NN (Nearest Neighbor) classifier (Euclidean distance), which ones of the following points are labeled positive? The drawing is not graded.
📗 Choices:
None of the above
📗 Calculator: .
📗 [3 points] Let a dataset consist of \(n\) = points in \(\mathbb{R}\), specifically, the first \(n - 1\) points are and the last point \(x_{n}\) is unknown. What is the smallest value of \(x_{n}\) above which \(x_{n-1}\) is among \(x_{n}\)'s 3-nearest neighbors, but \(x_{n}\) is NOT among \(x_{n-1}\)'s 3-nearest neighbor? Note that the 3-nearest neighbors of a point in the training set include the point itself.
📗 Answer: .
📗 [4 points] Say we have a training set consisting of positive examples and negative examples where each example is a point in a two-dimensional, real-valued feature space. What will the classification accuracy be on the training set with NN (Nearest Neighbor).
📗 Answer: .
📗 [4 points] List English letters from A to Z: ABCDEFGHIJKLMNOPQRSTUVWXYZ. Define the distance between two letters in the natural way, that is \(d\left(A, A\right) = 0\), \(d\left(A, B\right) = 1\), \(d\left(A, C\right) = 2\) and so on. Each letter has a label, are labeled 0, and the others are labeled 1. This is your training data. Now classify each letter using kNN (k Nearest Neighbor) for odd \(k = 1, 3, 5, 7, ...\). What is the smallest \(k\) where all letters are classified the same (same label, i.e. either all labels are 0s or all labels are 1s). Break ties by preferring the earlier letters in the alphabet. Hint: the nearest neighbor of a letter is the letter itself.
📗 Answer: .
📗 [3 points] Find the Nearest Neighbor label for using distance.
\(x_{1}\)
\(x_{2}\)
\(y\)
📗 Answer: .
📗 [4 points] You have a data set with positive items and negative items. You perform a "leave-one-out" procedure: for each item i, learn a separate kNN (k Nearest Neighbor) classifier on all items except item i, and compute that kNN's accuracy in predicting item i. The leave-one-out accuracy is defined to be the average of the accuracy for each item. What is the leave-one-out accuracy when k = ?
📗 Answer: .
📗 [3 points] What is the city-block distance (also known as L1 distance or Manhattan distance) between two points and ?
📗 Answer: .
📗 [3 points] What is the city-block distance (also known as L1 distance or Manhattan distance) between two points and ?
📗 Note: the Manhattan distance is the sum of the lengths of the red lines, not the length of the blue line: that is the L2 or Euclidean distance.
📗 Answer: .
📗 [2 points] You have a dataset with unique data points (half of which are labeled 0 and the other half labeled 1) which you want to use to train a kNN (k Nearest Neighbor) classifier. You setup the experiment as follows: you train kNN classifiers: \(k\) = using all the data points. Then you randomly select data points from the training set, and classify them using each of the classifiers. Which classifier (enter the \(k\) value) will have the highest accuracy? Your answer should not depend on which random subset is selected.
📗 You could save the text in the above text box to a file using the button or copy and paste it into a file yourself .
📗 You could load your answers from the text (or txt file) in the text box below using the button . The first two lines should be "##m: 9" and "##id: your id", and the format of the remaining lines should be "##1: your answer to question 1" newline "##2: your answer to question 2", etc. Please make sure that your answers are loaded correctly before submitting them.
📗 You can find videos going through the questions on Link.