# M2A Midterm Part 2

📗 Enter your ID (the wisc email ID without @wisc.edu) here: and click (or hit enter key)
📗 The same ID should generate the same set of questions. Your answers are not saved when you close the browser. You could print the page: , solve the problems, then enter all your answers at the end.
📗 In case the questions are not generated correctly, try (1) refresh the page, (2) clear the browser cache, Ctrl+F5 or Ctrl+Shift+R or Shift+Command+R, (3) switch to incognito/private browsing mode, (4) switch to another browser, (5) use a different ID. If none of these work, message me on Zoom.
📗 Join Zoom if you have questions: Zoom Link
📗 Please do not refresh the page (after you start): your answers will not be saved.

# Warning: please enter your ID before you start!


# Question 1


# Question 2


# Question 3


# Question 4


# Question 5


# Question 6


# Question 7


# Question 8


# Question 9


# Question 10


# Question 11


# Question 12


# Question 13


# Question 14


# Question 15


📗 [3 points] Suppose the vocabulary is the alphabet plus space (26 letters + 1 space character), what is the (maximum likelihood) estimated trigram probability \(\hat{\mathbb{P}}\left\{a | x, y\right\}\) with Laplace smoothing (add-1 smoothing) if the sequence \(x, y\) never appeared in the training set. The training set has tokens in total. Enter -1 if more information is required to estimate this probability.
📗 Answer: .
📗 [3 points] Suppose the cumulative distribution function (CDF) of a discrete random variable \(X \in \left\{0, 1, 2, ...\right\}\) is given in the following table. What is the probability that is observed.
\(\mathbb{P}\left\{X < 0\right\}\) \(\mathbb{P}\left\{X \leq 0\right\}\) \(\mathbb{P}\left\{X \leq 1\right\}\) \(\mathbb{P}\left\{X \leq 2\right\}\) \(\mathbb{P}\left\{X \leq 3\right\}\) \(\mathbb{P}\left\{X \leq 4\right\}\)
\(0\)

📗 Answer: .
📗 [3 points] Given an infinite state sequence where the pattern "" is repeated infinite number of times. What is the (maximum likelihood) estimated transition probability from state to (without smoothing)?
📗 Answer: .
📗 [3 points] A tweet is ratioed if at least one reply gets more likes than the tweet. Suppose a tweet has replies, and each one of these replies gets more likes than the tweet with probability if the tweet is bad, and probability if the tweet is good. Given a tweet is ratioed, what is the probability that it is a bad tweet? The prior probability of a bad tweet is .
📗 Answer: .
📗 [4 points] Consider the following Markov Decision Process. It has two states \(s\), A and B. It has two actions \(a\): move and stay. The state transition is deterministic: "move" moves to the other state, while "stay" stays at the current state. The reward \(r\) is for move, for stay. Suppose the discount rate is \(\beta\) = .
Find the Q table \(Q_{i}\) after \(i\) = updates of every entry using Q value iteration (\(i = 0\) initializes all values to \(0\)) in the format described by the following table. Enter a two by two matrix.
State \ Action stay move
A ? ?
B ? ?

📗 Answer (matrix with multiple lines, each line is a comma separated vector): .
📗 [4 points] There are minions. They have either one eye or two eyes. They are either short or tall. Complete the two cells (enter ? then ??) in the following table so that the mutual information (i.e. information gain) between "Eyes" and "Height" is :
Number of minions Eyes Height Examples
One Short Bob, Carl, Stuart
? One Tall John
?? Two Short Dave, Jerry, Jorge
Two Tall Kevin, Tim, Tom

📗 Answer (comma separated vector): .
📗 [4 points] What is the gradient magnitude of the center element (pixel) of the image . Use the x gradient filter: \(\begin{bmatrix} -1 & 0 & 1 \\ -2 & 0 & 2 \\ -1 & 0 & 1 \end{bmatrix}\), and the y gradient filter: \(\begin{bmatrix} -1 & -2 & -1 \\ 0 & 0 & 0 \\ 1 & 2 & 1 \end{bmatrix}\). Remember to flip the filters.
📗 Answer: .
📗 [4 points] A convolutional neural network has input image of size x that is connected to a convolutional layer that uses a x filter, zero padding of the image, and a stride of 1. There are activation maps. (Here, zero-padding implies that these activation maps have the same size as the input images.) The convolutional layer is then connected to a pooling layer that uses x max pooling, a stride of (non-overlapping, no padding) of the convolutional layer. The pooling layer is then fully connected to an output layer that contains output units. There are no hidden layers between the pooling layer and the output layer. How many different weights must be learned in this whole network, not including any bias.
📗 Answer: .
📗 [4 points] Say we use Naive Bayes in an application where there are features represented by variables, each having possible values, and there are classes. How many probabilities must be stored in the CPTs (Conditional Probability Table) in the Bayesian network for this problem? Do not include probabilities that can be computed from other probabilities.
📗 Answer: .
📗 [3 points] What is the minimum zero-one cost of a binary (y is either 0 or 1) linear (threshold) classifier (for example, an LTU (Linear Threshold Unit) perceptron) on the following data set?
\(x_{i}\) 1 2 3 4 5 6
\(y_{i}\)

📗 Answer: .
📗 [4 points] Consider a kernel \(K\left(x_{i_{1}}, x_{i_{2}}\right)\) = + , where both \(x_{i_{1}}\) and \(x_{i_{2}}\) are 1D positive real numbers. What is the feature vector \(\varphi\left(x_{i}\right)\) induced by this kernel evaluated at \(x_{i}\) = ?
📗 Answer (comma separated vector): .
📗 [4 points] You are given a training set of five points and their 2-class classifications (+ or -): (, +), (, +), (, +), (, -), (, -). What is the decision boundary associated with this training set using 3NN (3 Nearest Neighbor)?
📗 Answer: .
📗 [3 points] Consider the following directed graphical model over binary variables: \(A \leftarrow B \to  C\). Given the CPTs (Conditional Probability Table):
Variable Probability Variable Probability
\(\mathbb{P}\left\{B = 1\right\}\)
\(\mathbb{P}\left\{C = 1 | B = 1\right\}\) \(\mathbb{P}\left\{C = 1 | B = 0\right\}\)
\(\mathbb{P}\left\{A = 1 | B = 1\right\}\) \(\mathbb{P}\left\{A = 1 | B = 0\right\}\)

What is the probability that \(\mathbb{P}\){ \(A\) = , \(B\) = , \(C\) = }?
📗 Answer: .
📗 [4 points] Consider the problem of detecting if an email message is a spam. Say we use four random variables to model this problem: a binary class variable \(S\) indicates if the message is a spam, and three binary feature variables: \(C, F, N\) indicating whether the message contains "Cash", "Free", "Now". We use a Naive Bayes classifier with associated CPTs (Conditional Probability Table):
Prior \(\mathbb{P}\left\{S = 1\right\}\) = - -
Hams \(\mathbb{P}\left\{C = 1 | S = 0\right\}\) = \(\mathbb{P}\left\{F = 1 | S = 0\right\}\) = \(\mathbb{P}\left\{N = 1 | S = 0\right\}\) =
Spams \(\mathbb{P}\left\{C = 1 | S = 1\right\}\) = \(\mathbb{P}\left\{F = 1 | S = 1\right\}\) = \(\mathbb{P}\left\{N = 1 | S = 1\right\}\) =

Compute \(\mathbb{P}\){\(C\) = , \(F\) = , \(N\) = }.
📗 Answer: .
📗 [1 points] Please enter any comments including possible mistakes and bugs with the questions or your answers. If you have no comments, please enter "None": do not leave it blank.
📗 Answer: .

# Grade


 * * * *

 * * * * *

# Submission


📗 Please do not modify the content in the above text field: use the "Grade" button to update.


📗 Please wait for the message "Successful submission." to appear after the "Submit" button. Please also save the text in the above text box to a file using the button or copy and paste it into a file yourself and submit it through email. You could submit multiple times (but please do not submit too often): only the latest submission will be counted.
📗 You could load your answers from the text (or txt file) in the text box below using the button . The first two lines should be "##x: 2" and "##id: your id", and the format of the remaining lines should be "##1: your answer to question 1" newline "##2: your answer to question 2", etc. Please make sure that your answers are loaded correctly before submitting them.







Last Updated: November 30, 2024 at 4:34 AM