Young Wu's Homepage

Prev: M20 Next: M22

# M21 Past Exam Problems

📗 Enter your ID (the wisc email ID without @wisc.edu) here: and click (or hit enter key)

📗 If the questions are not generated correctly, try refresh the page using the button at the top left corner.

📗 The same ID should generate the same set of questions. Your answers are not saved when you close the browser. You could print the page: , solve the problems, then enter all your answers at the end.

📗 Please do not refresh the page: your answers will not be saved.

# Warning: please enter your ID before you start!

# Question 1

📗

# Question 2

📗

# Question 3

📗

# Question 4

📗

# Question 5

📗

# Question 6

📗

# Question 7

📗

# Question 8

📗

# Question 9

📗

# Question 10

📗

# Question 11

📗

# Question 12

📗

# Question 13

📗

# Question 14

📗

# Question 15

📗

# Question 16

📗

# Question 17

📗

# Question 18

📗

# Question 19

📗

# Question 20

📗

# Question 21

📗

# Question 22

📗

# Question 23

📗

# Question 24

📗

# Question 25

📗

📗 [3 points] Assume tokenization rule is using whitespaces between words as separator, input one sentence \(s_{1}\) into decoder stack during training time. Write down the attention mask of self-attention block in decoder, where \(1\) = attented, \(0\) = masked.

Sentence: \(s_{1}\) = "". (Note: "< s >" is one token, not three).

📗 Answer (matrix with multiple lines, each line is a comma separated vector):

📗 [3 points] Assume tokenization rule is using whitespaces between words as separator, batch the following sentences in the original order into a matrix as input to encoder stack during training time. Write down the attention mask, where \(1\) = attented, \(0\) = masked.

Sentence: \(s_{1}\) = "", \(s_{2}\) = "", \(s_{3}\) = "".

📗 Answer (matrix with multiple lines, each line is a comma separated vector):

📗 [2 points] Given attention weight from \(q\) to \(k_{1}\), \(w_{1}\) = , from \(q\) to \(k_{2}\), \(w_{2}\) = , given values \(v_{1}\) = , \(v_{2}\) = , calculate the output vector.

📗 Answer (comma separated vector): .

📗 [2 points] Suppose scaled dot-product attention function is used. Given query vector \(q\) = , key vectors \(k_{1}\) = , \(k_{2}\) = , calculate the attention weight of \(q\) to \(k_{1}\) and \(q\) to \(k_{2}\), separate them with comma in the answer.

📗 Answer (comma separated vector): .

📗 [2 points] Suppose scaled dot-product attention function is used. Given two vectors \(q\) = , \(k\) = , calculate the attention score of \(q\) to \(k\).

📗 Answer: .

📗 [4 points] For the following models, what are their basic structures? Select from the options.

📗 Answer:

is
is
is
is

📗 [4 points] From the following options, write down which vectors are used in the computation of the following matrices?

📗 Answer:

uses
uses
uses
uses

📗 [1 points] Blank.

📗 Answer: .

📗 [1 points] Blank.

📗 Answer: .

📗 [1 points] Blank.

📗 Answer: .

📗 [1 points] Blank.

📗 Answer: .

📗 [1 points] Blank.

📗 Answer: .

📗 [1 points] Blank.

📗 Answer: .

📗 [1 points] Blank.

📗 Answer: .

📗 [1 points] Blank.

📗 Answer: .

📗 [1 points] Blank.

📗 Answer: .

📗 [1 points] Blank.

📗 Answer: .

📗 [1 points] Blank.

📗 Answer: .

📗 [1 points] Blank.

📗 Answer: .

📗 [1 points] Blank.

📗 Answer: .

📗 [1 points] Blank.

📗 Answer: .

📗 [1 points] Blank.

📗 Answer: .

📗 [1 points] Blank.

📗 Answer: .

📗 [1 points] Blank.

📗 Answer: .

# Grade

* * * * *

* * * * *

📗 You could save the text in the above text box to a file using the button or copy and paste it into a file yourself .

📗 You could load your answers from the text (or txt file) in the text box below using the button . The first two lines should be "##m: 21" and "##id: your id", and the format of the remaining lines should be "##1: your answer to question 1" newline "##2: your answer to question 2", etc. Please make sure that your answers are loaded correctly before submitting them.

📗 You can find videos going through the questions on Link.

Last Updated: September 11, 2025 at 10:55 PM