📗 Enter your ID (the wisc email ID without @wisc.edu) here: and click (or hit enter key) 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25m21
📗 If the questions are not generated correctly, try refresh the page using the button at the top left corner.
📗 The same ID should generate the same set of questions. Your answers are not saved when you close the browser. You could print the page: , solve the problems, then enter all your answers at the end.
📗 Please do not refresh the page: your answers will not be saved.
📗 [3 points] Assume tokenization rule is using whitespaces between words as separator, input one sentence \(s_{1}\) into decoder stack during training time. Write down the attention mask of self-attention block in decoder, where \(1\) = attented, \(0\) = masked.
Sentence: \(s_{1}\) = "". (Note: "< s >" is one token, not three).
📗 Answer (matrix with multiple lines, each line is a comma separated vector):
📗 [3 points] Assume tokenization rule is using whitespaces between words as separator, batch the following sentences in the original order into a matrix as input to encoder stack during training time. Write down the attention mask, where \(1\) = attented, \(0\) = masked.
📗 Answer (matrix with multiple lines, each line is a comma separated vector):
📗 [2 points] Given attention weight from \(q\) to \(k_{1}\), \(w_{1}\) = , from \(q\) to \(k_{2}\), \(w_{2}\) = , given values \(v_{1}\) = , \(v_{2}\) = , calculate the output vector.
📗 Answer (comma separated vector): .
📗 [2 points] Suppose scaled dot-product attention function is used. Given query vector \(q\) = , key vectors \(k_{1}\) = , \(k_{2}\) = , calculate the attention weight of \(q\) to \(k_{1}\) and \(q\) to \(k_{2}\), separate them with comma in the answer.
📗 Answer (comma separated vector): .
📗 [2 points] Suppose scaled dot-product attention function is used. Given two vectors \(q\) = , \(k\) = , calculate the attention score of \(q\) to \(k\).
📗 Answer: .
📗 [3 points] What are the components of an -block in transformer model?
📗 Choices:
None of the above
📗 [4 points] For the following models, what are their basic structures? Select from the options.
📗 Answer:
is
is
is
is
📗 [4 points] From the following options, write down which vectors are used in the computation of the following matrices?
📗 You could save the text in the above text box to a file using the button or copy and paste it into a file yourself .
📗 You could load your answers from the text (or txt file) in the text box below using the button . The first two lines should be "##m: 21" and "##id: your id", and the format of the remaining lines should be "##1: your answer to question 1" newline "##2: your answer to question 2", etc. Please make sure that your answers are loaded correctly before submitting them.
📗 You can find videos going through the questions on Link.