Young Wu's Homepage

Prev: Q12 Next: Q14
Back to week 4 page: Link

# Lecture 13 Examples

📗 My handwriting is really bad, you should copy down your notes from the lecture videos instead of using these.

Lecture 13 Zoom Annotated (2021): Link
Lecture 13 Pre-recorded Annotated (from 2020, please use with caution): Link

# Q13 Quiz Instruction

📗 The quiz is canceled. Everyone gets 0.5 points.

# Multi-Armed Bandit Demo

This is a demo for multi-armed bandit. The boxes have different mean rewards between 0 and 1. Click on one of them to collect the reward. For TopHat quiz questions, enter code and press .

Number of arm pulls per round:
Remaining number of pulls:

📗 Bandit Settings:

(1) Number of arms:
(2) Mean reward:
(3) Reward distribution: , Standard deviation (or half range):

📗 Algorithm Settings:

(1) Number of arm pulls:
(2) Algorithm: , \(\varepsilon\) (for Greedy and EXP: , c (for UCB):

📗 Output:

(1) Data:
(2) Means:
(3) Counts:
(4) Total reward:
(5) Total regret:

Last Updated: July 01, 2025 at 1:47 AM