Prev: Q12 Next: Q14
Back to week 4 page: Link

# Lecture 13 Examples

📗 My handwriting is really bad, you should copy down your notes from the lecture videos instead of using these.
Lecture 13 Zoom Annotated (2021): Link
Lecture 13 Pre-recorded Annotated (from 2020, please use with caution): Link

# Q13 Quiz Instruction

📗 The quiz is canceled. Everyone gets 0.5 points.

# Multi-Armed Bandit Demo

This is a demo for multi-armed bandit. The boxes have different mean rewards between 0 and 1. Click on one of them to collect the reward. For TopHat quiz questions, enter code and press .




Number of arm pulls per round:
Remaining number of pulls:
📗 Bandit Settings:
(1) Number of arms:
(2) Mean reward:
(3) Reward distribution: , Standard deviation (or half range):
📗 Algorithm Settings:
(1) Number of arm pulls:
(2) Algorithm: , \(\varepsilon\) (for Greedy and EXP: , c (for UCB):
📗 Output:
(1) Data:
(2) Means:
(3) Counts:
(4) Total reward:
(5) Total regret:






Last Updated: April 29, 2024 at 1:11 AM