#
Bandit Configuration
T (number of rounds):
to
(number of trials
)
N (number repetition):
Reward means:
or default:
None
Rock Paper Scissors
Volunteer Dilemma
Tragedy of Commons
(Optional) number of players:
, number of actions:
Reward lower bound:
, upper bound:
Reward distribution:
Constant
Normal
Uniform
Method:
EXP3P
Explore-first
Epsilon-greedy
Successive Elimination
UCB1
EXP3
Bayesian (Beta-Bernoulli)
Bayesian (Normal-Normal)
Parameter
1
:
Parameter
2
:
Parameter
3
:
Compute
Regret:
Expected regret:
, variance:
#
Attack Configuration
Target:
Alpha:
Epsilon:
Reward at t =
:
Compute
Cost:
Expected cost:
, variance:
Last Updated: December 14, 2022 at 1:43 AM