# Finite Horizon Stochastic Game
📗 Number of periods (H): .
📗 Number of states (|S|): .
📗 Number of actions (|A_1|, |A_2|, ...): .
📗 Range of reward: min , max
📗 Constrains: bounds, worse case, zero sum
Uniform transition
📗 Mean rewards (R):
📗 Transition probabilities (T):
📗 Initial state (Mu):
📗 Number of episodes (K):
(Uniformly distributed actions)
📗 Policy (P0):
📗 Variance of reward (Gaussian):
Coverage,
📗 Simulated data (E0, based on H, S, A, R, T, Mu, P0):
# Estimated Game
📗 Estimated mean rewards (R0, based on E0):
📗 Estimated transition probabilities (T0, based on E0):
📗 Estimated initial state (Mu0, based on E0):
# Poison Attack
Zero-Target,
(Uniform random deterministic actions)
📗 Target Policy (P1):
📗 Epsilon:
Quadratic,
Dominant,
Nash,
Ignore Off-Path,
📗 Poisoned data (E1, based on H, S, A, R0, T0, Mu0, P1):
📗 Total cost:
Dominant,
Rationalizability,
Markov Perfect,
Nash,
Feasibility,
Mean Feasibility
📗 List of costs:
# Estimated Poisoned Game
📗 Estimated mean rewards (R1, based on E1):
📗 Estimated transition probabilities (T1, based on E1):
📗 Estimated initial state (Mu1, based on E1):
📗 Q function without attack (Q0, based on H, S, A, R0, T0, Mu0, P1):
📗 Q function after attack (Q1, based on H, S, A, R1, T1, Mu1, P1):
Last Updated: December 14, 2022 at 1:43 AM