📗 Genetic algorithm uses a genetic representation of a policy and a fitness function for the reward (usually the value function).
📗 The genetic representation need to work for crossover and mutation.
➩ Crossover between \(\pi_{1}\) and \(\pi_{2}\) combines the policies.
➩ Mutation of \(\pi\) randomly change the policy.
📗 \(N\) policies are initialized randomly and the crossover based on the probabilities \(\dfrac{V^{\pi}}{\displaystyle\sum_{n} V^{\pi_{n}}}\) and mutated randomly.