# Schedule
This is a directed study project in Computer Science. In Summer 2024, the topic is deep reinforcement learning. We will implement an autonomous driving simulation environment similar to
Link, and we will use Q network algorithms to train the vehicles instead of genetic algorithms.
We ended up implementing DQN and Policy Gradient to solve the flappy bird problem similar to
Link. We noticed that random initialization of the networks leads to very slow convergence, and initializing the networks by pre-training on human behavioral policies speeds up the convergence.
Week |
Date |
Topic |
Notes |
1 |
May 20 |
Markov Decision Process |
W1 |
2 |
May 27 |
Q Learning |
W2 |
3 |
Jun 3 |
Neural Networks |
W3 |
4 |
Jun 10 |
Gradient Methods |
W4 |
5 |
Jun 17 |
Genetic Algorithm |
W5 |
6 |
Jun 24 |
Deep Q Network |
W6 |
7 |
Jul 1 |
Policy Gradient |
W7 |
8 |
Jul 8 |
Project |
W8 |
9 |
Jul 15 |
Project |
W9 |
10 |
Jul 22 |
Project |
W10 |
11 |
Jul 29 |
Project |
W11 |
12 |
Aug 5 |
Project |
W12 |
13 |
Aug 12 |
Project |
W13 |
14 |
Aug 19 |
Project |
W14 |
Textbooks: Reinforcement Learning:
Link (more theory) and Multi-Agent Reinforcement Learning:
Link (more applied).