CS 839: Advanced Topics in Reinforcement Learning

CS 839, Fall 2025, Section 004
Department of Computer Sciences
University of Wisconsin–Madison


Project Page

A major component of your course grade is based upon the completion of a final project. This project must include an implementation component and scientific evaluation of the implementation.

You have two choices for the type of project you do:

  1. Application Project: Choose a domain of interest, define an RL problem in that domain, and apply an RL algorithm to solve the problem.
  2. Algorithm Project: Propose a modification to an RL algorithm that we read about in class or you read about elsewhere. Define performance metrics and demonstrate that the modification improves these metrics in comparison to the original algorithm.
In either case, the scope of the project's goals should be comparable to those of an RL research conference paper and successful projects may form the basis of future research publications. By default, projects should be completed in groups of two students. If you would like to form a team of up to 3 people, please discuss with me beforehand. The final project report will need to clearly describe the contribution of each team member.

Project Proposal

Your first task is to write a project proposal describing what you will do for your project, what your goals are, and how you will evaluate success.

Submit the proposal on GradeScope. Since the projects are to be done in teams, each team member should submit a copy of the proposal to receive credit.

You will receive full credit if the following questions are answered:

  • What is the goal of your project?
  • Why is the goal interesting and non-trivial? If you are successful, why will anyone care?
  • How will you evaluate success? Be as concrete as possible.
  • What (if any) code do you have to build upon?
  • (If application project) Formalize the application as an MDP. What are the states, actions, rewards, and discount?
  • (If algorithms project) Describe the RL environments (i.e., MDPs) you will use to evaluate your proposed modification.
  • Team members.
The more precise you write your proposal, the more helpful my feedback can be. I'm also happy to discuss projects before the proposal is due.

Literature Survey

Your second task is to write a literature survey that covers the relevant research literature on your project topic. This survey will have three sections:
  1. Summarize the plan for your project and any changes to it since the proposal was submitted. If nothing has changed since the proposal, simply state that.
  2. (Most important) The actual literature survey.
    • The survey should be roughly 1 page in a two column format.
    • The survey must include at least 10 surveyed references for each team member (default is 20 references). More is fine too.
    • For each surveyed reference discussed, say 1) what was done, 2) how it relates to your project (what is similar), 3) how your proposed project will be novel relative to this work (what is different).
      • Example: Smith et al. applied policy gradient RL to learn a walking controller for a quadrupedal robot [2020]. While this project also considers learning of walk controllers, we focus on the more challenging task of bipedal locomotion and will use an actor-critic algorithm.
    • You may include other references in the survey without completing these 3 steps. However, you must complete the 3 steps for at least 20 references.
  3. A bibliography with full citations for each included reference.
Note that the survey will become a component in the final project report. As with the project proposal, each team member should submit the survey on Gradescope.

Literature Survey Guidelines:

  1. Your review must be written in paragraph form---not just a list of references.
  2. Use of Latex is highly encouraged for type-setting your survey and final report. Overleaf.com is a great resource for quickly creating Latex projects in a browser.
  3. Citations go at the end of sentences.
  4. Citations are not nouns. Don't say "(Smith et al. 2020) applied policy gradient RL." Say "Smith et al. applied policy gradient RL (2020)."

Final Project Implementation and Report

Implementation:

  • Submit code and executable.
  • Submit documentation of AI usage.
  • README with directions for setting up and running your code. MAKE SURE IT IS ABSOLUTELY CLEAR HOW TO RUN YOUR CODE.

Empirical Analysis:

  • Should identify at least one question that will be answered empirically.
  • Forms and states at least one hypothesis as an answer to the question.
  • An experiment that allow you to either accept or reject the hypothesis.
  • An ablation experiment that shows the impact of different components of your implementation.
  • A sensitivity experiment that shows how sensitive your implementation is to one or more hyper-parameters.
Your analysis should be conducted following best practices in Empirical Design in Reinforcement Learning.

Report:

  • Approximately 5, two-column pages with 1 inch margins.
  • Includes an introduction that motivates the application or algorithmic modifications. Your project proposal provides a good starting point for the introduction.
  • Includes your literature review in paragraph form (not bullet points!)
  • Desciption of your approach or application and implementation.
  • Evaluate good and bad aspects of your implementation.
  • Empirical analysis as specified above.
  • Proofread and spell check.
  • If your implementation requires setting hyper-parameters (e.g., neural network architectures, step sizes), make sure you specify both what values you used and how you arrived at them.
  • AI Usage: You are welcome to use AI to refine your writing as discussed in the class AI Policy. However, you should carefully review generated text to ensure it accurately captures what you intend to say. Generative AI tools often produce nice sounding but low information sentences ("AI Fluff"). Such sentences will lead to point deductions.