CS 839 Fall 2025, Section 004

Project Page

A major component of your course grade is based upon the completion of a final project. This project must include an implementation component and scientific evaluation of the implementation.

You have two choices for the type of project you do:

Application Project: Choose a domain of interest, define an RL problem in that domain, and apply an RL algorithm to solve the problem.
Algorithm Project: Propose a modification to an RL algorithm that we read about in class or you read about elsewhere. Define performance metrics and demonstrate that the modification improves these metrics in comparison to the original algorithm.

In either case, the scope of the project's goals should be comparable to those of an RL research conference paper and successful projects may form the basis of future research publications. By default, projects should be completed in groups of two students. If you would like to form a team of up to 3 people, please discuss with me beforehand. The final project report will need to clearly describe the contribution of each team member.

Project Proposal

Your first task is to write a project proposal describing what you will do for your project, what your goals are, and how you will evaluate success.

Submit the proposal on GradeScope. Since the projects are to be done in teams, each team member should submit a copy of the proposal to receive credit.

You will receive full credit if the following questions are answered:

What is the goal of your project?
Why is the goal interesting and non-trivial? If you are successful, why will anyone care?
How will you evaluate success? Be as concrete as possible.
What (if any) code do you have to build upon?
(If application project) Formalize the application as an MDP. What are the states, actions, rewards, and discount?
(If algorithms project) Describe the RL environments (i.e., MDPs) you will use to evaluate your proposed modification.
Team members.

The more precise you write your proposal, the more helpful my feedback can be. I'm also happy to discuss projects before the proposal is due.

Literature Survey

Your second task is to write a literature survey that covers the relevant research literature on your project topic. This survey will have three sections:

Summarize the plan for your project and any changes to it since the proposal was submitted. If nothing has changed since the proposal, simply state that.
(Most important) The actual literature survey.
- The survey should be roughly 1 page in a two column format.
- The survey must include at least 10 surveyed references for each team member (default is 20 references). More is fine too.
- For each surveyed reference discussed, say 1) what was done, 2) how it relates to your project (what is similar), 3) how your proposed project will be novel relative to this work (what is different).
  - Example: Smith et al. applied policy gradient RL to learn a walking controller for a quadrupedal robot [2020]. While this project also considers learning of walk controllers, we focus on the more challenging task of bipedal locomotion and will use an actor-critic algorithm.
- You may include other references in the survey without completing these 3 steps. However, you must complete the 3 steps for at least 20 references.
A bibliography with full citations for each included reference.

Note that the survey will become a component in the final project report. As with the project proposal, each team member should submit the survey on Gradescope.

Literature Survey Guidelines:

Your review must be written in paragraph form---not just a list of references.
Use of Latex is highly encouraged for type-setting your survey and final report. Overleaf.com is a great resource for quickly creating Latex projects in a browser.
Citations go at the end of sentences.
Citations are not nouns. Don't say "(Smith et al. 2020) applied policy gradient RL." Say "Smith et al. applied policy gradient RL (2020)."

Final Project Implementation and Report

Implementation:

Submit code and executable.
Submit documentation of AI usage.
README with directions for setting up and running your code. MAKE SURE IT IS ABSOLUTELY CLEAR HOW TO RUN YOUR CODE.

Empirical Analysis:

Should identify at least one question that will be answered empirically.
Forms and states at least one hypothesis as an answer to the question.
An experiment that allow you to either accept or reject the hypothesis.
An ablation experiment that shows the impact of different components of your implementation.
A sensitivity experiment that shows how sensitive your implementation is to one or more hyper-parameters.

Your analysis should be conducted following best practices in Empirical Design in Reinforcement Learning.

Report:

Approximately 5, two-column pages with 1 inch margins. If your literature review is longer than 1 page then your final report will likely be longer.
Includes an introduction that motivates the application or algorithmic modifications. Your project proposal provides a good starting point for the introduction.
Includes your literature review in paragraph form (not bullet points!)
Desciption of your approach or application and implementation. Your project proposal likely also contain details that you can use here.
Evaluate good and bad aspects of your implementation.
Empirical analysis as specified above.
Proofread and spell check.
If your implementation requires setting hyper-parameters (e.g., neural network architectures, step sizes), make sure you specify both what values you used and how you arrived at them.
AI Usage: You are welcome to use AI to refine your writing as discussed in the class AI Policy. However, you should carefully review generated text to ensure it accurately captures what you intend to say. Generative AI tools often produce nice sounding but low information sentences ("AI Fluff"). Such sentences will lead to point deductions.

Lightning Talk
On the last day of class, every group will give a quick "lightning" talk about their project. Given the number of groups we have, each group will only receive 3 minutes to present their presentation. Guidelines are as follows:

There is a strict time limit of 3 minutes.
Content: focus on what your goal was, how you went about achieving that goal, and state your results as of presentation time. Some points to consider covering:
- How did you define the RL problem (MDP) that your project aimed to solve?
- What RL algorithm did you choose and why?
- What challenges did you face with the choice you made?
Submit slides using Google Slides. I will copy into a single presentation to minimize transition time.
Note that this talk will happen before the project is due. Hopefully you will have some successful results to report in the presentation but it is fine if you don't as you will have a few more days before the final report is due.