CS 839: Advanced Topics in Reinforcement Learning
CS 839, Fall 2025, Section 004
Department of Computer Sciences
University of Wisconsin–Madison
Course Description: Reinforcement learning is the branch of machine learning that studies how an agent can learn from taking actions and receiving feedback in an unknown environment. Reinforcement learning algorithms have demonstrated impressive empirical successes ranging from beating a human world champion at the board game Go to allowing robots to learn difficult manipulation and locomotion skills. This course aims to introduce students to reinforcement learning and topics at the forefront of reinforcement learning research. The first half of the course provides an introduction to RL fundamentals such as temporal difference learning, model-based learning, and policy gradients. The second half covers advanced topics in the RL research literature such as hierarchical and multi-agent reinforcement learning. The course will assume familiarity with probability, statistics, and topics covered in an introductory machine learning class. Familiarity with the Python programming language is recommended.
Number of credits associated with the course: 3
How credit hours are met by the course: This class meets for two 75-minute class periods each week over the semester and carries the expectation that students will work on course learning activities (reading, writing, problem sets, studying, etc) for about 3 hours out of classroom for every class period.
Prerequisite: None but the course will assume familiarity with probability, statistics, linear algebra, and topics covered in an introductory machine learning class (linear regression, neural networks, etc). Familiarity with the Python programming lanaguage is recommended.
Textbook: Reinforcement Learning: An Introduction (2nd edition). Rich Sutton and Andy Barto. MIT Press, 2018. (Available for free online: http://incompleteideas.net/book/RLbook2020.pdf)
In the regular lecture time (Tuesday and Thursday 9:30am-10:45am), we will have class in Morgridge Hall 2538. Class periods will involve instructor lecture, student presentations, class discussion and Q&A.
Each week will have assigned readings and the expectation is that all students will complete readings before lectures. The instructor will answer questions from the readings in class and lecture on the week's topic.
Each class period will also up to two 15-minute student led presentations on a topic that complements the week's topic. The aim of these presentations is to stimulate discussion or introduce the entire class to a research paper or application that relates to the lecture topic.
We will use Piazza for course announcements and Q&A outside lectures. You are responsible for any announcements posted on Piazza. If you have a question, your best option is to post a message on Piazza. There other students will be able to help answer your question. The instructor will also periodically check for unanswered questions and respond on Piazza. Please follow these rules:
Homework assignments are posted on Gradescope and homework must be submitted there. Specific instructions are posted for each homework.
Grades will be posted on Canvas.
The following weights are used:
McBurney Center students should contact the instructors to specify any special requests for the exams or homework assignments together with the supporting documentation provided by the McBurney Center. We will do our best to accommodate the requests.
All assignments are due when specied by the instructor. Late assignments will have 10% deducted for each 24 hours past the due date. This penalty is capped at 50% after which no credit is received except for weekly reading responses. Weekly reading responses may be turned in up to the final class day with a penalty of up to 50% off. In the event of illness or emergency that prevents an on-time completion, please contact the instructor prior to the deadline.
You are encouraged to discuss with your peers or the instructor ideas, approaches and techniques broadly. However, all programming assignments and the final project must be written up individually. For example, code for programming assignments must not be developed in groups, nor should code be shared. Make sure you work through all problems yourself, and that your solution is your own. If you feel your peer discussions are too deep for comfort, declare it in the homework solution: “I discussed with X,Y,Z the following specific ideas: A, B, C; therefore our solutions may have similarities on D, E, F…”.
You may use books or legit online resources to help solve homework problems, but you must always credit all such sources in your submission and you must never copy material verbatim.
Cheating and plagiarism will be dealt with in accordance with University procedures (see the UW-Madison Academic Misconduct Rules and Procedures)
The class policy is generally to allow and encourage use of AI as a tool that can help you go farther with your final projects and for additional study.
The one exception to this permissive policy is the weekly pre-class reading responses. As the pre-class reading is critical to effective class participation, use of AI to bypass the reading and response writing is not allowed. If AI use is suspected (due to the response itself or a lack of preparedness for class period), the response will receive a zero.
Otherwise, you are welcome (but not required) to use AI tools to complete programming assignments, refine writing, etc. However, all submitted content will be treated as your own, including for the purpose of determining academic dishonesty.
Please document your use of AI and include your documentation with your assignment submission. Documentation could include specific prompts used, process for iterating on prompt design, etc.