We are a team of researchers and students at UW–Madison participating in the international
RoboCup competition. Our mission is to advance autonomous robotics through competition-driven
innovation, focusing on multi-agent coordination, real-time decision-making, and reinforcement
learning.
About RoboCup
RoboCup is an international research initiative that uses robot soccer competitions as a
challenge for participating teams to program autonomous robots capable of competing in dynamic,
uncertain, and multi-agent environments. As a multidisciplinary competition, RoboCup advances
the state-of-the-art in robotics and artificial intelligence by fostering innovation in areas
like real-time computer vision, multi-agent coordination, machine learning, and real-time
decision-making and control. Additionally, students gain hands-on experience developing complete
AI and robotics systems while contributing to cutting-edge research and building essential
skills in teamwork, problem-solving, and system-level engineering.
RoboCup at Wisconsin
The University of Wisconsin – Madison has participated in RoboCup competitions for 3 years now
and done remarkably well in this short time. First competing in the lower division of the
standard platform league (all teams use the same robot), the UW – Madison team finished 3rd in
their first year. Since 2024, the University of Wisconsin has teamed up with the University of
Texas at Austin. In 2024, the joint team won their competition. In 2025, the joint team moved to
the upper division and finished 3rd overall. The key to the team’s success has been its
expertise at integrating cutting-edge AI methods into the robot’s decision-making and the team
has published research on its approach at premier robotics conferences. In 2026 and beyond, the
team plans to move toward using more advanced humanoid robots to continue pushing the
state-of-the-art in AI and robotics.
Getting Involved
- Interested in sponsoring? We're currently looking for industry sponsors to advance
our mission of conducting leading AI and robotics research and training students for careers
in these fields. If your company is interested in sponsorship, please reach out to Josiah
Hanna (jphanna [at] cs.wisc.edu).
- Interested in joining the team? We're always looking for motivated and
hardworking students to join the team. If you're a current UW -- Madison undergraduate or
graduate student and are interested in learning more about joining the team, please contact
Josiah Hanna (jphanna [at] cs.wisc.edu) or Will Cong (wycong [at] wisc.edu. Please include
your CV and a brief statement of relevant background experience in robotics, software
engineering, machine learning, or computer vision. Successful students are proactive, able
to work independently, and able to learn new skills as they go.
- Interested in AI and robotics but not sure you're ready to contribute? UW --
Madison's CS department offers a wide variety of courses in AI, robotics, and computer
vision. You might consider: CS 540 (Intro to AI), 566 (Computer Vision), and 580
(Intelligent Robotics). The engineering department also offers robotics and
machine learning courses.
Simultaneous Deep Model-Based Reinforcement Learning and State Inference
under Partial Observability.
William Cong,
Josiah P. Hanna.
Proceedings of the IEEE International Conference on Robotics and Automation
(ICRA). May 2026.
Abstract
BibTeX
Paper
Model-based reinforcement learning (MBRL) is a promising approach to enabling
robots to learn directly from a limited number of real-world interactions. MBRL
is notoriously difficult in settings without full state observability because
algorithms must simultaneously infer state from incomplete observations and use
these inferences to learn environment dynamics. Toward the use of MBRL for
autonomous robots, we introduce EMBRL, an expectation-maximization framework
that combines classical Bayesian state estimation with deep MBRL to jointly
infer states and learn neural network state transition models. This framework
takes advantage of the rich theory and practice of state estimation from the
field of robotics, while enabling behavior learning without a priori known robot
dynamics. Though conceptually straightforward, our instantiation of this
framework for deep MBRL reveals several key challenges when using a learned
transition model both for state inference and policy learning. We introduce a
practical implementation of EMBRL using both particle and extended Kalman
filters and smoothers and discuss key design choices necessary for effective
implementation. Finally, we evaluate different instantiations of the EMBRL
framework on both simulated and real-robot tasks and show that our methods learn
higher performing policies compared to strong MBRL baselines using recurrent
neural networks.
Multi-Robot Collaboration through Reinforcement Learning and Abstract Simulation.
Adam Labiosa,
Josiah P. Hanna.
Proceedings of the IEEE International Conference on Robotics and Automation (ICRA). May 2025.
Abstract
BibTeX
Paper
Teams of people coordinate to perform complex tasks by forming abstract mental models of world
and agent dynamics. The use of abstract models contrasts with much recent work in robot learning
that uses a high-fidelity simulator and reinforcement learning (RL) to obtain policies for
physical robots. Motivated by this difference, we investigate the extent to which so-called
{\textbackslash}textit\{abstract simulators\} can be used for multi-agent reinforcement learning
(MARL) and the resulting policies successfully deployed on teams of physical robots. An abstract
simulator models the robot's target task at a high-level of abstraction and discards many
details of the world that could impact optimal decision-making. Policies are trained in an
abstract simulator then transferred to the physical robot by making use of separately-obtained
low-level perception and motion control modules. We identify three key categories of
modifications to the abstract simulator that enable policy transfer to physical robots:
simulation fidelity enhancements, training optimizations and simulation stochasticity. We then
run an empirical study with extensive ablations to determine the value of each modification
category for enabling policy transfer in cooperative robot soccer tasks. We also compare the
performance of policies produced by our methodology with a well-tuned non-learning-based
behavior architecture from the annual RoboCup competition and find that our approach leads to a
similar level of performance. Broadly we show that MARL can be use to train cooperative physical
robot behaviors using highly abstract models of the world.
Reinforcement Learning Within the Classical Robotics Stack: A Case Study in Robot Soccer.
Adam Labiosa, Zhihan Wang, Siddhant Agarwal, William Cong, Geethika Hemkumar, Abhinav Narayan
Harish, Benjamin Hong, Josh Kelle, Chen Li, Yuhao Li, Zisen Shao, Peter Stone,
Josiah P.
Hanna.
Proceedings of the IEEE International Conference on Robotics and Automation (ICRA). May 2025.
A short version of this work appeared at the
Roboletics 2.0 Workshop at ICRA 2025
and received the Best RoboCup-Themed Paper Award.
Abstract
BibTeX
Paper
Robot decision-making in partially observable, real-time, dynamic, and multi-agent environments
remains a difficult and unsolved challenge. Model-free reinforcement learning (RL) is a
promising approach to learning decision-making in such domains, however, end-to-end RL in
complex environments is often intractable. To address this challenge in the RoboCup Standard
Platform League (SPL) domain, we developed a novel architecture integrating RL within a
classical robotics stack, while employing a multi-fidelity sim2real approach and decomposing
behavior into learned sub-behaviors with heuristic selection. Our architecture led to victory in
the 2024 RoboCup SPL Challenge Shield Division. In this work, we fully describe our system's
architecture and empirically analyze key design decisions that contributed to its success. Our
approach demonstrates how RL-based behaviors can be integrated into complete robot behavior
architectures.
WeRef: An Open-source and Extensible Dataset for Referee Gesture Recognition in RoboCup.
Zisen Shao,
Josiah P. Hanna.
RoboCup-2025: Robot Soccer World Cup XXVIII. July 2025.
Oral Presentation
Abstract
BibTeX
Visual recognition of referee gestures is essential for fully
autonomous robot soccer, yet progress in deep learning approaches has
been hindered by the absence of a public, standardized dataset and base-
line. In this paper, we present WeRef, an open-source synthetic data
generation pipeline for RoboCup referee gestures built on the Webots
simulator, along with a large-scale synthetic dataset. Our pipeline auto-
matically randomizes human models, backgrounds, lighting conditions,
obstacle presence, and camera viewpoints to produce diverse data with-
out manual labeling. We evaluate WeRef on real competition data us-
ing a 2D CNN with GRU classifier. We show that synthetic samples
generated by WeRef effectively augment limited real data, substantially
reducing the need for costly data collection and improving recognition
accuracy when training on a combination of synthetic and real data.
By releasing both the generation software and the resulting dataset,
we provide a scalable, open-source framework to facilitate the devel-
opment of referee gesture recognition in the RoboCup Standard Plat-
form League (SPL). The WeRef pipeline and dataset are available at
https://github.com/ZisenShao/WeRef.
Guided Data Augmentation for Offline Reinforcement Learning and Imitation Learning.
Nicholas Corrado, Yuxiao Qu, John U. Balis, Adam Labiosa,
Josiah P. Hanna.
Proceedings of the Reinforcement Learning Conference (RLC). August 2024.
Abstract
BibTeX
Paper
In offline reinforcement learning (RL), RL agents learn to solve a task using only a fixed
dataset of previously collected data. While offline RL has proven to be a viable method for
learning real-world robot control policies, it typically requires large amounts of
expert-quality data to learn effective policies that generalize to out-of-distribution states.
Unfortunately, such data is often difficult and expensive to acquire in real-world tasks.
Several recent works have leveraged data augmentation (DA) to inexpensively generate additional
data, but most DA works apply augmentations in a random fashion and ultimately produce highly
suboptimal augmented data. In this work, we propose {\textbackslash}textbf\{Gu\}ided
{\textbackslash}textbf\{D\}ata {\textbackslash}textbf\{A\}ugmentation (GuDA),
a human-guided DA framework that generates expert-quality augmented data. The key insight behind
GuDA is that while it may be difficult to demonstrate the sequence of actions required to
produce expert data, a user can often easily characterize when an augmented trajectory segment
represents progress toward task completion. Thus, a user can restrict the space of possible
augmentations to automatically reject suboptimal augmented data. To extract a policy from GuDA,
we use off-the-shelf offline reinforcement learning and behavior cloning algorithms. We evaluate
GuDA on a physical robot soccer task as well as simulated D4RL navigation tasks, a simulated
autonomous driving task, and a simulated soccer task. Empirically, GuDA enables learning given a
small initial dataset of potentially suboptimal experience and outperforms a random DA strategy
as well as a model-based DA strategy.