FREDERIC SALA
fredsala@cs.wisc.edu
I lead the Sprocket Lab at UW-Madison, where we study the fundamentals of data-driven systems and machine learning. We are especially interested in data- and compute-efficient systems.
Much of our current work focuses on data-centric AI, foundation models, and automated machine learning.
I have a research leadership role at Snorkel AI, where we are building a data-first approach to AI.
Previously, I was a postdoc in Stanford CS, associated with the Hazy group. I completed my Ph.D. in electrical engineering at UCLA, where I worked with the LORIS and StarAI groups.
Lab |
Research |
Bio |
CV |
Scholar |
X |
Bluesky |
Publications |
Awards
|
|
News
|
Research
My group's work has most recently focused on: (more detailed)
- Data-centric AI, including:
- Data mixtures. How do we get optimal data for model training? Ex: R&B, Grad-Mimic, Skill-it!
- Weak supervision & efficient annotation. How do we generate, label, or annotate training data---with little or no ground truth? Ex: Data-centric W2S, The ALCHEmist, UniversalWS, TBAL, Colander
- Evaluation & Benchmarks. How do we validate models, techniques, and algorithms? Ex: TPBench, BoxWRENCH, AutoWS-Bench-101
- Foundation Models, Architectures & AutoML: How do we design, build, and automate development for foundation models and foundation model-powered systems? Ex: Manticore, COSMOS
- Model Steering & Efficient Adaptation: How do we get models to behave the way we want them to---without extra training or data? Ex: RoboShot, OTTER, Chameleon, AlignEZ
- AI For Science. How do we use AI&ML to accelerate science? Ex: TheoreticalPhysicsBench, Multi-SALAD
- Data-centric AI, including:
- Data mixtures. How can we develop data that accelerates model training and induces desired behaviors in our models? How do we understand the effects of data choices on training, inference, and downstream?
- R&B (new!) Learns data domains and computes optimal mixtures with nearly zero overhead.
- Grad-Mimic (Dataworld '25 Oral) Data selection via gradient guidance from pretrained models.
- Skill-it! (NeurIPS '23 Spotlight) Generate optimal LLM training data mixtures by tracking skills.
- Weak supervision & efficient annotation. How can we obtain high-quality data for training, fine-tuning, RLHF, etc---without painful manual creation & annotation? We study techniques like weak-to-strong generalization, weak supervision, semi-supevised learning, self-training, and more.
- Data-centric W2S (ICLR '25) Shows that the driver behind weak-to-strong generalization is data that simultaneously admits multiple "solutions".
- The ALCHEmist (NeurIPS '24 Spotlight) Distills LLM labelers into programs for massive savings.
- Colander (NeurIPS '24) Learns optimal confidence functions for auto-labeling.
- TBAL (NeurIPS '23 Spotlight) First analysis of the fundamentals of auto-labeling.
- UniversalWS (ICLR '22) Vast generalization of weak supervision to any kind of annotation.
- Evaluation & Benchmarks. How do we validate models, techniques, and algorithms?
- TPBench (new!) Theoretical physics reasoning benchmark testing frontier model capabilities.
- BoxWRENCH ( NeurIPS D&B '24) Realistic and specialized weak supervision benchmark.
- AutoWS-Bench-101 ( NeurIPS D&B '22) Benchmark testing automated weak supervision methods.
- Foundation Models, Architectures & AutoML. Given the vast space of pretrained models, established and emerging model architectures, and multi-model pipelines and workflows---how can we help practitioners make the optimal choice?
- Manticore (COLM '25) Learned hybrid model architectures: automatically and efficiently combine Transformers, SSMs, and more with our generalization of neural architecture search .
- COSMOS (new!) Ultra-efficient per-task LLM selection and design.
- Model Steering & Efficient Adaptation. How do we get models to behave as desired without needing more compute, training, data acquisition, etc. ?
- RoboShot (ICLR '24, R0-FoMo '23 Best Paper Honorable Mention) Makes models robust via LLM-generated insights and ultra-efficient model steering.
- OTTER (NeurIPS '24) Fix mismatch in pretrained model predictions distributions via optimal transport.
- Chameleon (NAACL Findings '25) Personalize LLMs by synthesizing user preferences and steering.
- AlignEZ ('24) Low-cost multi-objective alignment via synthetic preferences and representation editing.
- AI for Science. How do we use AI/ML to accelerate scientific discovery?
- TPBench (new!) Theoretical physics reasoning benchmark testing frontier model capabilities.
- Multi-SALAD (JHEP '23)Anomaly detection with multiple reference datasets for resonant searches in HEP.
|
Students
Graduate Students
|
Publications
Conference | Journal | Preprint | Workshop
2025
Pretrained Hybrids with MAD Skills
Nicholas Roberts, Samuel Guo, Zhiqi Gao, Srinath Namburi, Sonia Cromp, Chengjun Wu, Chengyu Duan, Frederic Sala
Conference on Language Modeling (COLM), 2025
arXiv
2024
2023
Understanding Threshold-based Auto-labeling: The Good, the Bad, and the Terra Incognita
Harit Vishwakarma, Frederic Sala, Ramya Vinayak
NeurIPS Workshop on Adaptive Experimental Design and Active Learning in the Real World (RealML-2023), 2023
Mixed Curvature Representation Learning for Biological Pathway Graphs
Daniel McNeela, Frederic Sala, Anthony Gitter
ICML Workshop for Computational Biology, 2023
2022
AutoWS-Bench-101: Benchmarking Automated Weak Supervision with 100 Labels
Nicholas Roberts, Xintong Li, Tzu-Heng Huang, Dyah Adila, Spencer Schoenberg, Cheng-Yu Liu, Lauren Pick, Haotian Ma, Aws Albarghouthi, Frederic Sala
Neural Information Processing Systems (NeurIPS) (Datasets and Benchmarks Track), 2022
arXiv | OpenReview | code
AutoML for Climate Change: A Call to Action
Renbo Tu, Nicholas Roberts, Vishak Prasad, Sibasis Nayak, Paarth Jain, Frederic Sala, Ganesh Ramakrishnan, Ameet Talwalkar, Willie Neiswanger, Colin White
NeurIPS 2022 Workshop on Tackling Climate Change with Machine Learning, 2022
arXiv
2021
2020
2019
Multi-Resolution Weak Supervision for Sequential Data
Frederic Sala*, Paroma Varma*, Shiori Sagawa, Jason Fries, Daniel Y. Fu, Saelig Khattar, Ashwini Ramamoorthy, Ke Xiao, Kayvon Fatahalian, James Priest, Christopher Ré.
Neural Information Processing Systems (NeurIPS), 2019
paper
older
2018
2017
2016
paper
2015 and older
|
Awards
- UW-Madison SACM Students' Choice Professor of the Year Award, 2025
- DARPA Young Faculty Award, 2024
- Best Paper Award Honorable Mention, NeurIPS R0-FoMo Workshop, 2023
- Best Student Paper Runner-Up, UAI, 2022
- Outstanding Ph.D. Dissertation Award, UCLA Department of Electrical Engineering
- UCLA Dissertation Year Fellowship
- Qualcomm Innovation Fellowship Finalist
- Edward K. Rice Outstanding Masters Student Award UCLA Henry Samueli School of Engineering & Applied Science
- Outstanding M.S. Thesis Award, UCLA Department of Electrical Engineering
- National Science Foundation Graduate Research Fellowship (NSF GRFP)
|
|