Frederic Sala, University of Wisconsin-Madison

FREDERIC SALA

Assistant Professor, University of Wisconsin CS

fredsala@cs.wisc.edu

I lead the Sprocket Lab at UW-Madison, where we study the fundamentals of data-driven systems and machine learning. We are especially interested in data- and compute-efficient systems.

Much of our current work focuses on data-centric AI, foundation models, and automated machine learning.

I am the Chief Scientist at Snorkel AI, where we are building a data-first approach to AI.

Previously, I was a postdoc in Stanford CS, associated with the Hazy group. I completed my Ph.D. in electrical engineering at UCLA, where I worked with the LORIS and StarAI groups.

Lab | Research | Bio | CV | Scholar | X | Publications | Awards

News

We are launching a 𝘿𝙖𝙩𝙖 𝙁𝙤𝙪𝙣𝙙𝙖𝙩𝙞𝙤𝙣𝙨 𝙤𝙛 𝘼𝙄 initiative! Check out our talks and resources.
Had a great time on the Information Bottleneck podcast!
New paper accepted at TMLR from Jitian on understanding structure in foundation model embeddings!
LiveResearchBench was accepted at ICLR '26---cool new work from Jiayu!
New at TMLR from Sonia and company, our paper Tabby shows how to efficiently synthesize tabular data! Tabby also won a special J2C designation and will be presented at ICLR '26.
Presenting two papers from our group at NeurIPS 2025 (and come check out our new workshop papers)!

Jiayu performs a fine-grained analysis of RL for reasoning (collaboration with folks from Salesforce).
We improve weak verifiers with weak supervision (with our Stanford collaborators).

We received a DARPA SAFRON award to sponsor our work on safe programmatic distillation---very excited to collaborate!
Received the UW-Madison SACM Students' Choice Professor of the Year Award. Thank you to the students!
My group and I won the DARPA Young Faculty Award, which will sponsor our work on data-efficient foundation models! Very excited for the upcoming collaborations.

Research

My group's work has most recently focused on: (more detailed)

Data-centric AI, including:

Data mixtures. How do we get optimal data for model training? Ex: R&B, Grad-Mimic, Skill-it!
Weak supervision & efficient annotation. How do we generate, label, or annotate training data---with little or no ground truth? Ex: Data-centric W2S, The ALCHEmist, UniversalWS, TBAL, Colander
Evaluation & Benchmarks. How do we validate models, techniques, and algorithms? Ex: TPBench, LiveResearchBench, BoxWRENCH, AutoWS-Bench-101

Foundation Models, Architectures & AutoML: How do we design, build, and automate development for foundation models and foundation model-powered systems? Ex: Manticore, COSMOS
Model Steering & Efficient Adaptation: How do we get models to behave the way we want them to---without extra training or data? Ex: RoboShot, OTTER, Chameleon, AlignEZ
AI For Science. How do we use AI&ML to accelerate science? Ex: TheoreticalPhysicsBench, Multi-SALAD

Data-centric AI, including:
- Data mixtures. How can we develop data that accelerates model training and induces desired behaviors in our models? How do we understand the effects of data choices on training, inference, and downstream?
  - R&B (new!) Learns data domains and computes optimal mixtures with nearly zero overhead.
  - Grad-Mimic (Dataworld '25 Oral) Data selection via gradient guidance from pretrained models.
  - Skill-it! (NeurIPS '23 Spotlight) Generate optimal LLM training data mixtures by tracking skills.
- Weak supervision & efficient annotation. How can we obtain high-quality data for training, fine-tuning, RLHF, etc---without painful manual creation & annotation? We study techniques like weak-to-strong generalization, weak supervision, semi-supevised learning, self-training, and more.
  - Data-centric W2S (ICLR '25) Shows that the driver behind weak-to-strong generalization is data that simultaneously admits multiple "solutions".
  - The ALCHEmist (NeurIPS '24 Spotlight) Distills LLM labelers into programs for massive savings.
  - Colander (NeurIPS '24) Learns optimal confidence functions for auto-labeling.
  - TBAL (NeurIPS '23 Spotlight) First analysis of the fundamentals of auto-labeling.
  - UniversalWS (ICLR '22) Vast generalization of weak supervision to any kind of annotation.
- Evaluation & Benchmarks. How do we validate models, techniques, and algorithms?
  - TPBench (ML:ST '25) Theoretical physics reasoning benchmark testing frontier model capabilities.
  - LiveResearchBench (ICLR '26) Benchmark for user-centric deep research evaluation in realistic settings.
  - BoxWRENCH ( NeurIPS D&B '24) Realistic and specialized weak supervision benchmark.
  - AutoWS-Bench-101 ( NeurIPS D&B '22) Benchmark testing automated weak supervision methods.
Foundation Models, Architectures & AutoML. Given the vast space of pretrained models, established and emerging model architectures, and multi-model pipelines and workflows---how can we help practitioners make the optimal choice?
- Manticore (COLM '25) Learned hybrid model architectures: automatically and efficiently combine Transformers, SSMs, and more with our generalization of neural architecture search .
- COSMOS (new!) Ultra-efficient per-task LLM selection and design.
Model Steering & Efficient Adaptation. How do we get models to behave as desired without needing more compute, training, data acquisition, etc. ?
- RoboShot (ICLR '24, R0-FoMo '23 Best Paper Honorable Mention) Makes models robust via LLM-generated insights and ultra-efficient model steering.
- OTTER (NeurIPS '24) Fix mismatch in pretrained model predictions distributions via optimal transport.
- Chameleon (NAACL Findings '25) Personalize LLMs by synthesizing user preferences and steering.
- AlignEZ ('24) Low-cost multi-objective alignment via synthetic preferences and representation editing.
AI for Science. How do we use AI/ML to accelerate scientific discovery?
- TPBench (ML:ST '25) Theoretical physics reasoning benchmark testing frontier model capabilities.
- Multi-SALAD (JHEP '23)Anomaly detection with multiple reference datasets for resonant searches in HEP.

Students

Graduate students I advise / co-advise / frequently collaborate with:

Alumni

Changho Shin (Ph.D. '25). Next: Postdoc at Princeton.
Harit Vishwakarma (Ph.D. '25, co-advised with Ramya Vinayak). Next: Postdoc at Oxford.
Jitian Zhao (Ph.D. '25, co-advised with Karl Rohe). Next: Meta

Publications

Conference | Journal | Preprint | Workshop

2026

Quantifying Structure in CLIP Embeddings: A Statistical Framework for Concept Interpretation
Jitian Zhao, Chenghui Li, Frederic Sala, Karl Rohe
Transactions on Machine Learning Research (TMLR), 2026
OpenReview

Learning from Less: Measuring the Effectiveness of RLVR in Low Data and Compute Regimes
Justin Bauer, Thomas Walshe, Derek Pham, Harit Vishwakarma, Armin Parchami, Frederic Sala, Paroma Varma
Machine Learning and Systems (MLSys), 2026

LiveResearchBench: A Live Benchmark for User-Centric Deep Research in the Wild
Jiayu Wang, Yifei Ming, Riya Dulepet, Qinglin Chen, Austin Xu, Zixuan Ke, Frederic Sala, Aws Albarghouthi, Caiming Xiong, Shafiq Joty
International Conference on Representation Learning (ICLR), 2026
arXiv

Tabby: Tabular Data Synthesis with Language Models
Sonia Cromp, Satya Sai Srinath Namburi, Mohammed Alkhudhayri, Catherine Cao, Samuel Guo, Nicholas Roberts, Frederic Sala
Transactions on Machine Learning Research (TMLR), 2026 (J2C Special Designation)
arXiv

2025

Beyond Accuracy: Dissecting Mathematical Reasoning for LLMs Under Reinforcement Learning
Jiayu Wang, Yifei Ming, Zixuan Ke, Caiming Xiong, Shafiq Joty, Aws Albarghouthi, Frederic Sala
Neural Information Processing Systems (NeurIPS), 2025
arXiv

Shrinking the Generation-Verification Gap with Weak Verifiers
Jon Saad-Falcon, E. Kelly Buchanan†∗, Mayee F. Chen, Tzu-Heng Huang, Brendan McLaughlin, Tanvir Bhathal, Shang Zhu, Ben Athiwaratkun, Frederic Sala, Scott Linderman, Azalia Mirhoseini, Christopher Ré
Neural Information Processing Systems (NeurIPS), 2025
arXiv

Theoretical Physics Benchmark (TPBench) -- a Dataset and Study of AI Reasoning Capabilities in Theoretical Physics
Daniel J.H. Chung, Zhiqi Gao, Yurii Kvasiuk, Tianyi Li, Moritz Münchmeyer, Maja Rudolph, Frederic Sala, Sai Chaitanya Tadepalli
Machine Learning: Science and Technology, 2025
arXiv | Site

Pretrained Hybrids with MAD Skills
Nicholas Roberts, Samuel Guo, Zhiqi Gao, Srinath Namburi, Sonia Cromp, Chengjun Wu, Chengyu Duan, Frederic Sala
Conference on Language Modeling (COLM), 2025
arXiv

Rethinking Confidence and Thresholds in Pseudolabeling-based SSL
Harit Vishwakarma, Yi Chen, Srinath Namburi, Sui Jiet Tay, Ramya Korlakai Vinayak, Frederic Sala
International Conference on Machine Learning (ICML), 2025
draft

Weak-to-Strong Generalization Through the Data-Centric Lens
Changho Shin, John Cooper and Frederic Sala
International Conference on Learning Representations (ICLR), 2025
arXiv

Personalize Your LLM: Fake it then Align It
Yijing Zhang, Dyah Adila, Changho Shin, and Frederic Sala
NAACL Findings, 2025
OpenReview

Product Manifold Representations for Learning on Biological Pathways
Daniel McNeela, Frederic Sala, Anthony Gitter
Great Lakes Bioinformatics (GLBIO), 2025
arXiv

TARDIS: Mitigate Temporal Misalignment via Representation Steering,
Changho Shin, Xinya Yan, Suenggwan Jo, Sungjun Cho, Shourjo Aditya Chaudhuri, Frederic Sala
Preprint, 2025
arXiv

2024

The AlCHEmist: Automated Labeling 500x CHEaper than LLM Data Annotators
Tzu-Heng Huang, Catherine Cao, Vaishnavi Bhargava, Frederic Sala
Neural Information Processing Systems (NeurIPS), 2024 (Spotlight)
arXiv

Pearls from Pebbles: Improved Confidence Functions for Auto-labeling
Harit Vishwakarma, Sui Jiet Tay, Srinath Namburi, Frederic Sala, Ramya Korlakai Vinayak
Neural Information Processing Systems (NeurIPS), 2024
arXiv

OTTER: Improving Zero-Shot Classification via Optimal Transport
Changho Shin, Jitian Zhao, Sonia Cromp, Harit Vishwakarma, Frederic Sala
Neural Information Processing Systems (NeurIPS), 2024
arXiv

Stronger Than You Think: Benchmarking Weak Supervision on Realistic Tasks
Tianyi Zhang, Linrong Cai, Jeffrey Li, Nicholas Roberts, Neel Guha, Frederic Sala
Neural Information Processing Systems (NeurIPS) (Datasets and Benchmarks Track), 2024
arXiv

Look Who’s Talking Now: Covert Channels From Biased LLMs
Daniel Silva, Frederic Sala, Ryan Gabrys
Empirical Methods in Natural Language Processing (EMNLP) Findings, 2024
Paper

Zero-Shot Robustification of Zero-Shot Models
Dyah Adila, Changho Shin, Linrong Cai, Frederic Sala
International Conference on Learning Representations (ICLR), 2024
OpenReview | arXiv | code | blog

Is Free Self-Alignment Possible?
Dyah Adila, Changho Shin, Yijing Zhang, Frederic Sala
Preprint, 2024
arXiv

2023

Domain Generalization via Nuclear Norm Regularization
Zhenmei Shi, Yifei Ming, Ying Fan, Frederic Sala, Yingyu Liang
Conference on Parsimony and Learning (CPAL) , 2023 (Oral)
arXiv

The Cost of Compression: Investigating the Impact of Compression on Parametric Knowledge in Language Models
Srinath Namburi, Makesh Narsimhan Sreedhar, Srinath Srinivasan, Frederic Sala
Empirical Methods in Natural Language Processing (EMNLP) Findings, 2023
paper

Geometry-Aware Adaptation for Pretrained Models
Nicholas Roberts, Xintong Li, Dyah Adila, Sonia Cromp, Tzu-Heng Huang, Jitian Zhao, Frederic Sala
Neural Information Processing Systems (NeurIPS), 2023
arXiv

Promises and Pitfalls of Threshold-based Auto-labeling
Harit Vishwakarma, Heguang Lin, Frederic Sala, Ramya Korlakai Vinayak
Neural Information Processing Systems (NeurIPS), 2023 (Spotlight)
arXiv

Mitigating Source Bias for Fairer Weak Supervision
Changho Shin, Sonia Cromp, Dyah Adila, Frederic Sala
Neural Information Processing Systems (NeurIPS), 2023
arXiv

Train `n Trade: Foundations of Parameter Markets
Tzu-Heng Huang, Harit Vishwakarma, Frederic Sala
Neural Information Processing Systems (NeurIPS), 2023
paper

Skill-it! A Data-Driven Skills Framework for Understanding and Training Language Models
Mayee F. Chen, Nicholas Roberts, Kush Bhatia, Jue Wang, Ce Zhang, Frederic Sala, Christopher Ré
Neural Information Processing Systems (NeurIPS), 2023 (Spotlight)
arXiv

Embroid: Unsupervised Prediction Smoothing Can Improve Few-Shot Classification
Neel Guha, Mayee F Chen, Kush Bhatia, Azalia Mirhoseini, Frederic Sala, Christopher Ré
Neural Information Processing Systems (NeurIPS), 2023
arXiv

The Credential is Not Enough: Combining Honeypots and Fake Credentials for Cyber-Defense
Sonia Cromp, Mark Bilinski, Ryan Gabrys, Frederic Sala
Conference on Decision and Game Theory for Security (GameSec-23), 2023
paper

Generative Modeling Helps Weak Supervision (and Vice Versa)
Benedikt Boecking, Willie Neiswanger, Nicholas Roberts, Stefano Ermon, Frederic Sala, Artur Dubrawski
International Conference on Learning Representations (ICLR), 2023
arXiv | OpenReview

Ask Me Anything: A simple strategy for prompting language models
Simran Arora, Avanika Narayan, Mayee F. Chen, Laurel Orr, Neel Guha, Kush Bhatia, Ines Chami, Frederic Sala, Christopher Ré
International Conference on Learning Representations (ICLR), 2023
arXiv | OpenReview

Resonant anomaly detection with multiple reference datasets
Mayee F. Chen, Benjamin Nachman, Frederic Sala
Journal of High Energy Physics, 2023
paper

Foundation Models Can Robustify Themselves, For Free
Dyah Adila, Changho Shin, Linrong Cai, Frederic Sala
NeurIPS Workshop on Robustness of Few-shot and Zero-shot Learning in Foundation Models (R0-FoMo), 2023 (oral)
paper

Understanding Threshold-based Auto-labeling: The Good, the Bad, and the Terra Incognita
Harit Vishwakarma, Frederic Sala, Ramya Vinayak
NeurIPS Workshop on Adaptive Experimental Design and Active Learning in the Real World (RealML-2023), 2023

Multimodal Data Curation Via Object Detection And Filter Ensembles
Tzu-Heng Huang, Changho Shin, Sui Jiet Tay, Dyah Adila, Frederic Sala
ICCV Workshop: Towards the Next Generation of Computer Vision Datasets (TNGCV), 2023
paper

Mixed Curvature Representation Learning for Biological Pathway Graphs
Daniel McNeela, Frederic Sala, Anthony Gitter
ICML Workshop for Computational Biology, 2023

ScriptoriumWS: A Code Generation Assistant for Weak Supervision
Tzu-Heng Huang, Catherine Cao, Spencer Schoenberg, Harit Vishwakarma, Nicholas Roberts, Frederic Sala
Deep Learning for Code (DL4C) Workshop at ICLR, 2023
preprint

2022

Efficient Representation Learning for Higher-Order Data withSimplicial Complexes
Ruochen Yang, Frederic Sala, Paul Bogdan
Learning on Graphs (LOG), 2022
paper

Lifting Weak Supervision To Structured Prediction
Harit Vishwakarma, Nicholas Roberts, Frederic Sala
Neural Information Processing Systems (NeurIPS), 2022
arXiv | OpenReview | code

AutoWS-Bench-101: Benchmarking Automated Weak Supervision with 100 Labels
Nicholas Roberts, Xintong Li, Tzu-Heng Huang, Dyah Adila, Spencer Schoenberg, Cheng-Yu Liu, Lauren Pick, Haotian Ma, Aws Albarghouthi, Frederic Sala
Neural Information Processing Systems (NeurIPS) (Datasets and Benchmarks Track), 2022
arXiv | OpenReview | code

NAS-Bench-360: Benchmarking Neural Architecture Search on Diverse Tasks
Renbo Tu, Nicholas Roberts, Mikhail Khodak, Junhong Shen, Frederic Sala, Ameet Talwalkar
Neural Information Processing Systems (NeurIPS) (Datasets and Benchmarks Track), 2022
arXiv | OpenReview | code

Shoring Up the Foundations: Fusing Model Embeddings and Weak Supervision
Mayee F. Chen, Daniel Y. Fu, Dyah Adila, Michael Zhang, Frederic Sala, Kayvon Fatahalian, and Christopher Ré
Uncertainty in Artificial Intelligence (UAI), 2022 (oral)
arXiv

Universalizing Weak Supervision
Changho Shin, Winfred Li, Harit Vishwakarma, Nicholas Roberts, Frederic Sala
International Conference on Learning Representations (ICLR), 2022
arXiv | OpenReview | code

Anomaly Detection with Multiple Reference Datasets in High Energy Physics
Mayee Chen, Benjamin Nachman, Frederic Sala
NeurIPS Workshop on Machine Learning and the Physical Sciences (ML4PS), 2022
paper

Domain Generalization with Nuclear Norm Regularization
Zhenmei Shi, Yifei Ming, Ying Fan, Frederic Sala, Yingyu Liang
NeurIPS Workshop on Workshop on Distribution Shifts (DistShift) , 2022
OpenReview

AutoML for Climate Change: A Call to Action
Renbo Tu, Nicholas Roberts, Vishak Prasad, Sibasis Nayak, Paarth Jain, Frederic Sala, Ganesh Ramakrishnan, Ameet Talwalkar, Willie Neiswanger, Colin White
NeurIPS 2022 Workshop on Tackling Climate Change with Machine Learning, 2022
arXiv

Causal Omnivore: Fusing Noisy Estimates of Spurious Correlations
Dyah Adila, Sonia Cromp, Sicheng Mo, Frederic Sala
ICML Workshop on Spurious Correlations, Invariance, and Stability, 2022
paper | code

2021

Comparing the Value of Labeled and Unlabeled Data in Method-of-Moments Latent Variable Estimation
Mayee F. Chen, Ben Cohen-Wang, Steve Mussmann, Frederic Sala, Christopher Ré
International Conference on Artificial Intelligence and Statistics (AISTATS), 2021
arXiv

Cut Out The Annotator, Keep The Cutout: Better Segmentation With Weak Supervision
Sarah Hooper, Michael Wornow, Ying Hang Seah, Peter Kellman, Hui Xue, Frederic Sala, Curtis Langlotz, Christopher Ré
International Conference on Learning Representations (ICLR), 2021
paper

Hidden Network Generating Rules from Partially Observed Complex Networks
Ruochen Yang, Frederic Sala, Paul Bogdan
Communications Physics, 2021
paper

2020

Fast and Three-rious: Speeding Up Weak Supervision with Triplet Methods
Daniel Y. Fu*, Mayee F. Chen*, Frederic Sala, Sarah M. Hooper, Kayvon Fatahalian, Christopher Ré
International Conference on Machine Learning (ICML), 2020
arXiv | code | video | blog

Ivy: Instrumental Variable Synthesis for Causal Inference
Zhaobin Kuang, Frederic Sala, Nimit Sohoni, Sen Wu, Aldo Cordova-Palomera, Jared Dunnmon, James Priest, Christopher Ré
International Conference on Artificial Intelligence and Statistics (AISTATS), 2020
arXiv | tutorial | video

Low-Dimensional Hyperbolic Knowledge Graph Embeddings
Ines Chami, Adva Wolf, Da-Cheng Juan, Frederic Sala, Sujith Ravi, Christopher Ré.
Annual Conference of the Association for Computational Linguistics (ACL), 2020
arXiv | code | video

Train and You'll Miss It: Interactive Model Iteration with Weak Supervision and Pre-Trained Embeddings
Mayee F. Chen, Daniel Y. Fu, Frederic Sala, Sen Wu, Ravi Teja Mullapudi, Fait Poms, Kayvon Fatahalian, Christopher Ré
Preprint, 2020
arXiv | code | video

2019

Multi-Resolution Weak Supervision for Sequential Data
Frederic Sala*, Paroma Varma*, Shiori Sagawa, Jason Fries, Daniel Y. Fu, Saelig Khattar, Ashwini Ramamoorthy, Ke Xiao, Kayvon Fatahalian, James Priest, Christopher Ré.
Neural Information Processing Systems (NeurIPS), 2019
paper

Learning Mixed-Curvature Representations in Products of Model Spaces
Albert Gu, Frederic Sala, Beliz Gunel, Christopher Ré
International Conference on Learning Representations (ICLR), 2019
paper | code | blog

Learning Dependency Structures for Weak Supervision Models
Paroma Varma*, Frederic Sala*, Ann He, Alexander Ratner, Christopher Ré
International Conference on Machine Learning (ICML), 2019
arXiv | code

Training Complex Models with Multi-Task Weak Supervision
Alexander J. Ratner, Braden Hancock, Jared Dunnmon, Frederic Sala, Shreyash Pandey, Christopher Ré
AAAI Conference on Artificial Intelligence, 2019
arXiv | code

Context-Aware Resiliency: Unequal Message Protection for Random-Access Memories
Clayton Schoeny, Frederic Sala, Mark Gottscho, Irina Alam, Puneet Gupta, Lara Dolecek
IEEE Transactions on Information Theory, 2019
paper

Codes Correcting Two Deletions
Ryan Gabrys and Frederic Sala
IEEE Transactions on Information Theory, 2019
arXiv

older

Awards

UW-Madison SACM Students' Choice Professor of the Year Award, 2025
DARPA Young Faculty Award, 2024
Best Paper Award Honorable Mention, NeurIPS R0-FoMo Workshop, 2023
Best Student Paper Runner-Up, UAI, 2022
Outstanding Ph.D. Dissertation Award, UCLA Department of Electrical Engineering
UCLA Dissertation Year Fellowship
Qualcomm Innovation Fellowship Finalist
Edward K. Rice Outstanding Masters Student Award UCLA Henry Samueli School of Engineering & Applied Science
Outstanding M.S. Thesis Award, UCLA Department of Electrical Engineering
National Science Foundation Graduate Research Fellowship (NSF GRFP)