FREDERIC SALA
fredsala@cs.wisc.edu
I study the fundamentals of data-driven systems.
I have served as a research scientist at Snorkel, where we are building a data-first approach to AI.
Previously, I was a postdoc in Stanford CS, associated with the Hazy group. I completed my Ph.D. in electrical engineering at UCLA, where I worked with the LORIS and StarAI groups.
Bio |
CV |
Google Scholar |
Twitter |
Interests
|
|
News
|
Publications
Preprints
2021
Comparing the Value of Labeled and Unlabeled Data in Method-of-Moments Latent Variable Estimation
Mayee F. Chen, Ben Cohen-Wang, Steve Mussmann, Frederic Sala, Christopher Ré
International Conference on Artificial Intelligence and Statistics (AISTATS), 2021
2020
2019
Multi-Resolution Weak Supervision for Sequential Data
Frederic Sala*, Paroma Varma*, Shiori Sagawa, Jason Fries, Daniel Y. Fu, Saelig Khattar, Ashwini Ramamoorthy, Ke Xiao, Kayvon Fatahalian, James Priest, Christopher Ré.
Neural Information Processing Systems (NeurIPS), 2019
paper
2018
2017
2016
paper
older
|
|
Weak supervision for machine learning models
ICML '20 |
NeurIPS '19 |
ICML '19 |
AAAI '19 |
blog
Obtaining large amounts of labeled data is such a bottleneck that practitioners have increasingly turned to weaker forms of supervision. We study efficient algorithms for synthesizing labels from weak supervision sources with theoretical guarantees, an efficient way to learn the structure of a model of such sources, and new ways to tackle labeling data for large-scale video and other applications.
|
|
Geometry and structure of data
ACL '20 |
ICLR '19 |
ICML '18 |
blog 1 |
blog 2
Modern ML methods require first embedding data into a continuous space-traditionally Euclidean space. However, the structure of data makes Euclidean space unsuitable for many types of data (like hierarchies!) My work shows that non-Euclidean spaces like hyperbolic space (and other manifolds!) are more suitable for embeddings and study the limits and tradeoffs of these techniques.
|
|
Efficient data synchronization and reconstruction
IT '19 |
IT '17 |
TCOM '16
What is the least amount of information we must exchange to synchronize between two versions of a file, or to reconstruct a core piece of data from noisy samples? My work studies bounds and algorithms for these techniques.
|
|
Reliable data storage & next-gen memories
TCOM '17 |
TCOM '13 |
CL '14 |
SELSE '16 (best of) |
book
Modern memories offer speed and efficiency but suffer from physical limitations that lead to errors and corruption. New reliability and error-correcting techniques are critical to the future of these devices. My work develops new data representations, new coding techniques, and how to make algorithms more robust. I also study theoretical frameworks to evaluate broad ranges of ECCs.
|
Awards
- Top Reviewer NeurIPS, 2018, 2019
- Outstanding Ph.D. Dissertation Award, UCLA Department of Electrical Engineering (Signals & Systems Track) (2017)
- UCLA Dissertation Year Fellowship (2015-2016)
- Qualcomm Innovation Fellowship Finalist (2015)
- Edward K. Rice Outstanding Masters Student Award UCLA Henry Samueli School of Engineering & Applied Science (2013)
- Outstanding M.S. Thesis Award, UCLA Department of Electrical Engineering (Signals & Systems Track) (2013)
- National Science Foundation Graduate Research Fellowship (NSF GRFP) (2012-2015)
|
|