Photo credit: Hector Garcia-Molina


Theodoros (Theo) Rekatsinas

I am an assistant professor at the University of Wisconsin-Madison. I am a member of the UW-Madison Database Group.

Before joining UW, I was a Postdoc at Stanford with Chris Ré. I received my Ph.D. in Computer Science from the University of Maryland. My advisors were Amol Deshpande and Lise Getoor.

ML-first Data Integration and Enrichment: My group is exploring the fundamental connections between data integration and data enrichment with statistical learning and probabilistic inference. Our latest effort for ML-first data enrichment is HoloClean which is built on the idea of weak supervision and probabilistic inference; see our blog post. Systems like HoloClean transition from logic to probabilities in a way similar to the AI-revolution in the eighties.

Email  /  Google Scholar  /  LinkedIn


  • Received NSF CRII to study statistical learning methods for data cleaning!
  • Thanks to Amazon for their generous support!
  • Sigmod 2018: New paper and tutorial on ML-first data integration!
  • New manuscript on a noisy channel model for unclean relational data
  • Fonduer @ SIGMOD 2018: Sen shows how to build knowledge bases from richly formatted data!
  • HoloClean: Scalable data cleaning driven by probabilistic inference. Combine all your signals from integrity constraints to outliers and clean your data! Read our blog post.


NEW! A Formal Framework For Probabilistic Unclean Databases
Christopher De Sa, Ihab F. Ilyas, Benny Kimelfeld, Christopher Ré and Theodoros Rekatsinas
Manuscript, 2018

NEW! Data Integration and Machine Learning: A Natural Synergy
Xin Luna Dong and Theodoros Rekatsinas
Tutorial@SIGMOD 2018 (To Appear)

NEW! Deep Learning For Entity Matching: A Design Space Exploration
Sidharth Mudgal, Han Li, Anhai Doan, Theodoros Rekatsinas, Youngchoon Park, Ganesh Krishnan, Rohit Deep, Esteban Arcaute, and Vijay Raghavendra
SIGMOD 2018 (To Appear)

NEW! Fonduer: Knowledge Base Construction from Richly Formatted Data
Sen Wu, Luke Hsiao, Xiao Cheng, Braden Hancock, Theodoros Rekatsinas, Philip Levis and Christopher Ré
SIGMOD 2018 (To Appear)

HoloClean: Holistic Data Repairs with Probabilistic Inference
Theodoros Rekatsinas, Xu Chu, Ihab F. Ilyas and Christopher Ré
VLDB 2017

SLiMFast: Guaranteed Results for Data Fusion and Source Reliability
Theodoros Rekatsinas, Manas Jogklekar, Hector Garcia-Molina, Aditya Parameswaran and Christopher Ré

Forecasting Rare Disease Outbreaks from Open Source Indicators
Theodoros Rekatsinas, Saurav Ghosh, Sumiko Mekaru, Elaine Nsoesie, John Brownstein, Lise Getoor and Naren Ramakrishnan
Journal of Statistical Analysis and Data Mining, Best of SDM Special Issue, 2016

SourceSight: Enabling Effective Source Selection
Theodoros Rekatsinas, Amol Deshpande, Xin Luna Dong, Lise Getoor and Divesh Srivastava

HawkesTopic: A Joint Model for Network Inference and Topic Modeling from Text-Based Cascades
Xinran He, Theodoros Rekatsinas, James Foulds, Lise Getoor, and Yan Liu
International Conference on Machine Learning (ICML), 2015

StoryPivot: Comparing and Contrasting Story Evolution
Anja Gruenheid, Donald Kossmann, Theodoros Rekatsinas, and Divesh Srivastava

SourceSeer: Forecasting Rare Disease Outbreaks Using Multiple Data Sources Best Paper Award
Theodoros Rekatsinas, Saurav Ghosh, Sumiko Mekaru, Elaine Nsoesie, John Brownstein, Lise Getoor and Naren Ramakrishnan
SIAM International Conference on Data Mining (SDM), 2015

Finding Quality in Quantity: The Challenge of Discovering Valuable Sources for Integration
Theodoros Rekatsinas, Xin Luna Dong, Lise Getoor and Divesh Srivastava
7th Biennial Conference on Innovative Data Systems Research (CIDR), 2015

Characterizing and selecting fresh data sources
Theodoros Rekatsinas, Xin Luna Dong and Divesh Srivastava

SPARSI: partitioning sensitive data amongst multiple adversaries
Theodoros Rekatsinas, Amol Deshpande and Ashwin Machanavajjhala
Proceedings of the VLDB Endowment Volume 6 Issue 13, 2013

Multi-relational Learning Using Weighted Tensor Decomposition with Modular Loss
Ben London, Theodoros Rekatsinas, Bert Huang and Lise Getoor
NIPS 2012 Workshop on Spectral Algorithms for Latent Variable Models

Local structure and determinism in probabilistic databases
Theodoros Rekatsinas, Amol Deshpande and Lise Getoor

Fuzzy rule based neuro-dynamic programming for mobile robot skill acquisition on the basis of a nested multi-agent architecture Best Of Conference
John Karigiannis, Theodoros Rekatsinas and Costas S. Tzafestas
IEEE International Conference on Robotics and Biomimetics (ROBIO), 2010


Adaptive Querying Strategies for Efficient Crowdsourced Data Extraction
Theodoros Rekatsinas, Amol Deshpande and Aditya Parameswaran, 2016

Quality-Aware Data Source Management
Theodoros Rekatsinas, Doctoral Dissertation, 2015


Database Management Systems
(CS564, Computer Sciences Department, University of Wisconsin-Madison)
Fall 2017


PC-Member: SIGMOD 2017-2018, VLDB 2017, NIPS 2015-2017, IJCAI 2016, CIKM 2017