Photo credit: Hector Garcia-Molina

 

Theodoros (Theo) Rekatsinas

I am an assistant professor at the University of Wisconsin-Madison. I am a member of the UW-Madison Database Group.

Before joining UW, I was a Postdoc at Stanford with Chris Ré. I received my Ph.D. in Computer Science from the University of Maryland. My advisors were Amol Deshpande and Lise Getoor.

ML-first Data Integration and Enrichment: My group is exploring the fundamental connections between data integration and data enrichment with statistical learning and probabilistic inference. Our latest effort for ML-first data enrichment is HoloClean which is built on the idea of weak supervision and probabilistic inference; see our blog post. Systems like HoloClean transition from logic to probabilities in a way similar to the AI-revolution in the eighties.

Email: thodrek [at] cs.wisc.edu  /  Office: CS4361 @ Computer Sciences



News

  • How to address data cleaning via a noisy channel model. Our work on Probabilistic Unclean Databases to appear in ICDT 2019!
  • The slides of our tutorial on the synergy between ML and data integration are available here.
  • Excited to release HoloClean as an open-source project! Check it out here!
  • Learn about ML-first data integration @SIGMOD2018 and @VLDB2018: Our tutorial with Xin Luna Dong, explains the synergies between data integration and machine learning!



Publications

NEW! A Formal Framework For Probabilistic Unclean Databases
Christopher De Sa, Ihab F. Ilyas, Benny Kimelfeld, Christopher Ré and Theodoros Rekatsinas
ICDT 2019 (To Appear)

Data Integration and Machine Learning: A Natural Synergy
Xin Luna Dong and Theodoros Rekatsinas
Tutorial@SIGMOD 2018 and @VLDB2018

Deep Learning For Entity Matching: A Design Space Exploration
Sidharth Mudgal, Han Li, Anhai Doan, Theodoros Rekatsinas, Youngchoon Park, Ganesh Krishnan, Rohit Deep, Esteban Arcaute, and Vijay Raghavendra
SIGMOD 2018 Code is available here

Fonduer: Knowledge Base Construction from Richly Formatted Data
Sen Wu, Luke Hsiao, Xiao Cheng, Braden Hancock, Theodoros Rekatsinas, Philip Levis and Christopher Ré
SIGMOD 2018

HoloClean: Holistic Data Repairs with Probabilistic Inference
Theodoros Rekatsinas, Xu Chu, Ihab F. Ilyas and Christopher Ré
VLDB 2017

SLiMFast: Guaranteed Results for Data Fusion and Source Reliability
Theodoros Rekatsinas, Manas Jogklekar, Hector Garcia-Molina, Aditya Parameswaran and Christopher Ré
ACM SIGMOD 2017

Forecasting Rare Disease Outbreaks from Open Source Indicators
Theodoros Rekatsinas, Saurav Ghosh, Sumiko Mekaru, Elaine Nsoesie, John Brownstein, Lise Getoor and Naren Ramakrishnan
Journal of Statistical Analysis and Data Mining, Best of SDM Special Issue, 2016

SourceSight: Enabling Effective Source Selection
Theodoros Rekatsinas, Amol Deshpande, Xin Luna Dong, Lise Getoor and Divesh Srivastava
ACM SIGMOD, 2016

HawkesTopic: A Joint Model for Network Inference and Topic Modeling from Text-Based Cascades
Xinran He, Theodoros Rekatsinas, James Foulds, Lise Getoor, and Yan Liu
International Conference on Machine Learning (ICML), 2015

StoryPivot: Comparing and Contrasting Story Evolution
Anja Gruenheid, Donald Kossmann, Theodoros Rekatsinas, and Divesh Srivastava
ACM SIGMOD, 2015

SourceSeer: Forecasting Rare Disease Outbreaks Using Multiple Data Sources Best Paper Award
Theodoros Rekatsinas, Saurav Ghosh, Sumiko Mekaru, Elaine Nsoesie, John Brownstein, Lise Getoor and Naren Ramakrishnan
SIAM International Conference on Data Mining (SDM), 2015

Finding Quality in Quantity: The Challenge of Discovering Valuable Sources for Integration
Theodoros Rekatsinas, Xin Luna Dong, Lise Getoor and Divesh Srivastava
7th Biennial Conference on Innovative Data Systems Research (CIDR), 2015

Characterizing and selecting fresh data sources
Theodoros Rekatsinas, Xin Luna Dong and Divesh Srivastava
ACM SIGMOD, 2014

SPARSI: partitioning sensitive data amongst multiple adversaries
Theodoros Rekatsinas, Amol Deshpande and Ashwin Machanavajjhala
Proceedings of the VLDB Endowment Volume 6 Issue 13, 2013

Multi-relational Learning Using Weighted Tensor Decomposition with Modular Loss
Ben London, Theodoros Rekatsinas, Bert Huang and Lise Getoor
NIPS 2012 Workshop on Spectral Algorithms for Latent Variable Models

Local structure and determinism in probabilistic databases
Theodoros Rekatsinas, Amol Deshpande and Lise Getoor
ACM SIGMOD 2012

Fuzzy rule based neuro-dynamic programming for mobile robot skill acquisition on the basis of a nested multi-agent architecture Best Of Conference
John Karigiannis, Theodoros Rekatsinas and Costas S. Tzafestas
IEEE International Conference on Robotics and Biomimetics (ROBIO), 2010



Manuscripts

Adaptive Querying Strategies for Efficient Crowdsourced Data Extraction
Theodoros Rekatsinas, Amol Deshpande and Aditya Parameswaran, 2016

Quality-Aware Data Source Management
Theodoros Rekatsinas, Doctoral Dissertation, 2015



Students

Current PhD Students:

Current MS and Undergraduate Student:

  • Meghana Moorthy Bhat
  • Joshua McGrath
  • Jordan Vonderwell

Friends and Collaborators:

  • Alireza Heidari (Waterloo)
  • Han Li (UW-Madison)
  • Sen Wu (Stanford)

Alumni:

  • Sidharth Mudgal (MS 2018, Amazon)
  • Sherine Zhang (BS 2018, Stanford for MS)



Teaching

CS839: Probabilisitc Graphical Models, Fall 2018

CS839: Data Management for Machine Learning, Spring 2018

CS564: Database Management Systems, Fall 2017



Service

PC Demo Chair: ICDE 2019

PC-Member: SIGMOD 2017-2019, VLDB 2017, ICDE 2018, NIPS 2015-2017, ICML 2018, IJCAI 2016, CIKM 2017-2018

Reviewer: ICML, SIGMOD, VLDB, WSDM, WWW, TKDE, TODS, TSAS, SIGMOD Record