Professor, Database Group
Department of Computer Sciences, University of Wisconsin
Room 4355, 1210 W. Dayton St, Madison WI 53706
email@example.com, (608) 262 9759
News (Old News)
- Oct 2016: A talk on a system building agenda for data integration (and data science).
The Magellan system described below is an example of realizing this agenda for entity matching.
- Jul 2016: Launching
a new project to build an entity matching management
system. Magellan guides users through the EM workflow, step by
step. It provides automated tools to address the "pain points" of
the steps, and these tools seek to cover the entire EM
workflow. Finally, tools are built on top of the Python data science
and big data eco-system.
Research (Group's Homepage)
Data management, focusing on data integration, data
science, big data, and data-centric software eco-systems.
My work has charted new directions or bet on emerging directions that
I believe would become fundamental for data management. I have been
working on five such directions. The two current directions (from
The past three directions (from 2000-2010):
In between, from 2010-2014 I
some time in Silicon Valley, at a startup and an e-commerce company,
putting my work in the above three directions to use, and learning a
ton about doing things "in the wild".
- Building data integration systems
- Developing an agenda for building data integration systems (see a recent talk).
- As a part of the above agenda, developing
Magellan, an entity matching management system situated within the PyData eco-system.
- Data science.
Selected Recent Publications
Google Scholar Entry)
Selected Awards and Honors
- Human-in-the-Loop Challenges for Entity Matching: A Midterm Report,
A. Doan, A. Ardalan, J. Ballard, S. Das, Y. Govind, P. Konda, H. Li, S. Mudgal, E. Paulson, P. Suganthan G.C., H. Zhang.
HILDA Workshop @ SIGMOD-17.
- Falcon: Scaling Up Hands-Off Crowdsourced Entity Matching to Build Cloud Services,
S. Das, P. Suganthan G.C., A. Doan, J. Naughton, G. Krishnan, R. Deep, E. Arcaute, V. Raghavendra, Y. Park.
SIGMOD-17. extended version
- Towards Interactive
Debugging of Rule-Based Entity Matching, F. Panahi, W. Wu,
A. Doan, J. Naughton, EDBT-17.
- Magellan: Toward Building
Entity Matching Management Systems, P. Konda, S. Das,
P. Suganthan G.C., A. Doan, A. Ardalan, J. R. Ballard, H. Li,
F. Panahi, H. Zhang, J. Naughton, S. Prasad, G. Krishnan, R. Deep,
V. Raghavendra. VLDB-16. extended version
- Magellan: Toward
Building Entity Matching Management Systems over Data Science
Stacks, P. Konda, S. Das, P. Suganthan G.C., A. Doan,
A. Ardalan, J. R. Ballard, H. Li, F. Panahi, H. Zhang, J. Naughton,
S. Prasad, G. Krishnan, R. Deep, V. Raghavendra. VLDB-16,
- The Beckman Report on Database Research,
with many authors. Communications of the ACM, 2016. extended version
- Chimera: Large-Scale Classification using Machine Learning, Rules, and Crowdsourcing,
C. Sun, N. Rampalli, F. Yang, A. Doan. VLDB-14, industrial paper. slides
- Corleone: Hands-off Crowdsourcing for Entity Matching,
C. Gokhale, S. Das, A. Doan, J. Naughton, N. Rampalli, J. Shavlik, J. Zhu.
SIGMOD-14. slides, extended report
- Social Media Analytics: the Kosmix Story, with many authors.
IEEE Data Engineering Bulletin, 2013.
- Entity Extraction,
Linking, Classification, and Tagging for Social Media: A
Wikipedia-Based Approach, A. Gattani, D. Lamba, N. Garera,
M. Tiwari, X. Chai, S. Das, S. Subramaniam, A. Rajaraman,
V. Harinarayan, and A. Doan. VLDB-13, industrial paper. slides
- Building, Maintaining, and Using
Knowledge Bases: A Report from the Trenches, O. Deshpande,
D. Lamba, M. Tourn, S. Das, S. Subramaniam, A. Rajaraman,
V. Harinarayan, A. Doan. SIGMOD-13, industrial paper. slides
- Crowdsourcing Systems on the World-Wide Web,
A. Doan, R. Ramakrishnan, A. Halevy. Communications of the
Recent classes include data science at the undergrad
levels, and CS 564 (Introduction to RDBMSs).
- I spent 3 years (2011-2014) setting up a
professional MS program and a
program in CS at UW-Madison (with help from Karu Sankaralingam,
Jeff Naughton, and Suman Banerjee). These programs have been highly
successful, enrolling hundreds of students.
- Selected recent community service
- member, SIGMOD Advisory Board,
- member, ICDE 10-Year Most Influential Paper Award Committee (since 2013),
- associate editor, VLDB-16,
- co-chair, industrial program, VLDB-15,
- co-chair, Beckman meeting (with Mike Carey), 2013.
- chair, industrial program, SIGMOD-12
- I co-authored a data
integration textbook with Alon Halevy and Zack Ives in 2012.