C. Gokhale, S. Das, A. Doan, J. Naughton, N. Rampali, J. Shavlik & X. Zhu (2014).
Corleone: Hands-Off Crowdsourcing for Entity Matching.
Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, Snowbird, Utah.
This publication is available in PDF.
Abstract:
Recent approaches to crowdsourcing entity matching (EM) are limited in that they crowdsource only parts of the EM workflow, requiring a developer to execute the remaining parts. Consequently, these approaches do not scale to the growing EM need at enterprises and crowdsourcing startups, and cannot handle scenarios where ordinary users (i.e., the masses) want to leverage crowdsourcing to match entities. In response, we propose the notion of hands-off crowdsourcing (HOC), which crowdsources the entire workflow of a task, thus requiring no developers. We show how HOC can represent a next logical direction for crowdsourcing research, scale up EM at enterprises and crowdsourcing startups, and open up crowdsourcing for the masses. We describe Corleone, a HOC solution for EM, which uses the crowd in all major steps of the EM process. Finally, we discuss the implications of our work to executing crowdsourced RDBMS joins, cleaning learning models, and soliciting complex information types from crowd workers.
Computer Sciences Department
College of Letters and Science
University of Wisconsin - Madison
INFORMATION
~ PEOPLE
~ GRADS
~ UNDERGRADS
~ RESEARCH
~ RESOURCES
5355a Computer Sciences and Statistics ~ 1210 West Dayton Street, Madison,
WI 53706
cs@cs.wisc.edu ~ voice: 608-262-1204 ~
fax: 608-262-9777