Crowdsourcing (2002–2015)

This project was among the first in the database research community to explore crowdsourcing for data management, with a focus on building data integration systems.

In many ways, it was ahead of its time—by nearly a decade. Broad interest in crowdsourcing within the database community did not emerge until around 2011.

This project examines how to apply crowdsourcing to problems such as schema matching, knowledge base construction, entity matching, and social media analysis.

In 2019, I started a new project called Cymphony, in collaboration with Qatar Computing Research Institute, to build a general-purpose crowdsourcing platform. That project is ongoing.


Early Work (2002–2004)

I focused on applying crowdsourcing to schema matching. At the time, the term "crowdsourcing" had not yet been coined, so this approach was referred to as "mass collaboration." The core idea, however, was the same: pose a question, collect multiple responses, and aggregate them (e.g., through majority voting).


Work on Crowdsourcing Knowledge Bases (2005–2009)

Subsequently I focused on crowdsourcing to build community-centric knowledge bases, and deployed such a knowledge base called DBLife.


Surveys


Crowdsourcing Work in Silicon Valley (2010–2015)

I worked extensively on crowdsourcing in industry from 2010 to 2015, first at Kosmix and later at WalmartLabs.