I was a Ph.D.
student working with AnHai Doan
in the Computer Science Department
at the University of Wisconsin, Madison.
My main research interest is on applying database, Web, and AI technologies to
data management problems, especially with regard to managing unstructured data.
Toward this goal, my research has focused on key problems such as information
extraction and data integration.
Additionally, I have been a key architect for
a prototype structured Web portal that extracts and integrates information
for the database research community.
The Case for a Structured Approach to Managing Unstructured Data, AnHai Doan, Jeffrey Naughton, Akanksha Baid, Xiaoyong Chai, Fei Chen, Ting Chen, Eric Chu, Pedro DeRose, Byron Gao,
Chaitanya Gokhale, Jianshen Huang, Warren Shen, Ba-Quy Vuong. CIDR-09.
Information Extraction Challenges in Managing Unstructured Data. AnHai Doan, Jeffrey Naughton, Raghu Ramakrishnan, Akanksha Baid, Xiaoyong Chai, Fei Chen, Ting Chen, Eric Chu, Pedro DeRose, Byron Gao, Chaitanya Gokhale, Jiansheng Huang, Warren Shen, Ba-Quy Vuong. SIGMOD Record, Special Issue on Managing Information Extraction, Winter 08.
Toward Best-effort Information Extraction. Warren Shen, Pedro DeRose, Robert McCann, AnHai Doan, Raghu Ramakrishnan. SIGMOD-08.
Matching Schemas in Online Communities: A Web 2.0 Approach. Robert McCann, Warren Shen, AnHai Doan. ICDE-08.
Building Community Wikipedias: A Human-Machine Approach. Pedro DeRose, Xiaoyong Chai, Byron Gao, Warren Shen, AnHai Doan, Philip Bohannon, Jerry Zhu. ICDE-08.
Declarative Information Extraction Using Datalog with Embedded Extraction Predicates. Warren Shen, AnHai Doan, Jeffrey Naughton, Raghu Ramakrishnan. VLDB-07.
Building Structured Web Community Portals: A Top-Down, Compositional, and Incremental Approach. Pedro DeRose, Warren Shen, Fei Chen, AnHai Doan, Raghu Ramakrishnan. VLDB-07.
Source-aware Entity Matching: A Compositional Approach.
Warren Shen, Pedro DeRose, Long Vu, AnHai Doan, Raghu Ramakrishnan. ICDE-07. 122/659 = 18%.
DBLife: A Community Information Management Platform for the Database Research Community (demo).
Pedro DeRose, Warren Shen, Fei Chen, Yoonkyong Lee, Doug Burdick, AnHai Doan, Raghu Ramakrishnan. CIDR-07.
User-Centric Research Challenges
in Community Information Management Systems. AnHai Doan, Philip Bohannon,
Raghu Ramakrishnan, Xiaoyong Chai, Pedro DeRose, Byron Gao, Warren Shen.
IEEE Data Engineering Bulletin, special issue on data management in social
Community Information Management.
AnHai Doan, Raghu Ramakrishnan, Fei Chen, Pedro DeRose, Yoonkyong Lee, Robert McCann, Mayssam Sayyadian, Warren Shen.
IEEE Data Engineering Bulletin, Special Issue on Probabilistic Databases, 29(1), 2006.
Constraint-Based Entity Matching.
Warren Shen, Xin Li, AnHai Doan. AAAI-05 (Nat. Conf. on AI). 148/803 = 18%.
- Integrating Data from Disparate Sources: A Mass Collaboration Approach.
Robert McCann, Alexander Kramnik, Warren Shen, Vanitha Varadarajan, Olu Sobulo, AnHai Doan. ICDE-05. Poster. 100/521 = 19%.
Collective Integration of Information for Virtual Organizations.
Robert McCann, Warren Shen, AnHai Doan. SIGMOD Workshop on Databases in Virtual Organizations (DIVO), 2004.
Ph.D. Computer Science, University of Illinois, Urbana-Champaign, 2009
M.S. Computer Science, University of Illinois, Urbana-Champaign, 2005
B.S. Computer Science, Stanford University, 2002