Sanjib Das

4352 Computer Sciences,
1210 West Dayton Street,
Madison, WI-53706
Email: sanjibkd at cs dot wisc dot edu

I am a third year graduate student in the Department of Computer Sciences, at the University of Wisconsin-Madison. Prior to joining UW-Madison, I worked with Oracle Server Technologies, Bangalore for three years. Prior to that I graduated with a BTech in Computer Science from Indian Institute of Technology (IIT), Kharagpur.


I am interested in the broad area of data management and more specifically in Big Data, information extraction and crowdsourcing. I am working as a research assistant under Prof. AnHai Doan.

Currently I am looking into the problem of Entity Matching (aka Data Matching or Entity Resolution or Record Linkage). Previously I have worked on building large scale knowledge bases and using them to perform information extraction from text.


  • Entity Extraction, Linking, Classification, and Tagging for Social Media: A Wikipedia-Based Approach
    Abhishek Gattani, Digvijay S. Lamba, Nikesh Garera, Mitul Tiwari, Xiaoyong Chai, Sanjib Das, Sri Subramaniam, Anand Rajaraman, Venky Harinarayan, AnHai Doan
    To appear in VLDB (Industrial Track) 2013
  • Building, Maintaining, and Using Knowledge Bases: A Report from the Trenches
    Omkar Deshpande, Digvijay S. Lamba, Michel Tourn, Sanjib Das, Sri Subramaniam, Anand Rajaraman, Venky Harinarayan, AnHai Doan
    SIGMOD (Industrial Track) 2013 [Paper] [Slides]

Courses towards Major:

Spring 2012
CS761: Advanced Machine Learning (Jerry Zhu)

Fall 2011
CS784: Advanced Topics in Database Management Systems (AnHai Doan)
CS760: Machine Learning (David Page)
CS537: Introduction to Operating Systems (Remzi Arpaci-Dusseau)

Spring 2011
CS769: Advanced Natural Language Processing (Ben Snyder)
CS525: Linear Programming Methods (Ben Recht)

Fall 2010
CS764: Topics in Database Management Systems (Jignesh Patel)
CS547: Computer Systems Modeling Fundamentals (Mary Vernon)

Courses towards Minor (distributed across Mathematics, Optimization and Statistics):

ISyE635: Tools and Environments for Optimization (Jeff Linderoth)
MA632: Introduction to Stochastic Processes (David Anderson)

Fun Courses:

PE175: Volleyball I (for beginners)
PE262: Tennis II (for intermediates)
PE102: Swimming I (for beginners)


Last updated: July 30, 2013