Yash Govind

1210 W Dayton St, Dept. of Computer Sciences, Madison, WI-53706 ygovind at cs.wisc.edu

I am a fifth year Ph.D. student in the Dept. of Computer Sciences at the University of Wisconsin - Madison. My advisor is Professor AnHai Doan. My research involves building a system we call CloudMatcher. CloudMatcher provides a hands-off cloud/crowd service for Entity Matching(EM) using machine learning techniques. It provides a robust and scalable self-service framework to build macro/micro services and do end-to-end entity matching and other steps in the EM space. We envision CloudMatcher to be fast, easy-to-use, scalable and highly available service on the web. Our CloudMatcher code is being deployed at American Family Insurance, a Fortune 500 company.

Before graduate school, I worked for 7 years in the insurance sector as a software engineer where my last stint was at Humana Inc. Green Bay - WI. In 2007, I graduated with a Bachelors degree in Computer Sciences from Pt. Ravi Shankar Shukla University.

Research Interests

Data management, data integration, entity matching, machine learning, crowdsourcing


Entity Matching Meets Data Science: A Progress Report from the Magellan Project

Yash Govind, P. Konda, and others

Paper (Industrial track)


CloudMatcher: A Hands-Off Cloud/Crowd Service for Entity Matching [Demonstration Proposal]

Yash Govind, E. Paulson, P. Nagarajan, Paul S. G.C., AnHai Doan, Y. Park, G. M. Fung, D. Conathan, M. Carter, M. Sun


VLDB 2018

Toward a System Building Agenda for Data Integration (and Data Science)

AnHai Doan, P. Konda, Paul S. G.C., A. Ardalan, J. R. Ballard, S. Das, Yash Govind, H. Li, P. Martinkus, S. Mudgal, E. Paulson, H. Zhang


IEEE Data Engineering Bulletin 2018

Magellan: Toward Building Entity Matching Management Systems [SIGMOD Research Highlight]

P. Konda, S. Das, Paul S. G.C., P. Martinkus, AnHai Doan, A. Ardalan, J. R. Ballard, Yash Govind, H. Li, F. Panahi, H. Zhang, Jeff Naughton, S. Prasad, G. Krishnan, R. Deep,


SIGMOD Research Highlight 2018

CloudMatcher: A Cloud/Crowd Service for Entity Matching

Yash Govind, E. Paulson, M. Ashok, Paul G.C., A. Hitawala, AnHai Doan, Y. Park, P. Peissig, E. LaRose, J. Badger

Paper Talk


Human-in-the-Loop Challenges for Entity Matching: A Midterm Report

AnHai Doan, A. Ardalan, J.R. Ballard, S. Das, Yash Govind, P. Konda, H. Li, S. Mudgal, E. Paulson, Paul G.C., H. Zhang




Research Assistant

Department of Computer Sciences
Advisor: AnHai Doan

Towards building a cloud/crowd-based self-service framework to do Entity Matching(EM). A platform to support macro and micro services to perform different steps in the EM space.

March 2016 - Present

Project Assistant

School of Education - UW Madison

Worked on VidyaMap project by integrating digital text in design-based science classes using D3, Java and MySQL.

January 2016 - December 2017

Student Researcher

UW School of Medicine and Public Health

Backend developer for Macademia application at UW Carbone Cancer Center. Developed WCF services to extract publication data from PubMed.

September 2015 - January 2016

Research Assistant

Department of Computer Sciences

Worked on understanding the CoW (Copy on Write) behaviour of B-tree file system (Btrfs) and how isolation of data and metadata is done in Btrfs.

October 2014 - May 2015


Data Analytics Intern

American Family Insurance

Working to deploy/build the CloudMatcher solution at AmFam to match customers across multiple databases and solve other matching usecases in the insurance domain.

May 2018 - Present

Systems Software Intern (File System)

Huawei Technologies

Worked on extending the IceFS solution to isolate metadata in Ext3 file system dynamically based on the size of file system. Added space isolation: a cube(abstraction) will be allocated a specific number of block groups and changes can be done only by an administrator using an online tool. Enhanced user level tools (mke2fs, dumpe2fs, e2fsck, etc.).

May 2015 - August 2015

Project Lead

Humana Inc. (GreenBay, WI)

Worked on developing and maintaining web-services and solutions for agent reporting, commissions and bonuses as a backend developer. Developed ETL SSIS packages and did performance enhancement of SQL queries and packages.

October 2009 - August 2014

Software Developer

Tech Mahindra (Mahindra Satyam)

Worked as a Mainframe developer/production support analyst for CIGNA.

September 2007 - October 2009