Paul Suganthan G. C.

You can contact me at:

Email: paul.suganthan@gmail.com        LinkedIn

Bio

I am a Ph.D. candidate in the Department of Computer Sciences, at the University of Wisconsin-Madison. During my Ph.D., I have interned at Google in summer of 2015, and at Walmart Labs in the summer of 2014. Prior to joining UW-Madison, I obtained my Bachelors degree in Computer Science from College of Engineering Guindy, Anna University, India.

Research

I am interested in the broad area of Data Management, and more specifically in Data Integration, Data Science, Machine Learning, Big Data, and Crowdsourcing. My advisor is Prof. AnHai Doan.

During my Ph.D, I have worked on diverse problems such as crowdsourced entity matching (EM), scaling execution of EM workflows containing human and machine computations, scaling execution of ML models over joins (what if the join condition is an ML model ?), and building EM managment systems.

Publications

  1. MatchCatcher: A Debugger for Blocking in Entity Matching
    H. Li, P. Konda, Paul Suganthan G.C., A. Doan
    EDBT 2018 (To Appear)

  2. Falcon: Scaling Up Hands-Off Crowdsourced Entity Matching to Build Cloud Services  [Paper]   [SIGMOD Talk]
    S. Das, Paul Suganthan G.C., A. Doan, J. Naughton, G. Krishnan, R. Deep, E. Arcaute, V. Raghavendra, Y. Park
    SIGMOD 2017

  3. CloudMatcher: A Cloud/Crowd Service for Entity Matching  [Paper]
    Y. Govind, E. Paulson, M. Ashok, Paul Suganthan G.C., A. Hitawala, A. Doan, Y. Park, P. Peissig, E. LaRose, J. Badger
    BigDAS Workshop, KDD 2017

  4. Human-in-the-Loop Challenges for Entity Matching: A Midterm Report  [Paper]
    A. Doan, A. Ardalan, J. R. Ballard, S. Das, Y. Govind, P. Konda, H. Li, S. Mudgal, E. Paulson, Paul Suganthan G.C., H. Zhang
    HILDA 2017

  5. Magellan: Toward Building Entity Matching Management Systems  [Paper]
    P. Konda, S. Das, Paul Suganthan G.C., A. Doan, A. Ardalan, J. R. Ballard, H. Li, F. Panahi, H. Zhang, J. Naughton, S. Prasad, G. Krishnan, R. Deep, V. Raghavendra
    VLDB 2016

  6. Magellan: Toward Building Entity Matching Management Systems over Data Science Stacks  [Paper]
    P. Konda, S. Das, Paul Suganthan G.C., A. Doan, A. Ardalan, J. R. Ballard, H. Li, F. Panahi, H. Zhang, J. Naughton, S. Prasad, G. Krishnan, R. Deep, V. Raghavendra
    VLDB (Demo) 2016

  7. Why Big Data Industrial Systems Need Rules and What We Can Do About It  [Paper]
    Paul Suganthan G.C., C. Sun, Krishna Gayatri K., H. Zhang, F. Yang, N. Ram, S. Prasad, E. Arcaute, G. Krishnan, R. Deep, V. Raghavendra, A. Doan
    SIGMOD (Industrial Track) 2015

  8. Social Media Analytics: the Kosmix Story, with many authors. IEEE Data Engineering Bulletin, Sept 2013.  [Paper]

  9. AJAX Crawler, Paul Suganthan G.C. IEEE ICDSE 2012.   [Paper]

Open Source Contributions

I have been the main developer of two Python packages providing tools for scalable string matching (py_stringmatching and py_stringsimjoin). I have been managing the end-to-end development and release process of these packages.

The packages are currently being used at multiple organizations (such as RIT, Johnson Controls, Marshfield Clinic etc.) and in data science classes at UW-Madison. The packages are currently available in PyPI and Conda. Feel free to ping me in case you face any issues with the packages.


Last updated: Aug 25, 2017