|
Arun Kumar4354 Computer Sciences,1210 West Dayton Street, Madison, WI-53706 Email: (1stname)@cs.wisc.edu Phone: 608-890-0447 |
|
|
I am a fourth year graduate student in the Department of Computer Sciences, at the University of Wisconsin-Madison. I obtained my Bachelors in Computer Science and Engineering from the Indian Institute of Technology, Madras. |
Research:My primary research interests are in data management, especially the intersection of data management and machine learning (popularly known as "Big Data Analytics"). I am also interested in applied machine learning.
I am advised by professor Christopher Ré
in the Database Group here.
My current research focuses on problems in data analytics and managing uncertain data, as part of
Project Hazy. I spent two summers interning at Oracle Labs (2012) and IBM Research Almaden (2011), working on projects in data analytics. I was an RA at the Microsoft Jim Gray Systems Lab for a year (2010-11). In the past, I have worked on data management issues in networked systems. |
News:
|
Projects/Publications:BrainwashNewIn this project, we aim to systematize the black art of feature engineering in analytics using data management ideas.
In this project, we are building a unified system to handle several data analytics techniques. We use some ideas from the mathematical optimization literature and standard RDBMS features to achieve simplicity, efficiency and scalability. We also contributed some code to the open-source library MADlib.
In this project, we integrate the management of uncertain content, specifically Optical Character Recognition (OCR) data, with an RDBMS. We use a probabilistic model and some ideas from statistics to tradeoff between quality and performance.
|
Past Projects/Publications:InfoNames: Information-based naming of networked content
|
Courses:MA443: Applied Linear Algebra (Fall 2012)ST632: Stochastic Processes (Fall 2011) CS769: Advanced Natural Language Processing (Spring 2011) CS537: Operating Systems (Spring 2011) CS760: Machine Learning (Fall 2010) CS787: Advanced Algorithms (Fall 2010) CS764: Advanced Database Management Systems (Spring 2010) CS740: Advanced Computer Networks (Spring 2010) CS784: Data Models and Languages (Fall 2009) CS838: Rethinking the Internet Architecture (Fall 2009)
|
Misc fun stuff:
|
Last Updated: Dec 25, 2012