Research Interests:
statistical machine learning, natural language processing
Publications
[ by date
|
by topic ]
Current Professional Activities
Program Committee: ICML 2008,
AAAI 2008,
ACL 2008,
EMNLP 2008
Program Committee: ICML 2007 (SPC),
AAAI 2007,
AISTATS 2007,
ECML 2007
Ph.D. Carnegie Mellon University, 2005.
(C.V.)
Research
My research interests are in statistical machine learning algorithms and their applications in various areas, including natural language processing, cognitive science, and human-computer interfaces. Some of my group's current research topics are:
Semi-Supervised Learning
Semi-supervised learning is a machine learning paradigm which uses both labeled and unlabeled data to learn better.
It is of great practical interest because it can reduce human annotation effort while maintaining learning quality.
It is of great theoretical interest because natural systems, including ourselves, seem to learn from both labeled and unlabeled data too.
We are working on several open questions, including designing robust semi-supervised learning algorithms,
handling huge amount of labeled and unlabeled data,
and studying semi-supervised learning behaviors in natural learning systems (e.g., humans).
See my
semi-supervised learning literature survey, and
tutorial at ICML 2007.
[more publications]
Text-to-Picture Synthesis
We are developing a human-computer interaction technique
called "Text-to-Picture synthesis", so that computers can automatically
generate pictures from natural language sentences. The
goal is for the picture to convey the gist of the text. It may be
useful as an learning aid for children and second language learners, and as an assistive communication tool for people with learning disability.
Details of the project can be found here.
Natural Language Processing
We are interested in applying novel statistical machine learning techniques to natural language processing,
for example diversity ranking and sentiment classification.
[publications]
Applications of Statistical Machine Learning
We are interested in various traditional computer science problems,
including latent topic models for statistical software debugging,
Markov random fields for binary code analysis,
and support vector regression for network throughput prediction.
[publications]
AI group at UW-Madison
Courses
CS 769 Advanced Natural Language Processing: 2008(S)
CS 838 Topics on Advanced Natural Language Processing: 2007(S), 2006(S)
CS 540 Introduction to Artificial Intelligence: 2006(F), 2005(F)
Students
David Andrzejewski (with Mark Craven)
Nate Fillmore
Andrew Goldberg
Lijie Heng
Alumni
Jurgen Van Gael, M.S. 2007, now Ph.D. student at University of Cambridge
Not sure how to pronounce Chinese names like Zhu, Cai, Qin, Xu?
Learn it in five minutes.