I was a PhD student in the Computer Sciences department at UW-Madison, and graduated in August, 2012. My next stop is Google, and it starts in September, 2012.
My research interests include
(1) how to build scalable statistical inference/learning/analytics systems; and
(2) how to apply such systems to challenging problems such as text analytics.
I'm co-advised by Christopher Ré and AnHai Doan. Currently I mainly work on the Hazy project and the DARPA Machine Reading program. Please check out our Tuffy and Felix systems, in which we apply data management techniques to scale up a probabilistic logic language called Markov logic.
I got a B.S. in Computer Science from Tsinghua University in 2008.
- Office: Rm 4397 CS Building
- Email: leonn ☢ cs ☀ wisc ☀ edu
- An encyclopedia by the machines, for the people! Are you a student/scholar wishing for an exo-memory of every piece of knowledge in your field? Are you an investigative journalist or consultant detective trying to connect dots? Are you an attorney having to sift through thousands of pages of text? Are you tired of big G's superficiality? DeepDive (and successors) is here to help!
- An operator-based approach to statistical inference in Markov logic that breaks away from existing monolithic approaches by identifying specialized subtasks in the program and solving them using specialized algorithms. It intentionally mimics an RDBMS.
- A scalable Markov logic infernece engine built on an RDBMS.
Solving a Makov logic program inovolves two phases: grounding and search.
We scale up grounding with SQL and search with graph partitioning.
- Demonstrating a few thoughts about keyword search on relational data.
Web-scale Knowledge-base Construction via Statistical Inference and Learning
Feng Niu. PhD Dissertation, 2012, UW-Madison.
DeepDive: Web-scale Knowledge-base Construction using Statistical Learning and Inference
With Ce Zhang, Christopher Ré, and Jude Shavlik - VLDS 2012
Elementary: Large-scale Knowledge-base Construction via Machine Learning and Statistical Inference
With Ce Zhang, Christopher Ré, and Jude Shavlik - IJSWIS-WEKEX 2012
Big Data versus the Crowd: Looking for Relationships in All the Right Places
With Ce Zhang, Christopher Ré, and Jude Shavlik - ACL 2012
Hogwild!: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent
With Benjamin Recht, Christopher Ré, and Stephen J. Wright - NIPS 2011
Felix: Scaling Inference for Markov Logic with a Task-Decomposition Approach
With Ce Zhang, Christopher Ré, and Jude Shavlik - CoRR 2011
Tuffy: Scaling up Statistical Inference in Markov Logic Networks using an RDBMS
With Christopher Ré, AnHai Doan, and Jude Shavlik - VLDB 2011
The Price of Anarchy in Bertrand Games
With Shuchi Chawla - EC 2009
Videos from HazyResearch
- Article: Branch-and-bound cache
Concept of an elastic cache mechanism for the Net.
- Game: Ninja Parkour
Tilt maze remade and rebranded. Requires .NET framework 3.5+.
- Game: Gomoku
Requires .NET framework 3.5+.