-
I'm moving to Stanford (starting this summer). Bucky, you gave me a great home with great students, colleagues, and cheese curds. Thank you!
-
YouTube
We're now mirroring our videos on a YouTube channel, HazyResearch. Feedback
(positive and negative) is more than welcome.
A few projects are already up:
- GeoDeepDive With Shanan Peters (UW Geoscience) and Miron Livny's help from Condor, we are combining Macrostrat with DeepDive to (hopefully!) deliver value for Geoscientists. One key challenge is extracting all the measurement information that is reported in the literature, that is buried in the dark data of text, graphs, and figures. A demo video! Thank you to the National Science Foundation and Google for supporting this work. .
- IceCube Mark Wellons, Ben Recht, and I have done some work with the IceCube Neutrino Detector. Mark's code now runs in the detector on the South Pole and is used on over 250 Million events per day. More details are in this video and this new video! Thank you to the IceCube Collaboration and UW Graduate School for their support of our work!
Upcoming Meetings and Talks
- SILO. Ben Recht and Rob Nowak organize a great meeting.
- PODS13. Dan Suciu and I are organizing a Colloquium on Theory Challenges in Big Data. Dan got a great set of speakers, and I'm excited to hear what they say!
- GraphLab! Carlos has a company, and they are awesome. I'll be talking at their workshop.
- BNCOD13 and DEOS. I'm giving an invited tutorial and talk, Dan Olteanu put together a very interesting BNCOD program and Wolfgang Gatterbauer put together a very fun DEOS program.
- SPARC13. There is an awesome program with tutorials and incredible invited speakers (modulo a poorly chosen database guy...). I'm excited!
- I'm giving a keynote at ECML-PKDD 2013!
- I will be hanging out at Simons big data events at Cal. They promise to be off-the-charts good! Check 'em out!
Recent Papers, Manuscripts, and Funding News
- VLDB 2013. We have two demos accepted to VLDB. Feature Selection in Enterprise Analytics: A Demonstration using an R-based Data Analytics System with Oracle people and Ringtail: Nowcasting Made Simple with Michigan people.
- ACL 2013. Vidhya and Ce have an awesome new paper Understanding Tables in Context using Standard NLP Toolkits. We use this work in GeoDeepDive.
- WebDB. Dolan Antenucci, Michael Cafarella, Margaret C. Levenstein, Matthew Shapiro and I have a paper Ringtail: Nowcasting Made Easy in WebDB 2013 with SIGMOD. (Formally called Automan)
- StarAI. Sriraam Natarajan, José Picado, Tushar Khot, Kristian Kersting, Jude Shavlik, and my paper Using Commonsense Knowledge to Automatically Create (Noisy) Training Examples from Text has been accepted to StarAI with AAAI 2013.
- ICRC. Mark Wellon's work on making a robust statistical detector for IceCube has been accepted to The International Cosmic Ray Conference 2013. This is joint work with the IceCube collaboration. Thanks, IceCube!
- AirForce. Thank you to the Air Force for supporting Mathematical Foundations of Secure Computing Clouds; this is the hard work of Jordan Ellenberg (Math), Ben Recht (CS), Tom Ristenpart (CS), Rob Nowak (EE), and Steve Wright (CS).
- MPC. Ben Recht and my paper about Matrix Factorization (Jellyfish) is accepted to Mathematical Programming Computation, the latest version of the paper is here which supersedes the 2011 version.
- Alfred P. Sloan Research Fellowship. Thank you for your generous support of our research!
- Oracle. Thank you for continuing support to support the Hazy Research group! This gift will be used to continue our work on feature engineering for structured analytics.
- SIGMOD13: Towards High-Throughput Gibbs Sampling at Scale: A Study across Storage Managers. The paper is here. This is the latest version of our inference engine that runs GeoDeepDive and DeepDive. Code is here. A new video and code release is coming soon!
- SIGMOD13 (demo). GeoDeepDive: Statistical Inference using Familiar Data-Processing Languages has been accepted as a demonstration! Roughly, it will be a live version of how to build the system in this video and this paper.
- ACM Queue and CACM. The students of the Hazy group put
together a manuscript describing their vision for Big Data Analytics Hazy: Making it
Easier to Build and Maintain Big-data Analytics in ACM
Queue and was invited to CACM













