CS 784: Advanced Topics in Database Management Systems

Fall 2014, Mon/Wed 2:30-3:45pm, room 1257 COMP S&ST


AnHai Doan, contact information available from my homepage. Office hours: Mon 5-6pm and by appointment (pls send email, thanks). If you show up at 5pm on Monday and am still on the phone, pls wait a few mins, I will wrap up shortly. Thank you.

Course Description
The official name of this course is "Data models and Languages", a legacy name left over from the past. What this course will cover is fundamental and hot data management issues beyond relational data management. The goals are to help students prepare for the database qualifying exam, and get exposed to current hot and interesting trends beyond-relational data management. Another way to view this is: Prerequisites: Undergraduate knowledge of relational databases is highly recommended. If not, you should be willing to do a "crash course" on the topic in the first few weeks. The recommended books for the crash course are: The Cow Book, or The Complete Book.

Course Format
The course meets twice a week to discuss research papers. You are required to read the specified paper/textbook chapter/slides before each lecture and attend the lectures. There will be a midterm, a final, and a project.

Midterm: TBD, in class at usual time/room,
Final: TBD, in class at usual time/room,
Other important dates: Nov 27 - Nov 30: Thanksgiving break; last class: Wed Dec 10.

Grade: Midterm: 30%, final: 30%, project: 30%, participation in the class: 10%.

Course Schedule
Course schedule and the paper list are below (may be revised slightly as the course progresses). Each paper will be covered in 1-2 lectures.

Data Integration
Several chapters from a data integration textbook (available on Amazon; I will send out the chapters that you have to read shortly). Slides for these chapters are available from the book's Web site.

IR / Web Search
Read Chapter 27 (IR and XML Data) of the Cow Book, but only from 27.1 to 27.5.
IR overview
Web search, Pagerank

Read Chapter 24 (Deductive Databases) of the Cow Book.
Deductive databases (Datalog), Ullman notes
Evaluation of recursive programs (scan it only)

Data Mining (tentative, awaiting syncing with CS 764)
Read Chapter 26 (Data Mining) of the Cow Book.
Data mining: association rules
Data mining: clustering

Colliding Worlds: Hot Emerging Topics
We will cover several hot emerging topics, such as big data, noSQL, crowdsourcing, information extraction, and social media analysis.

Details to be posted later