CS 764: Topics in Database Management Systems

Spring 2007, Tue/Thur 2:30-3:15pm, room 1289

Instructor
AnHai Doan, contact information available from my homepage


Course Description
This course studies principles to effectively manage data. We will focus mostly on how these principles have been developed and implemented for relational databases. But we will also briefly explore how they can be augmented and applied well beyond relational contexts, to managing text data, emails, scientific databases, and data on the Web. This part will provide a glimse into next-generation search engines, business intelligence, and unstructured data management systems.

Prerequisites: Undergraduate knowledge of relational databases is highly recommended. If not, you should be willing to do a "crash course" on the topic in the first few weeks. The recommended books for the crash course are: The Cow Book, or The Complete Book.


Course Format
The course meets twice a week to discuss research papers. You are required to read the specified paper before each lecture and attend the lectures. There will be a midterm and a final, and a project that is done in teams of 1-3 persons. The default project is to extend a community information management system such as DBLife, but students are free to propose their own ideas. At the end of the semester there may be a short project presentation and/or a report (open to discussion).

Midterm: March 8, in class at usual time/room, Final: TBD by the university.
Other important dates: Apr 3-5: no class, spring break; Apr 17-19: I'm away for ICDE conf., Jeff Naughton will substitute.

Grade: midterm = 30%, final = 30%, project = 30%, report/discussion/participation = 10%.


Course Schedule
Papers are mostly drawn from the following list, from the Red Book. Each paper will be covered in 1-2 lectures. Papers will be covered mostly in the order listed below, though I may still move them around a bit. When that happens, I will let you know in advance.

NEW: Project description

First two lectures: the big picture.

Query processing
Query optimization
Join algorithms

Concurrency control
Granularity of locks
Optimistic CC
Oracle CC
B-tree locking

Crash recovery
Aries recovery
2-phase commit

Buffer Management
Buffer Management

Parallel and distributed databases
Parallel databases
Distributed databases
Dangers of replication

Access Methods
R-trees
Bitmap indexes

Misc.
Bucky benchmark (O/R DBMS)
ADTs in DBMS
C-store, C-store paper
XQuery
Data models: from hierarchical to XML
Model management, schema evolution