CS 764: Topics in Database Management Systems
Spring 2007, Tue/Thur 2:30-3:15pm, room 1289
Instructor
AnHai Doan,
contact information available from my homepage
Course
Description
This course studies principles to
effectively manage data. We will focus mostly on how these principles
have been developed and implemented for relational databases. But we
will also briefly explore how they can be augmented and applied well
beyond relational contexts, to managing text data, emails, scientific
databases, and data on the Web. This part will provide a glimse into
next-generation search engines, business intelligence, and
unstructured data management systems.
Prerequisites: Undergraduate knowledge of relational
databases is highly recommended. If not, you should be willing to do a
"crash course" on the topic in the first few weeks. The recommended books
for the crash course are:
The Cow Book, or
The Complete Book.
Course Format
The
course meets twice a week to discuss research papers.
You are required to read the specified
paper before each lecture and attend the lectures. There will be a
midterm and a final, and a project that is done in teams of 1-3
persons. The default project is to extend a community information management
system such as DBLife, but students are free to propose
their own ideas. At the end of the semester there may be a short project
presentation and/or a report (open to discussion).
Midterm: March 8, in class at usual time/room,
Final: TBD by the university.
Other important dates: Apr 3-5: no class, spring break; Apr 17-19:
I'm away for ICDE conf., Jeff Naughton will substitute.
Grade: midterm = 30%, final = 30%, project = 30%, report/discussion/participation = 10%.
Course Schedule
Papers are mostly drawn from the following
list, from the Red
Book. Each paper will be covered in 1-2 lectures.
Papers will be covered mostly in the order listed below, though I may still
move them around a bit. When that happens, I will let you know in advance.
NEW: Project description
First two lectures: the big picture.
Query processing
Query optimization
Join algorithms
Concurrency control
Granularity of locks
Optimistic CC
Oracle CC
B-tree locking
Crash recovery
Aries recovery
2-phase commit
Buffer Management
Buffer Management
Parallel and distributed databases
Parallel databases
Distributed databases
Dangers of replication
Access Methods
R-trees
Bitmap indexes
Misc.
Bucky benchmark (O/R DBMS)
ADTs in DBMS
C-store, C-store paper
XQuery
Data models: from hierarchical to XML
Model management, schema evolution