CS-838 (Section 3): Advanced Storage Systems


Basic Information

Who: Professor Remzi Arpaci-Dusseau
When: 11:00-12:15 W and F (but keep M open!)
Where: 1263 Computer Sciences Moved to 4310 Computer Sciences ( Timetable entry )

Notes

3/27: Wiki

1/20: Slight change in the reading list -- please read the AutoRAID paper for Wednesday.

Course Overview

Storage is at the heart of modern computing systems. In this class, you will read about the latest and greatest in storage systems, studying novel ideas from academia and learning about the best "real-world" systems from companies such as Google, Network Appliance, IBM, HP, and EMC. You will also perform a cutting-edge mini-research project, where you design, build, and evaluate a system.

Topics

Underlying technology
Local storage systems
Distributed storage systems
Mobile storage systems
Reliability and fault tolerance
Performance, scalability
Power management
Management and virtualization
Caching, replication, consistency
Storage networking
Security

Course Theme: Cross-Fertilization

One major theme we will be exploring this semester is cross-fertilization of ideas from other domains into storage. For example, can more formal techniques from the programming languages community be used to build more robust, better storage systems? Can techniques from databases be applied to build more functional storage? Can maching learning be useful in building more automated and hence manageable systems?

What You Will Do

In this course, you will be responsible primarily for two things: (1) reading the papers before class (to be ready for discussion), and (2) a mini-research project with one partner. Reading and understanding the papers is a crucial part of your educational process, so please take this very seriously. The project, however, is the real focal point of the course, where you are to get yours hands dirty and do something interesting in the storage research space. Some ideas will be suggested shortly.

The reading list is found below. It is basically organized into two halfs. The first deals with some of the basics of storage systems: how disks and RAIDs work, what the interface to storage is like, issues in on-disk consistency, scheduling, manageability, failure, and a few other low-level papers. The second half is about techniques, organized around ideas developed largely (although not entirely) outside the storage community: programming language ideas, database techniques, AI, and so forth.

Reading List

Disk Technology   W 1/17 Class begins   F 1/19 SCSI/ATA Wilkes
Storage Arrays   W 1/25 AutoRAID   F 1/27 Row-Diagonal
Interfaces   W 2/01 NASD   F 2/03 SSD
On-disk Consistency   W 2/08 Soft updates   F 2/10 Journaling
Scheduling   W 2/15 Rotational Sched   F 2/17 Capacity vs. Bandwidth
Manageability   W 2/22 Network Appliance   F 2/24 Petal and Frangipani
Failure   W 3/01 GoogleFS   F 3/03 IRON
Student Week   W 3/08 GoogleFS and MapReduce   F 3/10 Talk About Projects
Spring Break   W 3/15 No class   F 3/17 No class
PL Techniques   W 3/22 Singularity   F 3/24 CMC
Visiting Stanford   W 2/29 Malicious and FiSC   F 2/31 Deviant
Craziness   W 4/05 Wiki   F 4/07 Wiki
Reboot and Undo   W 4/12 Wiki   F 4/14 Wiki
TBA   W 4/19 Wiki   F 4/21 Wiki
TBA   W 4/26 Wiki   F 4/28 Wiki
Projects   W 5/03 Project presentations   F 5/05 Project presentations

See Wiki for further readings

Random Details

This course is currently being offered as CS-838 (Section 3) but will soon be turned into a "real" 700-level course, offered once every other year or so.

This course will count for core credit.

Prerequisites: CS 736 or permission from instructor.