« Petal: Distributed virtual disks | Main | Separating key management from file system security »

Frangipani: A Scalable Distributed File System

Chandramohan Thekkath, Timothy Mann, and Edward Lee. Frangipani: A Scalable Distributed File System. Proc. of the 16th ACM Symposium on Operating Systems Principles, October 1997, pages 224-237.

Review due Thursday 10/23

Comments

Summary: This paper introduced Frangipani, a scalable distributed file system that was built on top of a distributed virtual disk service, Petal.

Problem: A distributed file system is desirable for many reasons. Data replication can make the data more durable and safe from failures; file sharing can be made easy on a distributed system by providing each user a single view of the whole file system; storage space can be extended by simply adding new nodes. The main challenge is here is how to make those things possible while keep the file system coherent from operations from different nodes.

Contributions:
1. A two-layer file system. The bottom layer was a distributed virtual disk system, and Fragipani was the top layer. Fragipani was an application on top of Petal, where all that Fragipani could see is a huge virtual disk. Fragipani does not need to worry about the data is physically stored or how to communicate with other nodes.

2. Disk layout based on a virtual address space. Just like programs deals with virtual memory, Frangipani also deals with virtual disk address space. Virtual address space provides more flexibility and it makes applications focus on logical structures rather than implementation details. Thanks to Petal for make that possible.

3. A log based recovery mechanism. Frangipani will write to in-memory log first before it does real any operations on the disk. In case of a failure, a recovery routine will read the log file and do recoveries as needed. This mechanism makes Frangipani more reliable.

4. Strong coherence. Frangipani will make sure that each read will always return the latest data. This is done by making use of a lock service and a multiple-reader-one-writer lock mechanism.

Things confused me: It appears to me that as long as a file is in cache, a corresponding lock for the file (either reader lock or writer lock) must be held. A running system will access a lot of files, which means there are a lot of locks held by the node. Will that impose a huge overhead?

Things I learned: The locking mechanism and the coherence protocol.

Post a comment