Back to index
The LOCUS Distributed Operating System
Bruce Walker, Gerald Popek, Robert English, Charles Kline, and Greg Thiel.
University of California at Los Angeles
One-line Summary
Locus is a distributed operating system consisting of distributed process management and distributed filesystem in a transparent manner.
Overview/Main Points
- Distributed file system
- tree structured file system
- File representation
- Logical file group
- physical storage sites
- < logical filegroup number, inode number >
- Mount mechanism to construct a unifrom naming tree
- State information replicate at all sites
- Roles in FS operations
- Using Site (US): issues requests to open a file
- Storage Site (SS): supplies file replicas for US; broadcast for data updates
- Current Synchronization Site (CSS): enforces a global access sync policy (lock mgmt) for the file's filegroup and selects SS for US; only one CSS for any given filegroup
- File open
- Get the ID of the file: < logical filegroup #, inode # >
- US ➝ CSS
- examine the local mount table for CSS
- send logical filegroup # ➝ (along with its version vector) to CSS
- receive < lfg#, incore inode# > from CSS
- CSS ➝ SSs: send the latest version vector for checking SS
- SS ➝ CSS: SS responds if having the lastest copy; o.w., refuse.
- CSS ➝ US: send SS location to US
- File read
- Start with “ open file table ” and incore inode
- US ➝ SS: send the request with info including < lfg#, inode# >, logical page number, and a guess about where the incore inode is stored at SS.
- SS ➝ US
- incore inode found by the guess
- physical disk block number translated from the logical page#
- allocate buffers for the page if necessary
- prepare messages to SS with the buffer content
- File close
- US ➝ SS
- SS ➝ CSS
- CSS ➝ SS
- SS ➝ US
- Pathname searching: locate /a/b/c
- Similar to open-read procedures
- The only difference is an internal unsynchronized read without global locking
- Special care for searching crosses filegroup boundaries
- File write
- US ➝ SS: sends modified logical pages to SS
- File commit
- SS guarantees US's atomic changes
- mechanism for atomic commit
- logging
- shadow pages or intentions lists
- Concurrency control
- Not allow concurrent writes
- Concurrent read and write allowed by directing read to the same SS of write; any write commits will inform CSS which in turn notices the reader about such write.
- Data propagation is done by “ pulling ” as an internal read from SS that has the latest version
- File creation
- send placeholder instead of an inode # for a remote creation
- allocate the i-node # from a pool which is local to that physical container of the filegroup
- create a user-defined number of copies within local site and SSs of the parent directory
- File deletion
- recycling inode pool: consensus between other SSs for recycling an inode
- Distributed process management in a transparent manner
- remote process creation
- fork
- The parent and child process share open file descriptors.
- The child process has a copy of its parent's address space, both code and data.
- A copy of other process state information.
- Inter-process communication through a token mechanism
- Failure (partition) handling
- Network partition: enable updates for the same object per partition
- Inconsistency detection
- Version vector algorithm, similar to Dynamo
- Reconciliation
- For directory: check for name conflicts, resolve on an inode by inode basis, and further merge for links
- For mailbox: merge based on the version vector algo
- For others: report to users
- Dynamic reconfiguration
- Principle: transparency
- Partition protocol
- Merging protocol
- Cleanup protocol
Relevance
Flaws
Comparing to performance and reliability of clusters, no needs for distributed OS, in particular for remote process management.