CS 537 Notes, Section #24: Unix and DEMOS Disk Allocation
OSTEP: Chapter 40
Storage Management: For a given file, how the does OS find the data blocks
contained in that file?
The data structure that decribes the contents of file is generically
called a file descriptor.
We will see several other names (like inode and MFTE),
as we study about file systems.
The file descriptor information has to be stored on
disk, so it will stay around even when the OS does not.
-
In Unix, all the descriptors are stored in a fixed
size array on disk. The descriptors also contain
protection and accounting information.
-
A special area of disk is used
for this (disk contains two parts: the fixed-size
descriptor array, and the remainder, which is allocated
for data and indirect blocks).
-
The size of the
descriptor array is determined
when the disk is initialized, and cannot be changed. In
Unix, the descriptor is called an i-node, and its
index in the array is called its i-number. Internally,
the OS uses the i-number to refer to the file.
-
When a file is open, its descriptor is kept in main
memory. When the file is closed, the descriptor is
stored back to disk.
The Classic Unix Inode
-
File descriptors: 13 block pointers. The first 10 point to
data blocks, the next three to indirect, doubly-indirect,
and triply-indirect blocks (256 pointers in each indirect
block). Maximum file length is fixed,
but large. Descriptor space is not allocated until needed.
-
Examples: block 23, block 5 block 340
-
Free blocks: stored on a free list in no particular order.
-
Go through examples of allocation and freeing.
-
Advantages: simple, easy to implement, incremental expansion,
easy access to small files.
-
Drawbacks:
-
Indirect mechanism does not provide very efficient
access to large files: 3 descriptor ops for each
real operation. A cache is used, but this takes
up main memory space.
-
Block-by-block organization of free list means that
that file data gets spread around the disk.
The Demos File System
Demos was an operating system written especially for high performance
systems, originally the Cray 1.
Its design continues to influence systems today.
The Demos solution: allocates files contiguously, has more
compact file descriptors, uses more CPU time. (refer to contiguous
allocation picture in section 26).
- File descriptors: select sequences of physical blocks, called block groups, rather
than single blocks. Block groups were called
extents by IBM.
- A block group has three fields:
- Starting disk block: the starting address on disk of this block group,
- Starting logical block: the starting block number within the
file for the block group,
- Count: the number of blocks in the group.
- There are 10 block groups in file descriptor; if files become large, then these become
pointers to groups of indirect blocks. The resulting
structure is like a B-tree.
-
Free blocks: described with a bit map. Just an array
of bits, one per block. 1 means block free, 0 means
block allocated. For a 300 Mbyte drive there are about
300000 1kbyte blocks, so bit map takes up 40000 bytes.
Keep only a small part of the bit map in memory at
once. In allocation, scan bit map for adjacent free blocks.
-
Advantages:
-
It is easy to allocate block groups, since the
bit map automatically merges adjacent free blocks.
-
File descriptors take up less space on disk, require
fewer accesses in random access to large files.
-
Disadvantages:
-
Slightly more complex than Unix scheme: trades
CPU time for disk access time (OK for CRAY-1).
-
When disk becomes full, this becomes VERY expensive,
and does not get much in the way of adjacency.
Even if it is possible to allocate in groups, how do you
know when to do it? Use past history: if file is already
big, it will probably get bigger.
What Else is Stored in an inode (File Descriptor)?
So, far, we have described the primary task of an inode to locate the data blocks
in a file.
Some of the other basic information that is found in a file descriptor includes:
- Permissions:
- Indicates who can read, write or execute the file.
- Size:
- Size of the file in bytes (important because files are allocated on disk in
terms of blocks.
- Owner:
- The user ID and other information (like group ID on UNIX) about the creator of the file.
- Time stamps:
- The time that the was created, last referenced and last modified.
If you want to learn more about how the UNIX file system next evolved, you can
check out this paper we cover in the graduate operating systems class (CS736):
M.K McKusick,
W. N. Joy,
S. J. Leffler,
R. S. Fabry,
A Fast File System for UNIX,
ACM Trans. on Computer Systems,
2
3,
August 1984,
pp. 181-197.
Copyright © 2012, 2018, 2020 Barton P. Miller
Non-University of Wisconsin students and teachers are welcome
to print these notes their personal use.
Further reproduction requires permission of the author.