**THE FAST FILE SYSTEM** ======================== # 0. Problem with Simple Unix file system It is not disk-aware, treating disk as a random memory. Hence, - bad performance - utilize only a very small fraction of disk bandwidth (2%) Example of problem: 1) expensive position cost: + data block of a file is far from file's inode --> expensive seek 2) fragmented because of bad free space management (use free list) A1 A2 B1 B2 C1 C2 D1 D2 in the beginning A1 A2 f f C1 C2 f f C and D are deleted A1 A2 E1 E2 C1 C2 E3 E4 E is allocated (4 blocks) ==> accessing to E require an additional seek 3) small block size (512 byte) Question: how do we make file system *disk aware*? # 1. FFS: enforcing disk awareness FFS seeks to improve both performance and reliability: - performance: + place related stuff close together to avoid seek time + disk aware: parameterization - reliability: replication of supper block Note that: FFS retains simple abstraction (i.e the interface), changing the underlying implementation # 1.1. On-disk structure: Cylinder group - disk is divided in to *cylinder group* - each group has + copy of super block ==> super block is replicated, one gets corrupted, use another + bitmap for both inode and data (instead of free list) ==> better than free list, can find large chunk of free space to allocate to a file, hence may avoid some fragmentation + data block # 1.2. Policies: place related stuff together - allocate directory: find cylinder group with + low number of allocated dir ==> hence, balancing dirs across group + high number of free inodes ==> subsequent files on that dir may be allocated on the same cylinder - allocate file: + allocate data block in the same group as its inode ==> avoid long seek time + place files in the same directory in the cylinder group of that directory *Problem*: allocate of large file may occupy entirely its first group ==> undesirable, because subsequent "related" files may not be place on the same group, thus hurt subsequent file-access locality *Solution*: + for large file, divide it into chunks + place first chunk in first group + place subsequent chunk in different group, and so on ==> this might hurt performance, because of additional seek to read next chunk ==> Hence, *chunk-size* is important here: > large chunk size will increase the time portion spent on transferring (compared to seeking) (in the paper: chunk size is 1 MB) # 1.3. Other things: large block size and parameterization - 4K block size ==> efficient, much data is transfer per disk transaction But, problem of *internal fragmentation*, when number of small file is large ==> Solution: use *sub-block*, each has 512 byte. + when a small file, say 512 byte, is created, it would use up 1 sub-block, ==> not waste an entire 4K block. + As the file grew, FFS will allocate 512-byte sub-blocks until it acquires full 4-KB of data. Up to this, > FFS find a whole full 4-KB block > copy the sub-blocks into it > and free the sub-blocks for future use But now come the problem of *copy overhead*. Hence, a hack into the libc library: *the library will buffer writes and then issue them in 4-KB chunks to the file system*, thus avoid sub-block copying overhead. - Block layout on disk: parameterization Blocks are placed on disk to obtain optimal rotation access, avoiding extra rotations when reading subsequent blocks on a track. ==> This still requires at least 2 rotations to read the whole track ==> Newer FS: *track buffer*, hence remove this complex placing logic from file system code # 1.4. File system functionality: - long file name - symbolic link: allow cross file system reference - atomic rename - locking (?? which i don't understand much) # 3.3 Layout Policies - 2 way to improve performance: + 1. increase the locality of reference to minimize seek latency + 2. improve layout of data to make larger transfers possible - Global layout policies: + try to improve performance by clustering related information + spread unrelated data among different cylinder groups - Inode allocation: + place all inodes of files in a directory in same cylinder group ~ since files in same dir tend to access together (say ls...) + new directory is places in a cylinder group that has a greater than average number of new inode: ~ ensure files are distributed throughout the disk - Data block