vnode/vfs allows multiple types of file systems coexist within a system |
|
In this note FFS = Berkeley ffs, ufs = FFS within vnode/vfs framework |
Commonly supported attributes
|
|||||||||||||||||
sticky bit for file: keep the image in swap area after the program exits so that reloading the program later could be fast |
|||||||||||||||||
sticky bit for directory = access permission
|
inode: unique for the file |
|||||
Open file object: the status of an open - has offset, mode, etc. |
|||||
fd: pointer to an open file object
|
|||||
Passing fd between processes
|
read/write is atomic |
Ordinary read/write copies data from a consecutive address space of the process to a logically consecutive file space |
|
readv/writev: copies from scattered multiple addresses into a logically consecutive file space |
read/write atomicity does not guarantee consistency of file across read/write system calls -> locking required |
|||||
Advisory & mandatory locking
|
Unix file system is a collection of subtrees which are connected together to make a big tree rooted by the root file system |
An independent, randomly accessible, linear disk conceived by the kernel |
|||||
May contain only one file system |
|||||
May contain no file system and be used for swapping area |
|||||
A physical disk can be divided into multiple logical disks |
|||||
Multiple physical disks can be combined into a single logical disk
|
Both pipes and FIFOs are first-in first-out data stream |
|||||||
FIFO: named pipe
|
|||||||
Pipe: unnamed pipe
|
|||||||
Read from and write to pipe or FIFO are same as those of socket |
|||||||
BSD implements pipe and FIFO using socket, while SVR4 using STREAM |
Need for support of multiple types of file systems within a single system
|
|||||
vnode/vfs is de facto standard |
Support several file system types |
|
Continue to provide the single big, homogeneous tree structure |
|
Support network file systems |
|
Modular design |
Analogy to virtual file system:
|
|||||
Implementation of device driver
|
Object oriented implementation |
|||||
Kernel provides base classes for
|
|||||
Each file system extends the base classes by
|
|||||
=> a vnode object per file & a vfs object per file system |
vfs nodes connected together as a linked list |
|
Root file system is pointed by 'rootvfs' variable |
|
Each vnode has a pointer to a vfs to which it belongs |
|
Each vnode has a collection of methods which extend the virtual functions defined by the kernel |
|
Each process has an fd-table whose entries point the corresponding vnode |
Check whether the logical disk is formatted for the file system type |
|
Create a vfs object |
|
Link the vfs object with the existing vfs objects to form the linked list |
|
Create a vnode for the "/" directory of the new file system |
|
Connect the vnode to the new vfs object |
|
Connect the vnode to the mount point using "mounted-on" link |
lookuppn() - namei of the traditional Unix
|
|||||
Start from the root vnode("rootdir" variable) or the vnode for the current directory("u_cdir" variable in u-area) |
|||||
Call vnode->lookup() |
Global resource shared by vfs objects |
|
Cache vnode pointers for recently accessed files & directories |
|
Hashed bucket based on the parent directory & file name |
|
Cache hits eliminate disk I/O for directory lookup |
System V file system(s5fs) and FFS are two representative local file system in Unix |
|
Now SVR4 includes FFS |
|
vnode/vfs made possible to have both s5fs and FFS within a system |
|
ufs = FFS within the framework of vnode/vfs |
|
Earlier versions of unix file systems used "buffer cache". But modern systems integrate file I/O and virtual memory management and use buffer cache for only meta data |
Disk layout: boot area + superblock + inode list + data blocks
|
A special file containing a list of files |
|||
Each entry is 16-byte long: 2-byte for inode number & 14-byte for file name
|
Metadata for a file |
|||||||||||||||||
On-disk inode & in-core inode |
|||||||||||||||||
Fields of on-disk inode
|
Kernel reads the superblock when it mounts the file system and keeps in memory |
|||||||||||
Fields
|
Fields
|
To get in-core inode and in turn vnode pointer for the given inode number
|
read(fd, buf, size)
|
When the reference count of vnode reaches zero, the corresponding inode gets freed -> add the inode to the free list:
|
|||||||
When the kernel needs an inode and couldn't find the in-core inode, it takes an inode at the head of the free list
|
Reliability: single copy of superblock suffers from reliability |
|||||||||
Performance
|
|||||||||
Functionality
|
Major difference from s5fs are disk layout, on-disk structure, and free block allocation methods |
Disk partition comprises of a set of consecutive cylinders |
|||||||||
Cylinder group: a small set of consecutive cylinders |
|||||||||
Store related information in the same cylinder group -> reduce head movement |
|||||||||
Ordinary superblock is partitioned into
|
Block size increased to 8K
|
|||||
Fragment
|
|||||
File = multiple blocks + consecutive fragments (within the last block) |
9.7.2 Allocation Policies
9.8 FFS Functionality Enhancement
9.9 Analysis
9.10 Temporary File Systems
9.10.1 The Memory File System
9.10.2 The tmpfs File System
9.11 Special-Purpose File Systems
9.11.1 specfs File System
9.11.2 /proc File System
9.11.3 Processor File System
9.11.4 Translucent File System
9.12 Old Buffer Cache
9.12.1 Basic Operation
9.12.2 Buffer Headers
9.12.3 Advantages
9.12.4 Disadvantages
9.12.5 Ensuring File System Consistency