by Greg Tracy

Assume you have a RAID-5 array. Which file system would you choose to
use with it, FFS or LFS? Why?

I would choose LFS assuming that I had control over the configuration of
the RAID-5 LUN. The only way for LFS to take advantage of the parallel
nature of the RAID unit is to configure a segment size equal to the
strip size of the RAID-5. That is, the number of blocks in a segment
must equal (or be a multiple of) the number of data blocks in a given
parity stripe. This type of configuration will prevent the LFS from
suffering from the evil write penalty familiar with RAID-5 writes. This
occurs when a write operation does not span a full parity stripe. It
causes an extra read operation (of the old parity) to occur before the
data can be written to disk.

Since FFS does not have a minimum block size for writes, it is much more
difficult to optimize the configuration of the RAID-5 LUN. Although FFS
is optimized for larger write operations, there isn't anything that
prevents the system from doing small I/O. Inevitably, FFS will pay the
price of the RAID-5 write penalty on all of these small I/Os.

We still have the classic indirection problem on reads when using LFS.
When the system performs random writes followed by sequential reads of
the same data, LFS will perform worse then FFS. Although a RAID solution
will not solve this problem, the performance degredation won't be "as
bad" had we been operating on a single disk. With multiple data drives,
the sequential reads (random to LFS running on a RAID-5) can be broken
up into pieces and serviced at the same time. Each piece being serviced
by a different drive in parallel. The severity of the fragmentation in
the log will determine how much improvement can be found by applying a
RAID solution to LFS.

For instance, if a contiguous address range is written randomly, but
happens to be broken up in such a way that the individual pieces all
fall on the same disk yet different, discontigouos segments, the RAID
solution will not provide any benefit. If the random writes cause the
data to randomly land on all of the drives, however, some improvement
(over single-disk LFS configurations) will be seen.