Cluster I/O with River: Making the Fast Case Common
We introduce River, a data-flow programming environment and I/O
substrate for clusters of computers. River is designed to provide
maximum performance in the common case - even in the face of
non-uniformities in hardware, software, and workload. River is based on
two simple design features: a high-performance distributed queue,
and a storage redundancy mechanism called graduated declustering.
We have implemented a number of data-intensive applications on River,
which validate our design with near-ideal performance in a variety of
non-uniform performance scenarios.