High-Performance Sorting on Networks of Workstations

Appeared in SIGMOD '97

Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau,
David E. Culler, Joseph M. Hellerstein, and David A. Patterson.


We report the performance of NOW-Sort, a collection of sorting implementations on a Network of Workstations (NOW). We find that parallel sorting on NOWs is competitive to sorting on the large-scale SMPs that have traditionally held the performance records. On a 32-node cluster, we finish the Datamation benchmark in 2.41 seconds, and can sort 3.0 GB in just under one minute. On a smaller, better equipped, 8-node cluster, we run the Datamation in 2.92 seconds, and sort 1.4 GB in a minute.

Our implementations can be applied to a variety of disk, memory, and processor configurations; we highlight salient issues for tuning each component of the system. Throughout the paper, we evaluate the use of commodity hardware and operating systems for parallel sorting, and note lessons that can be drawn when applying NOW technology to data-intensive applications.


Available as: Abstract, Compressed Postscript, Postscript.

Also see the NOW-Sort Home Page