Footprint-based Scheduling

Marc de Kruijf and Gwendolyn Stockman

Abstract: As we move into the multicore era, fine-grain multithreading will be among the primary tools used by software developers to enhance the parallel performance of their applications. However, more so than in prior, coarse-grain multithreading systems, intelligent locality-aware scheduling of fine-grain threads is paramount for good performance. Fortunately, in fine-grain multithreaded applications there is an abundance of threads eligible for execution at any given time, suggesting opportunities to develop efficient, locality-aware thread scheduling algorithms that can improve overall performance.

In this paper, we develop techniques for footprint-based thread scheduling -- a previously uncharted area of research. We experimentally evaluate two sets of synthetic workloads to demonstrate the importance of localityaware scheduling. For array-based workloads, we show a worst-case 5X performance difference between localityaware and locality-indifferent executions. Similarly, for tree-based workloads, we show in excess of a 2X performance difference across best-case and worst-case executions. We also discuss and present architectural enhancements to expose thread footprint information to software for efficient footprint-based thread scheduling. Finally, we describe a footprint-based thread scheduling algorithm to be evaluated in future work.

Available as: pdf

Downloads: