Footprint-based Scheduling
Marc de Kruijf and Gwendolyn Stockman
Abstract:
As we move into the multicore era, fine-grain multithreading
will be among the primary tools used by software
developers to enhance the parallel performance of their
applications. However, more so than in prior, coarse-grain
multithreading systems, intelligent locality-aware
scheduling of fine-grain threads is paramount for good
performance. Fortunately, in fine-grain multithreaded applications
there is an abundance of threads eligible for
execution at any given time, suggesting opportunities to
develop efficient, locality-aware thread scheduling algorithms
that can improve overall performance.
In this paper, we develop techniques for footprint-based
thread scheduling -- a previously uncharted area of research.
We experimentally evaluate two sets of synthetic
workloads to demonstrate the importance of localityaware
scheduling. For array-based workloads, we show
a worst-case 5X performance difference between localityaware
and locality-indifferent executions. Similarly, for
tree-based workloads, we show in excess of a 2X performance
difference across best-case and worst-case executions.
We also discuss and present architectural enhancements
to expose thread footprint information to software
for efficient footprint-based thread scheduling. Finally,
we describe a footprint-based thread scheduling algorithm
to be evaluated in future work.
Available as: pdf
Downloads:
- Array workload source code: tar
- Array workload simulator output tar
- BST workload source code: tar
- BST workload simulator output: tar
- For modified GEMS simulator source code, please contact the
authors.