|
Project Sponsors
- NSF Next Generation Software Program (September 2001 - August 2004)
- NSF Next Generation Software Program (September 2000 - December 2001)
- DARPA (July 1997 - December 2000)
Project Overview
This project brings together investigators with expertise in state-of-the-art
applications, compilers, performance specification, performance modeling, and
distributed system design/capacity planning
to develop the tools needed for
integrated design, development, evolution, and near-optimal adaptive run-time control
of large high-performance applications running on distributed
computational systems.
Target applications include the cutting edge of computational science,
such as complex stochastic optimization codes that are used to
solve key organizational, economic, and financial planning decision
problems that involve uncertainty. Target computational platforms include
widely distributed heterogeneous "Grid" architectures
running Grid middleware such as Condor, Globus, or Legion.
Grid platforms currently provide one of the most attractive environments for
running large compute-intensive applications because the Grid resources are
inexpensive, widely accessible, and powerful.
Over time, the applications that are submitted using the Grid middleware are
each given a share of the computational resources that are not
used by higher priority computations. This enables an application to obtain
large quantities of processing power relatively easily, although the
the number and capabilities of the distributed hosts
that will be allocated to the application is unpredictable
and may vary during the course of the computation.
In contrast to previous work, the approach of the POEMS/MASC project
is to develop model-based techniques and tools for near-optimal
adaptive run-time control. That is, run-time control
is based on high-fidelity performance models that control the execution of the application,
so that the application adapts to its changing computational requirements as well as
to the unpredictable changes in the allocated computational, communication,
and storage resources, (nearly) optimally and/or to meet specified performance
objectives.
For more information about how to get involved in the project, see the list of
current topics.
|