My UW
|
UW Search
Computer Science Home Page
> ~vernon > Research Summary
Activities/Bio
Research Summary
Publications
Graduates
Teaching
Contact Information
|
|
|
Mary K. Vernon
Professor Emerita of Computer Science
|
|
Quantitative System Design Interests:
Analytic system performance models that accurately predict measured
system performance for alternative system design options,
important next-generation commercial system design questions,
quantitative design of near-optimal hardware and software,
near-optimal enterprise storage systems,
network traffic characterization, near-optimal network protocols,
state-of-the-art parallel/distributed applications and architectures.
Research Summary
The theme of my research has been the development and application
of analytic modeling techniques that enable the design of
software, hardware, and communication system architectures with
near-optimal performance and known performance properties.
The contributions have included (1) customized equations that directly
specify an optimal system design, (2) analytic bounds that quantify
the opportunity for improvement as well as the target system performance,
(3) customized analytic models that reflect the mechanics of the system
and that accurately estimate measured
system performance for various design options, and (4) application of the
models to a wide variety of vital commercially-relevant system design questions,
leading to the development of new near-optimal commercialy-adopted system designs
that significantly outperform previous solutions.
The simplest possible analytic models that predict measured system performance
readily expose all system features that inhibit system performance, including
any performance bottlenecks that can be eliminated. Hence, the development
of such models generally leads to significant new design insights
as well as new near-optimal solutions that are often simpler to implement
and significantly outperform previous designs.
In addition, these analytic models typically require 1/10th - 1/100th the development
time of accurate detailed system simulators,
and frequently uncover previously unknown significant errors in a complex system
(or simulator) implementation. Hence, such models
are also highly valuable for validating complex system and simulator implementations.
The new modeling techniques developed together with students and colleagues
include:
- the Generalized Timed Petri Net (GTPN) which can be used
to create models that predict measured system performance for bus arbitration protocols,
cache coherence protocols, and other detailed system architectures that
include operations that have deterministic durations,
- Customized Approximate Mean Value Analysis (CMVA) which supports
queueing-theoretic modeling at the accuracy of measured system performance
for the design of complex high-performance computer/communication architectures/systems,
in that the equations are developed "from scratch" to reflect the
mechanics of the system operation, thus facilitating accurate extensions
to capture the impact of a wide range of complex, nonseparable performance-relevant
system features,
- CMVA approximations for highly accurate modeling of complex system
features, such as
high variability in service times and the resulting highly bursty arrivals
to downstream servers,
pipelined interconnect message transfers with correlated blocking due to finite
buffer space,
in-order delivery of distributed responses to non-blocking memory requests,
mean service time correlated with number of parallel requests in service, and
requests coalescing with other requests for the same file while waiting
for service.
- A CMVA model for provisioning
servers for very low client loss probability (e.g, 0.0001).
- Customized high-fidelity LogGP models that predict measured
system performance for the design of state-of-the-art
applications runing on high-performance message-passing parallel architectures,
such as an 8192-node Cray XT3/XT4 or a 500-node Condor system,
- the LoPC model which supports the development of LogP models
- without the "g" parameter - for estimating measured system performance in
the design of state-of-the-art parallel applications running on
high-performance shared-memory architectures,
due to incorporating CMVA for accurately estimating the delay in
application execution time due to contention (C) at the shared system resources,
- Deterministic Task Graph Analysis for parallel algorithm design,
together with the underlying theory that defines the broad system space
for which this modeling technique is accurate,
- Customized equations for mean response time of alternative
parallel system job scheduling policies
which compute the exact mean response time at particular points in a high-fidelity
system architecture and workload characterization parameter space, and use
interpolation functions to accurately estimate mean response time over the rest of
the parameter space,
- Customized lower bounds on server and network bandwidth
required for on-demand streaming of popular media content for a variety of
alternative settings, such as transmission path data rate, client packet loss rate,
maximum client start-up delay, choose-your-own-ending content,
heavy-tailed distributions of media content popularity, and so forth,
- Customized models of Periodic Broadcast streaming protocols that can be used
to compute the broadcast schedule that minimizes server bandwidth for a specified
maximum client start-up delay, client packet loss recovery rate,
and client data path transmission rate,
- Customized linear programming models for directly determining an optimal
content caching and delivery architecture for a specified client workload
and near-optimal scalable streaming protocol, and
- A Customized Model of TCP-Vegas throughput in the presence of packet loss
that accurately predicts measured performance over a wide variety of Internet paths.
- Customized models of differentiated service link scheduling policies
for bottleneck links in the Internet.
We have used these techniques to obtain significant new system design insights
and new, in many cases near-optimal, designs that
significantly outperform previous solutions. Example design insights
and new near-optimal system designs include:
- new, simpler and near-optimal, commercially-adopted round-robin and
first-come first-served bus arbitration protocols
(pdf),
- new, simpler and near-optimal hardware cache coherence protocols for
a broad range of shared memory workloads,
implemented in commercial cache controllers
(pdf),
- commercially-adopted coherence protocol support for a new
Queue-on-Lock-Bit (QOLB) synchronization primitive,
- the CMVA-based discovery and quantification of a previously unknown,
significant processor throughput imbalance in commercial high-performance
parallel architectures with k-ary n-cube processor/memory interconnections,
due to a previously overlooked load-imbalance on the virtual channels used
for deadlock-avoidance in Dally and Seitz' wormhole-routing protocol
(pdf);
this led Lethin and Dally to use a modified CMVA model in the design a new routing
protocol for future interconnects,
- commercially-adopted CMVA models to explore
design trade-offs in high-performance shared memory architectures
for processors with aggressive instruction-level parallelism
(pdf),
- optimized Cray UNICOS operating system semaphores,
- a new "Priority Backfill" production parallel system job scheduling policy,
implemented in the 1520-processor Origin 2000 system at NCSA for a 5-fold (80%)
reduction in mean and 95-percentile job waiting time
(pdf),
- a new near-optimal task scheduling policy for Condor grids,
- a new near-optimal state-of-the-art ATR parallel stochastic optimization code
with a factor of 8 reduction in runtime on a 500-node Condor server farm
(pdf),
- high-performance parallel particle transport applications
(and other wavefront computations)
optimized for 8192-node (and larger) Cray XT3/XT4 systems
(pdf),
- new near-optimal Hierarchical Stream Mergiing and Bandwidth Skimming
multicast streaming protocols with required server bandwidth that grows with
the logarithm of the client request rate
(pdf, pdf),
- a new TCP-Madison Internet transport protocol that achieves a
target upper-bound on bottleneck link utilization, near-optimal bottleneck
link sharing, negligible packet loss and up to ten-fold higher throughput
than FastTCP over a wide variety of Internet paths
(pdf).
Current research includes the design of commercially-relevant
enterprise storage systems,
parallel/distributed applications,
high performance network transport protocols,
and network traffic characterization.
|
|
|