CS 736: Advanced Operating Systems

Basic Information

When: Tues/Thur 11:00 am - 12:15 pm
Where: 1289 Computer Sciences
Who: Professor Michael Swift
Office Hours: Tuesdays at 1:30-2:30 pm in my office (room 7369)

Notes

Second midterm is in-class on Thursday, 4/27.

Reading for Tuesday, April 18:
Terra: a virtual machine-based platform for trusted computing

Reading for Thursday, April 20:
Why Information Security is Hard -- An Economic Perspective
Reflections on Trusting Trust

Performance Summary

CS 736 – Spring 2006

Lecture 20: Performance Summary

1. General Problems

a. Latency: do things faster

i. E.g. RPC turn around

b. Throughput

i. Handle more requests/operations per second

c. Time to completion

i. How long does it take to compute a fixed workload? E.g. sort a billion values

d. Scale up: run faster on faster machines – e.g. giant multiprocessors with lots of memory and fast CPUs

i. Improve speed on faster/more computers

ii. Run well on a supercomputer

e. Scale out: run on bigger data sets on more machines

i. Handle more data on more computers / faster computers

ii. Run well on a cluster with a billion clients

f. Predictability

i. Does computer do as you expect? Is performance predictable, understandable, low variance? If there is a problem, can you understand its source?

g. Fairness

i. E.g. proportionally share a resource

h. Efficiency

i. Reduce the amount of CPU/bandwidth/storage it takes to do something, even if it isn�t the bottleneck

ii. Frees resources for something else

i. Overload

i. How does performance vary with load? Keep it even

2. General Solutions

a. Locality

i. FFS cylinder groups

ii. LFS logs

b. Optimize for common case

i. LRPC

c. Match underlying functionality

i. ActiveMessages: hardware messing

ii. Scheduler Activations: scheduling decisions

iii. Grapevine Naming – shows whether is user or group

d. Hints - Semantically irrelevant but useful performance-wise if correct

i. Pilot page usage

e. Partitioning – distribute load to multiple servers

i. Grapevine

ii. AFS

iii. Petal

iv. Frangipani

f. Replication – more read throughput

i. Grapevine

ii. AFS

iii. Petal

g. Caching

i. Grapevine – group membership

ii. AFS

iii. NFS

h. Change data structures

i. Logging in LFS, Petal, Frangipani

ii. Message lists in Grapevine

iii. Free block bitmap in FFS

iv. A-Stack / E-stack in LRPC

v. Group membership in Grapevine

i. Batching – reduce startup costs

i. Delayed write in LFS / Frangipani, NFS

j. Randomization for fairness

i. Lottery Scheduling

k. Idempotent operations / stateless operation

i. Message delivery in Grapevine

ii. NFS everything

l. Callbacks / leases – reduce server load

i. AFS

ii. Frangipani

m. Move work to client

i. AFS name translation

n. Early binding

i. NFS mounting

ii. LRPC binding / compiler-generated stubs

o. Asynchronous operation

i. Active Messages

p. Notifications – notify other participant of semantically interesting events

i. Scheduler Activations

q. Move control from OS to user code

i. Scheduler activations

ii. Active Messages

r. Delay work

i. LFS – segment cleaning

s. Multi-level policy

i. FFS global/local placement

ii. Petal global / physical maps

3. Evaluation Techniques

a. Questions to ask

i. When should they be used

ii. What do they show

b. Micro benchmarks

i. Used to understand performance problems – where is the speedup / slowdown / problem coming from

ii. E.g.

1. Null RPC

2. Contention

3. Read 8 / 64 / 1000 kb files

c. Synthetic benchmarks

i. Non-representative:

1. Andrew Benchmark

2. Shows higher level performance, more realistic mix of operations

3. Again, used to understand performance, indicate potential problems due to workload skew

d. Record live performance

i. Shows operational issues, not peak load

1. CPU utilization

e. Perform anomalous events, e.g. shutdown server

i. Show response under duress (e.g. time to reconfigure, time to clean)

f. Comparisons

i. Against best research system (LRPC, Scheduler Activations)

ii. Against industry practice (AFS, FFS, LFS)

iii. Against tuned industry practices (Petal, Frangipani, ActiveMessages)

g. Papers

i. LRPC:

1. NULL RPC + component timings

2. Throughput scaling on multiprocessor with simple workload

3. Compare to TAOS RPC

ii. Scheduler Activations

1. Micro benchmarks: null fork, wait

2. Scalability with # of processors on single program

3. Compare to unix threads, topaz fast threads, user level threads

iii. Active Messages

1. Null RPC + timings

2. Utilization as # of processors scale

3. Compare to native buffered model

iv. Lottery Scheduling

1. Proportionality with sharing under different simple (small # of processes) workloads)

v. FFS

1. Read / write bandwidth, CPU utilization on simple workloads

2. Compare to UFS

vi. LFS

1. Synthetic, fixed workloads (e.g. uniform, fixed) to show response to different patterns

2. Micro benchmarks for create/read/delete with sequential and random access

3. Usage characteristics from live system

4. Compare to FFS

vii. AFS

1. Usage characteristics from live usage

2. Andrew benchmark – time + scalability as client load increases

3. Access latency for different size files

4. Compare to NFS, local

viii. NFS

1. Compare to local, network disk

2. Run real programs

ix. Petal

1. Compare to local, tuned industry FS

2. Synthetic read / write workload

3. Mesaure latency, scalability with # of servers

4. Andrew benchmark

x. Frangipani

1. Compare to local w/ tuned industry FS

2. Andre wbenchmark

3. Synthetic read/write microbench

4. Scaling on microbench to understand perf

UW Global Navigation

University of Wisconsin-Madison