Back to index
The Dangers of Replication and a Solution
Jim Gray, Pat Helland, Patrick O'Neil, and Dennis Shasha.
Microsoft, UMB, and NYU
Scribe by: Zuyu Zhang
One-line Summary
This paper states update propagation techniques in replicated systems
as well as the problem of replicas, but the solution,
a two-tier replication algo that updates proposed by disconnected nodes
are later applied to a master copy,
may be no longer promising today.
Overview/Main Points
- Background
- distributed system, replicated data
- want to update data with transactional consistency in anywhere-anytime-anyway
- Options for distributed systems with replication
- propagation strategy
- eager
- keep all replicas synchronized by updating everything in a single transaction
- use a locking scheme for concurrency control
- A transaction may delay or abort if volating serialization.
- lazy
- asynchronously propagate updates to replicas
- use multi-version concurrency control scheme for detecting non-serializable behaviors
- Most multi-version isolation schemes provide the most recent committed value, but a very old value may be returned as well.
- Reconcile by a program or person, rather than automatically reverse the committed replica updates, if the serialization problem detected.
- the ownership strategy
- group (update anywhere)
- any node with a copy can update it
- master
- each object has a master node. Only master can update it (others make requests to the master).
- Replication strategies have scale problems
| Propagation / Ownership |
group |
master |
| eager |
one xact, N object owners |
one xact, one object owner |
| lazy |
N xacts, N object owners |
N xacts, one object owner |
- Eager systems
- transaction size grows quickly in # of replicas
- higher xact rate ⇒ increase deadlocks or waits
- Lazy systems
- need reconciliation
- # of reconciliation grows fast (cubic growth) in replication factor, and update rate
- “system delusion” due to failed reconciliations
- lazy master system
- no reconciliation; deadlocks instead
- lazy group system
- instead of transactional consistency, request convergence
- if no new updates arrive, eventually all replicas will converge to the same value.
- idea: restrict updates (trival but doable), e.g.
- append a timestamped value
- timestamped replace value (last-write-win policy)
Relevance
Flaws