1.1 Transaction:
- All or nothing semantics -
indivisibility
- ACID properties
- Atomicity - all or nothing
- Consistency - normal transaction end preserves
database consistency
- Isolation - Events within an XACT must be hidden
from other concurrent XACTs
- Durability - Once XACT commits, DBMS must
guarantee results survive subsequent malfunctions
1.2 Failures to anticipate:
- Transaction failure - self abort, DBMS abort,
system crash - happens frequently (10-100/minute)
- System Failure - DBMS bug, OS fault, hardware
failure - several times a week
- Media failure - secondary storage failure, bugs in
OS disk routines, head crash, magnetic decay - 1-2/year
1.3 Recovery action summary
- Transaction UNDO - normal self/system
abort
- Global UNDO - System failure - all incomplete
XACTs rolled back
- Partial REDO - System failure - completed XACT
results may need to be redone
- Global REDO - Media failure - archive
recovery. Restore from backup and supplement with committed XACTs
since
- Thus - XACT is the only unit of
recovery
2.1 Mapping Process: Objects and
Operations
- File Management - Lowest layer, direct access to
bit patterns on nonvolatile direct access device - disk. Fixed length
blocks.
- Propagation Control - Subtle distinction between
pages and blocks - page mapped to block at this layer, maybe to another block
later in lifetime
- Access Path Management - Complicated mapping - all
physical object representations (recs/flds) and related access paths
(pointers/hash tables/trees) in a large virtual addr space.
- Navigational Access Layer - Data manipulation
layer - DDL style interfaces and operations - single record ops
- Noprocedural Access Layer - Retrieval layer - SQL
(or QUEL) language used here - ops on sets of records
2.2 Storage Hierarchy: Implementational
Environment
- Main memory - volatile, lost in abnormal DBMS
termination
- Physical DB copy - on disk, maybe lost in media
failure
- Archive DB copy - on tape or other offline
storage
- Temp log - supports crash recovery - selective
XACT UNDO requires random access to log records, therefore on disk
- Archive log - supports global REDO after media
failure. Need archive DB copy, always processed sequentially, therefore
on tape
2.3 Different views of database
- Current database - all accessible objects during
normal DBMS processing. Disk pages + recently modified pages in
memory.
- Materialized database - state during crash
recovery before applying any log information, no memory buffer.
- Physical database - all blocks in the online copy
containing page images - current or not.
- Page contents modification - affects only current
db
- Write operation - affects only physical db - info
about new disk block may not yet be in materialized db
2.4 Mapping Concepts for Updates:
- Direct page allocation: updates occur in place, if
interrupted by system crash we can be inconsistent
- Indirect page allocation: All output goes to new
page on disk, thus we can always get back to the old state for
recovery
- ATOMIC: Any set of modified pages propogated as a
unit - all or none
- ~ATOMIC: Update-in-place block writes.
Vulnerable to system crashes
3. Crash recovery
3.1 State of the database after a
crash
- Only have materialized db and temp log (assumes
online db intact)
- With direct page alloc and ~ATOMIC, state of
materialized db is unpredictable
- With indirect page alloc and ATOMIC, db state is
that of most recent propogation
3.2 Types of log information to support recovery
actions
3.2.1 Depdendencies between Buffer Manager and
Recovery Component
- 3.2.1.1 Buffer Management and UNDO
- STEAL: Modified pages may be written and/or
propagated at any time
- ~STEAL: Modified pages are kept in buffer at
least until end of XACT (EOT).
- 3.2.1.2 Buffer Management and REDO
- FORCE: All modified pages are written and
propagated during EOT processing
- ~FORCE: No propagation is triggered during EOT
processing
- With FORCE, no logging for partial REDO is
required.
- However, still need global REDO if online db
lost
3.2.2 Classificaiton of log data
- Physical State Logging on Page Level - most
basic. Any modification writes entire page to log. UNDO has before
image, REDO uses after image.
- Physical Transition Logging on Page Level - does
page diffing to log. Uses XOR to allow commutative and
associative. Long sequences of zeros can be compressed.
- Physical State Logging on Access Path Level -
records, tables, access path structure changes. Log component must be
aware of structures, but log file size is reduced.
- Transition State Logging on Access Path Level -
not really different from physical state logging - stores opcode and object id
instead of phys addr and entry representation.
- Logical logging on the Record-Oriented Level -
store update DML statements and parameters. UNDO requires inverse
statements (DELETE for INSERT, UPDATE to old values) (System R
style)
- Rules for writing log information
- UNDO info written to the log file
before corresponding updates are propagated to materialized
db. This is WAL.
- REDO info written to temp and archive log
before EOT is acked to XACT program. This guarantees that we
can provide durability.
3.3.1 Optimization of Recovery actions by
Checkpoints
- Goal: To limit and define what must be done during
a crash recovery
- To temp log, write:
- BEGIN_CHECKPOINT
- Checkpoint data to log file and/or
db
- END_CHECKPOINT
- Complete checkpoints will have all three parts in
the log during recovery.
- 3.3.2 XACT-Oriented Checkpoints (TOC)
- Implied by using FORCE
- Hot-spot pages propogated each time modified
even though in buffer
- 3.3.3 XACT-Consistent Checkpoints
(TCC)
- To do this, disallow incoming XACTs and allow
incomplete XACTs to finish
- Propagate modified buffer pages and write to
log
- Once done, partial REDO does not need to look
past this point
- Drawback: Not realistic for busy multiuser DBMS
to get quiescent state
- Drawback: Cost of checkpoint will be high with
large buffer pool.
- 3.3.4 Action-Consistent Checkpoints
(ACC)
- At record level, action can be seen as DML
statement
- ACC can be done when no update action is being
processed
- Does limit partial REDO
- Does not bound global UNDO - Any incomplete XACT
at checkpoint must be global UNDO
- Cost of checkpointing large buffers can still be
high and cause delays
- 3.3.5 Fuzzy checkpoints
- indirect checkpointing - write buffer occupation
info to log, not pages
- may have to trace REDO for hot spot pages back a
long way in the log
- only applicable to ~ATOMIC
- fuzzy since it only deals with the log, not the
online db
3.4 Examples - see paper
3.5 Evaluation of Logging and Recovery
Concepts
- ATOMIC proagation achieves action or
XACT-consistent materialized db at crash time
- Thus, can do physical or logical
logging
- increased overhead during normal processing due
to redundancy required for indirect page mapping
- recovery can be cheap when combined with
TOC
- ~ATOMIC generally results in chaotic materialized
db at crash time
- Thus must do physical logging
- No overhead during normal processing
- recovery more expensive without good
checkpointing
- All TOC and TCC are expensive. Made worse in
TOC by high checkpoint frequency
- Probably rule out ATOMIC, FORCE, TOC combination
due to high cost during normal processing
4. Archive recovery
- Creation can be expensive
- to be consistent, updates must be suspended on
database
- can create a fuzzy dump by not worrying about
that
- Also can do incremental dump - just changed pages
since last archive created.
- Must consider most recent archives may be corrupt,
and may need to go to oldest archive to restore
- This means that we must keep archive log from
the oldest archive forward
- archive log could be corrupt too, so must keep
multiple copies
- But this adds too much overhead to normal
processing
- Solution - write multiple copies of temp log,
and asynchrously copy REDO records to multiple archive log
copies
_Alan