(2.5.6) Cosmic rays

Shekhar Y. Borkar: Designing Reliable Systems from Unreliable Components: The Challenges of Transistor Variability and Degradation. IEEE Micro 25(6): 10-16 (2005). IEEE Xplore link

Soft-errors single event upsets

Special radiation hardened circuits
architectural redundancy
localized ECC

Soft error rate
     Silent Data corruption (SDC)
     Detected unrecoverable error (DUE)
Architectural vulnerability factor > what effect does a soft error have on a program's actual output?
     branch predictor's AVF = 0
ACE : Architecturally correct execution are bits whose errors cause damage to arch state
     Use Little's law to calculate ave ACE (conservative) for a program
     unACE sources
          no-op
          performance enhancing instructions
          predicated false instructions
          dynamically dead code
          logically masked values

We translate Little’s law as N  = B °— L , where N  is the average number of bits in a processor structure, B  is the average bandwidth of bits per cycle into the structure, and L  is the average residence time of an individual bit in the structure.
Find AVFs using performance modeling and SPEC benchmarks