Summary of Sparc Architectures
- hyperSPARC
- 2-way superscalar
- 1 ALU
- 1 LSU
- 1 FPU adder and 1 FPU multiplier, but only 1 floating
point instruction can be issued per cycle.
- ALU instructions produce a value that can be used
in the next cycle
- The SETHI instruction produces a value which can be used
by the following instruction in the same cycle.
- Stores use the LSU for 2 cycles.
- Loads use the LSU for one cycle but the loaded value
is available in the cycle after next.
- The FPU adder typically takes 3 cycles to complete.
- The FPU may take a lot more.
- SuperSPARC
- 3-way superscalar
- 2 ALU
- 1 FPU adder and 1 FPU multiplier, but only 1 floating
point instruction can be issued per cycle.
- ALU instructions produce a value that can be used in the
same cycle, using some weird cascaded execution scheme.
- Loaded values are available in the next cycle.
- The FPU adder typically takes 3 cycles to complete.
- The FPU may take a lot more.
- UltraSPARC
- 4-way superscalar
- 2 ALU
- 1 LSU
- 2 FPU (can issue 2 float instructions per cycle)
- Non-blocking loads allow execution to continue even
when loads miss in the 1st level cache. The second
level cache latency is ~7 cycles.
- Uses branch prediction, and will include instruction from
a predicted branch target to be part of an earlier launch group