Lecture slides will be made a week or so in advance.
Date
| Topic
| Reading
| Reviews
|
6-Sep
| Intro, What is Computer Architecture, Technology Trends, Performance
|
|
|
8-Sep
| ISA overview
| Instructions Sets and Beyond: Computers, Complexity, and Concurrency. IEEE Computer, September 1985.
|
|
13-Sep
| Microarch basics
| CH1. Synthesis Lecture: Processor Microarchitecture: An Implementation Perspective (Access from UW-network only)
| Power Struggles: Revisiting the RISC vs. CISC Debate on Contemporary ARM and x86 Architectures, HPCA 2013
|
15-Sep
| Pin demo/Chisel demo
| Chisel tutorial
| The Optimum Logic Depth Per Pipeline Stage is 6 to 8 FO4 Inverter Delays, ISCA 2002
|
20-Sep
| Multi-issue, OOO execution
| CH3,CH4. Synthesis Lecture: Processor Microarchitecture: An Implementation Perspective (Access from UW-network only)
| Cramming More Components onto Integrated Circuits, Electronics, April 1965
|
22-Sep
| gem5 demo 1
| none
| Instruction Issue Logic for High-Performance, Interruptible, Multiple Functional Unit, Pipelined Computers
|
27-Sep
| Branch Prediction
| CH5,CH6, CH7, CH8. Synthesis Lecture: Processor Microarchitecture: An Implementation Perspective (Access from UW-network only)
| Architecture Simulators Considered Harmful
|
29-Sep
| gem5 demo 2
| A Graph-Based Program Representation for Analyzing Hardware Specialization Approaches. Computer Architecture Letters, 10, 2015
| Alternative Implementations of Two-Level Adaptive Branch Prediction
|
4-Oct
| Memory Disambiguation
| Dynamic speculation and synchronization of data dependences
| Dynamic Branch Prediction with Perceptrons
|
6-Oct
| Caches, multi-level caches, NUCA
| CH2. Synthesis Lecture: Processor Microarchitecture: An Implementation Perspective (Access from UW-network only)
| Fire-and-Forget: Load/Store Scheduling with No Store Queue at All
|
11-Oct
| Caches, multi-level caches, NUCA
| CH1, CH2, CH4. Synthesis Lecture: Processor Microarchitecture: An Implementation Perspective (Access from UW-network only)
| The MIPS R10000 Superscalar Microprocessor
|
13-Oct
| OOO policies
| CH5. Synthesis Lecture: Processor Microarchitecture: An Implementation Perspective (Access from UW-network only)
| Executing a program on the MIT tagged-token dataflow architecture
|
18-Oct
| OOO policies
| CH6. Synthesis Lecture: Processor Microarchitecture: An Implementation Perspective (Access from UW-network only)
| A Characterization of Processor Performance in the VAX-11/780
|
20-Oct
| OOO policies
| CH7. Synthesis Lecture: Processor Microarchitecture: An Implementation Perspective (Access from UW-network only)
| Improving the Energy Efficieny of Big Cores
|
25-Oct
| OOO policies
| CH8. Synthesis Lecture: Processor Microarchitecture: An Implementation Perspective (Access from UW-network only)
| Triggered Instructions: A Control Paradigm for Spatially-Programmed Architectures
|
27-Oct
| zsim demo. McPAT demo
| ZSim: fast and accurate microarchitectural simulation of thousand-core systems
| Analyzing Behavior Specialized Acceleration, ASPLOS 2016
|
1-Nov
| Prefetching (L1, L2)
| Synthesis Lecture: A Primer on Prefetching
| Efficient Embedded Computing
|
3-Nov
| Virtual memory/TLB
| none
| A Prefetch Taxonomy
|
8-Nov
| Virtual memory/TLB
| none
| Virtual Memory on Contemporary Processors, IEEE Micro, vol. 18, no. 4, 1998
|
10-Nov
| DRAM/memory controllers
| The Memory System: You Can't Avoid It, You Can't Ignore It, You Can't Fake It
| Agile Paging: Exceeding the Best of Nested and Shadow Paging
|
15-Nov
| Power analysis
| CH1, CH2: Synthesis Lecture on Power
| Fundamental Latency Trade-offs in Architecting DRAM Cache
|
17-Nov
| DRAMSim, Performance counters demo
| none
| McPAT: An Integrated Power, Area, and Timing Modeling Framework for Multicore and Manycore Architectures
|
22-Nov
| Vector processing/SIMD
| none
|
|
24-Nov
| Thanksgiving
|
|
|
29-Nov
| GPU
| GPUs and the Future of Parallel Computing
| The Cray-1 Computer System, Communications of the ACM, January 1978
|
1-Dec
| VLIW
| Very Long Instruction Word architectures and the ELI-512
| Many-core vs. many-thread machines: Stay away from the valley
|
6-Dec
| Accelerators 1
| Understanding sources of inefficiency in general-purpose chips
| Pushing the Limits of Accelerator Efficiency While Retaining General-Purpose Programmability
|
8-Dec
| Accelerators 2
|
| ASIC Clouds: Specializing the Datacenter
|
13-Dec
| Project Presentations
|
|
|
15-Dec
| Project Presentations
|
|
|
|