CS/ECE 752 Advanced Computer Architecture I

Spring 2007
Instructor: Karu Sankaralingam; URL: http://www.cs.wisc.edu/~karu
Meeting time: MECH ENGR 1143, 01:00 PM - 02:15 PM, MWF
Office hours: Monday,Wednesday 3-4pm, Thursday: 11-12am
TA: Derek Hower
Course URL: http://www.cs.wisc.edu/~karu/courses/cs752/Spring2007/
Mailing list: compsci752-1-s07@lists.wisc.edu


Reader 1

This reader may still change slightly, and, if so, I will send email notice of changes. Links need to be verified.

H&P is John L. Hennessy and David A. Patterson, Computer Architecture: A Quantitative Approach, Morgan Kaufmann Publishers, Fourth Edition, 2006.

HJ&S is Mark D. Hill, Norman P. Jouppi, and Gurindar S. Sohi, Readings in Computer Architecture, Morgan Kaufmann Publishers, 2000.

Technology, Cost, Performance, Power, etc.

  • H&P Chapter 1.
  • Gordon E. Moore, Cramming More Components onto Integrated Circuits, Electronics, April 1965. Reprinted in HJ&S pp. 56-59.
  • Ethan Mollick, MIT Sloan School of Management, Establishing Moore's Law, IEEE Annals of Computing, July-September 2006 (Vol. 28, No. 3) pp. 62-75, IEEE Xplore link.
  • Standard Performance Evaluation Corporation (SPEC). URL: http://www.specbench.org/. Read the "run and reporting rules" for SPEC CPU2006 and SPEC jAppServer2004. You may skim the rest of the web site.
  • Transaction Processing Council (TPC). URL: http://www.tpc.org (reference).
  • ITRS Roadmap -- Executive Summary, Go to URL http://www.itrs.net/Links/2005ITRS/Home2005.htm and click on Executive Summary, 89 pages. Read Introduction (pp. 1-10) and flip through Grand Challenges (pp. 11-18), reading, at least, titles. Study the trends in the Overall Roadmap Technology Characteristics (pp. 59-85). Don't try to memorize the tables. Rather, identify key facts and trends. Such as "what is the overall scaling trend?" and "what is the target yield range for volume production?".
  • David A. Patterson, Latency lags bandwith, Communications of the ACM, October 2004. Online PDF for University of Wisconsin only.

Instruction Sets

  • H&P Chapter 2.
  • William A. Wulf. Compilers and Computer Architecture, IEEE Computer, July 1981. Reprinted in HJ&S pp. 119-125.
  • Robert Colwell et al. Instructions Sets and Beyond: Computers, Complexity, and Concurrency. IEEE Computer, September 1985. Reprinted in HJ&S pp. 144-155.
  • J. S. Emer and D. W. Clark. A Characterization of Processor Performance in the VAX-11/780, ISCA 1984, Reprinted in HJ&S pp. 101-110.
  • B. Sprunt. Pentium 4 performance-monitoring features, IEEE Micro, July 2002. IEEE Xplore link.
  • Burger et al. Scaling to end of Silicon with EDGE architectures. IEEE Computer, July 2004. PDF download.
  • IA-32 Intel(R) Architecture Software Developer's Manual, Volume 1: Basic Architecture. Online PDF for University of Wisconsin only., 476 pages (reference).


  • H&P Appendix A reviews pipeline background material from CS/ECE 552.
  • T-Y. Yeh and Y. Patt. Two-level Adaptive Training Branch Prediction, ISCA 1991. Reprinted in HJ&S pp. 228-237.
  • Seznec, A., Felix, S., Krishnan, V., and Sazeides, Y. Design tradeoffs for the alpha EV8 conditional branch predictor. ISCA 2002. IEEE Xplore link
  • Tse-Yu Yeh and Patt, Y.N. Alternative Implementations of Two-Level Adaptive Branch Prediction. ISCA 1992 IEEE Xplore link
  • Dan Ernst, et al., A Low-Power Pipeline Based on Circuit-Level Timing Speculation, MICRO 2003 PDF download
  • Hrishikesh et al., The Optimum Logic Depth Per Pipeline Stage is 6 to 8 FO4 Inverter Delays, ISCA 2002. PDF download
  • Hartstein and Puzak, Optimum Power/Performance Pipeline Depth, MICRO 2003. PDF download

Multiple Issue and Static Scheduling

  • H&P Chapter 4.1, 4.2, 4.3, 4.5, and 4.7
  • Joseph A. Fisher. Very Long Instruction Word architectures and the ELI-512 ACM download link. And a 2005 retrospective.
  • C. McNairy and D. Soltis, Itanium 2 Processor Microarchitecture, IEEE Micro, Mar-Apr 2003, pp. 44-55. IEEE Xplore link
  • Intel (R) Itanium (R) Architecture Software Development Manual, URL: http://www.intel.com/design/itanium/manuals/iiasdmanual.htm. See especially Volume 1's (Application Architecture) Chapter 4 (Application Programming Model), 36 pages (reference).
  • R. Rau and J.A. Fisher, "Instruction-level parallel processing: History, overview, and perspective," The Journal of Supercomputing, 7(1):9-50, 1993. Reprinted in HJ&S pp. 288-308.

Dynamic Scheduling I and II

  • H&P Chapter 3 (but it is too long)
  • J. E. Smith and A. R. Pleszkun. Implementing Precise Interrupts in Pipelined Processors, IEEE Trans. on Computers, May 1988. Reprinted in HJ&S pp. 202-213.
  • Kenneth C. Yeager. The MIPS R10000 Superscalar Microprocessor, IEEE Micro, April 1996. Reprinted in HJ&S pp. 275-287.
  • D. Papworth. Tuning the Pentium Pro Architecture, IEEE Micro, April 1996. Reprinted in HJ&S pp. 660-667.
  • Robert E. Kessler. The Alpha 21264 Microprocessor, IEEE Micro, March/April 1999, (Vol. 19, No. 2), pp. 24-36. IEEE Xplore link
  • Simcha Gochman, Ronny Ronen, Ittai Anati, Ariel Berkovits, Tsvika Kurts, Alon Naveh, Ali Saeed, Zeev Sperber, Robert C. Valentine, The Intel (R) Pentium(R) M Processor: Microarchitecture and Performance Intel Technology Journal, May 2003. PDF download
  • Gurindar S. Sohi and S. Vajapeyam. Instruction Issue Logic for High-Performance, Interruptible, Multiple Functional Unit, Pipelined Computers, ISCA 1987. Reprinted in HJ&S pp. 244-251.
  • E. Borch, E. Tune, S. Manne, and J. Emer, Loose Loops Sink Chips, HPCA 2002. IEEE Xplore link
  • Nagarajan et al., A Design Space Exploration of Grid Processor Architectures, MICRO 2001. PDF download
  • Sethumadhavan et al., Scalable Hardware Memory Disambiguation for High-ILP Processors, MICRO 2003. PDF download