UW-Madison
Computer Sciences Dept.

CS/ECE 552 Introduction to Computer Architecture


Spring 2012 Section 1
Instructor David A. Wood and T. A. Ramkumar Ravikumar
URL: http://www.cs.wisc.edu/~david/courses/cs552/S12/

General instructions

  • The final demo involves the development of a fully functional pipelined design with multi-cycle memory and additional optimizations if any

  • Teams are expected to demonstrate the following features

    1. Two-way set-associative caches with multi-cycle memory
    2. Register file bypassing
    3. Forwarding from beginning of the MEM stage to beginning of EX stage
    4. Forwarding from beginning of the WB stage to the beginning of the EX stage
    5. Branch prediction

How Do I submit the code

  • Code needs to be submitted into a directory called demo3 that is already created in your respective handin folders For instructions, you can refer to the slides (handin cs552-1 demo3 "path_to_your_demo3_folder") here

What needs to be turned in electronically

  • All your verilog files including any primitives you have used in the code (Please do not miss out any files. Do not assume that I will be using files from your HW)

  • Vcheck outputs for all your *.v files (other than test-benches)

  • Synthesis results (area and other reports).

  • Log-files for all tests that you are supposed to run on your pipelined processor

Live demo and final report submission

  • All partners are required to be present on the day of demo and are expected to explain and answer questions about the whole design

  • The final report is also due on May 11. Instructions regarding the same can be found here

What tests do I need to run and how do I name the log-files

  • /p/course/cs552-david/public/html/S12/project/tests/public/552marks - Log-file name must be 552marks.summary.log

  • /p/course/cs552-david/public/html/S12/project/tests/public/complex_demofinal/all.list - Log-file name must be complex_demofinal.summary.log

  • /p/course/cs552-david/public/html/S12/project/tests/public/rand_ldst/all.list - Log-file name must be rand_ldst.summary.log

  • /p/course/cs552-david/public/html/S12/project/tests/public/rand_idcache/all.list - Log-file name must be rand_idcache.summary.log

  • /p/course/cs552-david/public/html/S12/project/tests/public/rand_icache/all.list - Log-file name must be rand_icache.summary.log

  • /p/course/cs552-david/public/html/S12/project/tests/public/rand_dcache/all.list - Log-file name must be rand_dcache.summary.log

  • /p/course/cs552-david/public/html/S12/project/tests/public/complex_demo2/all.list - Log-file name must be complex_demo2.summary.log

  • /p/course/cs552-david/public/html/S12/project/tests/public/complex_demo1/all.list - Log-file name must be complex_demo1.summary.log

  • /p/course/cs552-david/public/html/S12/project/tests/public/rand_complex/all.list - Log-file name must be rand_complex.summary.log

  • /p/course/cs552-david/public/html/S12/project/tests/public/rand_simple/all.list - Log-file name must be rand_simple.summary.log

  • /p/course/cs552-david/public/html/S12/project/tests/public/rand_ctrl/all.list - Log-file name must be rand_ctrl.summary.log

  • /p/course/cs552-david/public/html/S12/project/tests/public/rand_mem/all.list - Log-file name must be rand_mem.summary.log

  • /p/course/cs552-david/public/html/S12/project/tests/public/inst_tests/all.list - Log-file name must be inst_tests.summary.log

What if my full pipeline+cache is not functional

  • If your design has known failures, then bring to the demo a written short explanation for as many failures as you can track down

  • If your entire design does not work, then you may show a demo of a partially complete processor. In your best interest, snapshot working parts of your design as you add more functionality

  • For example, you may show any one of the following, if your full pipeline+cache does not work

    1. Stalling instruction memory alone
    2. Stalling data memory alone
    3. Stalling inst+data memory
    4. Direct-mapped instruction memory alone
    5. Direct-mapped data memory alone
    6. Direct-mapped inst+data memory
    7. 2-way instruction memory alone
    8. 2-way data memory alone
    9. 2-way inst+data memory alone

Suggested next steps

  • The main goal of demo3 is to improve the memory and make it more realistic

    Step 1: Aligned memory

    At this step, replace the original single-cycle memory with the Aligned single cycle memory. This is a very similar module, but it has an "err" output that is generated on unaligned memory accesses. Your processor should halt when an error occurs. Verify your design

  • To verify your design, use the -align parameter with the wsrun.pl or wsrun_mumble.pl command
  • Step 2: Stalling memory

    At this step, replace the single cycle memory with the Stalling memory. This is a very similar module, but has stall and done signals similar to the cache you built. Your pipeline will need to stall to handle these conditions. Verify your design.

  • Instruction memory: First replace your instruction memory module with this stalling memory, keep your data data memory module the same (i.e. aligned perfect memory from previous step). Verify your design. This will be easier to debug, as only module's behavior has changed.

  • Data memory: Now, replace your data memory module alone with this stalling memory, revert your instruction memory module back to the aligned perfect memory. Verify your design. This will be easier to debug, as only module's behavior has changed.
  • Instruction and Data memory: Now change both instruction and data memories to the stalling memory design. Verify your design.
  • Step 3: Four-banked memory

    The simple memory module is still highly idealized. Real memory systems are pipelined and use multiple banks. You are provided with such a simple four-banked memory module that models a simple DRAM controller. Make sure that your design continues to compile after integrating this memory into your design

    Step 4: Direct-mapped cache



    (Sample)

    Replace your memory modules with the cache modules. Note that upon a miss, the previous contents of the cache line will need to be written back to memory if dirty, and the new line will need to be loaded into cache. The main memory will take multiple cycles to perform each access

  • Instruction memory: First replace your instruction memory module with this mem_system module developed for HW5, make your data data memory module perfect (i.e. aligned perfect memory from previous step). Verify your design. This will be easier to debug, as only module's behavior has changed.

  • Data memory: Now, replace your data memory module alone with this mem_system module developed for HW5, revert your instruction memory module back to the perfect memory. Verify your design. This will be easier to debug, as only one module's behavior has changed.
  • Instruction and Data memory: Now change both instruction and data memories to the mem_system design. Verify your design.
  • Step 5: 2-way set-associative cache

    Use the 2-way set associative cache developed in homework 5.

    Step 6: Optimizations and Extra credit features

    Please see points 4-6 here

How will the processor perfromance be measured

  • Performance will be measured based on the 552marks tests. More information is available here

  • IMPORTANT: These results should be included in your final report

How do I get the execution time for the 552marks programs that is to be measured against the reference design

  • From your timing report, find the slack: Example: Lets say slack is -0.33

  • Subtract this from the desired 1ns cycle time i.e., 1-(-.33) = 1+.33 = 1.33ns = Your processor's cycle time

  • Now run any of the 552marks programs. Lets say you ran Primes.asm

  • At the end of the run and on a success, you will see:

  • /p/course/cs552-david/public/html/S12/project/tests/public/552marks/Primes.asm SUCCESS CPI:XX VALID_CPI:XX CYCLES:XXXXX ICOUNT:XXXXX VALID_ICOUNT:XXXXX IHITRATE: XXX DHITRATE: X
  • Lets assume you obtain CYCLES: 122369

  • Compute EXECUTION TIME = CYCLES * CYCLE-TIME = 122369 * 1.33 = 162750.77 ns = 162.75 us

I have done something additional that I want the TA/Professor to know (and/or)

What should I do in case I have test failures

  • If at all this is the case, please include a README.txt and explain the issue briefly

  • Additionally, please bring a copy of this file to the final demo

Are there any exceptions to the due date/time

  • The answer is NO. However, professor Wood will take the final call on this. Please talk to him/let him know in case of any major issues

 
Computer Sciences | UW Home