UW-Madison
Computer Sciences Dept.

CS/ECE 552 Introduction to Computer Architecture Spring 2010 Section 1
Instructor David A. Wood and T. A. Tony Nowatzki
URL: http://www.cs.wisc.edu/~david/courses/cs552/S10/

Suggested Project Stages and Implementation

These stages are not necessarily broken into equal weight; some may take considerably longer than others.

The overall implementation and verification plan for the project is the following:

  • Read the ISA Specification

  • Read the Microarchitecture Specification

  • Reuse homework verilog modules as much as possible

  • Develop pencil-paper schematics for your processor design

  • Specify the design in verilog, use suggested stages below

  • Verify your design.

1.  Stage One - Unpipelined -> Must demonstrate this in demo 1

To start, you should do a single-cycle, non-pipelined implementation. Figure 4.24 on page 329 of the COD fourth edition is a good place to start.

For this stage, use the Single cycle perfect memory. Since you will need to fetch instructions as well as read or write data in the cycle, use two memories -- one for instruction memory and one for data.

2.  Stage Two - Pipelined -> Must demonstrate this in demo 2

After you have completed the single cycle implementation, you will next implement a pipelined version of the architecture. A good starting point is Figure 4.65 on page 384 of COD fourth edition. Continue to use the single cycle memory.

Be sure that the non-pipelined version is functional before you try the pipelined version. While designing the non-pipelined version, make considerations that will allow easy conversion to the pipelined version.

3.  Stage Three: Memory Design

The next few steps will improve our memory and make it more realistic.

3.1  Aligned memory

At this step, replace the original single-cycle memory with the Aligned single cycle memory. This is a very similar module, but it has an "err" output that is generated on unaligned memory accesses. Your processor should halt when an error occurs. Verify your design.

3.2  Stalling memory

At this step, replace the single cycle memory with the Stalling memory. This is a very similar module, but has a "ready" output. At arbitrary times, it will de-assert "ready" to indicate that valid read data is not available, or write data has not be written. Your pipeline will need to stall to handle these conditions. Verify your design.

  • Instruction memory: First replace your instruction memory module with this stalling memory, keep your data data memory module the same (i.e. aligned perfect memory from previous step). Verify your design. This will be easier to debug, as only module's behavior has changed.

  • Data memory: Now, replace your data memory module alone with this stalling memory, revert your instruction memory module back to the aligned perfect memory. Verify your design. This will be easier to debug, as only module's behavior has changed.

  • Instruction and Data memory: Now change both instruction and data memories to the stalling memory design. Verify your design.

3.3  Four-banked memory

The simple memory module is still highly idealized. Real memory systems are pipelined and use multiple banks. You are provided with such a simple four-banked memory module that models a simple DRAM controller.

I suggest that you do not interface this four-banked memory directly to your processor. Just use it to fetch blocks for your cache. This sub-step does not require verification. Make sure your design continues to compile. Proceed to stage 4.

4.  Stage Four - Direct-Mapped Cache:

Replace your memory modules with the cache modules. This module has a "stall" output, which takes the place of the "ready" output of the stalling memory. Here, however, you will need to implement a state machine to handle cache misses. Upon a miss, the previous contents of the cache line will need to be written back to memory if dirty, and the new line will need to be loaded into cache. The main memory will take multiple cycles to perform each access.

Again, follow an incremental approach like we did for the stalling memory.

  • Instruction memory: First replace your instruction memory module with this mem_system module developed for HW5, make your data data memory module perfect (i.e. aligned perfect memory from previous step). Verify your design. This will be easier to debug, as only module's behavior has changed.

  • Data memory: Now, replace your data memory module alone with this mem_system module developed for HW5, revert your instruction memory module back to the perfect memory. Verify your design. This will be easier to debug, as only one module's behavior has changed.

  • Instruction and Data memory: Now change both instruction and data memories to the mem_system design. Verify your design.

5.  Stage Five - Two-way Set-associative Cache

Add a second cache module alongside each of your existing cache modules, and implement a two-way set-associative memory. Use the 2-way set associative cache developed in homework 5.

6.  Stage Six - optimizations to design and exceptions -> Final demo

 
Computer Sciences | UW Home