| CS552 Course Wiki: Spring 2008 | Main »
Homework 5 |
Tasks |
Homework 5 Due 04/24 On this page... (hide) Important
1. Problem 1Note: you may work with your project partner on the first 2 problems ONLY. One electronic submission and handwritten submission is ok. Clearly indicate in your homework, if your partner's submission has these problem. Completely abandoning responsibility for these problem to your partner is disallowed. You must JOINTLY do both problems. One partner doing the direct-mapped and one partner doing the 2-way cache will severely hurt your productivity! First complete direct-mapped before moving to the set-associative cache. You will implement a hierarchal memory system in Verilog that consists of a level-1 write-back cache and stalling memory. The system should use a direct-mapped cache and a four-banked, four-cycle memory. See the project modules provided page for the Cache module and four-banked memory module. Blocks are 4 words wide and the system is byte-addressable, word-aligned. The top level module that you will develop is as follows. verilog template source for mem_system.v
module mem_system(/*AUTOARG*/
// Outputs
DataOut, Done, Stall, CacheHit, err,
// Inputs
Addr, DataIn, Rd, Wr, createdump, clk, rst
);
input [15:0] Addr;
input [15:0] DataIn;
input Rd;
input Wr;
input createdump;
input clk;
input rst;
output [15:0] DataOut;
output Done;
output Stall;
output CacheHit;
output err;
/* data_mem = 1, inst_mem = 0 *
* needed for cache parameter */
parameter mem_type = 0;
// your code here
endmodule // mem_system
A top-level module called Two testbenches, with a reference memory module, and Important Notes:
To complete this problem, you will need to determine how the internal components are arranged and will have to create a cache controller FSM. See the description of the cache module for hints on how this should be done. You can chose to implement either a Mealy or Moore machine, although I recommend using a Moore machine as it will likely be easier to create. Be forewarned that the resulting state machine will be relatively large so get started early. The testing for this module should be extensive. You will need to verify that the design works correctly during hits, misses, writebacks, and refills. Also be sure to check the design under various main memory stall conditions. For extra credit, you can improve your performance by adding a two-entry store buffer so that it is possible for writes to complete in one cycle. The extra credit will not count if the standard system is not working, so be sure to thoroughly test your design before thinking about moving on. For this design, as usual:
Instantiating the cache modules This is the methodology I suggest for instantiating your cache modules, so the naming conventions are the same for everyone. Somewhere in mem_system.v: cache0 (0 + memtype) c0(....) This will guarantee that when this module is finally connected to your processor, your instruction and data memory will create separate dump files. See the Cache module section for details. Verification Verification is an important part and significant challenge for this problem. You are provided with two testbenches:
After every set of 1000 requests, you will see a message like the following:
To run this testbench: wsrun.pl mem_system_randbench *.v
You must write different address traces to test your module and prove that it does implement the cache correct. Determining what to test and show is an important part of this problem. Carefully document and show in your homework, what cases you are testing. Pick representative inputs from this testbench, by examining the waveforms. You must handin annotated waveforms to prove that your design works correctly during hits, misses, writebacks, and refills. To run this testbench: wsrun.pl mem_system_perfbench *.v What to submit:
2. Problem 2Implement your 2-way set associative cache which is required for the project. See the Cache module page. Replace your direct-mapped cache in the above problem with this 2-way set associate cache. Instantiating the cache modules This is the methodology I suggest for instantiating your cache modules, so the naming conventions are the same for everyone. Somewhere in mem_system.v: cache0 (0 + memtype) c0(....) cache1 (2 + memtype) c1(....) This will guarantee that when this module is finally connected to your processor, your instruction and data memory will create separate dump files. See the Cache module section for details.
Parameter Value File Names
--------------- ----------
0 Icache_0_data_0, Icache_0_data_1, Icache_0_tags, ...
1 Dcache_0_data_0, Dcache_0_data_1, Dcache_0_tags, ...
2 Icache_1_data_0, Icache_1_data_1, Icache_1_tags, ...
3 Dcache_1_data_0, Dcache_1_data_1, Dcache_1_tags, ...
What to submit
3. Problem 3Consider a direct-mapped cache with 32-byte blocks and a total capacity of 512 bytes in a system with a 32-bit address space.
0x0000a796 0x000092e8 0x000092f4 0x00004182 0x0000780a 0x0000a690 0x0000408e 0x0000a798 0x00007800 0x000092fc 0x00027c02 0x0000408a 0x00004198 0x00006710 0x0000670c 0x00027c04 0x0000a790 4. Problem 4Re-do problem 3, but using a two-way set-associative cache. When replacing a block, the least-recently-used block is chosen to be replaced. Everything else (block size and total capacity) remains the same. Determine the speedup over the direct-mapped cache in problem 3. Assume both caches can be accessed in 1 cycle, that the CPI without misses is 1.0, and that the miss penalty is 25 cycles. 5. Problem 5Consider a cache with the following characteristics:
6. Problem 6 (zero points)Sun's OpenSPARC chip design is available as opensource verilog. For this problem you will browse through this design to get a sense for what real industry designs look like. The source code is available here: http://opensparc-t1.sunsource.net/nonav/source/verilog/html/verilog.html Click on the Hierarchy for cmp_top and navigate through the hierarchy down to the processor core. cmp_top (top level)
OpenSPARCT1 (chip main)
sparc (processor core, 8 instances)
sparc_exu, sparc_ifu, spu etc. (individual modules inside processor core)
I specifically recommend taking a look at sparc_ifu and the sparc_exu units. You will find similarities to your project design. Look for how the pipelining has been implemented. Also notice, the clean separation between modules, the well defined interfaces, and the hierarchy. Also notice the separation between control path and data path. You do not need to turn in anything for this problem. This is an exercise to familiarize you with industrial strength designs. |
| Page last modified on April 29, 2008 |