UW-Madison
Computer Sciences Dept.

CS/ECE 552 Introduction to Computer Architecture


Spring 2012 Section 1
Instructor David A. Wood and T. A. Ramkumar Ravikumar
URL: http://www.cs.wisc.edu/~david/courses/cs552/S12/

CS/ECE 552 : Introduction to Computer Architecture
Spring 2012
Prof. Wood
Problem Set #3

Due: Wed, Mar. 7, 2012
Approximate Weight : 15% of homework grade

You should do this assignment in your project groups

You can find the PDF copies of Dicussion session 5 and Discussion session 6

Problem 1 (10 points)

Consider a single-cycle computer design such as the one in Figure 4.15 of cod4e (page 320). Assume a MIPS-like instruction set (32-bit). Suppose you had just completed such a design, and now the compiler group has come to you with a small list of additional instructions they would like you to add. How would you respond? Order the list from easiest to most difficult to add, based on the number of things that would have to change in the datapath; briefly indicate for each one what those changes would be. 

  1. "Split Register": This instruction would read an operand from $rs and move its lower half to $rd with the upper half set to zero. It would also take the upper half of $rs, shift it right 16 bits (with zero fill), and write it to $rt.

  2. "Bit Equal": This instruction does a bit-for-bit compare between two registers. For each bit i, if bit i of $rs is equal to bit i of $rt, set bit i of $rd; otherwise set bit i of $rd to zero.

  3. "Replace Under Mask": This I-Format instruction uses the 16-bit sign-extended immediate to select which bits of $rt should be replaced with the corresponding bits of $rs. For each bit of the sign-extended immediate that is a one, the result comes from the corresponding bit of $rs; for each bit that is a zero, the result comes from $rt. The result is written to $rt.


Problem 2 (10 points)

For this problem, you need not use mentor or Verilog; just draw the designs (neatly!) on paper. 
First, design a 4-bit ripple-carry adder; as a building block use squares representing full adders. (You may use a printout from Homework 1.) 
Next, draw an 8-bit carry-select adder, using as building blocks 4-bit ripple-carry adders and 2:1 muxes.

Now, calculate the delay at output S5 of the carry-select adder. To do this, you will need to know the gate design of the full adder and the 2:1 mux; these are given here. You will also need to know the delay function: 
AND, OR, NAND, NOR, NOT: delay = (8 + n
2) τ 
XOR: delay = (12 + n
2) τ 
where n is the number of inputs to the gate, and τ is a time constant.
Assume that all inputs to your design are available at time zero. Calculate (and mark on your paper) the pertinent critical-path delays through the basic building blocks, through your ripple adder, and finally to S
5 of the entire design.


Problem 3 (30 points)

Using Verilog, design an 8 -by -16- bit register file. See the verilog interface below. It has one write port, two read ports, three register select inputs (two for read and one for write,) a write enable, a reset and a clock input. All register state changes occur on the rising edge of the clock. As always, your basic building block must be the D-flipflop given in the project page. The read ports should be all combinational logic. Do not use tri-state logic in your design.

Design this register file such that changing the width to 32-bit or 64-bit would be straightforward.


                                     +--------------------+
                                     |                    |
    read1regsel[2:0] >---|                    |----> read1data[15:0]
    read2regsel[2:0] >---|                    |
                                     |                    |
    writeregsel[2:0] >---|                    |
     writedata [15:0]>---|                    |----> read2data[15:0]
                                     |                    |
               write >---|                    |
                 clk >---|                    |
                 rst >---|                    |
                         +--------------------+


Use the following Verilog template: (download rf.v)

module rf (
           // Outputs
           read1data, read2data, err,
           // Inputs
           clk, rst, read1regsel, read2regsel, writeregsel, writedata, write
           );
   input clk, rst;
   input [2:0] read1regsel;
   input [2:0] read2regsel;
   input [2:0] writeregsel;
   input [15:0] writedata;
   input        write;

   output [15:0] read1data;
   output [15:0] read2data;
   output        err;

   // your code

endmodule

The read and write data ports are 16 bits each. The select inputs (read and write) are 3 bits each. When the write enable is asserted (high) the selected register will be written with the data from the write port. The write occurs on the next rising clock edge; write data cannot flow through to a read port during the same cycle.

There is no read enable; data from the selected registers will always appear on to the corresponding read ports.

The reset signal is synchronous and when asserted (active high), resets all the register values to 0.

You must use a hierarchical design. Design a 16-bit register first, and then put 8 of them together with additional logic to build the register file.

  • Follow the Verilog file naming conventions for this design.

    • Use the clock / reset generator provided in the project modules page.

    • Connect it to your register file module (rf) in a top-level schematic called rf_hier.v. Doing this will mean that you do not need use forces to create the clock and reset.

    • You must instantiate rf_hier and test it using a testbench or do files.

  • See template below: (download rf_hier.v)

module rf_hier (
   // Outputs
   read1data, read2data, 
   // Inputs
   read1regsel, read2regsel, writeregsel, writedata, write
   );
   input [2:0] read1regsel;
   input [2:0] read2regsel;
   input [2:0] writeregsel;
   input [15:0] writedata;
   input        write;

   output [15:0] read1data;
   output [15:0] read2data;

   wire clk, rst;

   // Ignore err for now
   clkrst clk_generator(.clk(clk), .rst(rst), .err(1'b0) );
   rf rf0(
          // Outputs
          .read1data                    (read1data[15:0]),
          .read2data                    (read2data[15:0]),
          .err                          (err),
          // Inputs
          .clk                          (clk),
          .rst                          (rst),
          .read1regsel                  (read1regsel[2:0]),
          .read2regsel                  (read2regsel[2:0]),
          .writeregsel                  (writeregsel[2:0]),
          .writedata                    (writedata[15:0]),
          .write                        (write));

endmodule

You must verify your design. You may use the example Verilog register file testbench to test your module. This testbench is provided as is without any guarantees of correctness! You can download the testbench: rf_bench. v. If you use the testbench and discover bugs in it, handin the corrected testbench with your submission. Enhancements to the testbench may results in bonus points happy smiley

What to submit:

  1. Turn in neatly and legibly drawn schematics of your design. (paper) [Block diagram representation]

  2. Annotated simulation trace of the complete design. (paper) Pick representative cases for your simulation input to turn in. Your trace must show the following:

    • Every register gets read and written properly, and that each bit of each register has been both low and high at least once. (For example, use patterns such as 5555 and AAAA.)

    • A simultaneous read and write on the same register must work properly,

    • A read and write at the same cycle but on different registers must also work properly

    • Both read ports set to the same value must work properly

  3. If you use the testbench provided, electronically submit the text output of the program as rf_bench.out (see 4 below). Modelsim will write the text output to a file called transcript in your project directory.

  4. Electronically submit the following files.

    • Directory name: hw3_3

    • All your verilog source code.

    • rf_bench.out mentioned above

All files for this problem must be in this directory. If a problem requires files from a different directory than hw3/problem3, then create a copy of the file in this directory.


Problem 4 (20 Points)

Using Verilog, write, compile and simulate a six state saturating counter. The counter should take as input a clock and a reset line (ctr_rst), and output a 3-bit wide bus (and an err output). Reset is synchronous and sets the output to 0 at the rising edge of the clock. The output should increment every clock cycle until it reaches its saturation value (5, i.e. 101 in binary) and then continue to output the maximum value until reset. ctr_rst is different from the global rst signal which is set high at start for 2 cycles and then remains low. Use this global rst signal to initialize any state. The "err" output is a standard way of indicating hardware errors or illegal states; we'll get in the habit of using it for all state machine, though in this case it will only be driven if there are states which are supposed to be impossible to get into.

Use the following Verilog template: (download sc.v)

module sc( clk, rst, ctr_rst, out, err);
   input clk;
   input rst;
   input ctr_rst;
   output [2:0] out;
   output err;

endmodule

The ctr_rst line is active high, i.e. a logical value of 1 will reset the counter, while a logical value of 0 will let the counter increment.

If ctr_rst is high while the counter is still counting, the output should reset to 0. If it is held high over multiple cycles, the counter should hold at 0. All state changes should occur on the clock's rising edge.

  • Follow the Verilog file naming conventions for this design.

    • Use the clock / reset generator provided in the project modules page. Remember that for this design the global reset signal generated at the start of simulation is different from the ctr_rst that can be set high any time.

    • Connect it to your saturation counter module (sc) in a top-level schematic called sc_hier.v. Doing this will mean that you do not need use forces to create the clock and global reset.

    • You must instantiate sc_hier and test it using a testbench or do files.

    • Your traces or text log should demonstrate multiple ctr_rst instances and show that your counter behaves correctly.

  • See template below: (download sc_hier.v)

module sc_hier (ctr_rst, out);

    input ctr_rst;
    output [2:0] out;

    wire err;
    wire clk;
    wire rst;

    clkrst clk_generator(.clk(clk), .rst(rst), .err(err) );
    sc sc0( .clk(clk),
            .rst(rst),
            .ctr_rst( ctr_rst ),
            .out(out),
            .err(err) );


endmodule

What to submit:

  1. Turn in neatly and legibly drawn schematics of your design. [Block diagram representation]

  2. Annotated simulation trace of the complete design. Show exhaustive simulation.

  3. If you use a testbench, annotated text log of output.

  4. Electronically submit the following files.

    1. Directory name: hw3_4

    2. All your verilog source code.

    3. Testbench source code if you created one.

All files for this problem must be in this directory. If a problem requires files from a different d

Problem 5 (30 Points)

In this problem, you must design a FIFO that can hold 64-bit data values. Design, simulate, and verify in Verilog a 4-entry FIFO that can hold 64-bit data. The FIFO should implement the functionality of a conventional first-in-first-out data structure. You may assume the D-flip flop module provided. The FIFO accepts new input each cycle that data_in_valid is asserted, unless it is full (indicated by fifo_full). Data that is “inserted” into a full FIFO is ignored. The data_out signals always drive the data at the head of the FIFO (the oldest data). The head of the FIFO is popped when pop_data is asserted. An empty FIFO drives zeros on data_out and asserts fifo_empty. Popping an empty FIFO has no affect. Asserting reset makes the FIFO empty. All outputs should change only in response to the clock edge.

The "err" output is a standard way of indicating hardware errors or illegal states; we'll get in the habit of using it for all state machine, though in this case it will only be driven if there are states which are supposed to be impossible to get into.

Use the following Verilog template for your FIFO detector module. (download fifo.v)

module fifo(clk, rst, data_in, data_in_valid, pop_fifo, data_out, fifo_empty, fifo_full,
            data_out_valid, err);
  input [63:0] data_in;
  input data_in_valid;
  input pop_fifo;

  input clk;
  input rst;
  output [63:0] data_out;
  output fifo_empty;
  output fifo_full;
  output data_out_valid;
  output err;

  //your code here

endmodule
  • Follow the Verilog file naming conventions for this design.

    • Use the clock / reset generator provided in the project modules page.

    • Connect it to your FIFO module (fifo) in a top-level schematic called fifo_hier.v. Doing this will mean that you do not need use forces to create the clock and global reset.

    • You must instantiate fifo_hier and test it using a testbench or do files.

  • See template below: (download fifo_hier.v)

module fifo_hier(/*AUTOARG*/
   // Outputs
   data_out, fifo_empty, fifo_full, data_out_valid, 
   // Inputs
   data_in, data_in_valid, pop_fifo
   );

   input [63:0] data_in;
   input        data_in_valid;
   input        pop_fifo;

   output [63:0] data_out;
   output        fifo_empty;
   output        fifo_full;
   output        data_out_valid;

   clkrst clk_generator(.clk(clk),
                        .rst(rst),
                        .err(err) );

   fifo fifo0(/*AUTOINST*/
              // Outputs
              .data_out                 (data_out[63:0]),
              .fifo_empty               (fifo_empty),
              .fifo_full                (fifo_full),
              .data_out_valid           (data_out_valid),
              .err                      (err),
              // Inputs
              .data_in                  (data_in[63:0]),
              .data_in_valid            (data_in_valid),
              .pop_fifo                 (pop_fifo),
              .clk                      (clk),
              .rst                      (rst));




endmodule

What to submit:

  1. Turn in neatly and legibly drawn schematics of your design. [Block diagram representation]

  2. Annotated simulation trace of the complete design. Show exhaustive simulation.

  3. You must explain why your traces test all possible cases in the FIFO.

  4. Electronically submit the following files

    1. Directory name: hw3_5

    2. All your verilog source code.

    3. Testbench source code if you created one.

All files for this problem must be in this directory. If a problem requires files from a different directory than hw3/problem6, then create a copy of the file in this directory.


Fifo FAQ

  1. What is the desired behavior of the FIFO if both data_in_valid and pop_fifo are high at the same time?

    1. If the fifo is empty:
      a) assert fifo_empty and write the element into the fifo, no data is popped: 
      b) You may optimize this behavior by bypassing the value and sending out the new value that same cycle. This is an optimization. You may need to add a data_out_valid signal to the interface to implement this optimization correctly. This is not required part of the homework.

    2. If the fifo is non-empty and not-full then insert the element into the FIFO and pop the correct element from the FIFO and drive data_out

    3. If the fifo is full indicated by fifo_full being asserted, you have two choices:
      a) ignore the data being inserted and pop the correct element from the FIFO and drive data_out
      b) pop out the correct element and insert this new element as the newest element in the FIFO. 
      This is an optimization. This is not required part of the homework.


 
 
 Resources:

dff.v

clkrst.v

rf.v
rf_hier.v
rf_bench.v

sc.v
sc_hier.v

fifo.v
fifo_hier.v

What to handin

Hand in your homework using the CS handin program.

  • Make a folder for each problem (hw3_3, hw3_4 and hw3_5)
  • Each folder should contain all the verilog files for that problem.
  • name and signals for the top level module should be as indicated for each problem
  • tar these 3 folders to 'cs username'.tar [example : tar cvf ram.tar hw3_3 hw3_4 hw3_5]
  • Copy the tar file over to an empty folder and submit it using handin documentation [example: mkdir ram; mv ram.tar ram]
    * <class_name> cs552-1
    * <assignment_name> HW3
    * <directory_path> 'location_of_the folder_you_created'

 
Computer Sciences | UW Home