CS/ECE 552-1: Introduction to Computer Architecture
Spring 2005
Problem Set #2

Problem 1

Design an 8 -by -16- bit register file using the Mentor Graphics software. Figure 1 gives the high-level interface. It has one write port, two read ports, three register select inputs (two for read and one for write,) a write enable, a read enable, a reset and a clock input.

Here's the top-level schematic of the register file. I've implemented it with clock gating and using muxes to select the output, though other variations are possible.

Here's the simulation results:

Problem 2

Design a simple 16-bit ALU using Mentor. Operations to be performed are ADD, bitwise-AND, bitwise-OR, and bitwise-XOR. In addition, it must have the ability to invert the B input before performing the operation. Another input line also determines whether the arithmetic to be performed is signed or unsigned . Use a carry look-ahead adder (CLA) in your design. (Hint: First design a 4-bit CLA. Then use blocks of this CLA for designing the 16-bit CLA.)

Schematic for the 1-bit ALU:

Schematic for the carry-lookahead unit:

Schematic for the 4-bit ALU:

Schematic for the 16-bit ALU:

Schematic for the zero detect:

Schematic for the overflow detect:

An even simpler way to detect overflow is to compare C15 with C16...if they are different, an overflow occured.

Simulation of the ALU:

Problem 3

Problem 2.13 on page 92

Using compiler C1:
M1 - 400 MHz / (0.3 * 4 CPI + 0.5 * 6 CPI + 0.2 * 8 CPI) = 69 MIPS
M2 - 200 MHz / (0.3 * 2 CPI + 0.5 * 4 CPI + 0.2 * 3 CPI) = 63 MIPS

Using compiler C2:
M1 - 400 MHz / (0.3 * 4 CPI + 0.2 * 6 CPI + 0.5 * 8 CPI) = 63 MIPS
M2 - 200 MHz / (0.3 * 2 CPI + 0.2 * 4 CPI + 0.5 * 3 CPI) = 69 MIPS

Using 3rd-party compiler:
M1 - 400 MHz / (0.5 * 4 CPI + 0.3 * 6 CPI + 0.2 * 8 CPI) = 74 MIPS
M2 - 200 MHz / (0.5 * 2 CPI + 0.3 * 4 CPI + 0.2 * 3 CPI) = 71 MIPS

Using C1, M1 is the fastest. Using C2, M2 is the fastest. The 3rd-party compiler produces the fastest code on M1 and M2. Since M1 is the fastest in two of the three cases, it would be the best choice.

Problem 4

Problem 2.18, 2.20 on pages 93-94

Mbase: CPI = 0.4 * 2 + 0.25 * 3 + 0.25 * 3 + 0.1 * 5 = 2.8 CPI
Mopt: CPI = 0.4 * 2 + 0.25 * 2 + 0.25 * 3 + 0.1 * 4 = 2.45 CPI

Mbase: 500 MHz / 2.8 CPI = 179 MIPS
Mopt: 600 MHz / 2.45 CPI = 245 MIPS
Mopt is 245 / 179 = 1.37 times faster than Mbase.

Problem 5

Problem 2.31

Assume 100 instructions: 10 multiply, 90 other
Time spend on multiplication = (10 instr * 12 CPI) / (10 instr * 12 CPI + 90 instr * 4 CPI ) = 25 %

Problem 2.32

Avg CPI of Machine without modifictation (CPI_old) = 0.1*12 + 0.9*4 = 4.8

Avg. time to excecute one instruction (Time_old) = 4.8 CPI/ x MHz

Avg CPI of Machine with modification (CPI_new) = 0.1*6 + 0.9*4 = 4.2

Avg. time to excecute one instruction (Time_new) = 4.2 CPI/ (x/1.2 MHz ) = 5.04 CPI / x MHz

Since Time_old/ Time_new = 0.95 , the machine without modification is faster so the modification should not be done.

Problem 6

Amdahl's Law: Speedup = Execution Time (old) / Execution Time (new) = 1 / ((1 - Fraction_enhanced) + Fraction_enhanced / Speedup_enhanced)

Speedup = 1 / ( 1 - 0.5 + 0.5 / 5) = 1.67

Problem 7

Multiply optimization: Speedup = 1 / (1 - 0.2 + 0.2 / 4) = 1.18

Memory optimization: Speedup = 1 / (1 - 0.5 + 0.5 / 2) = 1.33

Both: Speedup = 1 / (1 - 0.2 - 0.5 + 0.2 / 4 + 0.5 / 2) = 1.67