Solution for the Midterm Exam
Problem 1)
a)
| G'0 | 0 | 
| P'3 | x3 | 
| PIII3 | PII12 PII13 PII14 PII15 | 
| C16 | PIII0C0 | 
| C11 | PI10 PI9 PI8C8 | 
| C54 | PI53 PI52C52 | 
| S39 | X39 XOR C39 | 
b)
| S19 | 6T | 4T + C18 + 2T | 
| GIII1 | 0 | no G signal in incrementor | 
| C48 | 3T | T in each level of lookahead | 
Problem 2)
i) T
ii) T
iii) F
iv) F
v) F
vi) F
Problem 3)
i) Easier to pipeline since address of next instruction can be calculated while decoding current instruction.
ii) datapath + control = processor, memory, input, output
iii) No. of instructions * CPI * clock cycle time
iv) a) When the system is executing one task at a time, e.g. in a single or multi-cycle datapath.
b) When the system is executing more than one task at a time, e.g. in a pipelined datapath.
v) A computer where the instruction of the program are stored in memory; the CPU is assigned the task of fetching the instruction from memory, decoding them and executing them.
Problem 4)
i) Hazard when instructions in 2 different pipeline stages(i.e. 2 different instructions) want to access the same resource in the same cycle. Can be handled by duplicating the resource.
ii) Possibility of incorrect operation due to a conditional branch instruction. Which instruction to execute after a branch is not known until outcome of branch is known.
iii) No. In addition to the 6-bit opcode, the R-type instructions make use of the bottom 6 bits of the instruction (the func field) to specify the ALU instruction. This allows for 26 = 64 different opcodes for the ALU instructions for each combination of the opcode bits.
Problem 5)
a) It forwards the results of the parent instruction to it children (the dependent instructions) in EX stage. Now the following dependent instructions do not have to wait for the results to be written back to RF.
b)
| 1 | ADD1 | ||||
| 2 | ADD2 | ADD1 | |||
| 3 | LOAD1 | ADD2 | ADD1 | ||
| 4 | LOAD1 | ADD2 | BUBBLE | ADD1 | |
| 5 | LOAD2 | LOAD1 | ADD2 | BUBBLE | ADD1 | 
| 6 | LOAD2 | LOAD1 | BUBBLE | ADD2 | BUBBLE | 
| 7 | ADD3 | LOAD2 | LOAD1 | BUBBLE | ADD2 | 
| 8 | ADD3 | LOAD2 | BUBBLE | LOAD1 | BUBBLE | 
| 9 | ADD3 | LOAD2 | BUBBLE | LOAD1 | |
| 10 | ADD3 | LOAD2 | BUBBLE | ||
| 11 | ADD3 | LOAD2 | |||
| 12 | ADD3 | ||||
| 13 | |||||
| 14 | 
c)
Add EX/MEM to EX bypass. It removes all RAW bubbles due to ALU instructions. LOAD-USE bubbles are not removed.
Problem 6)
i) Time on B = 50 + 50 / 6 = 58.33sec
SpeedUp = 100/58.33 = 1.714
ii) 1/(1-f + f/6) = 3
=> f = 0.8
=> 80% on machine A
20 + 80/6 = 20 + 13.33
FP = 13.33/33.33 = .4
=> 40% time on machine B
Problem 7)
Changes to the datapath
1) The mux on the input of RF would get an extra input from Instruction[25:21]. Hence the select lines for that mux now has to be of 2-bits.
2) The lower mux at the input of ALU would get a constant 1 as input and hence the select lines for that mux would be now of 3-bits.
3) Control has to generate a new condition called PCCondSrc which works as a select line for the 2:1 mux placed at the output Zero of the ALU.
| 1 | Instruction -> IR PC = PC + 4 | MemRd=1, ALUSrcA =0, IorD=0, IRWrite=1, ALUSrcB = 01 ALUOp=00, PCWrite=1, PCSource=00 | 
| 2 | Read Registers Compute the branch address | MemRd=0, IRWrite=0, PCWrite=0, ALUOp=00 ALUSrcA=0, ALUSrcB=11 | 
| 3 | ALUoutput = A-1 PC<-PC + shift extended if Rs-1!= 0 | ALUOp=01, ALUSrcA=1, ALUSrcB=100, PCCondSrc=1, PCWriteCond=1 | 
| 4 | Rs<-ALUoutput | PCWriteCond=0, RegWrite=1, RegDst=10 MemtoReg=0 | 
| 5 | RegWrite=0 |