The basic concepts within computer science and engineering that make everything work:
These are used (for our purposes) to design programs, computers.
When a problem is large, it needs to be broken down.
We divide and conquer. One way to do this is by introducing a hierarchy (or level), and solve the problem at each level.
For example: the levels used in the design of a computer
#1 is the lowest (or bottom) level. #5 is the highest (or top) and most abstract level.
A diagram showing an interpretation of the various levels of a computer system hardware design, where the diagram is not complete.
computer system | ---------------------------------------------- | | | CPU memory system I/O system (processor) | | | ------------------- ---------------- | | ... | | | | registers ALU control I-cache D-cache main memory | hardware | ------------------------------------------ | | | ... | encoder multiplexer flip flop adder | ----------------------------- | | ... | nand gate not gate pass gate | --------------------------- | ... | CMOS p-type CMOS n-type transistor transistor
We can do our design starting at either end. It makes no sense to start in the middle. In practice, designers always start at the top. After initial design, a bottom-up effort implements known/common lowest level, and works up the hierarchy. The design changes through iterations; a change in one aspect of one level causes changes in other levels.
Another example: software levels of abstraction
writing a large program: >10,000 lines of code
A top-down approach divides (or, categorizes) tasks into modules, and designs each module separately. We define procedures (functions, methods) that accomplish a task. Then, we specify interfaces (parameters, arguments) for the procedures.
The implementation of a function is independent of its interface specification! It is a different level in the abstraction of program design.
We have a computer system, with software running on it.
HLL (high level language) computer programs, written in . . . . . . . . . hardware Pascal, C, Fortran, Java (the software)
How do we get from one to the other?
Wanted: write in nice abstract HLL.
Have: stupid computer that only knows how to execute machine language.
What is machine language?
Binary sequences (lots of 1's and 0's in a very specific order)
interpreted by a computer as instructions.
It is not very human readable.
To help the situation, we introduce the abstraction of assembly language.
It is a more human readable form of machine language.
It uses mnemonics for the instruction type,
and operands for variables.
But, now we need something to translate assembly language
to machine language: an assembler.
An example assembly language instruction might be something like:
add AA, BB, CC
The assembly language is defined such that this is a well-defined
operation.
For example, this instruction may be defined such that it adds
the variable defined by BB
to
the variable defined by CC
,
and places the sum in
the variable defined by AA
.
add
is the mnemonic or opcode
(operation code).
AA
, BB
, and CC
are the operands,
the variables used in the instruction.
Lastly, if we had a program that translated HLL programs to assembly language, then we could be happy. A compiler does this.
Here is an example of these levels within the hierarchy. This example was provided by Prof. James Larus. (You do not need to understand it.)
------------------------------------------- sum.c (a C program, the HLL version) ------------------------------------------- #include <stdio.h> int main (int argc, char *argv[]) { int i; int sum = 0; for (i = 0; i <= 100; i++) sum += i * i; printf ("The sum from 0 .. 100 is %d\n", sum); } ------------------------------------------- sum.s (the assembly language version) ------------------------------------------- .text .align 2 .globl main .ent main 2 main: subu $sp, 32 sw $31, 20($sp) sd $4, 32($sp) sw $0, 24($sp) sw $0, 28($sp) loop: lw $14, 28($sp) mul $15, $14, $14 lw $24, 24($sp) addu $25, $24, $15 sw $25, 24($sp) addu $8, $14, 1 sw $8, 28($sp) ble $8, 100, loop la $4, str lw $5, 24($sp) jal printf move $2, $0 lw $31, 20($sp) addu $sp, 32 j $31 .end main .data .align 0 str: .asciiz "The sum from 0 .. 100 is %d\n" ------------------------------------------- sum.nolabels ------------------------------------------- addiu sp,sp,-32 sw ra,20(sp) sw a0,32(sp) sw a1,36(sp) sw zero,24(sp) sw zero,28(sp) lw t6,28(sp) lw t8,24(sp) multu t6,t6 addiu t0,t6,1 slti at,t0,101 sw t0,28(sp) mflo t7 addu t9,t8,t7 bne at,zero,-9 sw t9,24(sp) lui a0,4096 lw a1,24(sp) jal 1048812 addiu a0,a0,1072 lw ra,20(sp) addiu sp,sp,32 jr ra move v0,zero ------------------------------------------- sum.machine_lang (the machine language version) ------------------------------------------- 00100111101111011111111111100000 10101111101111110000000000010100 10101111101001000000000000100000 10101111101001010000000000100100 10101111101000000000000000011000 10101111101000000000000000011100 10001111101011100000000000011100 10001111101110000000000000011000 00000001110011100000000000011001 00100101110010000000000000000001 00101001000000010000000001100101 10101111101010000000000000011100 00000000000000000111100000010010 00000011000011111100100000100001 00010100001000001111111111110111 10101111101110010000000000011000 00111100000001000001000000000000 10001111101001010000000000011000 00001100000100000000000011101100 00100100100001000000010000110000 10001111101111110000000000010100 00100111101111010000000000100000 00000011111000000000000000001000 00000000000000000001000000100001
The complete picture:
----------- ------------ HLL ---> | compiler|---> assembly --->| assembler|--->machine ----------- language ------------ language (least detailed) (most detailed) (most abstract) (least abstract) (top level) (bottom level)
This course deals with the software aspects of assembly language, assemblers and machine language. It also deals with the hardware aspects of what the computer does to execute programs. It is an introduction to study of computer architecture: the interface between hardware and software. A computer architecture may be defined by an instruction set.
Computer architecture is the relationship between hardware (stuff you can touch) and software (programs, code).
Karen can design a computer that has hardware which directly executes programs in any programming language. For example, a computer that directly executes Pascal. The input to such a computer is Pascal source code.
So, why do we not do just that?
In this class, in whatever assembly language you are writing programs, it will look like you have a machine that executes those programs directly. A simulator provides that illusion for you. It makes it appear as if you are running code directly on a machine.
What we will do:
hll ---> MAL ---> TAL (C) (MIPS (MIPS R2000) assembly)
We assume that you know a HLL (high level language, like C++, C, Java, Pascal, Fortran). From that, we will quickly learn C. C clarifies (important) concepts that Java prefers to hide (such as pointers). Later in the semester, you will learn MAL and TAL. MAL is MIPS assembly language. Programs will be written in both C and MAL.
C and MAL are each abstractions. Each may define a computer architecture. TAL happens to correspond to a real (manufactured) architecture.
A simplified diagram of a computer system (hardware!):
------------ ---------- | CPU | <---------> | memory | |processor | | | ------------ | ---------- | | ------- | I/O | -------
This diagram is the Von Neumann diagram of a computer.
CPU
controls the running of programs
executes instructions
makes requests of the memory
CPU stands for central processing unit
CPU and processor are synonyms
memory
where programs and program variables are stored
handles requests from the CPU
We use a memory to save our program, instead of actually setting up the hardware (buttons/switches/pluggable wires) to represent a computer program.
This was a big deal in the 1940s! It allowed programs to be permanently saved on other media, and then used I/O devices to place the program in the computer memory.
To execute an instruction, the processor must be able to request 3 things from memory:
The memory really only needs to be able to do 2 operations:
Where within the memory?
A label specifies a unique place (a location) in memory.
A label is often identified as an address.
read: processor specifies an address and that the memory is to do a read operation; memory responds with the contents at the address
write: processor specifies an address, data to be stored, and that the memory is to do a write operation; memory responds by overwriting the data at the address specified
For discussion: how (most) processors operate with respect to the execution of instructions.
This discussion uses an invented assembly language instruction example:
mult aa, bb, cc ^ ^ | |____ list of operands (the variables) | |_____ the instruction's mnemonic (a short name for an instruction)
Instructions and operands are stored in memory (in a special machine code format, not shown in this example). Before the instruction can be used by the processor, the instruction must be fetched or loaded.
processor steps involved:
mult
instruction.
This also reveals how many operands there are, since
the number of operands is fixed for any given instruction.
There are 3 operands for this mult
instruction.
bb
and cc
.
The destination operand (aa
) does not need to be loaded.
Its current value is irrelevant, as it will be overwritten with the product
generated by this multiplication instruction.
bb
and cc
together.
aa
.
Next step: suppose we want to execute multiple instructions, like a program.
Except for control instructions, all instructions execute sequentially. (Why? Because the machine code for the program keeps the instructions stored sequentially within memory!)
The processor must keep track of which instruction is to be executed. It does this by the use of an extra variable contained within and maintained by the processor, called a program counter or PC. The contents of the PC is the address of the next instruction to be executed.
Intel calls their version of the PC the IP (Instruction Pointer).
So, revise the processor steps given above:
The added step could come at any time after step 1. It is convenient to think of it as step 2.
This set of steps works fine for all instructions. For control instructions, the interpretation of what each of the steps does must be modified.
Control Instruction example
beq x, y, label
A control instruction is one that explicitly modifies the PC, other than to update it to point to the next instruction. This is how we accomplish a "goto" type of instruction. This category of instructions are most often called a branch or a jump.
beq
instruction, and there are 3 operands)
x
and y
)
label
).
For all computers here are the 6 processor steps involved. This is the instruction fetch and execute cycle.
Notice that this series of steps gets repeated constantly -- to make the computer useful, all that is needed is a way to give the PC an initial value (the address of the first instruction of a program), and to have a way of knowing when the program is done, so the PC can be given the starting address of another program.
This cycle of steps is very important -- it forms the basis for understanding how a computer operates. This cycle of steps is termed the instruction fetch and execute cycle.
There are two different methods used in architectures for defining control instructions and how they work. They have equivalent functionality, meaning that both methods can accomplish the same things. The code appears a bit different depending on the method.
The two methods:
beq x, y, label
instruction above.
It is used by the MIPS architecture, and the Alpha architecture.
Operands (identified by the instruction) are compared. Based on that comparison, the PC may be changed.
The processor keeps and maintains another variable, often called a Condition Code Register or just the Condition Codes. This variable contains information relevant to the result generated by an instruction. For example, the condition code indicates whether the result was positive, zero, or negative.
When the Condition Codes get updated varies from architecture to architecture. One easy-to-remember implementation has the update done during step 6 of every instruction that does an arithmetic type of operation.
The control instruction then looks at the Condition Codes to base a decision on whether or not to overwrite the PC.
An example of an instruction (invented) might be
bpos label
This instruction would overwrite the PC with the address implied
by the operand label
, if the Condition Codes indicated a positive
result.
Note that the ordering of instructions (so that the intended one sets the Condition Codes) becomes important with this method.
This method is used by the Intel IA-32 (x86 or Pentium) architecture. The SPARC also has condition codes. On the SPARC, there are 2 forms of many arithmetic instructions. One form sets condition codes based on the result, and the other form does not.
Control instructions are the mechanism by which the high-level language construct
if (condition) then
execute some code
(an if statement) is implemented. It is also called conditional execution. Depending on the evaluation of a condition, a set of code/instructions is either executed or not executed.
2 categories of control instructions are generally defined:
This is the equivalent of a goto. No condition is evaluated. A set of instructions are always not executed (skipped).
The examples given of invented assembly language instructions were both conditional control instructions. A condition is evaluated. Depending on the outcome of the evaluation, the set of code/instructions will or will not be executed.
Both unconditional and conditional control instructions are implemented in (every architecture's) assembly languge with branch (or jump) instructions.
For an unconditional branch (or jump), no condition is evaluated, and the PC is always changed (as step 6 in the instruction fetch and execute cycle).
For a conditional branch (or jump), the condition is evaluated, and the PC is conditionally changed (as step 6 in the instruction fetch and execute cycle).
The terminology used to describe this: if the PC is explicitly changed, other than to update it to contain the address of the sequentially next instruction, then we say the branch is taken. If the PC is not changed, and the resulting contents of the PC is the address of the sequentially next instruction, then we say the branch is not taken.
So, an unconditional branch is always taken. And, a conditional branch may or may not be taken.
One more implementation item to consider:
An unconditional branch can be implemented by using a
conditional branch, where the condition always evaluates
to the one for the taken branch.
Here is a high-level language example that is not likely to ever be used, as it does not make logical sense. It implies the use of an unconditional branch.
if ( 1 == 0 ) {
/* some code in here that never gets executed */
}
Now, reconsider the invented assembly language example given,
beq x, y, target_label
If x
and y
are always the same, then the branch will be taken,
resulting in the same behavior as an unconditional branch.
If x
and y
represent integer values, and syntax allows
immediate integer values written into the code, the
unconditional branch may be implemented with
beq 0, 0, target_label
Copyright © Karen Miller, 2006, 2007, 2008 |