Go to the previous section.

Instructions

Instructions are the primary data structure in Simple-SUIF. Each procedure is represented by a doubly-linked list of instructions. These instructions contain references to the various types, symbols, and registers. Simple-SUIF instructions resemble assembly language instructions. Each one performs a relatively simple operation specified by its opcode. Most instructions have several operands in registers.

Instructions are stored in simple_instr structures. All instructions share several fields: the opcode, the result type, and pointers to the next and previous instructions. Since different instructions require different kinds of operands, the rest of the fields in the simple_instr structure are specific to the instruction format. The simple_op_format function may be used to identify the format used for a particular opcode. The possible format values are:

BASE_FORM
This format provides a destination register and two source registers and is used for most of the opcodes. Not all of the registers are always used. For example, nop (no operation) instructions use none of the registers, and cpy (copy) instruction only use the destination and one of the source registers.

BJ_FORM
Branch and jump instructions use this format which includes a reference to the target label and an optional source register. The source register is only used with conditional branch instructions.

LDC_FORM
Load constant (ldc) instructions use this format. It includes a destination register and a simple_immed structure to hold the immediate value. See section Immediate Values.

CALL_FORM
This format is only used for procedure call instructions. There are at least two register operands: the destination and the address of the procedure to be called. The format also includes a field to record the number of arguments and an array of the argument registers.

MBR_FORM
Multi-way branch (mbr) instructions use this format. The fields include a source register, an integer offset, a default target label, and an array of other target labels.

LABEL_FORM
Label (lab) instructions identify the position of a label in the instruction list. They use this format which only contains a reference to the label symbol.

The result type of an instruction identifies the type of the value produced for the destination register. If the instruction format does not contain a destination register or if the particular opcode does not use the destination, the result type should always be a VOID_TYPE. Otherwise, the result type must always be supplied and should generally be the same as the type of the variable in the destination register. In rare cases, the destination register field may be set to NO_REGISTER even though the instruction produces a result. When this happens, the result type should indicate the type of the value that is actually produced even though that value is unused.

Unlike the other Simple-SUIF data structures, instructions can be freely modified and rearranged. The library includes functions to help create new instructions and deallocate them when they are no longer needed. The new_instr function creates a new simple_instr object given the opcode and result type. The other fields must be filled in separately. Note that this function is merely a convenience; nothing special is required when creating new instructions, and users are free to create them by calling malloc and setting the opcode and result type manually. The free_instr function deallocates the storage used by an instruction. Again, this function does nothing magic and is only provided for convenience.

The following table lists all of the Simple-SUIF opcodes. Each opcode has an internal representation that is a member of the simple_op enumeration. The simple_op_name function may be used to produce a textual representation for an opcode. The table lists both forms. Unless indicated otherwise, the opcodes use the BASE_FORM format.

nop NOP_OP
Do nothing at all. All of the register operands for these instructions should be set to NO_REGISTER, and the result type should be a VOID_TYPE.

load LOAD_OP
Load the value at the address contained in the src1 register and put it in the dst register. The result type may be any type and indicates the type of the value being loaded. The type of the variable in src1 must be a pointer to the result type. The src2 register is not used.

str STR_OP
Store the value in the src2 register at the address contained in the src1 register. Both registers must be specified. The src2 register may have any type. The src1 register should contain a variable that is a pointer to the type of the variable being stored. The dst register is not used.

mcpy MCPY_OP
Memory-to-memory copy. Load the value from the address in the src2 register and store it at the address in the src1 register. Objects of any type may be copied. Both of the source registers must be pointers to the type of the object that is being copied. The dst register is not used.

cpy CPY_OP
Copy the src1 register to the dst register. The src2 register is not used. The result type must be compatible with the type of the source register but need not necessarily be equivalent. Only scalar types are allowed here.

cvt CVT_OP
Convert the src1 register to the result type and put it in the dst register. The src2 register is not used. Conversions should be performed in steps, changing only one attribute at a time. For example, when converting from an unsigned 8-bit type to a signed 32-bit type, use one cvt instruction to change the size and another to make it signed. Conversions between integer and floating-point types should always be made using types that are as close as possible in size. Nothing can be converted to or from a VOID_TYPE type or a RECORD_TYPE. ADDRESS_TYPE types can only be converted to and from integer types.

ldc LDC_OP
Load a constant value. This instruction uses the LDC_FORM format. The value operand is an immediate constant stored in a simple_immed structure (see section Immediate Values). If the immediate value is an integer, the result type must be a SIGNED_TYPE, UNSIGNED_TYPE or ADDRESS_TYPE type. The result type should be a FLOAT_TYPE type if the immediate is a floating-point value. And finally, if the immediate value is a symbolic address, the result type should always be an ADDRESS_TYPE type.

neg NEG_OP
Negation. Change the sign of the value in the src1 register and put the result in the dst register. The src2 register is unused. The result type and the type of the registers must be compatible integer or floating-point types.

add ADD_OP
Add the values in the src1 and src2 registers and put the result in the dst register. Except for pointer additions, the result type and the types of the registers must be compatible integer or floating-point types. Pointer addition is a special case. One of the source registers may have an ADDRESS_TYPE type, as long as the other source register contains a variable with an integer type of the same size as the pointer; the result type must also be an ADDRESS_TYPE type.

sub SUB_OP
Subtract the value in the src2 register from the value in the src1 register and put the result in the dst register. Except for pointer subtractions, the result type and the types of the registers must be compatible integer or floating-point types. There are two special cases for pointer subtractions. In either case, the src1 register must have an ADDRESS_TYPE type. First, the src2 register may have an integer type of the same size as the pointer to produce an ADDRESS_TYPE value in the dst register. Second, the src2 register may be another ADDRESS_TYPE value to produce a value in the dst register with an integer type that is the same size as the pointers.

mul MUL_OP
div DIV_OP
Multiply or divide the value in the src1 register by the value in the src2 register and put the result in the dst register. The result type and the types of the registers must be compatible integer or floating-point types. Integer multiplication and division are defined according to the rules for ANSI C.

rem REM_OP
mod MOD_OP
Remainder and modulus. These two instructions are very similar. Both divide the value in the src1 register by the value in the src2 register to find the remainder or modulus. The rem instruction is identical to the modulus operator in ANSI C, and the mod instruction is the same except that its result is always guaranteed to be positive. The result type and the types of the destination and source registers must be compatible integer types.

not NOT_OP
Bit-wise inversion. Compute the one's complement negation of the value in the src1 register and put the result in the dst register. The src2 register is not used. The result type and the types of the registers must be compatible UNSIGNED_TYPE types.

and AND_OP
ior IOR_OP
xor XOR_OP
Compute the bit-wise AND, inclusive OR, or exclusive OR of the values in the src1 and src2 registers and put the result in the dst register. The result type and the types of the registers must be compatible UNSIGNED_TYPE types.

asr ASR_OP
lsr LSR_OP
lsl LSL_OP
Shift the value in the src1 register right or left by the amount specified in the src2 register. The variable in the src2 register must always have an UNSIGNED_TYPE type. The asr instruction performs sign extension and requires that the result type and the types of the dst and src1 registers be compatible SIGNED_TYPE types. The lsr instructions does not perform sign extension and requires that the result type and types of the dst and src1 register be compatible UNSIGNED_TYPE types. Sign extension is not an issue for left shifts, so the lsl instruction only requires that the result type and the types of the dst and src1 register be compatible integer types.

rot ROT_OP
Rotate the value in the src1 register left or right by the amount specified in the src2 register. The variable in the src2 register must always have an SIGNED_TYPE type. If the shift amount is positive, the value is rotated to the left; if it is negative, the value is rotated to the right. The result type and the types of the dst and src1 registers must be compatible integer types.

seq SEQ_OP
sne SNE_OP
sl SL_OP
sle SLE_OP
Comparison instructions. If the src1 register is equal, not equal, less than, or less than or equal, respectively, to the src2 register, assign the integer value one to the dst register. Otherwise, set the dst register to zero. The result type must always be a SIGNED_TYPE type. The source registers must have compatible scalar types.

jmp JMP_OP
Unconditional jump. This instruction uses the BJ_FORM format, but the src register is unused. The flow of control is unconditionally transferred to the code at target label.

btru BTRUE_OP
bfls BFALSE_OP
Branch if true or false. This instruction uses the BJ_FORM format. If the src register, which must have an integer type, contains a true (non-zero) or false (zero) value, respectively, the flow of control is transferred to the code at the target label. Otherwise, it continues with the next instruction in sequential order.

mbr MBR_OP
Multi-way branch. This instruction uses the MBR_FORM format, which includes an array of target labels. The number of labels in the array is identified by the value of the ntargets field. Control is transferred to one of these labels depending on the value in the src register. The variable in the src register must have a SIGNED_TYPE or UNSIGNED_TYPE type. The integer value in the offset field is subtracted from the value in the src register and the result is used to index into the array of target labels. If the index is within the range of the array, the instruction branches to the label at that position in the array; otherwise, it branches to the default label in the deflab field.

lab LABEL_OP
Label pseudo-instruction. This instruction uses the LABEL_FORM format. No operation is performed by a label instruction. Its only purpose is to mark the location of a label symbol in the instruction list. The lab field must be a pointer to the simple_sym for the label.

call CALL_OP
Call a procedure. This instruction uses the CALL_FORM format. The proc register must hold a pointer to the procedure to be called. The type of the variable in that register must be a pointer to the type of the procedure. Note: Simple SUIF currently cannot represent procedure types, so you must use the pointer types that are used in the input list when referring to procedure addresses. The result type of the call instruction must match the return type of the procedure. If the result type is not a VOID_TYPE type, the dst register will be assigned the value returned by the procedure. The nargs field in the call instruction indicates how many arguments are in the args array. Each entry in the args array is a register holding the value of an argument to the procedure.

ret RET_OP
Return from a procedure. Only the src1 register is used and it is optional. If specified, it is the return value and may contain a variable of any type.

Go to the previous section.