- 1 - SAM - A Symbolic Assembler for the MACC 2 SAM is a line oriented symbolic assembler which has been developed for the MACC 2. The MACC 2 is a 16 bit machine with 16 general purpose registers and a memory of 2^15 two byte words addressed from 0..2^15 - 1. SAM accepts, as input, lines of text and generates a memory image for the MACC 2. Each line of input must be a comment, assembler directive or an instruction in symbolic form. All blanks and tabs in the input to SAM are ignored except as noted in the description of the STRING directive. 1. _T_h_e__L_o_c_a_t_i_o_n__C_o_u_n_t_e_r SAM maintains an internal variable, the location counter, which is initialized to zero and incremented when assembler directives and symbolic instructions are processed. SAM uses the location counter to determine the location of the next memory element to be used when it is building a memory image for the MACC 2. 2. _C_o_m_m_e_n_t_s Comments and blank lines (lines containing only blanks and tabs) are ignored by SAM. Comments begin with the character '%' and continue to the end of the line. Comments may follow any assembler directive or instruction. 3. _T_h_e__A_s_s_e_m_b_l_e_r__D_i_r_e_c_t_i_v_e_s SAM currently recognizes five assembler directives: INT, REAL, STRING, LABEL and SKIP. Each directive must be completely specified on a single line of the input. 3.1 _I_N_T An INT directive consists of the keyword "INT" followed by a signed integer constant in the range -2^15..2^15 - 1. This directive causes SAM to update the memory image being created by placing the 16 bit 2's complement representation of the specified value in the memory location addressed by the current value of the location counter. SAM then increments the location counter by one. Examples illustrating correct and incorrect uses of the INT directive are presented below. - 2 - INT 45 % OK INT -4782 % OK INT 4.5 % INCORRECT - not an integer value INT 4 5 6 % OK - blanks are ignored INT - -45 % INCORRECT - too many signs INT + 345 % OK - the '+' is redundant INT 3,456 % INCORRECT - the ',' is not allowed int 3 % INCORRECT - "int" is illegal 3.2 _R_E_A_L REAL directives consist of the keyword "REAL" followed by a signed real constant (given in decimal notation) which can be represented in 32 bits using the floating point representation of the MACC 2. SAM will deposit the MACC 2 representation of the specified value in the two consecutive locations starting with the memory location identified by the value of the location counter. SAM then increments the location counter by two. Examples illustrating correct and incorrect uses of the REAL directive are presented below. REAL 4.35 % OK REAL -5.77 % OK REAL 4 E 16 % INCORRECT - exponent notation forbidden REAL 4.4.5 % INCORRECT - too many decimal points REAL 4.5 6 % OK real 4.35 % INCORRECT - "real" is illegal 3.3 _S_T_R_I_N_G STRING directives are used to deposit sequences of ASCII characters in the memory image being created by SAM. Each STRING directive begins with the keyword "STRING" followed by a string constant. String constants consist of characters delimited by double quotes ("). Blanks and tabs are significant in string constants. In a string constant, the construct ":ddd" (where ddd is a 3 digit decimal integer) is used to denote an unprintable character. The decimal value following the ':' is interpreted as the ASCII representation of the desired character. Further, a double quote can be embedded in a string constant using the construct ":"". Similarly, "::" is used to embed a single ':'. When processing STRING directives, SAM stores the character values contained in the string constant in consecutive memory locations (two characters per word) starting with the location addressed by the current value of the location counter. Moreover, SAM always stores an ASCII null (:000) as the last character in a string constant. SAM then increments the location counter so that it addresses the memory location immediately following the last memory location used to store the string constant. Examples - 3 - illustrating correct and incorrect uses of the STRING directive are presented below. STRING "a typical string" % OK STRING ":"quoted string:"" % OK - note the embedded "'s STRING " " % OK - lots of blanks STRING "a" % OK STRING ":007" % OK - a bel character STRING this " is bad" % INCORRECT - "this" is illegal STRING ":" % INCORRECT - too few '"' STRING 'this' % INCORRECT - SAM needs '"' STRING "" % OK - the empty string string "poor" % INCORRECT - "string" is illegal 3.4 _S_K_I_P SKIP directives are used to allocate blocks of memory which are to be used for storing data during program execution. A SKIP directive consists of the keyword "SKIP" followed by an unsigned integer constant in the range 1..2^15 - 1. SAM processes SKIP directives by incrementing the location counter by the specified value. Examples illustrating correct and incorrect uses of the SKIP directive are presented below. SKIP 15 % OK - reserves 15 bytes of memory SKIP 1.5 % INCORRECT - must be an integer constant SKIP +12 % INCORRECT - constant must be unsigned skip 34 % INCORRECT - "skip" is illegal 3.5 _L_A_B_E_L Sam maintains an internal table of user defined symbols. A user defined symbol consists of at most 5 upper case letters and digits and must begin with an upper case letter. User defined symbols can only be associated with memory locations (i.e., these symbols do not denote arbitrary constants). LABEL directives are used to define user defined symbols and associate values with these labels. Each LABEL directive consists of the keyword "LABEL" followed by a user defined symbol. SAM processes LABEL directives by entering the user defined symbol into its internal table of symbols and associating the current value of the location counter with this symbol. SAM does not increment the location counter when processing LABEL directive. An error message will be generated if the user defined symbol has already been defined by a previous LABEL directive. User defined symbols can be referenced before the are defined (using the LABEL directive). Hence, SAM may need to "patch-up" the memory image being created when it processes a LABEL directive. When SAM has finished - 4 - processing its input (End of File), if there are user defined symbols which have been referenced but not used, SAM will print an error message. Examples illustrating correct and incorrect uses of the LABEL directive are presented below. LABEL L1 % OK LABEL SKIP % OK - "SKIP" is NOT reserved LABEL 1F7 % INCORRECT - "1F7" is an illegal symbol LABEL a1 % INCORRECT - "a1" is an illegal symbol label L2 % INCORRECT - "label" is illegal 4. _I_n_s_t_r_u_c_t_i_o_n_s An instruction consists of an instruction name followed by one or two addresses. If there are two addresses in the instruction, they must be separated by a comma. SAM translates the symbolic representation of an instruction into the appropriate MACC 2 instruction. A one word instruction is deposited in the memory location addressed by the current value of the location counter and the location counter is incremented by one. A two word instruction is deposited in two consecutive memory locations starting with the location addressed by the current value of the location counter which is then incremented by two. 4.1 The_Symbolic_Form_of_an_Address The symbolic address forms and the corresponding effective address calculations are summarized in the below table. For the purpose of brevity r is an integer in the range 0..15, w an integer in the range -2^15..2^15 - 1 (except for direct memory and memory indirect in which case it is in the range 0..2^15 - 1) and PC denotes the location counter (or program counter). Parentheses are used to denote the contents of a register or memory element. Name Symbolic Form Effective Address ---- ------------- ----------------- Register Direct Rr Rr Memory Direct w w Indexed w(Rr) (Rr) + w Immediate #w (PC) - 1 Register Indirect *Rr (Rr) Memory Indirect *w (w) Indexed Indirect *w(Rr) ((Rr) + w) PC Relative &w (PC) + w - 5 - 4.2 2-Address_Instructions Name Operation Effect ---- --------- ------ IN Integer Negation (R1) <- -(A2) IA Integer Addition (R1) <- (R1) + (A2) IS Integer Subtraction (R1) <- (R1) - (A2) IM Integer Multiplication (R1) <- (R1) * (A2) ID Integer Division (R1) <- floor[(R1) / (A2)] FN Float Negation (R1:R1+1) <- -(A2:A2+1) FA Float Addition (R1:R1+1) <- (R1:R1+1) + (A2:A2+1) FS Float Subtraction (R1:R1+1) <- (R1:R1+1) - (A2:A2+1) FM Float Multiplication (R1:R1+1) <- (R1:R1+1) * (A2:A2+1) FD Float Division (R1:R1+1) <- (R1:R1+1) / (A2:A2+1) BI Bitwise Inversion (R1) <- not(A2) BO Bitwise OR (R1) <- (R1) or (A2) BA Bitwise AND (R1) <- (R1) and (A2) IC Integer Comparison (R1) <- compare (R1) to (A2) FC Float Comparison (R1) <- compare (R1:R1+1) to (A2:A2+1) JSR Jump to Subroutine (R1) <- (PC); (PC) <- A2 BKT Block Transfer (R1) <- move (R1+1) bytes of memory starting at location (R1) to memory starting at A2. LD Load (R1) <- (A2) STO Store (A2) <- (R1) LDA Load Address (R1) <- A2 FLT Integer to Float (R1:R1+1) <- (A2) FIX Float to Integer (R1) <- floor[(A2:A2+1)] The names of the 2-address instructions are presented in the above table. Every 2-address instruction consists of one of the names given in this table followed by two addresses separated by a comma. The first address must be a direct register address (R1) while the second address (A2) can be any one of the eight address types described above with the following exceptions: 1. The second address of a floating point instruction (FN, FA, FS, FM, FD, FC or FIX) cannot be an immediate address. 2. The second address of a "JSR" instruction cannot be a direct register or an immediate address. 3. The second address of a "BKT" instruction cannot be a direct register or an immediate address. 4. The second address of an "STO" instruction cannot be an immediate address. - 6 - 5. The second address of an "LDA" instruction cannot be a direct register address. 6. The second address of an "FIX" instruction cannot be an immediate address. Examples illustrating the correct and incorrect description of 2-address instructions are shown below. IA R4, R5 % OK IS R4 R5 % INCORRECT - no comma RC R3, *-4(R1) % OK - indexed indirect addressing STO R5, &-10 % OK - PC relative addressing IM R10, #47 % OK - immediate addressing FA R15, R2 % OK - uses wrap around IA 4, +45(R2) % INCORRECT - first address is illegal FA R4, +4 % OK FA R4, #3 % INCORRECT - exception 1 above JSR R11, R14 % INCORRECT - exception 2 above BKT R13, R4 % INCORRECT - exception 3 above STO R5, #358 % INCORRECT - exception 4 above LDA R2, R5 % INCORRECT - exception 5 above FIX R3, #45 % INCORRECT - exception 6 above 4.3 _J_u_m_p__I_n_s_t_r_u_c_t_i_o_n_s Name Operation Effect ---- --------- ------ JMP Unconditional jump (PC) <- A JLT Jump when less if LT = 1 then (PC) <- A JLE Jump when less or equal if LT = 1 or EQ = 1 then (PC) <- A JEQ Jump when equal if EQ = 1 then (PC) <- A JNE Jump when not equal if EQ = 0 then (PC) <- A JGE Jump when greater or equal if GT = 1 or EQ = 1 then (PC) <- A JGT Jump when greater if GT = 1 then (PC) <- A NOP No operation Nothing The names of the jump instructions are presented in the table above. A jump instruction consists of a jump instruction name followed by an address, A. This address must be a direct memory address, an indexed address, a register indirect address, a memory indirect address, an indexed indirect address or a PC relative address. SAM will produce an error message if a direct register address or an immediate address is used in a jump instruction. The "NOP" instruction is unique among the jump instructions in that it does not require an address. If an address is found in the description of a "NOP", the address will be ignored (every "NOP" is a 1 word instruction). Examples illustrating the - 7 - correct and incorrect description of jump instructions are shown below. JLE +492 % OK JLT *R4 % OK JGE *888 % OK NOP % OK NOP 888 % OK - the address is ignored JMP #45 % INCORRECT - immediate address JNE R5 % INCORRECT - direct register address 4.4 _S_h_i_f_t__I_n_s_t_r_u_c_t_i_o_n_s Name Operation ---- --------- SRZ Shift right fill with 0 SRO Shift right fill with 1 SRE Shift right bit extend SRC Shift right circular SRCZ Shift right doubleword fill with 0 SRCO Shift right doubleword fill with 1 SRCE Shift right doubleword bit extend SRCC Shift right doubleword circular SLZ Shift left fill with 0 SLO Shift left fill with 1 SLE Shift left bit extend SLC Shift left circular SLCZ Shift left doubleword fill with 0 SLCO Shift left doubleword fill with 1 SLCE Shift left doubleword bit extend SLCC Shift left doubleword circular Each shift instruction consists of one of the shift instruction names presented above followed by a direct register address, a comma and an unsigned integer. The unsigned integer specifies the number of bits to be shifted. Thus for a single register shift, this number must be in the range 1..16, while a range of 1..32 is allowed for a doubleword shift. The shift is done in place so that the register contents is changed. Examples illustrating the correct and incorrect description of shift instructions are shown below. SRZ R5, 5 % OK SLC R15, 14 % OK SRCE 15, 2 % INCORRECT - first address is incorrect SLCZ R4 5 % INCORRECT - missing comma SLO R5, 24 % INCORRECT - shift amount too large SRZ R3, 16 % OK - clears R3 - 8 - 4.5 The Pseudo-Instruction Clear A clear instruction consists of the keyword "CLR" followed by a direct register address. This instruction is equivalent to a left or right zero-fill shift of 16 bits. Because there is no MACC 2 clear instruction, this is called a pseudo-instruction. 4.6 _I_/_O__I_n_s_t_r_u_c_t_i_o_n_s Name Operation ---- --------- RDI Read integer RDF Read float RDBD Read binary digit RDBW Read binary word RDHD Read octal digit RDHW Read octal word RDCH Read character RDST Read string RDNL Read new line WRI Write integer WRF Write float WRBD Write binary digit WRBW Write binary word WRHD Write octal digit WRHW Write octal word WRCH Write character WRST Wrtie string WRNL Write new line The names of the I/O instructions are given above. An I/O instruction consists of an I/O instruction name followed by an address. The effect of all such instructions is to either deposit the contents of an input device in the specified location in the case of a read, and to yield the contents of the location to an output device in the case of a write. Immediate addressing cannot be used in the input instructions or in the "WRF" and "WRST" instructions. The "WRNL" instruction does not require an address. If an address is given with a "WRNL" instruction, it will be ignored. Note: On input, strings are terminated by the end of a line; on output, strings are terminated by a null byte (ASCII 0). 4.7 _T_h_e__T_R_N_G__I_n_s_t_r_u_c_t_i_o_n The test range instruction checks whether its first argument (an integer) is within a range defined by an upper and lower bound (also integers). The lower bound is in the constant area at a location defined by the second argument - 9 - (a general address), and the upper bound is at the word following the lower bound in the constant area. The first argument is a register address. 4.8 _T_h_e__H_a_l_t__I_n_s_t_r_u_c_t_i_o_n The halt instruction consists of the keyword "HALT". 4.9 _E_r_r_o_r_s When SAM detects an error on a line of input, an error message is generated which describes the error and the location of the error. SAM will detect at most one error on each line of input. When the first error is detected, SAM destroys the memory image being created, but continues processing the input looking for further errors.