Here are several examples, to promote understanding of how assembly language implements code that deals with arrays.
One way to declare an array of 13 characters:
my_chars: .byte 0:13
What was the initial value of each of these declared array
elements? Answer: The null character ('\0'
in C).
An alternative way to declare the array of 13 characters:
my_chars: .space 13
What is the difference between these two declarations?
Answer: The first way initializes this declared memory space
before the program begins execution.
The second way (with the .space
directive)
does not initialize memory. It allocates the
13 bytes, but does not change their contents before the program
begins its execution.
And if we wanted to declare the array of 13 characters,
but initialize each character to the value 'A'
?:
my_As: .byte 'A':13
Alternatively:
my_As: .byte 65:13
This second way works fine, but may be less clear
to a programmer looking at the code.
This second way initializes each byte of the array to
contain the 8-bit, two's complement representation for
the decimal value 65. Since that is the ASCII character
encoding for 'A'
, it works equally well.
To declare an array of integer-sized elements,
recall that on the MIPS architecture,
each integer requires 4 bytes (or 32 bits).
Also, each word on the MIPS architecture is
4 bytes.
Therefore, we may use the .word
directive to
declare an array of integers:
int_array: .word 0:36This declaration allocates 36 words (integer-sized memory chunks), which are all (nicely) located at word-aligned addresses. The initial value of each array element is 0.
If, instead, we wanted an array of 36 integers, where each element is initialized to the value 2, we may use:
all_twos: .word 2:36
The .space
directive might be used to declare an array
of integer-sized elements, but can be problematic.
Consider the declaration:
array: .space 100
25 integer-sized elements are allocated as desired, but there is no guarantee that each of the elements are at word-aligned addresses. Therefore, this declaration plus code that does
la $8, array lw $9, 12($8) # load the 4th element of the arraymay result in an unaligned address exception when the program executes.
A related declaration issue that is beyond the scope of this class is the alignment of data within an array, when each element contains more than a single field. An example from the C is an array of structures, where each structure has more than one field. The difficulty of the allocation may be seen with the sample structure:
struct fivebytes { int oneint; char onechar; } fivebytes array[10];Code to work with elements of this array makes it difficult to load/store the integer-sized, and non-word aligned
oneint
in an efficient manner.
One solution pads each element of the array to a
number of bytes equal to the word size of the machine,
resulting in an inefficient use of memory space.
Another solution requires the code to load and store
multiple bytes, when the integer field is accessed.
This results in code that takes longer to execute.
Assembly language code (high level language code, too!) that does array access may be generally classified as doing either regular accesses or random accesses. A regular access is one that might be stated such as "for each element of the array, do something." Or, "for every 3rd element of the array, do something." A random access is more of an isolated element access similar to the C code:
int array[12]; /* declare an array of 12 integers */ int x; x = 4; array[x] = -23;
The code that does a random array element access tends to follow a fixed pattern (a series of steps), as generated by a compiler.
Here is a MIPS assembly language implementation of the C code fragment for the isolated (random) element access:
.data array: .word 0:12 # array of 12 integers .text li $8, 4 # $8 is the index, and variable x la $9, array # $9 is the base address of the array mul $10, $8, 4 # $10 is the offset add $11, $10, $9 # $11 is the address of array[4] li $12, -23 # $12 is the value -23, to be put in array[4] sw $12, ($11)
A regular access will be done within a structured loop. Once the address of the initial element is calculated, further array element addresses are calculated relative to the known one. Only the address changes. This reduces the amount of code necessary within the loop, which results in fewer instructions executed and (therefore) faster code.
Consider the implementation of a code example that is to re-initialize each element of an array of 100 integers to be the value 18. A less efficient implementation places the isolated element access code into a loop. This implementation tries to use the same registers for clarity of the example.
.data array: .word 0:100 # array of 100 integers .text li $8, 0 # $8 is the index, and loop induction variable li $13, 100 # $13 is the sentinel value for the loop for: bge $8, $13, end_for la $9, array # $9 is the base address of the array mul $10, $8, 4 # $10 is the offset add $11, $10, $9 # $11 is the address of desired element li $12, 18 # $12 is the value 18, to be put in desired element sw $12, ($11) add $8, $8, 1 # increment loop induction variable b for end_for:
More efficient code to do the same thing increments only the address of the desired element within the loop. As many instructions as possible are removed from within the loop.
.data array: .word 0:100 # array of 100 integers .text li $8, 0 # $8 is the loop induction variable li $13, 100 # $13 is the sentinal value for the loop la $9, array # $9 starts as the base address of the array # and is the address of each element li $12, 18 # $12 is the value 18, to be put in desired element for: bge $8, $13, end_for sw $12, ($9) add $9, $9, 4 # get address of next array element add $8, $8, 1 # increment loop induction variable b for end_for:
This code might be made even more efficient by eliminating the loop induction variable ($8), instead calculating the address of the last element and using it to decide when to exit the loop. Note that this eliminates a single instruction from within the body of the loop.
There is little formal syntax (in assembly language) to declare or use a 2-dimensional array. Therefore, implementations vary. Here are some MIPS examples to suggest 2-dimensional array implementations.
Without a formalized syntax, a declaration of a 2-dimensional array reduces to the allocation of the correct amount of contiguous memory. The base address identifies the first element of the first row within the first column.
Consider the declaration of an example 2 by 3 array of characters. Each character requires one byte.
chars: .space 6 # 2 by 3 = 6 bytes of allocated space
This is not a satisifying declaration for the abstract thinker, as this declaration might represent a 3 by 2 array, or a 1-dimensional array of 6 characters. The burden is on the programmer to declare the necessary memory space, and then use that space in a consistent manner.
An alternative in MIPS assembly language code allocates a set of arrays. For example, consider a 4 by 6 array of integers, where each element is initialized to the value 18.
arr: .word 18:6 .word 18:6 .word 18:6 .word 18:6
This declaration conveys the notion of an array of arrays.
Issues with code that operates on a 2-dimensional array are the same as those with 1-dimensional arrays, with the added point of storage order. A 2-dimensional array may be stored in either row major order or column major order.
If 2-dimensional array is thought of as an array of 1-dimensional arrays, then operating on one row of a row major ordered array is fairly simple. Likewise, operating on one column of a column major ordered array is fairly simple.
Copyright © Karen Miller, 2006 |