Here are several examples, to promote understanding of how assembly language implements code that deals with arrays.
One way to declare an array of 13 characters:
my_chars: .byte 0:13
What was the initial value of each of these declared array
elements? Answer: The null character ('\0' in C).
An alternative way to declare the array of 13 characters:
my_chars: .space 13
What is the difference between these two declarations?
Answer: The first way initializes this declared memory space
before the program begins execution.
The second way (with the .space directive)
does not initialize memory. It allocates the
13 bytes, but does not change their contents before the program
begins its execution.
And if we wanted to declare the array of 13 characters,
but initialize each character to the value 'A'?:
my_As: .byte 'A':13
Alternatively:
my_As: .byte 65:13
This second way works fine, but may be less clear
to a programmer looking at the code.
This second way initializes each byte of the array to
contain the 8-bit, two's complement representation for
the decimal value 65. Since that is the ASCII character
encoding for 'A', it works equally well.
To declare an array of integer-sized elements,
recall that on the MIPS architecture,
each integer requires 4 bytes (or 32 bits).
Also, each word on the MIPS architecture is
4 bytes.
Therefore, we may use the .word directive to
declare an array of integers:
int_array: .word 0:36This declaration allocates 36 words (integer-sized memory chunks), which are all (nicely) located at word-aligned addresses. The initial value of each array element is 0.
If, instead, we wanted an array of 36 integers, where each element is initialized to the value 2, we may use:
all_twos: .word 2:36
The .space directive might be used to declare an array
of integer-sized elements, but can be problematic.
Consider the declaration:
array: .space 100
25 integer-sized elements are allocated as desired, but there is no guarantee that each of the elements are at word-aligned addresses. Therefore, this declaration plus code that does
la $8, array
lw $9, 12($8) # load the 4th element of the array
may result in an unaligned address exception when the program
executes.
A related declaration issue that is beyond the scope of this class is the alignment of data within an array, when each element contains more than a single field. An example from the C is an array of structures, where each structure has more than one field. The difficulty of the allocation may be seen with the sample structure:
struct fivebytes {
int oneint;
char onechar;
}
fivebytes array[10];
Code to work with elements of this array makes it difficult
to load/store the integer-sized, and non-word aligned
oneint in an efficient manner.
One solution pads each element of the array to a
number of bytes equal to the word size of the machine,
resulting in an inefficient use of memory space.
Another solution requires the code to load and store
multiple bytes, when the integer field is accessed.
This results in code that takes longer to execute.
Assembly language code (high level language code, too!) that does array access may be generally classified as doing either regular accesses or random accesses. A regular access is one that might be stated such as "for each element of the array, do something." Or, "for every 3rd element of the array, do something." A random access is more of an isolated element access similar to the C code:
int array[12]; /* declare an array of 12 integers */
int x;
x = 4;
array[x] = -23;
The code that does a random array element access tends to follow a fixed pattern (a series of steps), as generated by a compiler.
Here is a MIPS assembly language implementation of the C code fragment for the isolated (random) element access:
.data
array: .word 0:12 # array of 12 integers
.text
li $8, 4 # $8 is the index, and variable x
la $9, array # $9 is the base address of the array
mul $10, $8, 4 # $10 is the offset
add $11, $10, $9 # $11 is the address of array[4]
li $12, -23 # $12 is the value -23, to be put in array[4]
sw $12, ($11)
A regular access will be done within a structured loop. Once the address of the initial element is calculated, further array element addresses are calculated relative to the known one. Only the address changes. This reduces the amount of code necessary within the loop, which results in fewer instructions executed and (therefore) faster code.
Consider the implementation of a code example that is to re-initialize each element of an array of 100 integers to be the value 18. A less efficient implementation places the isolated element access code into a loop. This implementation tries to use the same registers for clarity of the example.
.data
array: .word 0:100 # array of 100 integers
.text
li $8, 0 # $8 is the index, and loop induction variable
li $13, 100 # $13 is the sentinel value for the loop
for: bge $8, $13, end_for
la $9, array # $9 is the base address of the array
mul $10, $8, 4 # $10 is the offset
add $11, $10, $9 # $11 is the address of desired element
li $12, 18 # $12 is the value 18, to be put in desired element
sw $12, ($11)
add $8, $8, 1 # increment loop induction variable
b for
end_for:
More efficient code to do the same thing increments only the address of the desired element within the loop. As many instructions as possible are removed from within the loop.
.data
array: .word 0:100 # array of 100 integers
.text
li $8, 0 # $8 is the loop induction variable
li $13, 100 # $13 is the sentinal value for the loop
la $9, array # $9 starts as the base address of the array
# and is the address of each element
li $12, 18 # $12 is the value 18, to be put in desired element
for: bge $8, $13, end_for
sw $12, ($9)
add $9, $9, 4 # get address of next array element
add $8, $8, 1 # increment loop induction variable
b for
end_for:
This code might be made even more efficient by eliminating the loop induction variable ($8), instead calculating the address of the last element and using it to decide when to exit the loop. Note that this eliminates a single instruction from within the body of the loop.
There is little formal syntax (in assembly language) to declare or use a 2-dimensional array. Therefore, implementations vary. Here are some MIPS examples to suggest 2-dimensional array implementations.
Without a formalized syntax, a declaration of a 2-dimensional array reduces to the allocation of the correct amount of contiguous memory. The base address identifies the first element of the first row within the first column.
Consider the declaration of an example 2 by 3 array of characters. Each character requires one byte.
chars: .space 6 # 2 by 3 = 6 bytes of allocated space
This is not a satisifying declaration for the abstract thinker, as this declaration might represent a 3 by 2 array, or a 1-dimensional array of 6 characters. The burden is on the programmer to declare the necessary memory space, and then use that space in a consistent manner.
An alternative in MIPS assembly language code allocates a set of arrays. For example, consider a 4 by 6 array of integers, where each element is initialized to the value 18.
arr: .word 18:6
.word 18:6
.word 18:6
.word 18:6
This declaration conveys the notion of an array of arrays.
Issues with code that operates on a 2-dimensional array are the same as those with 1-dimensional arrays, with the added point of storage order. A 2-dimensional array may be stored in either row major order or column major order.
If 2-dimensional array is thought of as an array of 1-dimensional arrays, then operating on one row of a row major ordered array is fairly simple. Likewise, operating on one column of a column major ordered array is fairly simple.
| Copyright © Karen Miller, 2006 |