CS 536 Program 5: Code Generation
Due date: Wednesday, December 15 (by midnight)
Not accepted after midnight on Saturday, December 18
Overview |
Requirements |
Announcements | Handin
Overview
For this assignment you will write a code generator that generates
MIPS assembly code (suitable as input to the Spim interpreter) for
C-- programs represented as abstract-syntax trees.
Requirements
General Information
Similar to the third and fourth assignments, the code generator
will be implemented by writing codeGen member functions for
the various kinds of AST nodes.
Implementing code generation for arrays is required only for
students working in pairs;
it is optional for students working alone (if you do implement arrays,
we will ignore that aspect of your work for grading purposes; i.e.,
implementing arrays will neither give you extra credit nor cause you
to lose points for errors in your implementation).
In addition to implementing the code generator, you will also update the main program (P5.java)
so that, if there are no errors (including type errors),
the code generator is called after the type checker (writing code to
the file named by the second command-line argument)
and you will write an input program (in a file called test.C)
to test your code generator.
Since you will need to test code generation for read
statements, you will also need to write a file named input
that contains the input expected by your test.C.
(Note that your main program should no longer call the unparser.)
Getting Started
Skeleton files on which you should build are in:
~cs536-1/public/prog5
You can use these if there were problems with your own versions.
The files are:
Some useful code-generation methods can be found in the file
Codegen.java.
Note that to use the methods and constants defined in that file
you will need to prefix the names with Codegen.;
for example, you would write:
Codegen.genPop(Codegen.T0)
rather than genPop(T0).
(Alternatively, you could put the declarations of the methods and
constants in your ASTnode class; then you would not need the
Codegen prefix.)
Also note that a PrintWriter p is declared as a static
public field in the Codegen class.
The code-generation methods in Codegen.java all write to PrintWriter
p, so you should use it when you open the output file
in your main program (in P5.java);
i.e., you should include:
Codegen.p = IO.openOutputFile(args[1]);
in your main program (or ASTnode.p if you put the declarations
in the ASTnode class).
You should also close that PrintWriter at the end of the program:
Codegen.p.close();
Spim
Documentation on Spim is available on-line:
You can run "plain"
spim by typing: spim -file xxx
where xxx is the name of the file produced by your compiler.
Or you can use the X-windows version:
- Type: xspim This will open a new window.
- Click on the "load" button.
This will open a small window in which you should type the name of
the file produced by your compiler;
finish with a carriage-return, or click on the
"assembly file" button in the small window.
- Click on the "run" button. This will open another small window.
Click on the "ok" button in that window.
Syntax and runtime errors will be reported at the bottom of the large
window opened when you first typed xspim.
If your C-- program produces output, yet another window will be opened,
and the output will be written to that window.
Here is a link to an example C-- program and
the corresponding MIPS code.
Changes to Old Code
Required changes:
- Add to the name analyzer or type checker (your choice), a check
whether the program contains a function named main.
If there is no such function, print the error message:
"No main function".
Use 0,0 as the line and character numbers.
- Add a new "offset" field to the Sym class (or to the appropriate
subclass(es) of Sym).
Change the name analyzer to compute offsets for each function's
parameters and local variables (i.e., where in the function's
Activation Record they will be stored at runtime) and to fill
in the new offset field.
Note that each scalar variable requires 4 bytes of storage, and
each array variable requires 4 * size bytes
(where size is the size of the array).
You may find it helpful to verify that you have made this
change correctly by modifying your unparser to print each local
variable's offset.
Suggested changes:
- Modify the name analyzer to compute and save the total size of the
local variables declared in each function (e.g., in a new field of the
function name's symbol-table entry).
This will be useful when you do code generation for function
entry (to set the SP correctly).
- Either write a method to compute the total size of the formal
parameters declared in a function, or modify the name analyzer to
compute and store that value (in the function name's symbol-table
entry). This will also be useful for code generation for
function entry.
- Change the definition of class WriteStmtNode to include a (private)
field to hold the type of the expression being written, and change your
typecheck method for the WriteStmtNode to fill in that field.
This will be useful for code generation for the write statement
(since you will need to generate different code depending on
the type of the expression being output).
Non-Obvious Semantic Issues
- All parameters should be passed by value.
- The and and or operators (&& and ||) are short circuited,
just as they are in Java. That means that their right operands are
only evaluated if necessary (for all of the other binary operators,
both operands are always evaluated). If the left operand of "&&"
evaluates to false, then the right operand is not evaluated
(and the value of the whole expression is false);
similarly, if the left operand of "||" evaluates to true,
then the right operand is not evaluated (and the value of the whole
expression is true).
- In C-- (as in C++ and Java), two string literals are considered
equal if they contain the same sequence of characters.
So for example, the first two of the following expressions should
evaluate to false, and the last two should evaluate to
true:
"a" == "abc"
"a" == "A"
"a" == "a"
"abc" == "abc"
- Boolean values should be output as 1 for true and 0
for false (and that is probably how you should represent
them internally as well).
- Boolean values should also be input using 1 for true and 0
for false.
Arrays
Implementing code generation for arrays is required only for
students working in pairs.
Students working alone can implement them or not (but there will
be no extra credit if they are implemented.)
Note that arrays are indexed starting at zero (just as they are
in Java).
Remember that for global array variables, the entire array must
be stored in the static-data area, and
for local array variables, the entire array must be
stored in the Activation Record.
Also note that since array expressions (like A[j*k]) can occur
on the left-hand side of an assignment, you will need to write a
genStore method for ArrayExpNodes as well as a codeGen
method.
When generating code for an array access (like A[k]), you
should not worry about whether the index is in range or not.
Do no compile-time checks, and do not generate any run-time checks.
If you are working alone and do not implement arrays, you do not need
to add any special checks to ensure that no arrays are declared or
used in the program.
Just write your code-generation methods assuming that there will be
no arrays;
it doesn't matter what your program does if it is used on a program with
arrays, as long as it works correctly on all C-- programs that do not
include arrays.
Suggestions
- Start by making the changes to your old code described
above.
Modify the unparse method for IdNode to print
the offset so that you can make sure you are computing
offsets correctly.
- Next, implement code generation for global variable declarations,
function entry, and function exit.
Write a test program that just declares some global variables
and a main function (that does nothing).
Make sure you can generate code and execute the
code for that test program.
- Next, implement code generation for statements and expressions
(one kind at a time).
Start with the write statement and test your code generator
for every new statement and expression by writing out the result
of the newly implemented operation.
- Develop your test program (test.C) as you develop
your code generator. Each time you implement a new feature,
add that feature to test.C.
- Implement arrays last. Make sure everything else works
before even thinking about arrays.
Announcements
Includes: Additions, Revisions, and FAQs
(Frequently Asked Questions).
Please check here frequently.
11/29/2004 |
Program released. |
Handin
What to turn in
See the assignments page for information about how to submit your code.
The late policy is also found on the assignments page.
Electronically submit all of the files that are needed to
create and run P5.class (including your Makefile) as well as your test program
(test.C) and your input file input.
Do not copy any ".class" files, and do not create any subdirectories
in your handin directory.
If you are working with a partner only one of you should hand in files.
Include a comment at the top of P5.java with the names of both
partners.
General information on program grading criteria can be found on the Grading Criteria
for Programs page.