Programming Assignment 4 (P4)
CS536-S24 Intro to PLs and Compilers

In this page: Due date | Overview | Specifications | Handing in | Grading criteria

Due Tuesday, April 9 at 11:59 pm


Overview

For this assignment you will write a name analyzer for base programs represented as abstract-syntax trees. Your main task will be to write name analysis methods for the nodes of the AST. In addition you will need to:

  1. Modify the Sym class (by including some new fields and methods and/or by defining some subclasses).
  2. Modify the IdNode class in ast.java (by including a new Sym field and by modifying its unparse method).
  3. Modify P4.java (an extension of P3.java).
  4. Modify the ErrMsg class.
  5. Write two test inputs: nameErrors.base and test.base to test your new code.

Specifications

Getting Started

The files are:

Here is p4.zip file. It is recommended to start your project with it.

Name Analysis

The name analyzer will perform the following tasks:

  1. Build symbol tables. You will use the "list of hashtables" approach (using the SymTable class from program 1).

  2. Find multiply declared names, uses of undeclared names, bad tuple accesses, and bad declarations. Like C, the base language allows the same name to be declared in non-overlapping or nested scopes. The formal parameters of a function are considered to be in the same scope as the function body. All names must be declared before they are used. A bad tuple access happens when either the left-hand side of the colon-access is not a name already declared to be of a tuple type or the right-hand side of the colon-access is not the name of a field for the appropriate type of tuple. A bad declaration is a declaration of anything other than a function to be of type void as well as the declaration of a variable to be of a bad tuple type (the name of the tuple type doesn't exist or is not a tuple type).

  3. Add IdNode links: For each IdNode in the abstract-syntax tree that represents a use of a name (not a declaration) add a "link" to the corresponding symbol-table entry. (As stated above, you will need to modify the IdNode class in ast.java to have a new field of type Sym. That is the field that your name analyzer will fill in with a link to the Sym returned by the symbol table's globalLookup method.)

You must implement your name analyzer by writing appropriate methods for the different subclasses of ASTnode. Exactly what methods you write is up to you (as long as they do name analysis as specified).

It may help to start by writing the name analysis method for ProgramNode, then work "top down", adding a method for DeclListNode (the child of a ProgramNode), then for each kind of DeclNode (except TupleDeclNode), and so on (and then handle TupleDeclNode and perhaps other tuple related nodes at the end). Be sure to think about which nodes' methods need to add a new hashtable to the symbol table (i.e., when is a new scope being entered) and which methods need to remove a hashtable from the symbol table (i.e., when is a scope being exited).

Some of the methods will process the declarations in the program (checking for bad declarations and checking whether the names are multiply declared, and if not, adding appropriate symbol-table entries) and some will process the statements in the program (checking that every name used in a statement has been declared and adding links). Note that you should not add a link for an IdNode that represents a use of an undeclared name.

tuple Handling Issues

Name analysis issues surrounding tuples come up in several situations:

Error Reporting

Your name analyzer should find all of the errors described in the table given below; it should report the specified position of the error, and it should give exactly the specified error message (each message should appear on a single line, rather than how it is formatted in the following table). Error messages should have the same format as in the scanner and parser (i.e., they should be issued using a call to ErrMsg.fatal).

If a declaration is both "bad" (e.g., a non-function declared void) and is a declaration of a name that has already been declared in the same scope, you should give two error messages (first the "bad" declaration error, then the "multiply declared" error).

Type of Error Error Message Position to Report
More than one declaration of an identifier in a given scope (note: includes identifier associated with a tuple definition) Multiply-declared identifier The first character of the ID in the duplicate declaration
Use of an undeclared identifier Undeclared identifier The first character of the undeclared identifier
Bad tuple access (LHS of colon-access is not of a tuple type) Colon-access of non-tuple type The first character of the ID corresponding to the LHS of the colon-access.
Bad tuple access (RHS of colon-access is not a field of the appropriate a tuple) Invalid tuple field name The first character of the ID corresponding to the RHS of the colon-access.
Bad declaration (variable or parameter of type void) Non-function declared void The first character of the ID in the bad declaration.
Bad declaration (attempt to declare variable of a bad tuple type) Invalid name of tuple type The first character of the ID corresponding to the tuple type in the bad declaration.

Note that the names themselves should not be printed as part of the error messages.

During name analysis, if a function name is multiply declared you should still process the formals and the body of the function; don't add a new entry to the current symbol table for the function, but do add a new hashtable to the front of the SymTable's list for the names declared in the body (i.e., the parameters and other local variables of the function).

If you find a bad variable declaration (a variable of type void or of a bad tuple type), give an error message and add nothing to the symbol table.

Summary

tuple

function

if/else/while

Other Tasks

Extending the Sym Class

It is up to you how you store information in each symbol-table entry (each Sym). To implement the changes to the unparser described below you will need to know each name's type. For function names, this includes the return type and the number of parameters and their types. You can modify the Sym class by adding some new fields (e.g., a kind field) and/or by declaring some subclasses (e.g., a subclass for functions that has extra fields for the return type and the list of parameter types). You will probably also want to add new methods that return the values of the new fields and it may be helpful to change the toString method so that you can print the contents of a Sym for debugging purposes.

Modifying the IdNode Class

Two changes to the IdNode class are needed:

  1. Adding a new field of type Sym (to link the node with the corresponding symbol-table entry), and

  2. Changing the unparse method so that every use of an ID has its type (in angle brackets, i.e., < >) after its name. (The point of this is to help you to see whether your name analyzer is working correctly; i.e., does it correctly match each use of a name to the corresponding declaration and does it correctly set the link from the IdNode to the information in the symbol table.) For names of functions, the information should be of the form: param1Type, param2Type, ..., paramNType -> returnType. For names of global variables, parameters, and local variables of a non-tuple type, the information should be integer or logical. For a global or local variable that is of a tuple type, the information should be the name of the tuple type. For example, given a program that contains this code:

    tuple Point {
        integer x.
        integer y.
    }.
    integer f{integer x, logical b} [ ]
    void g{} [
        integer a.
        logical b.
        tuple Point p.
        p:x = a.
        b = a == 3.
        f(a + p:y*2, b).
        g().
    ]
    

    The unparser should print:

    tuple Point {
        integer x.
        integer y.
    }.
    integer f{integer x, logical b} [ 
    ]
    void g{} [
        integer a.
        logical b.
        tuple Point p.
        p<Point>:x<integer> = a<integer>.
        b<logical> = (a<integer> == 3).
        f<integer,logical->integer>((a<integer> + (p<Point>:y<integer> * 2)), b<logical>).
        g< ->void>().
    ]
    

P4.java

The main program, P4.java, will be similar to P3.java except that

Calling the name analyzer means calling the appropriate method of the ASTnode that is the root of the tree built by the parser.

Modifying the ErrMsg Class

Your compiler should quit after the name analyzer has finished if any errors have been detected so far (either by the scanner/parser or the name analyzer). To accomplish this, you can add a static boolean field to the ErrMsg class that is initialized to false and is set to true if the fatal method is ever called (warnings should not change the value of this field). Your main program can check the value of this field and only call the unparser if it is false.

Writing Test Inputs

You will need to write two input files to test your code:

  1. nameErrors.base should contain code with errors detected by the name analyzer. This means that it should include bad and multiply declared names for all of the different kinds of names, and in all of the different places that declarations can appear. It should also include uses of undeclared names in all kinds of statements and expressions as well as bad tuple accesses.

  2. test.base should contain code with no errors that exercises all of the name-analysis methods that you wrote for the different AST nodes. This means that it should include (good) declarations of all of the different kinds of names in all of the places that names can be declared and it should include (good) uses of names in all kinds of statements and expressions.

Note that your nameErrors.base should cause error messages to be output, so to know whether your name analyzer behaves correctly, you will need to know what output to expect.

As usual, you will be graded in part on how thoroughly your input files test your code.

Some Advice

Here are few words of advice about various issues that come up in the assignment:

Handing in

Please read the following handing in instructions carefully.

Turn in the following files to the appropriate assignment in Gradescope (note: these should be the only files changed/needed to run with the provided materials):

Please ensure that you do not turn in any sub-directories or put your Java files in any packages.

If you are working in a pair, make sure both partners are indicated when submitting to Gradescope.

Grading criteria

General information on program grading criteria can be found on the Assignments page.

For more advice on Java programming style, see these style and commenting standards (which are essentially identical to the standards used in CS200 / CS300 / CS400).


Last Updated: 3/22/2024     © 2024 Beck Hasti