CS 536: Program 1  
A simple interpreter of an even simpler language

Overview

This programming assignment serves multiple purposes.  It will introduce you to symbol tables, abstract syntax trees, and recursive traversals of tree data structures. It will also serve as a warm-up exercise in Java programming. (You are going to write a lot of Java code this semester.)  Most importantly, this assignment should provide the fun of being able to execute your own programs.  Well, if it does not provide enough fun, the assignment should at least remove your fear of complexity, by showing you that an interpreter is really a very simple program.

You will write an interpreter for a simple programming language (a subset of a subset of Java).  The language supports integer variables, assignments, basic arithmetic operations, if and while statements, and a print statement.  The language does not contain classes, declarations of variables, procedure calls, or even an input statement.  

The assignment does not involve a lot of coding. Excluding the test programs that you will have to write, the assignment can be accomplished in less than 300 lines of code, many of which are print statements and repeated patterns of code.  Don't let the above statement fool you, though.  Before you can code the 300 lines, you will have to do some thinking, so start early (today!).

Your interpreter will not be a full-fledge interpreter like a Unix shell, perl, or a Java VM:  Your interpreter will not read the program in the source-code format.  Instead, the program to be interpreted will be encoded as an abstract syntax tree (AST). The AST representation of the program will be constructed "manually" by means of allocating AST nodes (using the Java new statement) and linking them together into trees.  The tree construction will be thus "hard-coded" in your interpreter.   Note that, except for being constructed manually, the AST will look as if it was created by a fully functional front-end (i.e., scanner and parser) of a fully-functional interpreter.

Your interpreter will traverse the AST, evaluate expressions, insert assigned values into the symbol table, and execute the statements.

The high-level design of the interpreter has already been done for you. You will be given a skeleton of the interpreter and will have to fill in a few methods in classes that correspond to the various kinds of AST nodes.  The only file that you will have to write from scratch is the symbol table for the interpreter.

In addition to writing the interpreter, you will have to thoroughly test it, by writing AST programs that will exercise the various statements and expressions of the simple language. 

Because the AST programs are hard to read in their hard-coded form (see below), you will have to write routines that will print an AST in its corresponding source-code form.  These routines will essentially "decompile" the ASTs back to our simple subset-of-Java language.  It is recommended that you start the assignment by writing these routines, as they will make you familiar with the format of the AST and will also serve as a useful debugging facility.

You will be graded both on the correctness and readability of your code, and also on how thoroughly your input programs (i.e., the ASTs) test your interpreter. 

In this assignment, you are going to work alone.  The remaining programming assignments will be done in pairs.

The Symbol Table

For each symbol (i.e., each variable in the AST program), the symbol table will maintain its integer value. The symbol table is a hash table. It is indexed by the name of the variable (a String) and is built upon the Java Hashtable class. The symbol table should be in a file named SymbolTable.java. You will have to write the class SymbolTable from scratch.

The class SymbolTable will contain a definition of a (nested) class Sym.  Objects of this class will store the attributes of each symbol.  In our simple interpreter, only the current integer value of the program variable needs to be stored. You must implement the following (public) SymbolTable methods:

SymbolTable ( ) This is the constructor method; it should create the symbol table's Hashtable object.
Sym insert (String name)

This method will be called when the interpreter encounters the first assignment into a variable.

Creates a new instance of a Sym object, inserts it into the hash table, and returns a pointer to the Sym object. If the symbol has previosuly been inserted into the symbol table, a DuplicateException exception must be thrown.

Sym lookup (String name) This method will be called when the interpreter wants to read the current value of a variable, or check whether a variable has been already inserted into the symbol table.

Return the attributes for a symbol with name 'name'.  The symbol must already be in the symbol table. If the symbol is not in the symbol table, the lookup returns null.

The AST programs

The AST programs to be interpreted will be hardcoded in file 

The version of P1.java given to you contains two simple AST programs, stored in fields prog1 and prog2.  During debugging and testing, you will add additional programs, stored in prog3, prog4, ... . Here's an example of a hardcoded AST program (to understand the code, you should draw the AST graphically, as a tree):

    StmtListNode prog2; 

    Sequence loop_body = new Sequence(); 
    loop_body.addToEnd(new AssignStmtNode(new IdNode("a"), 
                                                          new MinusNode(new IdNode("a"), new IntLitNode(1))));
    loop_body.addToEnd(new PrintStmtNode(new IdNode("a"))); 

    Sequence main_stmts = new Sequence(); 
    main_stmts.addToEnd(new AssignStmtNode(new IdNode("a"), new IntLitNode(10))); 
    main_stmts.addToEnd(new WhileStmtNode(new IdNode("a"), 
                                                            new StmtListNode(loop_body))); 

    prog2 = new StmtListNode(main_stmts); 

The AST program above (stored in variable prog2) corresponds to a source program shown below. Note that the condition in while is not a Boolean.  Our simple language uses integer-typed conditions in both while's and if's.  These conditions are considered true when their value is non-zero, and false when the value is zero.

    a = 10;
    while( a ) {
      a = a-1;
      System.out.println(a);
    }

Decompiling

As part of your assignment, you will write methods that will "decompile" the AST into the source code shown above.  The decompilation will help you verify that the (ugly) AST programs that you will add to class P1 are correct programs and that they do what you expect them to.

The decompile routines should print a nicely formatted version of the program (this process is called decompiling the abstract-syntax tree). The output produced by calling decompile should look as follows:
  1. Each statement is printed on a single line (regardless of its length).
  2. The body of the while loop and the body of the if statement is indented by two spaces.  Doubly-nested bodies are indented by four spaces, triply nested bodies are indented by siz spaces, etc.
  3. The bodies are surrounded by curly braces, as in Java. To make grading easier, you should put open curly braces on the same line as the preceding code, and closing curly braces on a line with no other code. The first statement in the body of an if or while should be on the line following the open curly brace.
  4. Whitespace within a line is up to you (as long as it looks reasonable). 

See the example, below, of how the output should look.

The main program

The main program is also in file 

Besides defining the AST programs, class P1 prints (i.e., decompiles) the programs and interprets them.  Here is the important fragment from main():

    out.println("\n\nProgram 1:");
    prog1.decompile(out, 3);
    prog1.interp(); 

    out.println("\n\nProgram 2:");
    prog2.decompile(out, 3);
    prog2.interp();

In this fragment, the main program first prints the the source code that corresponds to the AST stored in prog1 (call to decompile), and then interprets the ASTs (call to interp).  The output of the above program is (the source code for prog1 is in P1.java):

    nova7 $ java P1   // run the interpreter



    Program 1:
       System.out.println(7);
    7


    Program 2:
       a = 10;
       while( a ) {
         a = (a-1);
         System.out.println(a);
       }
    9
    8
    7
    6
    5
    4
    3
    2
    1
    0

You will do the following changes to class P1: (1) you will add new AST programs (into prog3, etc); and (2) you will decompile and interpret these new programs. Your main() should interpret all your AST programs. 

The primary reason why you will create new AST programs is to carefully test your interpreter.   Your new AST programs should include comments that describe what is being tested by that AST program.  

Make sure that your programs test "boundary" cases (such as references to a variable that has not been assigned to yet). Note: To determine what should be tested is up to you!  This course should prepare you for real life.  In real life, the programmers are fully responsible their bugs, and so they need to think what the potential bugs might be and test for them.  You will have to do the same in this assignment.

Your P1.main() should not require any input from the user (including command-line arguments).

Note that we will test your code also with our own version of P1.java and our own AST programs.  Therefore, you must not change the AST interface (i.e., your decompile and interp methods must have the same arguments as shown in ast.java).

The interpreter itself

The interpreter is invoked by calling the method interp() on the root node of the AST program (see the code from P1.java shown above).  The interp() method will recursively invoke interp() on its children. After the children are interpreted (i.e., evaluated), their values can be used for evaluation of the parent node.  

Each node type (assignment node, addition node, etc) has its own interp() method, which knows how to evaluate the node.  The interpreter code is thus "distributed" across the various classes in file 

~cs536-1/public/PROG1/ast.java

which defines the various AST nodes. 

The use of the symbol table: As the interpreter (recursively) traverses the AST, it passes into children a pointer to the symbol table, which is used to maintain values that variables obtain in assignment nodes.  The symbol table is written to at AssingStmtNode nodes, and is read at IdExpNode nodes.

Statements are interpreted slightly differently than expressions: interpretation of statements (i.e., subclasses of class StmtNode) does not return any value; it only updates the symbol table (updates occur when assignment nodes were encountered during the interpretation). On the other hand, the interpretation of expressions (i.e., subclasses of class ExpNode) returns an integer, which is the value of the evaluated expression.

Error messages: Your interpreter will have to detect one kind of error -- when a variable that has not been assigned yet is used in an expression.  Your interpreter will print an error message but will not terminate the interpretation when this error is detected.  Instead, it will recover from this error by assuming that the value of the variable is 1.

You do not need to check for division by zero (you can let the Java VM terminate your interpreter with an exception).

Compiling your Interpreter

You will be provided a Makefile.  For this assignment, you should not have to modify the makefile.  Typing 'make' will recompile your program.

Files

    P1.java: the main file, and the definitions of the AST programs, (you will add more AST programs here).
    ast.java:  the AST nodes, the interpreter and decompiler (you will have to extend this file).
    SymbolTable.java: the symbol table, (you will have to write this file from scratch).
    DuplicateException.java: needed by the symbol table (you don't have to change anything).
    Sequence.java: the list data structure.  Needed by the AST programs.  (No change necessary).
    NoCurrentException.java: Needed by the Sequence class.  (No change needed).
    Makefile: a file that will let you compile the interpreter (you should not have to change the file).

Bonus

Implement print statements with multiple arguments.  Each argument should be able to be an arbitrary integer expression.

This bonus is worth extra 10%.  Note: to receive points for the bonus graded, you need to submit into the submission directory a file called README, in which you briefly describe what you have implemented and how you implemented it.  Keep the description brief (at most 3 - 6 lines).

Summary

You have to write: 

It is recommended that you write the files in the above order.

What to Hand In

To hand in your program, copy files SymbolTable.java, interpreter.java, P1.java, prog.java, and Makefile to using your actual CS login name in place of <your login name>.  Add any comments you need to make into file README that you should also copy into the submit directory. The directory is setup so that you cannot remove or modify the files once you copy them in.  So, be careful which file you copy.