CS 536 Homework 2

Due date: Tuesday, October 26 in class
Not accepted late

Question 1 | Question 2 | Question 3 | Question 4

Question 1

Part 1

Write a CFG for the language of regular expressions whose operands are letters and whose operators are the usual ones:

A regular expression can also include parentheses for grouping.

Your CFG should be unambiguous, and should be defined so that parse trees reflect the following precedences and associativities of the operators: The "or" operator should have lowest precedence, the "followed by" operator should have the next higher precedence, and the "zero-or-more" and "one-or-more" operators should have the same, highest precedence. Both of the binary operators should be left associative.

Finally, your CFG should exclude expressions that have more than one star or plus in a row (with no intervening close parenthesis). For example, the following should not be in the language of your CFG:

a**
(a|b)++
c*+
d+*
But the following should: (a*)*
(a|b+)+
(ab|(c|d*))+
When writing your CFG, use lower-case words for the nonterminals (e.g., exp, term, factor, base), and use upper-case words for the terminals (e.g., OR, STAR, LPAREN, LETTER).

Part 2

Using your CFG, write a leftmost derivation and draw a parse tree for the expression:

a | b +

Question 2

Below is a (slightly modified) version of the C-- grammar for lists of declarations. This grammar is clearly not LL(1) because some productions involve immediate left recursion, and others are not left factored. Write a new grammar for the same language that corrects these two problems.

declList -> declList decl
         -> epsilon

decl -> varDecl
     -> fnDecl

varDecl	-> type ID SEMICOLON
        -> type ID LSQBRACKET INTLITERAL RSQBRACKET

fnDecl -> type ID formals SEMICOLON

formals -> LPAREN RPAREN
        -> LPAREN formalsList RPAREN

formalsList -> type ID
            -> type AMPERSAND ID
            -> formalsList COMMA type ID
            -> formalsList COMMA type AMPERSAND ID

type -> INT
     -> BOOL

Question 3

Below is a modified version of the C-- grammar for for a function body.

fnBody -> LCURLY varDeclList stmtList RCURLY

stmtList -> stmt stmtList

stmtList -> epsilon

stmt -> ID ASSIGN exp SEMICOLON

stmt -> IF LPAREN exp RPAREN LCURLY varDeclList stmtList RCURLY stmt'

stmt' -> ELSE LCURLY varDeclList stmtList RCURLY

stmt' -> epsilon

varDeclList -> varDecl varDeclList

varDeclList -> epsilon

varDecl -> type ID varDecl'

varDecl' -> SEMICOLON

varDecl' -> LSQBRACKET INTLIT RSQBRACKET SEMICOLON

type -> INT

type -> BOOL
Fill in the first table below with the First and Follow sets for all of this grammar's nonterminals (except exp, just ignore that nonterminal). Fill in the second table with the First sets for all of the production right-hand sides. (Note: the two tables are provided via links so that you can print them out, fill them in, and hand them in).

Click here for the first table, and here for the second table.


Question 4

Part 1

Write an unambiguous CFG for the language of assignment statements that assign from one identifier to another, and that allow "chaining". For example:

a = b;
a = b = c;
a = b = a = d;
a = b = a = d = ...

Note that your grammar should generate just single assignment statements (e.g., any one of the above lines), not lists of statements (not all four lines).

Use "id", "=", and ";" as the terminal symbols of your grammar, and use single upper-case letters for the nonterminals. Make assignment right associative; i.e., the parse tree for the statement

should have a subtree that groups

Part 2

Using the CFG you wrote for part 1, define a syntax-directed translation rule for each grammar production, so that the translation of an assignment statement is the number of assignments in the statement. (Define "plain" rules using X.trans = ..., not Java Cup rules.) Here are some example translations:

statement translation
a = b; 1
a = b = c; 2
a = b = c = d; 3