Question 1 | Question 2 | Question 3
This question is about the language of regular expressions. One way to define that language is using a context-free grammar (CFG). To make things a little simpler, rather than allowing any characters as operands in regular expressions, we will restrict ourselves to only allowing letters (that way we don't have to worry about how to specifiy characters that are the same as operators or things like newlines). We will also allow ε (epsilon) in our regular expressions. The operators for our language of regular expressions are:
In a regular expression, * and + have the same, highest precedence, "followed by" has middle precedence, and | has the lowest precedence. All of the operators are left associative.
Write an unambiguous CFG for this language of regular expressions so that parse trees correctly reflect the precedences and associativities of the operators. Use lower-case names for nonterminals and use the following terminals:
LTR // any letter EPS // epsilon OR // | STAR // * PLUS // + LPAR // left paren RPAR // right paren
Draw a parse tree for the string:
a+b*|c*(d+f*)+|ε
Use LTR(a) in the parse tree to mean "the LTR token for the letter a" (and similarly for the other letters in the string).
This question is about the language of set expressions defined as follows:
A a comma-separated list of zero or more names enclosed in curly braces is a set expression.
In a set expression, parenthesis has the highest precedence; intersection and union have the same second-highest precedence; subtraction has the lowest precedence. Intersection, union, and subtraction are all left associative.
The following table are tokens for terminals:
NAME // one name in a set ∪ // union ∩ // intersection − // minus ( // left paren ) // right paren { // left curly brace } // right curly brace , // comma
For this question you are not required to write CFG for this language; some CFGs are offered below. For each CFG, do the following check:
Note that the terminals are LPAREN, RPAREN, MINUS, UNION, INTERSECT, LCURLY, RCURLY, COMMA, and NAME.
exp → exp MINUS term | term term → term UNION factor | term INTERSECT factor | LPAREN exp RPAREN | factor factor → LCURLY list RCURLY list → NAME | list COMMA NAME
exp → LCURLY RCURLY | LCURLY list RCURLY | term term → term MINUS factor | factor factor → factor UNION set | factor INTERSECT set | set set → LPAREN set RPAREN | exp list → epsilon | NAME | list COMMA NAME
exp → exp MINUS term | term term → term UNION factor | term INTERSECT factor | LPAREN exp RPAREN | factor factor → LCURLY RCURLY | LCURLY list RCURLY list → NAME | list COMMA NAME
exp → exp MINUS term | term term → factor UNION factor | term INTERSECT factor | factor factor → LPAREN exp RPAREN | LCURLY RCURLY | LCURLY list RCURLY list → NAME | list COMMA NAME
exp → exp MINUS term | term term → term UNION factor | term INTERSECT factor | factor factor → LPAREN exp RPAREN | LCURLY RCURLY | LCURLY list RCURLY list → list COMMA NAME | NAME
For this question you will define a syntax-directed translation for the CFG given below, which defines a very simple programming language.
program → MAIN LPAREN RPAREN LCURLY list RCURLY list → list oneItem | epsilon oneItem → decl | stmt decl → BOOL ID SEMICOLON | INT ID SEMICOLON stmt → ID ASSIGN exp SEMICOLON | IF LPAREN exp RPAREN stmt | LCURLY list RCURLY exp → exp PLUS exp | exp LESS exp | exp EQUALS exp | ID | BOOLLITERAL | INTLITERAL
Write a syntax-directed translation for the CFG given above, so that the translation of an input program is the set of variables used somewhere in the program. Note: here the term use is in contrast to declaration; any appearance of a variable that is not a variable declaration counts as a use of that variable. For the example code in Part b, the translation should be { x, a, b }.
Your translation rules should use the following notation:
Note that you should not try to use something like "{ a, b }" to mean a set with two elements; instead, use set union to combine two sets that each contain one element.
Use the notation that was used in class and in the on-line readings; i.e., use nonterminal.trans to mean the translation of a nonterminal, and terminal.value to mean the value of a terminal. Assume that ID.value is a String (the name of the identifier). Use subscripts for translation rules that include the same nonterminal or the same terminal more than once.
Draw a parse tree for the program given below and annotate each nonterminal in the tree with its translation.
main ( ) { int x; bool y; x = a; int z; if (x == a) { int x; b = x < 18; } }
Last Updated: 2/13/2024 © 2024 CS 536