LL(1) Grammars and Predictive Parsers

LL(1) grammars are parsed by top-down parsers. They construct the derivation tree starting with the start nonterminal and working down. One kind of parser for LL(1) grammars is the predictive parser. The idea is as follows:

Here's how the predictive parser works:

Note also that there are rules for removing non-immediate left recursion; for example, you can read about how to do that in the compiler textbook by Aho, Sethi & Ullman, on page 177. However, we will not discuss that issue here.

Left Factoring

A second property that precludes a grammar from being LL(1) is if it is not left factored, i.e., if a nonterminal has two productions whose right-hand sides have a common prefix. For example, the following grammar is not left factored:

$Exp$ $\longrightarrow$ ( $Exp$ ) | (  )
In this example, the common prefix is "(".

This problem is solved by left-factoring, as follows:

For example, consider the following productions: $Exp$ $\longrightarrow$ ( $Exp$ ) | (  )

Using the rule defined above, they are replaced by:

$Exp$ $\longrightarrow$ ( $Exp`$
$Exp`$ $\longrightarrow$ $Exp$ ) | )

Here's the more general algorithm for left factoring (when there may be more than two productions with a common prefix):

Not all entries have been filled in, but already we can see that this grammar is not LL(1) since there are two entries in table[S,a] and in table[S,c].

Here's how we filled in this much of the table:

  1. First, we considered the production S -> B c. FIRST(Bc) = { a, c }, so we put the production's right-hand side (B c) in Table[S, a] and in Table[S, c]. FIRST(Bc) does not include epsilon, so we're done with that production.
  2. Next, we considered the production S -> D B . FIRST(DB) = { d, a, c }, so we put the production's right-hand side (D B) in Table[S, d], Table[S, a], and Table[S, c].
  3. Next, we considered the production D -> epsilon. FIRST(epsilon) = { epsilon }, so we must look at FOLLOW(D). FOLLOW(D) = { a, c }, so we put the production's right-hand side (epsilon) in Table[D, a] and Table[D, c}.


Finish filling in the parse table given above.


How to Code a Predictive Parser

Now, suppose we actually want to code a predictive parser for a grammar that is LL(1). The simplest idea is to use a table-driven parser with an explicit stack. Here's pseudo-code for a table-driven predictive parser:

   currToken = scan();

   while (! Stack.empty()) {
     topOfStack = Stack.pop();
     if (isNonTerm(topOfStack)) {
       // top of stack symbol is a nonterminal
       p = table[topOfStack, currToken];
       if (p is empty) report syntax error and quit
       else {
         // p is a production's right-hand side
	 push p, one symbol at a time, from right to left
     else {
       // top of stack symbol is a terminal
       if (topOfStack == currToken) currToken = scan();
       else report syntax error and quit


Here is a CFG (with rule numbers):

S -> epsilon               (1)
  |  X Y Z                 (2)

X -> epsilon               (3)
  |  X S                   (4)

Y -> epsilon               (5)
  |  a Y b                 (6)

Z -> c Z                   (7)
  |  d                     (8)

Question 1(a): Compute the First and Follow Sets for each nonterminal.

Question 1(b): Draw the Parse Table.