New River, New Water

Try Understanding SLR Parsing by Walking Through an Example

This blog is aiming to sort out some parsing concepts, especially, by walking through a simple CFG example.

Most concepts are from Stanford CS143 Compilers, and Compilers: Principles, Techniques, & Tools (A.K.A the Dragon Book). This blog is just reflecting and trying to deepen my own understanding of the topic, you should refer to the course and the book for stricter definitions.

CFG Example

$$ E \rightarrow E + E | id \tag{CFG-1}$$

LR(0) DFA

States of the DFA

Transition Diagram

transition diagram

Parsing Example

The table is produced by following example 4.43 of the Dragon Book.

Parsing $id + id$

Line Stack Symbols Input Action
1 0 $ id + id $ shift to I2
2 0 2 $ id + id $ reduce by E -> id
3 0 1 $ E + id $ shift to I3
4 0 1 3 $ E + id $ shift to I2
5 0 1 3 2 $ E + id $ reduce by E -> id
6 0 1 3 4 $ E + E $ reduce by E -> E + E
7 0 1 $ E $ accept

I feel the ordering of the columns is a little misleading, I actually finish the stack column at last of each step. Here are some explanations and considerations from me to better understand this table.

Parsing $id + id + id$

Line Stack Symbols Input Action
1 0 $ id + id + id $ shift to I2
2 0 2 $ id + id + id $ reduce by E -> id
3 0 1 $ E + id + id $ shift to I3
4 0 1 3 $ E + id + id $ shift to I2
5 0 1 3 2 $ E + id + id $ reduce by E -> id
6 0 1 3 4 $ E + E + id $ reduce or shift???

Here in line 6, we have a shift/reduce conflict, meaning we could not decide whether to reduce or to shift. In fact, we could parse the string either by shifting or reducing at the specific stage; but to avoid such problem, we should either specify left associative or right associative.

Some Concepts

Ambiguity

We say a grammar is ambiguous if a string of the language could be parsed into different parsing trees.

For example, $id + id + id$, which is a string belonging to the language defined by CFG-1, could be parsed in either $id + (id + id)$ or $(id + id) + id$.

This is because the Associativity is not specified here.

In some other cases, without specifying Precedence of operations can also lead to ambiguous grammar. For example, $id + id * id$ could be parsed as $(id + id) * id$ or $id + (id * id)$ by grammar $E \rightarrow E + E | E * E | id$.

Conflicts

There are mainly 2 conflicts mentioned in this context: shift/reduce conflict and reduce/reduce conflict.

A simple reduce-reduce conflict example

A no-brain example could be:

$$ E \rightarrow A | B $$ $$ A \rightarrow x $$ $$ B \rightarrow x $$ when we have an $x$ in the parsing stack, while we have 2 productions $A \rightarrow x$ and $B \rightarrow x$ producing $x$, the parser could not decide which one to apply.

A complicated reduce-reduce conflict example

Supposer the grammar is like: $$ S \rightarrow A(S)B | \epsilon $$ $$ A \rightarrow S | SB | x | \epsilon $$ $$ B \rightarrow SB | y $$

The initial SLR(1) parsing automaton state of the grammar contains following LR(0) items: $$ S’ \rightarrow .S $$ $$ S \rightarrow .A(S)B $$ $$ S \rightarrow . $$ $$ A \rightarrow .S $$ $$ A \rightarrow .SB $$ $$ A \rightarrow .x $$ $$ A \rightarrow . $$

This grammar has a reduce-reduce conflict on input (.

Why? Because ( is both in $FOLLOW(A)$ (line 1 of the grammar) and $FOLLOW(S)$ (line 2 of the grammar we know that $FOLLOW(A) \subseteq FOLLOW(S)$).

Intuitively, the fact tells us that ( is valid following both an A and an S. Notice that both A and S could be reduced to an $\epsilon$, that’s why here we’re facing a situation where there’s more than one valid reductions.

Decision of the Action

The root for the conflict is $I_4$, where we have 2 LR(0) items, which are: $$ E \rightarrow E + E. \tag{Item-1} $$ and $$
E \rightarrow E. + E \tag{Item-2} $$ .

For Item-1, we already reach the end, the choice for it is reduction. While for Item-2, if the next symbol is +, then it’s possible to do a shift. In this case, we say we have a shift/reduce conflict on +.


If you have any feedback to this article, feel free to comment here or send an email to me