Removing Ambiguity
One parse tree only!
The role of the grammar
distinguish between syntactically legal and illegal programs
But that’s not enough,it must also define a parse tree
the parse tree conveys the meaning of the program
What if a string can be parsed with multiple parse trees?
we say the grammar is ambiguous
must fix the grammar (the problem is not in the parser)
Note,often a string can be derived in more than one way
ie,with more than one derivation sequence
this does not mean the grammar is ambiguous
2
3
Ambiguity,Example
Grammar
E? E + E | E * E | ( E ) | int
Strings
int + int + int
int * int + int
4
Ambiguity,Example
This string has two parse treesE
E
E E
E+
int +
intint
E
E
E E
E+
int+
intint
+ is left-associative
5
Ambiguity,Example
This string has two parse treesE
E
E E
E*
int +
intint
E
E
E E
E+
int*
intint
* has higher precedence than +
6
Ambiguity (Cont.)
A grammar is ambiguous
if it has more than one parse tree for any
string
Ambiguity is bad
Leaves meaning of some programs ill-defined
Ambiguity is common in programming
languages
Arithmetic expressions
IF-THEN-ELSE
7
Dealing with Ambiguity
There are two ways to remove ambiguity:
1) Rewrite the grammar into a unambiguous grammar
E? E + T | T
T? T * int | int | ( E )
Enforces precedence(优先级) of * over +
Enforces left-associativity(左结合) of + and *
2) Declarations,instruct parser to pick desired parse trees
using %left,%right,%prec declarations
8
Ambiguity,Example
The int * int + int has ony one parse tree
now E
E
E E
E*
int +
intint
E
T
T int
T+
int
*
E
int
Rewriting the grammar,what’s
the trick?
Trick 1,Fixing precedence (* computed before +)
E? E + E | E * E | id
In the parse tree for id + id * id,we want id*id to
be subtree of E+E,How to accomplish that by
rewriting?
Create a new nonterminal (T)
make it derive id*id,…
ensure T’s trees are nested in E’s of E+E
New grammar:
9
Rewriting the grammar,what’s
the trick? (part 2)
Trick 2,Fixing associativity (+,*,associate to the
left)
E? E + E | T
T? T * T | id
In the parse tree for id + id + id,we want the left
id+id to be subtree of the right E+id,Same for
id*id*id.
Use left recursion
it will ensure that +,* associate to the left
New grammar (a simple change),10
11
Ambiguity,The Dangling Else
悬挂 ELSE问题
Consider the grammar
S? if E then S
| if E then S else S
| OTHER
This grammar is also ambiguous
12
The Dangling Else,Example
The expression
if E1 then if E2 then S3 else S4
has two parse trees if
E1 if
E2 S3 S4
if
E1 if
E2 S3
S4
Typically we want the second form
13
The Dangling Else,A Fix
Common rule,else matches the closest
unmatched then (最近匹配原则)
We can describe this in the grammar
Idea,
distinguish matched and unmatched then’s
force matched then’s into lower part of the tree
Rewritten if-then-else grammar
New grammar,Describes the same set of strings
forces all matched ifs (if-then-else) as low in the tree as possible
notice that MIF does not refer to UIF,
so all unmatched ifs (if-then) will be high in the tree
S? MIF /* all then are matched */
| UIF /* some then are unmatched */
MIF? if E then MIF else MIF
| OTHER
UIF? if E then S
| if E then MIF else UIF
14
15
The Dangling Else,Example
Revisited
The expression if E1 then if E2 then S3 else S4
if
E1 if
E2 S3 S4
if
E1 if
E2 S3
S4
Not valid because the
then expression is not a
MIF
A valid parse tree (for
a UIF)
16
Ambiguity
No general techniques for handling ambiguity
Impossible to convert automatically an
ambiguous grammar to an unambiguous one
Used with care,ambiguity can simplify the
grammar
Sometimes allows more natural definitions
We need disambiguation mechanisms
17
Precedence and Associativity
Declarations
Instead of rewriting the grammar
Use the more natural (ambiguous) grammar
Along with disambiguating declarations
Bottom-up parsers like CYK and Earley
allow precedence and associativity
declarations to disambiguate grammars
Examples …
18
Associativity Declarations
Consider the grammar E? E + E | int
Ambiguous,two parse trees of int + int + int
E
E
E E
E+
int +
intint
E
E
E E
E+
int+
intint
Left-associativity declaration,%left +
19
Precedence Declarations
Consider the grammar E? E + E | E * E | int
And the string int + int * int
E
E
E E
E+
int *
intint
E
E
E E
E*
int+
intint
Precedence declarations,%left +
%left *