Use of Tools for Parsing

(構文解析用のツールの詳細)

10th lecture, June 16, 2017

Language Theory and Compilers

http://www.sw.it.aoyama.ac.jp/2017/Compiler/lecture10.html

Martin J. Dürst

AGU

© 2005-17 Martin J. Dürst 青山学院大学

Today's Schedule

 

Last Week's Homework

Complete calc.y so that with test.in as an input, it produces test.check

Caution: How to deal with unary MINUS

 

Summary of Last Lecture

 

How to Express Priorities

 

How to Express Associativity

 

How to Express Repetition (Lists)

 

How to Express Other Constructs (e.g. if statement)

Write as is, carefully distinguishing alternatives and terminal/non-terminal symbols

if_statement : IF OPENPAREN cond CLOSEPAREN statement
             | IF OPENPAREN cond CLOSEPAREN statement
               ELSE statement
;

 

Grammar Patterns: Repetition

One or more times:

items: items item
| item
;

Zero or more times:

items: items item
|
;

Instead of "items item", "item items" is also possible, but bison's stack may become a problem

 

Grammar Patterns: Associativity

Left associative:

big_exp: big_exp left_associative_operator small_exp
| small_exp
;

Right associative:

big_exp: small_exp right_associative_operator big_exp
| small_exp
;

 

Grammar Patterns: Priority

(priority is small_exp > middle_exp > big_exp; assuming left associative)

big_exp: big_exp operator middle_exp
| middle_exp
;
middle_exp: middle_exp operator small_exp
| small_exp
;

 

Grammar Patterns: Parentheses

small_exp: open_paren big_exp close_paren
| literal
;

 

Order of Derivation: Leftmost and Rightmost Derivation

With leftmost derivation, the leftmost nonterminal in the syntax tree is always expanded first

With rightmost derivation, the rightmost nonterminal in the syntax tree is always expanded first

Simple example grammar:

E → E '+' T| T
T → integer

Example of input: 5 + 7 + 3

 

Derivation Choices

Different choices may:

 

Kinds of Analysis Methods

The labels are also used for grammars:
"This grammar is LL(1)" means: this grammar can be used with an LL(1) parser

 

Understanding bison: The .output File

bison -v creates a file with extension .output, containing the following interesting details:

 

Understanding bison: Debuging

#define YYDEBUG 1 switches on debugging

The output shows how bison works:

 

Conflicts and Ambiguous Grammars

 

Another Example of Ambiguity

The grammar for if-else is a famous example of ambiguity:

if (...) if (...) ...; else ...;

can be parsed in two ways:

if (...) {
    if (...)
        ...;
    else
        ...;
}

or

if (...) {
    if (...)
        ...;
}
else
    ...;

This creates a shift-reduce conflict.

The first way of parsing is correct (for C), and is choosen by Bison because in a shift-reduce conflict, shift is selected.

 

Grammar of bison Rewriting Rules

rewritingRule → nonterminalSymbol ":" rightHandList ";"
rightHandList → rightHand | rightHand "|" rightHandList
rightHand → symbolList "{" CFragment "}"
symbolList → symbol | symbol symbolList
symbol → nonterminalSimbol | terminalSymbol

 

How to Combine flex and bison

 

Advantages and Problems of Bottom-Up Parsing

 

Homework: Simple Programming Language

Deadline: June 29, 2017 (Thursday in two weeks), 19:00

Prepare questions so that you can ask them in next week's lecture!

Where to submit: Box in front of room O-529 (building O, 5th floor)

Format: Submit the files language.lex and language.y, A4 using BOTH sides (↓↓, not ↓↑), stapled in upper left if more than one page, NO cover page, NO wrapping lines, legible font size, non-proportional font, portrait (not landscape), formatted (indents,...) for easy visibility, name (kanji and kana) and student number as a comment at the top right

Expand the simple calculator of calc.y to a small programming language.

 

Hints for Homework

 

Glossary

unary (operator)
単項 (演算子)
leftmost derivation
最左導出
rightmost derivation
最右導出
reverse order
逆順
lookahead
先読み
non-proportional font
等幅のフォント