PRINCIPLES OF COMPILER
DESIGN
UNIT-I Lexical
Analysis
- What is a compiler? [AUT Nov. / Dec. 2011] [AU Nov. / Dec. 2009]
A Compiler is a program that reads a
program written in one language – the source language and translates it into an
equivalent program in another language- the target language. An important role
of compiler is to report errors in the source program that it detects during
the translation process.
- What is an interpreter?
An interpreter is
another kind of language processor. Instead of producing a target program as a
translation, an interpreter appears to directly execute the operations
specified in the source program on inputs supplied by the user.
- What is a hybrid compiler? Give an example.
A language
processors which combine compilation and interpretation is called as hybrid
compiler. Eg. Java Language processors.
A java Source
program may first compiled into an intermediate form called bytecodes. The bytecodes are then
interpreted by a virtual machine. A benefit of this is that bytecodes compiled
on one machine can be interpreted on another machines. To achieve faster
processing, just-in-time compilers,
translate the bytecodes into machine language immediately before they run the
intermediate program to process the input.
- What are the two parts of a compilation? Explain briefly.
Analysis
and Synthesis are the two parts of compilation.
The analysis part is the front end of the compiler. It breaks up
the source program into constituent pieces and creates an intermediate
representation of the source program.It collects information about source
program and stores it ina data structure called a symbol table.
The
synthesis part is the back end of
the compiler. It constructs the desired target program from the intermediate representation
and the information in the symbol table.
- List the phases of the compiler
- Lexical Analyzer
- Syntactic Analyzer
- Semantic Analyzer
- Intermediate Code Generation
- Machine-Independent Code Optimization
- Code Generation
- Machine-Dependent Code Optimization
- List the phases that constitute the front end of a compiler.
The front end
consists of those phases or parts of phases those depends primarily on the
source language and are largely independent of the target machine. These include
·
Lexical
and Syntactic analysis
·
The
creation of symbol table
·
Semantic
analysis
·
Generation
of intermediate code
A
certain amount of code optimization can be done by the front end as well. Also includes error handling that goes along
with each of these phases.
- Mention the back-end phases of a compiler.
The back end of compiler includes
those portions that depend on the target machine and generally those portions
do not depend on the source language, just the intermediate language. These
include
·
Code
optimization
·
Code
generation, along with error handling and symbol- table operations.
- List the subparts or phases of analysis part.
Analysis
consists of three phases:
·
Linear
Analysis.
·
Hierarchical
Analysis.
·
Semantic
Analysis.
- What are the classifications of a compiler?
Compilers
are classified as:
·
Single-
pass
·
Multi-pass
·
Load-and-go
·
Debugging
or optimizing
- What is a symbol table? [AU Nov/Dec 2007]
A symbol table is a data structure
containing a record for each identifier, with fields for the attributes of the
identifier. The data structure allows us to find the record for each identifier
quickly and to store or retrieve data from that record quickly. Whenever an
identifier is detected by a lexical analyzer, it is entered into the symbol
table. The attributes of an identifier cannot be determined by the lexical analyzer.
- What is lexical analysis?
Linear analysis is one in which the
stream of characters making up the source program
is read from left to right and grouped into tokens that are sequences of characters having a collective meaning called lexemes. For each lexemes, lexical
analyser produces a token of form <token-name, attribute-value>. The
lexical analysis is also called as linear
analysis or scanning.
- What is syntax analysis?
It is the second phase of compiler. It
is also called as Parsing. The
parser uses the first components of tokens produced by the lexical analyzer to
crteate a tree-like intermediate representation(i.e) a grammatical structure of
a token stream. A syntax tree in which each interior node represents an
operation and the children of the node represents the arguments of the
operation.
- What is the use of Syntax directed translation engines?
Syntax directed
translation engines produce collections of routines that walk the parse tree,
and generating the intermediate code.
- What is the function of the semantic analyzer? [AU Nov / Dec 2011]
A semantic analyzer checks the source
program for semantic errors and collects the type information for the code
generation. Type-checking is an important part of semantic analyzer
Ex: newval :=
oldval + 12
The type of the
identifier newval must match with
type of the expression (oldval+12)
- What are the functions of Intermediate code generator?
A compiler may produce
an explicit intermediate codes representing the source program. These
intermediate codes are generally machining (architecture independent). But the
level of intermediate codes is close to the level of machine codes.
Ex:
newval :=
oldval * fact + 1
id1 := id2
* id3 + 1
MULT id2,id3,temp1 Intermediates Codes (Quadraples)
ADD temp1,#1,temp2
MOV temp2,,id1
- What is the function of the code generator?
Produces the target
language in a specific architecture.The target program is normally is a re
locatable object file containing the
machine codes.
Ex:( assume that we have an architecture with instructions whose at
least one of its operands is a machine register)
MOVE id2,R1
MULT id3,R1
ADD #1,R1
MOVE R1,id1
- What is the function of the code optimizer?
The code optimizer
optimizes the code produced by the intermediate code generator in the terms of
time and space.
Ex:
MULT id2,id3,temp1
ADD temp1,#1,id1
- Mention some of the cousins of a compiler. [AU Apr/May 2004, 2005][AU Nov/Dec 2013]
Cousins
of the compiler are:
§ Preprocessors
§ Two Pass Assembly
§ Assemblers
§ Loaders and Link-Editors
- What is the function of pretty printers?
A Pretty printer
analyzes a program and prints it in such a way that the structure of the
program is clearly visible.
- How the Grouping of Phases into Passes is organized?
In an
implementation, activities from several phases may be grouped together into a
pass that reads an input file and writes an output file. (Ex.) The front-end
phases of lexical,syntax,semantic and intermediate code generation might be
grouped together into one pass. Code optimization might be optional pass. Then
the back-end pass consisting of code generation for a particular target
machine.
- What are the various compiler construction Tools available? [AU Nov/Dec 2004]
1. Parser generators
2. Scanner generators
3. Syntax – directed translation engine
4. Code-generator generators
5. Data Flow analysis engine
6. Compiler-construction toolkits
- Define compiler-compiler. [AUT Nov/Dec 2010]
Systems to help
with the compiler-writing process are often been referred to as
compiler-compilers, compiler-generators or translator-writing systems.
Largely
they are oriented around a particular model of languages, and they are suitable for generating compilers of
languages similar model.
- What are the functions of preprocessor? [AU Nov/Dec 2007] [AU Nov/Dec 2004] [AUT Nov / Dec 2011]
·
Macro
processing
·
File
inclusion
·
“Rational”
preprocessors
·
Language
Extensions
- Define preprocessor. [AU May/Jun 2007]
Preprocessors
produce input to the compilers. It performs the following functions.
·
Macro
processing
·
File
inclusion
·
“Rational”
preprocessors
·
Language
Extensions
- Differentiate analysis and synthesis. [AUT Nov/Dec 2010]
Analysis:
It is one Part of
compilation.
The analysis part
breaks up the source program into constituent pieces and creates an intermediate
representation of the source program.
It includes Linear
Analysis, Syntax Analysis and Semantic Analysis.
Synthesis:
It is another part
of compilation.
The
synthesis part constructs the desired target program from the intermediate representation.
It
includes intermediate code generation, code generation and code optimization.
- Represent the following expression into lexical analysis phase and semantic analysis phase: Average = (mark1 + mark2) * 1.5 / 2. [AUT Nov/Dec 2010]
- Compare compiler and interpreter. [AU Nov. / Dec. 2012] [AU May / June 2013]
A Compiler
is a program that translates
code of a programming language
in machine code, also called object code. The object code can be
executed directly on the machine where it was compiled.
An Interpreter
reads the statements of a program, analyzes them and then executes them on the
virtual machine by calling the corresponding instructions of the library.
- What is meant by a cross – compiler? [AU Nov. / Dec. 2012] [AU May / June 2013]
A cross compiler is a compiler capable of creating executable
code for a platform other than the one on which the
compiler is running. Cross compiler tools
are used to generate executables for embedded
system or multiple platforms.
- What is the role of lexical analyzer?[ AU Nov/Dec 2013]
The main task is to read the input characters of the source program, group
them into lexemes, and produce as output a sequence of tokens for each lexeme
in the source program. The atream of tokens is sent to the parser for syntax
analysis. It is common for lexical analyzer to interact with the symbol table
as well.
- Differentiate tokens, patterns, lexeme.[AU Nov / Dec 2010] [AU May / June 2013]
Ø
Tokens- Sequence of
characters that have a collective meaning.
Ø Patterns- There is a set of strings in the input for
which the same token is produced as output. This set of strings is described by
a rule called a pattern associated with the token
Ø Lexeme- A sequence of characters in the source
program that is matched by the pattern for a token.Ex. const a = 4.98; here a
is a lexeme for the token “identifier”.
- List the operations on languages.
§ Union - L U M ={s | s is in
L or s is in M}
§ Concatenation – LM ={st | s is in
L and t is in M}
§ Kleen Closure – L* (zero or more
concatenations of L)
§ Positive Closure – L+ ( one or more
concatenations of L)
- Write a regular expression for an identifier.
An identifier is
defined as a letter followed by zero or more letters or digits. The regular
expression for an identifier is given as
letter
(letter | digit)*
- Mention the various notational shorthands for representing regular expressions.
i. One or more
instances (+)
ii. Zero or one instance (?)
iii. Character classes ([abc] where a,b,c are
alphabet symbols denotes the regular expressions a | b | c.)
iv. Non regular sets
- What is the function of a hierarchical analysis?
Hierarchical
analysis is one in which the tokens are grouped hierarchically into nested collections with collective meaning.
Also termed as Parsing.
- What does a semantic analysis do? [AU Nov / Dec 2011]
Semantic
analysis is one in which certain checks are performed to ensure that components of a program fit together
meaningfully.
Mainly
performs type checking.
- List the various error recovery strategies for a lexical analysis. [AU Apr/May 2005]
Possible
error recovery actions are:
Ø Panic mode recovery
Ø Deleting an
extraneous character
Ø Inserting a missing
character
Ø Replacing an
incorrect character by a correct character
Ø Transposing two
adjacent characters
- What are the components of Context Free Grammar?
A finite set of terminals (in
our case, this will be the set of tokens)
Ø A finite set of
non-terminals (syntactic-variables)
Ø A finite set of
productions rules in the following form
·
A
® a
where A is a non-terminal and
a is a string of terminals and
non-terminals (including the empty string)
- A start symbol (one of the non-terminal symbol)
Example:
E
® E + E
| E – E | E
* E |
E / E | - E
E
® ( E )
E ® id
- What is the function of the syntactic analyzer?
- A Syntax Analyzer creates the syntactic structure (generally a parse tree) of the given program.
- A syntax analyzer is also called as a parser.
- A parse tree describes a syntactic structure.
A
Parse tree pictorially shows the start symbol of a grammar derives a string in the language.
Ex.
AàXYZ
- Define input buffering. [AUT Nov/Dec 2010]
It
stores the input and is divided into two N-character halves. N – number of
character on one disk block. Ex. 1024 or 4096. It is used to reduce the amount
of overhead required to process an input character.
- Write the possible error recovery techniques.
1. Deleting an
extraneous character
2. Inserting a
missing character
3. Replacing an
incorrect character by a correct character
4. Transposing two
adjacent character
- What is a Sentinel? [AU Nov / Dec 2010]
The
Sentinel is a special character that cannot be part of the source program. Ex.
eof. It is used to perform the test to identify the end of the source file.
- How will you specify the patterns?
Regular
expressions are used to describe tokens. Regular
expressions are an important notation for specifying patterns .Each pattern
matches a set of strings, so regular expressions will serve as names for set of
strings.
- What is a Language? What are the operations that cane be applied on it?
Language
denotes any set of strings over some fixed alphabet. A language denoted by a regular expression is said to
be a regular set. The possible operations on languages are Union,
Concatenation, Kleene Closure, Positive Closure.
- What is a regular definition?
If
å is an alphabet of basic
symbols, then a regular definition is a sequence of definitions of the form
d1 à r1
d1 à r1
d2 à r2
d3 à
r3
…..
dn à
rn
where
each di is a distinct name, and each ri is a regular
expression over the symbols in å U{d1,
d2, d3,……. Di-1}.
- Which constructs of a program should be recognized by the lexical analyzer, and which ones by the syntax analyzer?
1. Both of them do
similar things; But the lexical analyzer deals with simple non-recursive
constructs of the language.
2. The syntax analyzer
deals with recursive constructs of the language.
3. The lexical
analyzer simplifies the job of the syntax analyzer.
4. The lexical
analyzer recognizes the smallest meaningful units (tokens) in a source program.
5. The syntax analyzer
works on the smallest meaningful units (tokens) in a source program to
recognize meaningful structures in our programming language.
- What are the issues in Lexical Analysis? [AU May/Jun 2007, 2009, 2013]
- The separation of lexical analysis from syntax analysis often allows us to simplify one or the other of these phases.
- Compiler efficiency is improved
- Compiler portability is enhanced
- What is a DFA?
A DFA represents a
finite state machine that recognizes a RE. For example, the following DFA:
recognizes (abc+)+.
- Differentiate between NFA and DFA. [AU Nov / Dec 2012]
NFA: A transition
function “move” that maps state-symbol pairs to set of states. It has ε- transition.
For each state S and input symbol ‘a’, there is more than one edge labeled ‘a’
leaving S.
DFA: It has no ε-
transition. For each state S and input symbol ‘a’, there is atmost one edge labeled ‘a’ leaving S.
- Write a regular expression for the set of all strings of 0’s and 1’s with an
even number
of 0’s and an odd number of 1’s. [AU Nov / Dec 2012]
(1*01*01*)* (1*10*10*)*.
- What is LEX? [AU May / June 2013]
Lex is a computer
program that generates lexical
analyzers ("scanners" or "lexers").
Lex reads an input stream specifying the lexical analyzer and
outputs source
code implementing the lexer in the C programming language.
52. Write the difference
between regular grammar and context free grammar. [AU
May / June 2013]
In formal language theory, a context-free
grammar (CFG) is a formal
grammar in which every production rule is of the form
V -> w
where V
is a single nonterminal symbol, and w is a string of terminals and/or nonterminals (w
can be empty). A formal grammar is considered "context free" when its
production rules can be applied regardless of the context of a nonterminal. It
does not matter which symbols the nonterminal is surrounded by, the single
nonterminal on the left hand side can always be replaced by the right hand
side. The languages generated by context-free grammars are
known as the context-free languages.
A regular
grammar is a formal grammar (N, Σ, P, S)
such that all the production rules in P are of one of the following
forms:
·
B ->aC - where B and C are in N and a
is in Σ
- Write regular expression for language to recognize ‘identifier’. [AU May / June 2013]
The regular definition for recognizing
identifiers in C programming language is given by
letter -> a|b|...z|A|B|...|Z|
digit -> 0|1|...|9
identifier ->
letter(letter|digit)*
This definition will generate identifiers
of the form identifier:
[_a-zA-Z][_a-zA-Z0-9]*
16 Mark Questions
1. Discuss in
detail the various phases of a compiler for translating a source statement to
object code.(or) What are the Six phases of the
compilers? Explain with an example.
2. What are
compiler construction tools? Describe about their features.(or) Why do we need
compiler construction tools? Discuss in detail about some compiler construction
tools.
3. Explain the
needs of grouping of phases of compiler.
4.Write in detail
about the cousins of the compiler.
5. Draw the DFA for
the regular expression (a/b)*abb and find minimized DFA.
6. (a)Write in
detail about tool for generating a Lexical Analyser (b) Give a lexical analyser
specification for a model computer.
7. Depict the
language processing system and describe about the cousins of the compiler.
8. Discuss in detail
the efficiency issues concerned with the buffering of input during lexical
analysis. (or) Discuss about the input buffering schemes in lexical analysis.
9. Construct a
nondeterministic finite automata using Thompson’s construction for the regular
expression (a/b)*abb (a/b)*
10. Write the role
of lexical analyzer. (or) Explain in detail about the role of Lexical analyzer
with the possible error recovery actions.
11. What is the
need for lexical analysis? Discuss the issues in lexical analysis.
12. Discuss about
the errors encountered during the lexical analysis phase.
13. Discuss about
the implementation of lexical analyzer.
14. Write the
significance of symbol table during the lexical analysis phase.
15. Construct a
minimum state DFA for the regular expression (a/b)*abb(a/b).
16. Prove that the
given two regular expressions are equivalent by showing that their minimum
state DFA’s are the same (a*/b*) and ((ε/a) b*)*.
UNIT II Syntax
Analysis and Run-time Environments
- Define parser or State the role of parser. [AUT Nov/Dec 2010, Nov/Dec 2013]
Hierarchical
analysis is one in which the tokens are grouped hierarchically into nested
collections with collective meaning. Also termed as Parsing.
Or
It gets the string
of tokens from the lexical analyzer and verifies that the string can be
generated by the grammar for the source language.
- Mention the basic issues in parsing.
There
are two important issues in parsing.
a. Specification of
syntax
b. Representation of
input after parsing.
- Why lexical and syntax analyzers are separated out?
Reasons
for separating the analysis phase into lexical and syntax analyzers:
- Simpler design.
- Compiler efficiency is improved.
- Compiler portability is enhanced.
- Define a context free grammar. [AU Nov / Dec 2009, Nov/Dec 2013]
A
context free grammar G is a collection of the following
1. V is a set of non
terminals
2. T is a set of
terminals
3. S is a start symbol
4. P is a set of production rules
5. G can be
represented as G = (V,T,S,P)
6. Production rules
are given in the following form
7.
Non terminal → (V U
T)*
- Briefly explain the concept of derivation.
Derivation
from S means generation of string w from S. For constructing derivation two things are important.
ii. Choice of non
terminal from several others.
iii. Choice of rule from
production rules for corresponding non terminal. Instead of choosing the
arbitrary non terminal one can choose
iv. either leftmost
derivation – leftmost non terminal in a sentinel form
v. or rightmost
derivation – rightmost non terminal in a sentinel form
- Define ambiguous grammar. [AU Nov/Dec 2007]
A
grammar G is said to be ambiguous if it generates more than one parse tree for
some sentence of language L(G).i.e. both leftmost and rightmost derivations are
same for the given sentence.
- What is an operator precedence parser?
A
grammar is said to be operator precedence if it possess the following
properties:
- No production on the right side is ε.
- There should not be any production rule possessing two adjacent non terminals at the right hand side.
- List the properties of LR parser.
a. LR parsers can be
constructed to recognize most of the programming languages for which the
context free grammar can be written.
b. The class of
grammar that can be parsed by LR parser is a superset of class of grammars that
can be parsed using predictive parsers.
c. LR parsers work
using non backtracking shift reduce technique yet it is efficient one.
- Mention the types of LR parser.
Ø SLR parser- simple LR parser
Ø LALR parser- lookahead LR parser
Ø Canonical LR parser
- What are the problems with top down parsing? [AU May/Jun 2009]
The following are the problems associated
with top down parsing:
1. Backtracking
2. Left recursion
3. Left factoring
4. Ambiguity
- Write the algorithm for FIRST and FOLLOW.
FIRST
1. If X is terminal,
then FIRST(X) IS {X}.
2. If X → ε is a
production, then add ε to FIRST(X).
3. If X is non
terminal and X → Y1,Y2..Yk is a production, then place a in FIRST(X) if for
some i , a is in FIRST(Yi) , and ε is in all of FIRST(Y1),…FIRST(Yi-1);
FOLLOW
·
Place
$ in FOLLOW(S),where S is the start symbol and $ is the input right endmarker.
·
If
there is a production A → αBβ, then everything in FIRST(β) except for ε is
placed in FOLLOW(B).
·
If
there is a production A → αB, or a production A→ αBβ where FIRST(β) contains ε
, then everything in FOLLOW(A) is in FOLLOW(B).
- List the advantages and disadvantages of operator precedence parsing.[AU May/Jun 2007]
Advantages
This type of parsing is simple to
implement.
Disadvantages
·
The
operator like minus has two different precedence(unary and binary).Hence it is hard to handle tokens like minus sign.
· This kind of parsing is applicable to only
small class of grammars.
- What is dangling else problem?
Ambiguity can be eliminated by means of
dangling-else grammar which is show below:
Stmt |
if expr then stmt
|
if expr then stmt else stmt
|
other
- Write short notes on YACC.
YACC
is an automatic tool for generating the parser program. YACC stands for Yet
Another Compiler Compiler which is basically the utility available from UNIX. Basically
YACC is LALR parser generator. It can report conflict or ambiguities in the
form of error messages.
- What is meant by handle pruning?
A
rightmost derivation in reverse can be obtained by handle pruning.
If w is a sentence of the grammar at hand,
then w = γn, where γn is the nth right-sentential form of some as yet unknown
rightmost derivation
S = γ0 => γ1…=> γn-1 => γn = w
- What is meant by viable prefixes?
The set of prefixes of right sentential
forms that can appear on the stack of a shift-reduce parser are called viable
prefixes. An equivalent definition of a viable prefix is that it is a prefix of
a right sentential form that does not continue past the right end of the
rightmost handle of that sentential form.
- Define handle. [AU Nov /Dec 2012]
A
handle of a string is a substring that matches the right side of a production,
and whose reduction to the
nonterminal on the left side of the production represents one step along the reverse of a rightmost
derivation. A handle of a right – sentential form γ is a production A->β and
a position of γ where the string β may be found and replaced by A to produce
the previous right-sentential form in a rightmost derivation of γ. That is , if
S =>αAw =>αβw, then A->β in
the position following α is a handle of αβw.
- What are kernel & non-kernel items?
Kernel
items, whish
include the initial item, S'→ .S, and all items whose dots are not at
the left end.
Non-kernel
items, which
have their dots at the left end.
- What is phrase level error recovery?
Phrase
level error recovery is implemented by filling in the blank entries in the
predictive parsing table with pointers to error routines. These routines may change, insert, or delete symbols on the
input and issue appropriate error messages.
They may also pop from the stack.
- What is meant by derivation?
A sequence of
replacements of non-terminal symbols is called a derivation of id+id from E.
In
general a derivation step is
aAb Þ bga if there is a production rule Ag® in our
grammar
where a and b are arbitrary
strings of terminal and non-terminal symbols
a1 Þ a2 Þ ... Þ an (an
derives from a1 or a1 derives an )
Þ : derives in one step
Þ
:
derives in zero or more steps
Þ
:
derives in one or more steps
- State the difference between sentential and sentence of a context free grammar G.
S Þ a - If a contains non-terminals, it is
called as a sentential form of G.
If a does not contain
non-terminals, it is called as a sentence of G.
- What is L(G)?
Ø L(G) is the
language of G (the language generated by G) which is a set of sentences.
Ø A sentence of L(G) is a string of terminal symbols of G.
Ø If S is the start symbol of G then
Ø w is a sentence of L(G) iff S Þ w
where w
is a string of terminals of G.
Ø If G is a
context-free grammar, L(G) is a context-free language.
- What is the difference between left most and right most derivation?
Ø If we always choose
the left-most non-terminal in each derivation step, this derivation is called
as left-most derivation.
Ø E Þ -E Þ -(E) Þ -(E+E) Þ -(id+E) Þ -(id+id)
Ø If we always choose
the right-most non-terminal in each derivation step, this derivation is called
as right-most derivation.
E Þ -E Þ -(E) Þ -(E+E) Þ -(E+id) Þ -(id+id)
- When the Grammar is called as an ambiguous grammar?
A grammar produces more than one parse tree
for a sentence is called as an ambiguous
grammar.
- What is a Left recursive grammar?
A
grammar is left recursive if it has a non-terminal A such that
there is a derivation. A Þ Aa for some string a the left-recursion
may appear in a single step of the derivation (immediate left- recursion), or may appear in more than
one step of the derivation.
- How will you eliminate the Left Recursion in Grammars?
A
® A a | b
where b does not start
with A
ß eliminate immediate left
recursion
A
® b A’
A’
® a A’ | e an equivalent grammar
In
general,
A
® A a1 | ... | A am | b1 | ... | bn where b1 ... bn do not start with
A
ß eliminate
immediate left recursion
A
® b1 A’ | ... | bn A’
A’
® a1 A’ | ... | am A’
| e an equivalent grammar
- Give an Example for an Immediate Left Recursion and how it will be eliminated?
Example :1
E ® E+T | T
T ® T*F | F
F ® id | (E)
eliminate immediate left recursion
E ® T E’
E’ ® +T E’ | e
T ® F T’
T’ ® *F T’ | e
F ® id | (E)
Example
2:
S
® Aa | b
A
® Ac | Sd | f
-
Order of non-terminals: S, A
for
S:
-
we do not enter the inner loop.
-
there is no immediate left recursion in S.
for
A:
-
Replace A ® Sd with
A ® Aad | bd
So, we will have A ® Ac | Aad | bd | f
-
Eliminate the immediate left-recursion in A
A ® bdA’ | fA’
A’
® cA’ | adA’ |
e
So,
the resulting equivalent grammar which is not left-recursive is:
S
® Aa | b
A
® bdA’ | fA’
A’
® cA’ |
adA’ | e
- Why we need Left Factoring? [AU Nov / Dec 2011]
A predictive parser
(a top-down parser without backtracking) insists that the grammar must be left-factored.
grammar
è a new equivalent
grammar suitable for predictive parsing
stmt
® if expr
then stmt else
stmt |
if
expr then stmt
when
we see if, we cannot now which
production rule to choose to re-write stmt
in the derivation.
- Give an Example for left Factoring?
A ® abB | aB | cdg | cdeB
| cdfB
ß
A ® aA’ | cdg | cdeB | cdfB
A’ ® bB | B
ß
A ® aA’ | cdA’’
A’ ® bB | B
A’’ ® g | eB | fB
- Write an algorithm that will do the Left Factoring.
For each
non-terminal A with two or more alternatives (production rules) with a common non-empty prefix, let say
A ®
ba1 | ... | ban | g1 | ... | gm
convert
it into
A
® aA’ |
g1 | ... | gm
A’
® b1 | ... | bn
- List out the important features of Predictive Parsing? [AU Nov/Dec 2007]
Predictive Parsing
•
no
backtracking
•
efficient
•
needs
a special form of grammars (LL(1) grammars).
•
Recursive
Predictive Parsing is a special form of
Recursive Descent parsing without backtracking.
Non-Recursive
(Table Driven) Predictive Parser is also known as LL(1) parser.
- Give a simple example for Recursive-Descent Parsing (uses Backtracking)
Backtracking is needed.
It tries to find the left-most
derivation
S
® aBc
B
® bc | b
S S
input:
abc
a B c a B c
b c b
- What is a Non-Recursive Predictive Parsing -- LL(1) Parser?
Non-Recursive predictive parsing is a
table-driven parser.
It is a top-down parser.
It
is also known as LL(1) Parser.
- What are the components of the Parsing Table?
parsing
table
–
a
two-dimensional array M[A,a]
–
each
row is a non-terminal symbol
–
each
column is a terminal symbol or the special symbol $
–
each
entry holds a production rule
- What is a FIRST (a)?
FIRST (a) is a set of the terminal symbols which occur
as first symbols in strings derived from a where a is any string of
grammar symbols.
if
a derives to e, then e is also in FIRST(a) .
- What is a FOLLOW (A)?
FOLLOW
(A)
is the set of the terminals which occur immediately after (follow) the non-terminal
A in the strings derived from the
starting symbol.
–
a
terminal a is in FOLLOW(A) if S Þ aAab
$
is in FOLLOW (A) if S Þ aA
- How will you compute FIRST for Any String X ? [AU Nov / Dec 2011]
If X is a terminal symbol è FIRST(X)={X}
If X is a non-terminal symbol and X ® e is a production
rule
è e
is in FIRST(X).
If X is a non-terminal symbol and X ® Y1Y2..Yn is a
production rule
è if
a terminal a in FIRST(Yi) and e is in all FIRST(Yj) for
j=1,...,i-1 then a is in FIRST(X).
è if e is in all FIRST(Yj) for
j=1,...,n then e is in FIRST(X).
If X is e
è FIRST(X)={e}
If X is Y1Y2..Yn
è if
a terminal a in FIRST(Yi) and e is in all FIRST(Yj) for
j=1,...,i-1 then a is in FIRST(X).
è if
e is in all
FIRST(Yj) for j=1,...,n then e is in FIRST(X).
- Write an Example to calculate the FIRST(X).
E ® TE’
E’ ® +TE’ | e
T ® FT’
T’ ® *FT’ | e
F ® (E) | id
FIRST(F) = {(,id} FIRST(TE’)
= {(,id}
FIRST(T’) = {*, e} FIRST(+TE’
) = {+}
FIRST(T) = {(,id} FIRST(e) = {e}
FIRST(E’)
= {+, e} FIRST(FT’) = {(,id}
FIRST(E)
= {(,id} FIRST(*FT’)
= {*}
FIRST(e) = {e}
FIRST((E))
= {(}
FIRST(id)
= {id}
- How will you Compute FOLLOW for non-terminals ?
If S is the start symbol è
$ is in FOLLOW(S)
if A
® aBb is a production rule
èeverything in FIRST(b) is FOLLOW(B)
except e
f (
A ® aB is a production
rule ) or ( A ® aBb is a production
rule and e is in FIRST(b) ) è everything in
FOLLOW(A) is in FOLLOW(B).
We
apply these rules until nothing more can be added to any follow set
- Give an Example to compute the FOLLOW?
E ® TE’
E’ ® +TE’ | e
T ® FT’
T’ ® *FT’ | e
F ® (E) | id
FOLLOW(E) = { $, ) }
FOLLOW(E’)
= { $, ) }
FOLLOW(T)
= { +, ), $ }
FOLLOW(T’)
= { +, ), $ }
FOLLOW(F) = {+,
*, ), $ }
- What do we have to do it if the resulting parsing table contains multiply defined entries?
If we didn’t eliminate left recursion, eliminate
the left recursion in the grammar.
If the grammar is
not left factored, we have to left factor the grammar.
If its (new
grammar’s) parsing table still contains multiply defined entries, that grammar
is ambiguous or it is inherently not a LL(1) grammar.
- What are the properties of LL(1) grammar?
A
grammar G is LL(1) if and only if the
following conditions hold for two distinctive
production rules A ® a and
A ® b
- Both a and b cannot derive strings starting with same terminals.
- At most one of a and b can derive to e.
- If b can derive to e, then a cannot derive to any string starting with a terminal in FOLLOW(A).
- List out the possible error may occur in the predictive parsing (LL(1) parsing)
–
if
the terminal symbol on the top of stack does not match with the current input
symbol.
–
if
the top of stack is a non-terminal A, the current input symbol is a, and the
parsing table entry M[A,a] is empty.
- What should the parser do in an error case?
–
The
parser should be able to give an error message (as much as possible meaningful
error message).
–
It
should be recovered from that error case, and it should be able to continue the parsing with the rest of the
input.
- List the various error recovery techniques?
Ø Panic-Mode Error
Recovery
Ø Phrase-Level Error
Recovery
Ø Error-Productions
Ø Global-Correction
- What will be performed in Panic-Mode and Phrase-Level Error Recovery?
Panic-Mode
Error Recovery - Skipping the input symbols until a synchronizing token is found.
Phrase-Level
Error Recovery - Each empty entry in the parsing table is filled with a pointer to a specific error routine to
take care that error case.
- What is an Error Production?
Error-Productions
–
If
we have a good idea of the common errors that might be encountered, we can
augment the grammar with productions that generate erroneous constructs.
–
When
an error production is used by the parser, we can generate appropriate error
diagnostics.
–
Since
it is almost impossible to know all the errors that can be made by the
programmers, this method is not
practical.
- How panic-mode error recovery is performed in LL(1) parsing?
–
All the empty entries are marked as synch
to indicate that the parser will skip all the input symbols until a symbol in
the follow set of the non-terminal A which on the top of the stack. Then the
parser will pop that non-terminal A from the stack. The parsing continues from
that state.
–
To
handle unmatched terminal symbols, the parser pops that unmatched terminal
symbol from the stack and it issues an error message saying that that unmatched
terminal is inserted.
- What is a Shift Reduce parsing?
Bottom-up parsing is also known as shift-reduce
parsing because its two main actions
are shift and reduce.
–
At
each shift action, the current symbol in the input string is pushed to a stack.
–
At
each reduction step, the symbols at the top of the stack (this symbol sequence
is the right side of a production) will replaced by the non-terminal at the
left side of that production.
–
There
are also two more actions: accept and error.
- What are the Conflicts encountered During Shift-Reduce Parsing ?
Stack contents and the next input symbol
may not decide action:
–
shift/reduce conflict: Whether make a shift operation or
a reduction.
–
reduce/reduce conflict: The parser cannot
decide which of several reductions to make.
- What is a non LR(K) grammar?
If a shift-reduce parser cannot be used for
a grammar, that grammar is called as non-LR(k)
grammar.
L: left to right R:
right-most k: lookhead scanning derivation
An ambiguous grammar can never be a LR
grammar.
- What are the types of Parser?
- Predictive Parser
- Shift Reduce Parser
- Operator-Precedence Parser-simple, but only a small class of grammars.
- LR-Parsers
–
covers
wide range of grammars.
•
SLR
– simple LR parser
•
LR
– most general LR parser
•
LALR
– intermediate LR parser (lookhead LR parser)
SLR,
LR and LALR work same, only their parsing tables are different.
- What is an operator grammar? Give an Example.
Ø In an operator
grammar, no production rule can have:
–
e
at the right side
–
two
adjacent non-terminals at the right side.
Ex:
E®AB E®EOE
E®E+E |
A®a
E®id E*E |
B®b
O®+|*|/ E/E
| id
not operator grammar not operator grammar operator
grammar
- Waht are the precedence relations between the pair of terminals in operator-precedence parsing?
In operator-precedence parsing, we define
three disjoint precedence relations between certain pairs of terminals.
a
<. b b has higher precedence than a
a
=· b b has same precedence as a
a
.> b b has lower precedence than a
- What is the the canonical LR(0) collection?
–
Sets
of LR(0) items will be the states of action and goto table of the SLR parser.
–
A
collection of sets of LR(0) items (the canonical LR(0) collection) is
the basis for constructing SLR parsers.
- What is an LR(0) item? Give an Example.
An LR(0) item of a grammar G is a
production of G a dot at the some position of the
right side.
Ex: A
® aBb
Possible LR(0) Items: A ® .aBb
(four different possibility) A
® a.Bb
A
® aB.b
A
® aBb.
- What is an Augmented Grammar?
Augmented Grammar:
G’
is G with a new production rule S’®S where S’ is the new starting
symbol.
- How will you find the Closure (I)?
If
I is a set of LR(0) items
for a grammar G, then closure(I) is the set of LR(0) items constructed from I by the two rules:
- Initially, every LR(0) item in I is added to closure(I).
- If A ® a.Bb is in closure(I) and Bg® is a production rule of G; then B®.g will be in the closure(I). We will apply this rule until no more new LR(0) items can be added to closure(I).
- What is a SLR(1)Parser?
–
An
LR parser using SLR(1) parsing tables for a grammar G is called as the SLR(1)
parser for G.
–
If
a grammar G has an SLR(1) parsing table, it is called SLR(1) grammar (or SLR
grammar in short).
–
Every
SLR grammar is unambiguous, but every unambiguous grammar is not a SLR grammar.
- What are the conflicts will encounter in SLR Parser?
–
If
a state does not know whether it will make a shift operation or reduction for a
terminal, we say that there is a shift/reduce conflict.
–
If
a state does not know whether it will make a reduction operation using the
production rule i or j for a terminal, we say that there is a reduce/reduce
conflict.
–
If
the SLR parsing table of a grammar G has a conflict, we say that that grammar
is not SLR grammar.
–
- Eliminate the Left Recursion from the following Grammars? [AU May/Jun 2007] [AU May / June 2013]
AàAc / Aad / bd / c (or) AàAc / Aad / bd / ε
Solution:
A à bdA’ / A’
A’ à cA’ / adA’ / c
- What is predictive parser? [AU Nov/Dec 2007]
A predictive parser is an efficient way of
implementing recursive descent parsing [Top down parsing without backtracking]
by handling the stack of activation records. It has an input, a stack, a
parsing table and an output.
- Is the grammar S à SS | (S) | ( ) ambiguous. [AUT Nov/Dec 2010]
No. There is no two
different parse tree for the same string.
- Determine the FIRST(S) for the following grammar. [ AUT Nov/Dec 2010]
S à ScB | B
B à e | efg | efCg
Cà SdC | S
Removing left
recursion: S à
BS’
S’à cBS’ | ε
FIRST(S) = FIRST(B)
= { e }.
- Translate a*-(b+c) into postfix code. [ AUT Nov/Dec 2010]
ANS: abc+-*
- Is the following grammar G LL(1)? E à A | B Aàa|c Bàb|c [AUT
Nov/Dec 2010]
There is no
multiple entries in the parsing table. So it is not LL(1).
- Derive FIRST and FOLLOW for the following grammar. [AUT
Nov/Dec 2010]
S à 0 | 1 | AS0 | BS0 Aà ε Bà ε
FIRST(S) = {0, 1, ε
} FOLLOW(S) = FOLLOW(A) =
FOLLOW(B) = {0, $}
FIRST(A) = { ε }
FIRST(B) = { ε }
- What is an annotated parse tree? [AU May / June 2013]
A parse tree
showing the attribute values at each node is called an annotated parse tree.
- Write the difference between Top down parsing and Bottom up parsing. [AU May / June 2013]
- Top-down parsers: starts constructing the parse tree at the top (root) of the parse tree and move down towards the leaves. Easy to implement by hand, but work with restricted grammars.
examples:
- Predictive parsers (e.g., LL(k))
- Bottom-up parsers: build the nodes on the bottom of the parse tree first. Suitable for automatic parser generation, handle a larger class of grammars.
examples:
– shift-reduce parser (or LR(k) parsers)
Both are general techniques that can be made to work for all
languages (but not all grammars!)
- Define handle and handle pruning process. [AU Nov / Dec 2012]
Handle:
A ‘handle’ of a string is a substring that matches the right side
of a production and whose reduction to the non-terminal on the left side of the
production represents one step along the reverse of a rightmost derivation.
Handle Pruning: Reducing β to A in
αβω is called handle pruning,i.e., removing the children of A from the parse
tree. A rightmost derivation in reverse can be obtained by handle pruning.
- Write the LR parsing algorithm. [AU Nov / Dec 2012]
LR parser consists of an input, an
output, a stack, a driver program and a parsing table that has two functions:
Ø action
Ø goto
The
driver program is same for all LR parsers. Only the parsing table changes from
one parser to another.
The
parsing program reads character from an input buffer one at time. The program
uses a stack to store a string of the form S0X1S1X2S2…….XmSm,
where Sm is on top. Each Si is a symbol called state and
each Xi is a grammar symbol.
- What is Run-time organization?
To implement the
abstractions embodied in the source language, a compiler creates and manages a
run-time environment in concert with the operating system and the target
machine. It has static area for the object code and the static data objects
created at compile time. It also has dynamic stack and heap areas for managing
objects created and destroyed as the target program executes.
- What is Control stack?
Procedure calls and
returns are usually managed by a run-time stack called control stack. We can use a stack because procedure calls or
activation nest in time (i.e.,) if p calls q, then this activation of q is
nested within this activation of p.
- What is Stack allocation?
Storage for local
variables can allocated on a run-time stack for languages that allow or require
local variables to become inaccessible when their procedures end. For such
languages, each live activation has an activation record on the control stack,
with the root of activation tree at the bottom, and entire sequence of
activation record on the stack corresponding to the path in activation tree to
activation where control currently resides. The later activation has record at
the top of the stack.
16 Mark Questions
1.
Determine the FIRST and FOLLOW for the Grammar G with productions
S->0/1/AS0/BS0 A->ε
B->ε. Construct a LL(1) parsing table and prove or disprove if the
grammar is LL(1)
2.
Explain how error recovery is done in predictive parsing with suitable example.
3.
Construct Predictive Parsing table for the following grammar S->(L) / a L->L,S/S
Check
whether the following sentences belong to the above grammar or not.
i.)
(a,a) ii.) (a,(a,a))
4.
Construct a CLR parser for the grammar: S->CC, C->cC / d
5.
Give the canonical collection of sets of LR(0) items for the grammar given
below:
E’ ->E
E -> E + T / T
T-> T * F / F
F-> (E) / id
6. Perform LR
parsing and derive the input a(a,a(a,a))) using the below given grammar:
S’->S
S->a(L)
S->a
L->S,L
L->S
7. Explain the
working of top down parsing and bottom up parsing?
8. Describe the
error detection and recovery process involved in the lexical phase.
9. Write an algorithm for generating
LR item sets and constructing SLR Parsing table.
10. Write about LALR parser.
11. What are the different
strategies that a parser can employ to recover from syntax errors?
12. Construct the CLR parsing table
form
S->AA
A->Aa/b
13. Write Operator-precedence
parsing algorithm.
14. Compare LR, SLR, Canonical LR,
LALR parser with their advantages.
15. Design a predictive parsing
table for the grammar
E->
E+T/T
T->
T*F/F
F->(E)/id
and
show the moves for input string id*id+id.
16. Design an operator precedence
parser for the grammar
E-> E+T/T
T-> T*F/F
F->(E)/id
and
draw the precedence function graph.
17. How the Storage Organization is performed
in Run-Time Environment?
18.What are the available Storage allocation
strategies available?
UNIT III Intermediate Code Generation
- What are the different intermediate languages available? [AU Nov/Dec 2007][AU May / June 2013, Nov/Dec 2013]
- Syntax trees can be used as an intermediate language.
- Postfix notation can be used as an intermediate language.
- Three-address code (Quadraples) can be used as an intermediate language
- we will use quadraples to discuss intermediate code generation
- quadraples are close to machine instructions, but they are not actual machine instructions.
- Some programming languages have well defined intermediate languages.
·
java
– java virtual machine
·
prolog
– warren abstract machine
·
In
fact, there are byte-code emulators to execute instructions in these
intermediate languages.
- What are various representation of Three address statement? [AU Nov/Dec 2007]
Quadruples:A
Quadruple is a record structure with four fields, op,arg1,arg2 and result.
The op field contains an internal code for the operator. Eg. X’ = Y+Z is represented by
placing + in op, y in arg1, z in arg2 and x in result.
Triples: It is a record with only three
fields:op, arg1,arg2.Eg. X op Y
Indirect Triples: It will list pointers
to the triples rather than the triples itself.
- What are the properties of three address code?
§ Three address
instruction has at most one operator in addition to the assignment symbol. The
compiler has to decide the order in which operations are to be done.
§ The compiler must
generate a temporary name to hold the value computed by each instruction.
Some
three address instructions can have less than three operands
- List the forms of three-address instruction forms.
·
Assignment
instruction of form, x= y op z.
·
Assignment
instruction of form, x= op y.
·
Copy
instruction of form, x= y.
·
An
unconditional jump goto L.
·
Conditional
jump of form, if x goto L and ifFalse x goto L.
·
Conditional
jump such as if x relop y goto L.
·
Procedure
calls such as
Param
x1
Param
x2
….
Param
xn
Call p,n
·
Indexed
copy instruction of form, x= y[i] and x[i]=y.
·
Address
and pointer assisnments of form, x= &y, x= *y and *x= y.
- What is record in type expression?
A record is a data
structure with named fields. A type expression can be formed by applying the
record type constructor to the field names and their types. Record types will
be implemented by applying the constructor record to a symbol table containing
entries for the fields.
- How to use a name of X for the field within a record does not conflict with the other uses of name outside the record?
The three uses of X in the following declaration is
distinct and do not conflict with each other.
float X;
record
{ float X; float Y;} p;
record
{ int tag; float X; float Y;} q;
A subsequent assignment X= p.X+q.X;
sets variable X to the sum of fields named X in the records p and q. The
relative address of X in p differs from the relative address of X in q.
- How the Elements of arrays can be accessed in One Dimensional array?
location
of A[i] è baseA+(i-low)*width
base
A+(i-low)*width
can
be re-written as i*width +
(baseA-low*width)
should
be computed at run-time can be
computed at compile-time
So, the location of A[i] can be computed at
the run-time by evaluating the formula i*width+c where c is (baseA-low*width) which is
evaluated at compile-time.
- How the Elements of arrays can be accessed in two Dimensional array?
The location of A[i1,i2]
is baseA+ ((i1-low1)*n2+i2-low2)*width
baseA is the location of the array A.
low1 is the index of the first row
low2 is the index of the first column
n2 is the number of elements in each row
width
is the width of each array element
Again, this formula can be re-written as
((i1*n2)+i2)*width +
(baseA-((low1*n1)+low2)*width)
should
be computed at run-time can be
computed at compile-time
- How to implement statements using control flow?
Statements can be translated ny
inheriting a label next, where next marks the first instruction after
the codefor this statement. The conditional S
if(B)S1 can be translated by attaching a new label marking the
beginning of code for S1 and passing a new label and S.next for the true and false exits, respectively of B.
- How to generate jumping code for Boolean expressions?
Jumping code is
useful because a Boolean expression is used for control flow as in if(B)S.
Boolean values can be computed by jumping to t=true or t=false, where t is the
temporary name. Using labels for jumps, a Boolean expression can be translated
by inheriting the labels corresponding to its true or false exits. The
constants true and false translate into the jump to the true and false exits,
respectively.
- How the Translation of Switch-Statements is performed?
The intended
translation of switch is code to,
1. Evaluate the
expression E.
2. Find the value
Vj in the list of cases.
3. Execute the
statement Sj associated with the value found.
- What is meant by back patching? [AU Nov /Dec 2012] [AU May / June 2013]
By
generating a series of branching statement with the targets of the jumps temporarily left unspecified. Each such
statement will be put on a list of go to statements
whose labels will be filled in when the proper label can be determined. This type of subsequent filling in of
labels as back patching.
- Define Backpatching. [AU May/Jun 2007, 2009] [AUT Nov/Dec 2010]
It is a Technique
to solve the problem of replacing symbolic names into goto statements by the
actual target address. The Backpatching can be done for,
·
One-Pass
code generation
·
Boolean
expression
·
Flow-of-control
statements
·
Break,Continue
and Goto statements.
- Write the syntax directed translation for procedure call.
Sà call id (Elist)
{ for each item p on queue do
emit (‘param’ p);
emit (‘call’ id.place) }
15.
What are the two
representations to express intermediate languages?
§ Graphical
representation.
§ Linear
representation.
- Define triple and give one example. [AU Nov/Dec 2007]
A
triple is a data structure with 3 fields like operator , argument-1 ,
argument-2.
Example:
a=b*-c
|
operator
|
Argument-1
|
Argument-2
|
(0)
|
uminus
|
c
|
|
(1)
|
*
|
b
|
(0)
|
(2)
|
=
|
a
|
(1)
|
- Define quadruple and give one example. [AU Nov/Dec 2007]
A
quadruple is a data structure with 4 fields like operator , argument-1 ,
argument-2 and result.
Example:
a=b*-c
|
Operator
|
Argument-1
|
Argument-2
|
Result
|
(0)
|
Uminus
|
c
|
|
T1
|
(1)
|
*
|
b
|
T1
|
T2
|
(2)
|
=
|
T2
|
|
A
|
18.
What is the merit
of quadruples?
If we move a statement computing a
variable , ‘A’ then the statements using ‘A’ requires no change. That is ‘A’
can be computed anywhere before the utilization of A.
19.
Generate three
address code for “if A=B then 1 else 0”, using numerical method.
·
if
A=B goto (4)
·
T1=0
·
goto
(5)
·
T1=1
·
……
- What are the benefits of intermediate code generation? [AU May/Jun 2009]
- A Compiler for different machines can be created by attaching different back end to the existing front ends of each machine.
- A Compiler for different source languages can be created by proving different front ends for corresponding source languages t existing back end.
- A machine independent code optimizer can be applied to intermediate code in order to optimize the code generation.
- What are the various types of intermediate code representation? [AU Nov/Dec 2007] [AU Nov / Dec 2011][AU Nov/Dec 2013]
There are mainly three types of
intermediate code representations.
Ø Syntax tree
Ø Postfix notation
Ø Three address
code - Quadruple, Triple and Indirect
triple
- Give the syntax-directed definition for if-else statement.
1. S -> if E then S1
E.true :=
new_label()
E.false :=S.next
S1.next :=S.next
S.code :=E.code | |
gen_code(E.true ‘: ‘) | | S1.code
2.
S -> if E then S1 else S2
E.true :=
new_label()
E.false :=
new_label()
S1.next :=S.next
S2.next :=S.next
S.code :=E.code | |
gen_code(E.true ‘: ‘) | | S1.code| | gen_code(‘go to’,S.next) | |gen_code(E.false
‘:’) | | S2.code
- Mention the functions that are used in backpatching for One-pass code generation .
Ø mklist(i) creates
the new list. The index i is passed as an argument to this function where I is
an index to the array of quadruple.
Ø merge_list(p1,p2)
this function concatenates two lists pointed by p1 and p2. It returns the
pointer to the concatenated list.
Ø backpatch(p,i)
inserts i as target label for the statement pointed by pointer p.
- What are the properties of intermediate languages? [AU May/Jun 2007]
Retargeting is
facilitated; a computer for a different machine can be created by detaching a
back end for the new machine to an existing front end.
- Differentiate triple representation from indirect representation in the three address format. [AUT Nov/Dec 2010]
Triples: It is a record structure with three fields:
op, arg1 and arg2. The field’s arg1 and arg2 are either pointers to the symbol
table or pointers into the triple structure. It is used to avoid temporary
names into the symbol table.
Indirect triples: listing pointers to triples rather
than listing the triples themselves are called indirect triples. It saves some
space compared to quadruples.
- Translate the expression (a+b)*(c+d)+(a+b+c) into quadruples. [AUT Nov/Dec 2010]
|
Op
|
Arg1
|
Arg2
|
Result
|
(0)
|
+
|
A
|
B
|
t1
|
(1)
|
+
|
C
|
D
|
t2
|
(2)
|
+
|
t1
|
C
|
t3
|
(3)
|
*
|
t1
|
t2
|
t4
|
(4)
|
+
|
t4
|
t3
|
t5
|
- Compare a triple and Quadruple. [AU Nov / Dec 2011]
Triples:
It
is a record structure with three fields: op, arg1 and arg2. The field’s arg1
and arg2 are either pointers to the symbol table or pointers into the triple
structure. It is used to avoid temporary names into the symbol table.
Quadruples:
A
Quadruple is a record structure with four fields: op, arg1, arg2 and result.
The op field contains an internal code for the operator.
- Translate a*-(b+c) into postfix code. [AUT Nov/Dec 2010]
Postfix notation is abc+-*
- Write the difference between Syntax Tree and DAG. [AU May / June 2013]
A syntax
tree is a graph structure. Nodes in a syntax tree correspond to statements.
Edges in a syntax tree represent relations between statements. For example,
lets take the case of a very simple executable statement: 1 + 2. It contains two
sub-statements, the constant expressions 1 and 2 and the operator +. The syntax tree
for this statement will have the root node representing a binary operation with
the operator + and two child
nodes representing the constant expressions 1 and 2.
Directed
acyclic graph (DAG), is a directed
graph with no directed cycles. That is, it is formed by a collection
of vertices and directed edges, each edge connecting one vertex
to another, such that there is no way to start at some vertex v and follow
a sequence of edges that eventually loops back to v again.
30.
Why is SDT needed? AU May / June 2013]
Syntax-directed translation (SDT) refers to a
method of compiler implementation where the source language translation is
completely driven by the parser, i.e., based on the syntax of the language The
parsing process and parse trees are used to direct semantic analysis and the
translation of the source program. Almost all modern compilers are
syntax-directed. SDT can be a separate phase of a compiler or we can augment
our conventional grammar with information to control the semantic analysis and
translation. Such grammars are called attribute grammars.
31.
How the Procedure
or Function calls can be translated?
The Procedure calls can be
translated by using the concepts like,
·
Function
types
·
Symbol
tables
·
Type
checking
·
Function
calls
16
mark Questions
1. Describe
the method of generating syntax-directed definition for Control statements.
2. Give the semantic for declarations in a
procedure.
3. Explain the generation of intermediate
code for declaration statement.
4. Discuss the methods for translating
Boolean expression.
5. Brief intermediate code generation for
basic block, control flow and Boolean expression.
6. Give the syntax-directed definition for
flow of control statements.
7. How back patching can be used to
generate the code for Boolean expression and flow of control statements.
8. Write the short note on procedure calls.
9. Explain about the different type of
three address statements.
10. Explain in detail how three address
code are generated and implemented.
11. Write syntax directed translation for
desktop calculator implementation. Using the SDT compute the value of 19+4*15.
12. Write short notes on Quadruples,
Triples and Indirect triples.
13. Discuss about back patching techniques
with example.
14. Detail the translation scheme for flow
control statements and procedure call
15. Discuss about back patching of Boolean expression.
UNIT IV Code
Generation
1.
What is Code
generation?
Code generation is the final phase
of a compiler. The code generator maps the intermediate representation produced
by the front end or if there is a code optimization phase by a code optimizer,
into the target program.
2.
What are the issues
in the design of a code generator? [AU May/Jun 2009] [AUT Nov/Dec 2010]
Intermediate
code in which the source program was represented when it is given as a input to
code generator. Target machine language Mapping names in the source program to
addresses of data objects in run time memory is done cooperatively by the front
end and the code generator.
·
Instruction
selection
·
Register
allocation
·
Choice
of evaluation order
·
Approaches
to code generation.
3.
What are the
problems identified while using registers?
During
Register allocation, select the set of variables that will reside in registers
at a point in the program. During a subsequent register assignment phase, pick the specific register that a variable will
reside in.
4.
What is an
activation record?
Information needed during an execution
of a procedure is kept in a block of storage called an activation record.
Storage for names local to the procedure also appears in the activation record.
- What is a basic block?[AU Nov/Dec 2007] [AU May / June 2013]
A basic block is a maximal sequence
of consecutive three-address statements in which flow of control can only enter
at the first statement of the block and leave at the last statement without
halting or branching except possibly at the last statement in the basic block.
- How do we allocate the space for the generated target code and the data object of our source programs?
•
The
places of the data objects that can be determined at compile time will be allocated
statically.
•
But
the places for the some of data objects will be allocated at run-time.
•
The
allocation of de-allocation of the data objects is managed by the run-time
support package.
–
run-time
support package is loaded together with the generate target code.
–
the
structure of the run-time support package depends on the semantics of the programming
language (especially the semantics of procedures in that language).
•
Each
execution of a procedure is called as activation of that procedure
- What is an activation tree? [AU Nov/Dec 2007] [AU Nov / Dec 2011]
We can use a tree (called activation
tree) to show the way control enters and leaves activations.
In an activation tree:
–
Each
node represents an activation of a procedure.
–
The
root represents the activation of the main program.
–
The
node a is a parent of the node b iff the control flows from a to b.
–
The
node a is left to to the node b iff the lifetime of a occurs before the
lifetime of b.
- What is the flow of control in a program? [AU Nov/Dec 2007] [AU May / June 2013]
A flow graph is a graphical
representation of a program in which the nodes of a graph are basic blocks and
the edge of the graph shows how control can flow among the blocks.
- What is the use of Control stack?
A stack (called control stack)
can be used to keep track of live procedure activations.
–
An
activation record is pushed onto the control stack as the activation starts.
–
That
activation record is popped when that activation ends.
When
node n is at the top of the control stack, the stack contains the nodes along
the path from n to the root.
- What do you meant by scope of the variable?
The scope rules of the language
determine which declaration of a name applies when the name appears in the
program.
An occurrence of a variable (a name)
is:
–
local: If that occurrence is in the same
procedure in which that name is declared.
–
non-local: Otherwise (ie. it is declared outside of
that procedure)
Example:
procedure p;
var b:real;
procedure
p;
var a: integer;
a is local
begin
a := 1; b := 2; end; b
is non-local
begin
... end;
- What are the contents of the activation record? [AU May / June 2012]
return value
|
actual parameters
|
optional control link
|
optional access link
|
saved machine status
|
local data
|
Temporaries
|
- Who allocates an activation record of a procedure?
–
Some
part of the activation record of a procedure is created by that procedure
immediately after that procedure is entered.
–
Some
part is created by the caller of that procedure before that procedure is
entered.
- Who deallocates an activation record of a procedure?
–
Callee
de-allocates the part allocated by Callee.
–
Caller
de-allocates the part allocated by Caller
- What do you meant by dangling reference problem? Give an example.
Whenever storage is
de-allocated, the problem of dangling references may occur.
Example:
main
() {
int
*p;
p = dangle();
}
int
*dangle*()
{
int i=2;
return &i;
}
- What are the various parameters passing mechanism?
·
Call-by-value
·
Call-by-reference
·
Call-by-name
(used by Algol)]
- What are the two types of transformation that can be applied to basic blocks?[AU May/Jun 2007] [AU Nov / Dec 2011]
·
Structure
preserving Transformation
Ø Common sub
expression elimination
Ø Dead code
elimination
Ø Renaming of
temporary variables
Ø Interchange of two
independent adjacent statement
·
Algebraic
Transformation
- Translate the following expression in to three address code sequence. d: = (a-b)+(a-c)+(a-c).
t := a-b
u :=a-c
v :=t+u
d := v+u
- Generate the code for the above three address code sequence.
MOV a, R0
SUB
b, R0
MOV
a R1
SUB
c, R1
ADD
R1, R0
ADD
R1,R0
MOV
R0,d
- What is a DAG? State its uses? [AU May/Jun 2007] [AUT Nov/Dec 2010][AU Nov/Dec 2013]
Directed Acyclic Graphs are useful data
structures for implementing transformations on basic blocks. A DAG gives the
picture of how the values are computed by each statement in the basic block is
used in subsequent statements of the block.
- Write an algorithm to construct a DAG?
Input: Basic Block
Output: A DAG for
the basic block containing the following information
A
label for each node. For leaves the label is an identifier and for interior
nodes an operator symbol. For each node a list of attached identifier
Method:
Ø If node (y) is
undefined, create a leaf labeled y.
Ø Determine if there
is a node labeled op, whose left child is node(y) and whose right child is node
(z).
Ø If not create such
node.
Ø Delete x from the
list of attached identifiers of node(x)
- What is the use of Next – use information? [May / June 2012]
If the name in a register is no longer
needed, then the register can be assigned to some other name. This idea can be
applied in number of contexts.
- Construct the dag for the following basic block.
d := b*c
e : = a+b
b := b*c
a := e-d
- Write the steps for constructing LEADERS in Basic block. [AU May/Jun 2009]
i. First statement is a leader.
ii. Any statement that is the target of the
conditional or unconditional goto is a leader.
iii. Any statement that immediately follows a goto
or conditional goto statement is a leader.
- State the importance of Next-use information. [AUT Nov/Dec 2010]
Next-use information is a collection
of all the names that are used for next subsequent statements in a block. This
information is generally collected for register allocation and temporary names
or variables allocation.
- What do you by instruction cost? [AU Nov / Dec 2011]
The cost of an instruction to be one plus the cost
associated with the source and destination adress modes. This cost coresponds
to the length of the instruction.
- Define Register allocation and Register assignment?
Register
allocation: It
is the process of deciding which IR values to keep in registers. Graph coloring
is an effective technique for doing register allocation in compilers.
Register
assignment: It is the process of deciding which register should hold a
given IR value.
- Mention the issues to be considered while applying the techniques for code optimization.
Ø The semantic
equivalence of the source program must not be changed.
Ø The improvement
over the program efficiency must be achieved without changing the algorithm of
the program.
- What are the basic goals of code movement?
- To reduce the size of the code i.e. to obtain the space complexity.
- To reduce the frequency of execution of code i.e. to obtain the time complexity.
- What do you mean by machine dependent and machine independent optimization?
Ø The machine
dependent optimization is based on the characteristics of the target machine
for the instruction set used and addressing modes used for the instructions to
produce the efficient target code.
Ø The machine
independent optimization is based on the characteristics of the programming
languages for appropriate programming structure and usage of efficient
arithmetic properties in order to reduce the execution time.
- List the different storage allocation strategies.
The strategies are:
Ø Static allocation
Ø Stack allocation
Ø Heap allocation
- What are the contents of activation record? [AU May/Jun 2007] [AUT Nov/Dec 2010]
The activation
record is a block of memory used for managing the information needed by a
single execution of a procedure. Various fields’ f activation record are:
Ø Temporary variables
Ø Local variables
Ø Saved machine
registers
Ø Control link
Ø Access link
Ø Actual parameters
Ø Return values
- What is code motion? [AU May/Jun 2007] [AUT Nov/Dec 2010] [AU May / June 2012]
Code motion is an
optimization technique in which amount of code in a loop is decreased. This
transformation is applicable to the expression that yields the same result
independent of the number of times the loop is executed. Such an expression is
placed before the loop.
- What are the various ways to pass a parameter in a function?
Ø Call by value
Ø Call by reference
Ø Copy-restore
Ø Call by name
- What is dynamic scope rule? [AUT Nov/Dec 2011]
Under dynamic
scope, a new activation inherits the existing bindings of nonlocal names to
storage. There are two approaches to implement dynamic scope:
a. Deep access
b. Shallow access.
- Give any four applications of DAG. [AU May / June 2013]
Ø We can detect common sub expressions automatically.
Ø We can determine the identifiers which have their
vlues used in the block.
Ø We can determine the statements which compute values
that could be used outside the block.
16
mark questions
1. Explain various issues envolved in the design of
code generation.(or) Discuss about the issues in the design of code generation.
2.Explain the code generation algorithm.
3. Explain the process of constructing DAG representation of
Basic block for the below given code segment.
{
prod=0;
i=1;
do
{
prod = prod + a[i]*b[i];
i=i+1;
}
while(i<=10)
}
4. Explain in detail about primary
structure-preserving transformation on Basic blocks.
5. Explain the construction of DAG for the Basic
blocks in detail.
6. Discuss the runtime storage management of a code
generator.
7. Describe a simple Code Generator. Explain how it generates code by an
example.
8. Discuss about
the data structure used for constructing symbol table.
9. Describe in
detail the source language issues.
10. Discuss the
various strategies for recovering from syntax error.
11. Explain the
methods used to access nonlocal names.
UNIT V Code Optimization
1.
What is meant by code
optimization?[AU Nov/Dec 2013]
The unnecessary
instructions in object code is eliminated or replacement of one sequence of
instruction by a faster sequence of instructions that does the same thing is
called as code improvement or code optimization. It has two types,
· Local code
optimization – Code improvement within a basic blocks
· Global code
optimization - Code improvement across
the basic blocks.
2.
What is an optimizing
compiler?
Compilers that apply code-improving
transformations are called optimizing compiler. It should preserve the
semantics of the original program.
- What are the properties of optimizing compiler? [AU May / June 2013]
The source code should be such that it
should produce minimum amount of target code. There should not be any
unreachable code. Dead code should be completely removed from source language. The
optimizing compilers should apply following code improving transformations on
source language.
i) Common sub-expression elimination
ii) Dead code elimination
iii)Code movement
iv)Strength reduction
- What is copy propagation?
A copy statement,
u=v, assigns one variable v to another, u. We can replace all uses of u by v,
thus eliminating both the assignments and u.
- What is a Function – Preserving Transformation?
There
are number of ways in which a compiler can improve a program without changing
the function it computes. Common sub Expression Elimination, Copy propagation,
dead code elimination and constant folding are example for the above
transformation.
- Which type of expression is called as common sub expression?
An occurrence of an
expression E is called a common sub expression if E was previously computed,
and the values of variables in E have not changed since the previous
computation.
- What is meant by constant folding? [AU May June 2013]
The substitution of
values for names whose values are constant is known as constant folding. (or)
Deducing at compile time that the value of an expression is a constant and
using the constant instead is known as constant folding.
- What is happening at code motion?
The transformation takes an expression
that yields the same result independent of the number of times a loop is
executed and places that expression out of the loop.
- What is induction variable?
It is a variables
that take on a linear sequence of values each time around the loop. Some of
these are used only to count iterations, and they often can be eliminated, thus
reducing the time it takes to go around the loop.
- What is strength reduction?
The transformation
of replacing an expensive operation ,such as multiplication by a cheaper one ,
such as addition is known as strength reduction.
- What is Dominators?
A node in the flow
graph dominates another if every path to the latter must go through the former.
A proper dominator is a dominator other than the node itself. Each node except
the entry node has an immediate dominator – that one of its proper dominators
that is dominated by all the other proper dominators.
- What are the classification of Edges?
When we construct a
depth-first spanning tree, all the edges of the flow graph can be classified
into,
·
Advancing
edges – those that go from ancestor to
proper descendant
·
Retreating
edges – those from descendant to
ancestor
·
Cross
edges – others
13. What is meant by Back Edges?
A back edge is one
whose head dominates its tail. Every back edge is a retreating edge, regardless
of which depth-first spanning tree for its flow graph is chosen.
- What is reducible flow graph?
If every retreating
edge is a back edge, regardless of which depth-first spanning tree is chosen,
then the flow graph is said to be reducible. The majority of flow graphs are
reducible(i.e.,) those whose only control-flow statements are the usual
loop-forming and branching statements are certainly reducible.
- What is natural loops? How can be it constructed? What are its properties?
A natural loop is a
set of nodes with a header node that dominates all the nodes in the set and has
at least one back edge entering that node. Given any back edge, we can
construct its natural loop by taking the head of the edge plus all nodes that
can reach the tail of the edge without going through the head. It has two
essential properties,
·
It
must have single-entry node, called header. This entry node dominates all nodes
in the loop, or it would not be the sole entry to the loop
·
There
must be a back edge that enters the loop header. Otherwise, it is not possible
for the flow of control to return to the header directly from the loop. (i.e.,)
There really is no loop.
- What is Data-flow analysis?
It defines the
value at each point in the program. Statements of the program have associated
transfer functions that relate the value before the statement to the value
after. Statements with more than one
predecessor must have their value defined by combining the values at the
predecessors, using a meet operator.
- What does the data-flow analysis framework contains?
A data-flow
analysis framework (D,V,^,F) consists of
- A direction of the data flow D, which is either FORWARDS or BACKWARDS.
- A semi-lattice, which includes a domain of values V and a meet operator ^.
- A family F of transfer functions from V to V. This family must include functions suitable for the boundary conditions, which are constant transfer functions for the special nodes ENTRY and EXIT in any flow graph.
- What is Reaching Definitions in data flow framework?
It has values that
are sets of statements in the program that define values for one or more
variables. The transfer function for the block kills definitions of variables
that are definitely redefined in the block and adds in those definitions of
variables that occur within the block. The confluence operator is union, since
definitions reach a point if they reach any predecessor of that point.
- What is meant by Live Variables?
It computes the
variables that are live at each point. The framework is similar to reaching
definition, except the transfer function runs backward. A variable is live at
the beginning of the block if it is either used before the definition in the
block or is live at the end of the block and not redefined in the block.
- What is Available expressions?
To discover global
common sub expressions, we determine the available expressions at each point
(i.e.,)expressions that have been computed and neither of the expression’s
arguments were redefined after the last computation. The data-flow framework is
similar to reaching definitions , but the confluence operator is intersection
rather than union.
- Give an example to represent code motion and reduction in strength. [AUT Nov/Dec 2010]
Reduction in
strength: Replace expensive operations by equivalent cheaper ones.
X2
è X * X
Code motion: while(
i < = n-2)
{
…..
}
Changed to t = n-2;
While(i<=t)
{
…
}
- Identify the induction variables in the following segment of basic block and eliminate to optimize the code. [AUT Nov/Dec 2010]
B1: j=j-1
t4=4*j
t5=a[t4]
If t5 > j goto B1
Induction variable:
j
After optimization
T4 = 0
B1: t4 := t4 + 4
T5 = a[t4]
If t5>76 goto B1
- Define Peephole optimization. [AU Nov / Dec 2012]
Peephole optimization is a kind of optimization performed over a very
small set of instructions in a segment of generated code. The set is called a
"peephole" or a "window". It works by recognizing sets of
instructions that can be replaced by shorter or faster set of instructions.
24.
What are the
characteristic of a peephole optimization?
Ø Redundant –
instruction elimination
Ø Flow- of – control
optimization
Ø Algebraic
simplification
Ø Use of machine
idioms
- Give example for ‘removal of loop invariant’. [AU May / June 2013]
LOOP
INVARIANT COMPUTATIONS:
It’s the construct where a same computation is performed,
every time when loop is executed. Elimination Process-: We first
identify the invariant computations then move them outside the loop without
changing the meaning of actual program.
Example
x
:= init
for ...
a :=
...
x := x È{a}
y := x Çz
od:In
this example:
· z is a loop
invariant
· x and y are
induction variables
Therefore, we can
replace the above code with the following one:
x
:= init
y
:= initÇz
for ...
a :=
x := x È{a}
if a Îz then
y := y È {a}
od
16
mark questions
- Discuss in detail about peephole optimization. (or) Discuss the peephole optimization techniques with example.
- What are the principle sources of optimization? Write notes on peephole optimization.
- Explain various code optimization techniques in detail.
- Describe the principal sources of optimization.
- Optimize the following code using various optimization techniques.
i=1;s=0
for(i=1;i<=3;i++)
for(j=1;j<=3;j++)
c[i][j]=c[i][j][j]+a[i][j]+b[i][j].
- What are the various loops in flow graphs.
- The natural loop of a back edge n->h was defined to be h plus the set of nodes that can reach n without going through h. Show that h dominates all the nodes in the natural loop of n->h.
- Suppose V is the set of complex numbers. Which of the following operations can serve as the meet operation for a semilattice on V?
a) Addition:
(a+ib)^(c+id)=(a+b)+i(c+d).
b) Multiplication:
(a+ib)^(c+id)=(ac-bd)+i(ad+bc)
UNIVERSITY
QUESTION PAPERS
B.E/B.Tech. DEGREE EXAMINATION,
NOVEMBER/DECEMBER 2013.
Seventh Semester
Computer Science and Engineering
080230045 - PRINCIPLES OF COMPILER
DESIGN
(Regulation 2008)
Answer ALL questions
PART- A- (10*2=20 marks)
1. What is difference
between compiler and assembler?
2. What are cousins of
the compiler?
3. What is the role of
Lexical analyser?
4. Draw the NFA for
regular expression ab(a/b)*ab*.
5. Define context free
grammar. Give an example.
6. What are role of
parser?
7. What are intermediate
languages?
8. Give some forms of
intermediate languages.
9. What is DAG?
10.
What
is meant by Code Optimization?
PART- B- (5*16=80 marks)
11.
(a).What are six phases of
compilers? Explain with an example.
Or
(b)What are compiler construction tools?
Describe about their features.
12. (a)Draw the DFA for the regular expression (a/b)*abb and
find minimized DFA.
Or
(b).(i). Write
in detail about tool for generating a Lexical Analyser. (8)
(ii).Give a
lexical analyser specification for a model compiler. (8)
13. (a)Design
a predictive parsing table for the grammar
E-> E+T/T
T-> T*F/F
F->(E)/id
and show the moves for input string
id*id+id.
(b)
Design an operator precedence parser for the grammar
E->
E+T/T
T-> T*F/F
F->(E)/id
and
draw the precedence function graph.
14.
(a)Detail
the translation scheme for flow control statements and procedure call
Or
(b).Discuss about back patching of
Boolean expression.
15. (a)What are the principle sources of optimization?
Write notes on peephole optimization.
Or
(b)Describe a simple Code Generator. Explain how it
generates code by an example.
B.E/B.Tech DEGREE EXAMINATION,
MAY/JUNE 2012
Sixth semester
Computer science and Engineering
CS2352/Cs62/10144 Cs602-PRINCIPLES OF
COMPILER DESIGN
(Regulation 2008)
Answer ALL questions
PART- A
1. Mention
few cousins of compiler.
2. What are
the possible error recovery actions in lexical analyzer?
3. Define an ambiguous grammar.
4. What is dangling reference?
5. Why are quadruples preferred over triples in an optimizing compiler?
6. List out the motivations for back patching.
7. Define flow graph.
8. How to perform register assignment for outer loops?
9. What is the use of algebraic identities in optimization of basic blocks?
10. List out two properties of reducible flow graph?
3. Define an ambiguous grammar.
4. What is dangling reference?
5. Why are quadruples preferred over triples in an optimizing compiler?
6. List out the motivations for back patching.
7. Define flow graph.
8. How to perform register assignment for outer loops?
9. What is the use of algebraic identities in optimization of basic blocks?
10. List out two properties of reducible flow graph?
PART B
11. (a) (i)
What are the various phases of the compiler? Explain each phase in detail.
(ii) Briefly explain the compiler construction tools.
OR
(b) (i) What are the issues in lexical analysis?
(ii)Elaborate in detail the recognition of tokens.
12. (a) (i) Construct the predictive parser for the following grammar.
S->(L)/a
L->L,S/S
(ii)Describe the conflicts that may occur during shift reduce parsing.
OR
(b) (i)Explain the detail about the specification of a simple type checker.
(ii)How to subdivide a run-time memory into code and data areas. Explain
13. (a) (i) Describe the various types of three address statements.
(ii)How names can be looked up in the symbol table? Discuss.
OR
(b) (i)Discuss the different methods for translating Boolean expressions in detail.
(ii) Explain the following grammar for a simple procedure call statement S->call id (enlist).
14. (a) (i) Explain in detail about the various issues in design of code generator.
(ii)Write an algorithm to partition a sequence of three address statements into basic blocks.
OR
(b) (i)Explain the code-generation algorithm in detail.
(ii)Construct the dag for the following basic block.
d: =b*c
e: = a +b
b: =b*c
a: =e-d
15. (a) (i) Explain the principal sources of optimization in detail.
(ii)Discuss the various peephole optimization techniques in detail.
OR
(b) (i) How to trace data-flow analysis of structured program?
(ii) Explain the common sub expression elimination, copy propagation, and
transformation for moving loop invariant computations in detail.
(ii) Briefly explain the compiler construction tools.
OR
(b) (i) What are the issues in lexical analysis?
(ii)Elaborate in detail the recognition of tokens.
12. (a) (i) Construct the predictive parser for the following grammar.
S->(L)/a
L->L,S/S
(ii)Describe the conflicts that may occur during shift reduce parsing.
OR
(b) (i)Explain the detail about the specification of a simple type checker.
(ii)How to subdivide a run-time memory into code and data areas. Explain
13. (a) (i) Describe the various types of three address statements.
(ii)How names can be looked up in the symbol table? Discuss.
OR
(b) (i)Discuss the different methods for translating Boolean expressions in detail.
(ii) Explain the following grammar for a simple procedure call statement S->call id (enlist).
14. (a) (i) Explain in detail about the various issues in design of code generator.
(ii)Write an algorithm to partition a sequence of three address statements into basic blocks.
OR
(b) (i)Explain the code-generation algorithm in detail.
(ii)Construct the dag for the following basic block.
d: =b*c
e: = a +b
b: =b*c
a: =e-d
15. (a) (i) Explain the principal sources of optimization in detail.
(ii)Discuss the various peephole optimization techniques in detail.
OR
(b) (i) How to trace data-flow analysis of structured program?
(ii) Explain the common sub expression elimination, copy propagation, and
transformation for moving loop invariant computations in detail.
B.E/B.Tech DEGREE EXAMINATION,
NOVEMBER/DECEMBER 2011.
Sixth Semester
Computer Science and Engineering
CS 2352 - PRINCIPLES OF COMPILER
DESIGN
(Regulation 2008)
Maximum : lOO marks
Answer ALL questions.
PART A
1. What is the role of
lexical analyzer?
2.
Give the transition diagram for an
identifier.
3.
Define handle pruning.
4.
Mention the two rules for type Checking.
5.
Construct the syntax tree for the following
assignment statement:
6.
What are the types of three address
statements?
7.
Define basic blocks and flow graphs.
8.
What is DAG?
9.
List out the Criteria for Code improving
transformations.
10. When
does dangling reference Occur?
PART B
11. (a).(i)
Describe the Värious phäses Of Compiler amd trace it with the program Segment
(position: = initial + rate * 60].
(10)
(ii) State the Complier Construction
tools. Explain them. (6)
Or
(b). (i) Explain briefly about
input buffering in reading the source program for finding the tokens. (8)
(ii)
Construct the minimized DFA for the regular expression (0+1)*(0+1) (8)
12. (a).Construct
a Canonical parsing table for the grammar given below.
Also
explain the algorithm used. (16)
E->E+T E->T T->T*F T->F
F ->(E) F ->id
Or
(b).What
are the different Storage allocation Strategies? Explain. (16)
13.(a).(i)
Write down the translation Scheme to generate Code for assignment statement.
Use the Scheme for generating three address Code for the assignment statement
g:a+b-c*d (8)
(ii)
Describe the various methods of implementing three-address
statements. (8)
Or
(b),(i) How
Can Back patching be used to generate code for Boolean expressions and flow of
Control statements? (10)
(ii) Write
a short note on procedures Calls. (6)
14.(a).(i)
Discuss the issues in the design of Code generator. (10)
(ii).
Explain the Structure-preserving transformations for basic blocks. (6)
Or
(b).(i)
Explain in detail about the simple code generator. (8)
(ii) Discuss
briefly about the Peephole Optimization.
(8)
15.(a).Describe
in detail the principal sources Of Optimization. (16)
Or
(b).(i)
Explain in detail Optimization Of basic blocks with example. (8)
(ii) Write
about Data flow analysis Of structural programs. (8)
B.E.,B.Tech. DEGREE EXAMINATION,
APRIL/MAY 2011
Sixth Semester
Computer Science and Engíneen'ng
CS 2352 _ PRINCIPLES OF COMPILER
DESIGN
(Regulation 2008)
Time :
Three hours Maximum : 100 marks
Answer ALL questions
PART A
l. What is
an interpreter?
2. Define
token and lexeme.
3. What is
handle pruning?
4. What are
the limitations of static allocation?
5. List out
the benefits of using machine-independent intermediate forms.
6. What is
a syntax tree? Draw the syntax tree for the following statement:
7. List out
the prirnary structure preserving transformations on basic block.
8. What is
the purpose of next-use information?
9. Define
dead-code elimination.
10. What is
loop Optimization?
PART B
11.a.(i). Describe
the Various phases of Complier and trace the program segment a:=b + C *4 for all phases. (10)
(ii) Explain
in detail about compiler construction tools. (6)
Or
b.(i)
Discuss the role of lexical analyzer in detail. (8)
(ii)Draw
the transition diagram for relational operators and unsigned numbers in Pascal.
(8)
12.a.(i)
Explain the error recovery strategies in syntax analysis. (6)
(ii) Construct
a SLR construction table for the following grammar.
E->E+T
E->T
T->T*F
T->F
F ->(E)
F
->id
Or
b.(i)Distinguish
between the source text of a procedure and its activation at run time. (8)
(ii)
Discuss the various storage allocation strategies in detail. (8)
13.a.(i).Define
three-address Code. Describe the various methods of implementing three-address statements with an
example. (8)
(ii)Give
the translation scherne for Converting the assignments into three address code.
(8)
Or
b.(i) Discuss
the various methods for translating Boolean expression. (8)
(ii)Explain
the process of generating the code for a Boolean expression in a single pass
using back patching. (8)
14.a.(i)Write
in detail about the issues in the design of a code generator (10)
(ii) Define
basic block. Write an algorithm to partition a sequence of three-address
statements into basic blocks. (6)
Or
b.(i)How to
generate a code for a basic block from its dag representation? Explain. (6)
(ii)Briefly
explain about simple code generator. (10)
15.a.(i)Write
in detail about function-preserving transformations. (8)
(ii)
Discuss briefly about Peephole optimization. (8)
Or
b.(i)Write
an algorithm to construct the natural loop of a back edge. (6)
(ii)
Explain in detail about code-improving transformations. (10)