Software Construction Project: DSL Compiler Design and Implementation

Verified

Added on  2023/06/10

|13
|1794
|407
Project
AI Summary
This project focuses on designing and implementing a compiler for a domain-specific language (DSL). The solution covers various phases of compiler design, including lexical analysis, syntax analysis, semantic analysis, intermediate code generation, code optimization, and target code generation. It defines the grammar for the DSL using context-free grammar, and illustrates the creation of Non-deterministic Finite Automata (NDFA) and Deterministic Finite Automata (DFA) based on the production rules. The project also includes code snippets using Lex and YACC tools for grammar verification, demonstrating the practical aspects of compiler construction. The symbol table manager and error handler modules are also mentioned. Desklib provides access to a wealth of study resources, including past papers and solved assignments, to aid students in mastering compiler design and related topics.
Document Page
COMPUTER SCIENCE
By Name
Course
Instructor
Institution
Location
Date
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
[Q3]
(a)
Block diagram
Document Page
The diagram above shows the block diagram showing the scope of the compiler design and the
various phases stepwise the design shall use to fully develop the compiler for the domain specific
language. The scope of this task is to come up with the compiler that shall convert the pure high-
level language from the pre-processor into the assembly language. The interest is to look deep in the
compiler phase of the program execution and come up with the necessary phases that shall be useful
in coming up with the compiler (Fischer, Cytron and LeBlanc, 2009). The individual scope as per
the diagram is explained below,
Lexical Analysis
The scope of this phase is to take the program and read it and then convert it into tokens that are
also called streams of tokens. All the white spaces shall be removed at this stage and any comments.
This shall be made possible using the pattern set up for the compiler to identify tokens, operators,
and identifiers. Once the streams of tokens are output, this shall act as the input for the next scope
(Mogensen, 2017).
Syntax analysis
This stage also called the parse. The syntax analyser shall take the tokens given by the lexical
analysis phase and convert it into a parse tree The syntax analyse shall take all the tokens, one by
one ad it is also going to take a grammar generally called context-free grammar with the help of the
production rules, the input is going to be checked whether it is given according to the production
rules defined in the grammar. Thd scope of this stage is to enable debugging of syntax errors parse
tree formed shall be given as the input for the next phase (Grune et al., 2012).
Semantic analysis
The semantic analysis phase is going to verify whether the parse tree it has received has some
logical meaning or not which is called a parse tree which can be verified semantically. Once the
parse tree is verified semantically. All the type checking shall be done at this stage it will be input to
an intermediate code generator phase as input.
Intermediate code generator
The scope here is to generate the three address code. There is various three address code which shall
be used in the design, The output of the three address code shall input to the code optimizer.
Code optimizer
Document Page
The scope of the code optimizer is to minimize the size of the program or to minimize the number
of lines. The output shall be given to the target code generator.
Target code generator
The scope of this phase is to actually generate the assembly code for the domain specific language.
This phase scope is to write the code that the assembler can understand
During all the phases are going to take support from a module known as the symbol table manager.
Moreover, all the phases shall talk with the module called the error handler
(b)
This sections describes the grammer for the domain specific language. The grammer shown in this
sections shall be during implementation as it will act as reference. Context-free grammar has
application in programming languages and compiler design. Informally the grammar used for the
compiler shall contain four components namely;
1) Terminal Symbols
This is similar to the alphabetical symbols we consider when we talk about the finite
automata. The set of the terminal symbols shall be the alphabet of the grammar.
2) Variables
This represents a finite set of symbols with each representing a specific language
3) Start symbol
This represents the language we first start with
4) production
These are rules that we would use to describe the language
To better define the grammar, the following was defined systematically; Expression, Operators,
Statements, Functions, Classes and the various punctuation to be used
a) Expression
The following defines the expression part of the grammer for the domain specific language.
Expression:
Exp1 [AssignOperator Exp1]
AssignOperator
=
+=
-=
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
*=
/=
&=
!=
^=
%=
<<=
>>=
>>>=
Exp1:
Exp2 [Exp1Rest]
Exp1Rest
?Exp : Exp1
Exp2:
Exp3 [Exp2Rest]
Exp2Rest:
{ Infixop Exp3 }
instanceof Type
b) Operators
The operators will act on the operands and hence very important to be defined for our grammer. The
following includes a set of operators that shall be used by the compiler.
AssignOperators
=
+=
-=
*=
/=
&=
|=
^=
%=
<<=
>>=
>>>=
Document Page
LogicOperators
| |
&&
==
!=
<
>
<=
>=
<<
>>
c) Statements
This may be made up of expression as its building block as they represent particular action which
the domain specific language shall run. The following grammer is defined for the statements.
Bloc:
{ BlocStatement }
BlocStatement:
{ BlocStatement }
BlocStatement:
LocalValDeclareStatement
ClassDeclaration
[ id :] Statement
LocalValDeclareStatement:
{ VarModifier } DataType VarDeclarator;
Statement:
Bloc
;
Id : Statement
StatementExp ;
if ParExp [: Exp ] ;
assert Exp [: Exp ] ;
switch ParExp { SwitchBlocStatementGrps }
while ParEx[ Statement
do Statement while ParExp;
for ( ForCtrl ) Statement
Document Page
break [ id ] ;
contnue [ id ] ;
return [ Exp ] ;
throw Exp ;
try Bloc (Catches | [ Catches ] [ FinallyBloc ]
StatementExp :
Exp
Catches :
CatchCtrl { CatchCtrl }
CatchCtrl :
catch ( {varModifier} ) CatchType Id ) Bloc
CatchType :
QualifiedId { | QualifiedId}
FinallyBloc :
FinallyBloc Bloc
SwitchBlocStatementGrps :
{ SwitchBlocStatementGrps }
SwitchBlocStatementGrps :
SwictLbl BlocStatements
SwictLbl :
SwictLbl { SwictLbl }
SwictLbl :
case Exp :
case EnumConstName :
default :
EnumConstName :
Id
ForCtrl
ForCtrl
ForInt ; [ Exp ] ; [ForUpdate]
ForCtrl :
{ VarModifier } ; DataType VarDeclaratorId ForVarCtrlRest
ForVarCtrlRest :
[ = VarInit ] { , VarDeclarator }
ForInit :
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
ForUpdate :
StatementExp { , StatementExp}
d) Function AND classes
Function are the block of a program that does a specific task and return a value to the calling
program. The domain specific language shall use functions as defined by the grammer below,
Classbdy :
{ { ClassbdyDeclare } }
ClassbdyDeclare :
;
{ Mod } MemberDeclare
[ Static ] Bloc
MemberDeclare :
MethodOrFieldDeclare
void Id VoidMethodDeclaratorRest
ClassDeclare
MethodOrFieldDeclare :
Datatype Id MethodOrFieldRest
(c)
NDFA
The productions rules were used to create the nondeterministic finite automata and the deterministic
finite automata. The nonterminal strings from the set of production rule were considered as a virtual
terminal string and were denoted by a single node. The NDFAs and their subsequent DFAs are
shown from the bottom to the top hence each non-terminal string present in the production rule are
defined using the NDFA and NFA (Kohavi and Jha, 2009).
Logic '<' | '>' | '>=' | '<=' | '!=' | '=='
NDFA
if Ex1 then if Ex2 then Stmt1 else Stmt2
if Ex1 then {if Ex2 then Stmt1} else Stmt2
if Ex1 then {if Ex2 then Stmt1 else Stmt2}
Document Page
if ( Exp ) Stmt else Stmt
(d)
Document Page
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
[Q4]
YEC
#include <stdio.h>
#include "myscanner.h"
ext int yylex();
ext int yylineno;
ext char* yytext;
char *names[] = {NULL, "dbType", "dbName", "dbTablePrefix", "dbPort"};
int main(void)
{
int n_token, v_token;
n_token = yylex();
while(n_token) {
printf("%d\n", n_token);
if(yylex() != COLON) {
printf("Syntax error alert on line %d, Was Expecting a ':' but found %s \n",
yylineno, yytext);
return 1;
}
v_token = yylex();
switch (n_token) {
case TYPE:
case NAME:
case TABLE_PREFIX:
if(v_token != IDENTIFIER) {
printf(" Syntax error alert on line %d,Expecting an identifier but
found %s Instead\n", yylineno, yytext);
Document Page
return 1;
}
printf("%s is set to %s\n", names[ntoken], yytext);
break;
case PORT:
if(v_token != INTEGER) {
printf(" Syntax error alert on line %d, Was Expecting an integer but
found %s Instead\n", yylineno, yytext);
return 1;
}
printf("%s is set to %s\n", names[ntoken], yytext);
break;
default:
printf(" Syntax error alert on %d\n",yylineno);
}
n_token = yylex();
}
return 0;
}
LEC
%{
#include <stdio.h>
#include "y.tab.h"
int c;
extern int yylval;
%}
%%
" " ;
[a-z] {
c = yytext[0];
yylval = c - 'a';
return(LETTER);
}
[0-9] {
c = yytext[0];
yylval = c - '0';
return(DIGIT);
}
[^a-z0-9\b] {
c = yytext[0];
return(c);
}
chevron_up_icon
1 out of 13
circle_padding
hide_on_mobile
zoom_out_icon
[object Object]