SlideShare a Scribd company logo
1 of 65
Introduction to Compiler

               Prakash Khaire
Department of Computer Science and Technology
                   UTU
Introduction to Compiler
•   Compiler are basically “Language Translator”.
     – Language translators switch texts from one language into
        another, making sure that the translated version conforms to
        the grammar and style rules of the target language.
•   Compiler is a program which takes one language (source program)
    as input and converts into an equivalent another language (target
    program).




            Source program                       target program
                               Compiler
Introduction to Compiler
•   During this process of translation if some errors are encountered
    then compiler displays them as error messages.
•   The compiler takes a source program as higher level languages
    such as C, PASCAL, FORTRAN and converts it into low level
    languages or a machine language.
Stream of characters
       Process of
       Compiling           scanner                Stream of tokens


                             parser               Parse/syntax tree


                     Semantic analyzer                Annotated tree


                Intermediate code generator
                                                        Intermediate code
                    Code optimization
                                                  Intermediate code
                     Code generator
                                                  Target code
                      Code optimization
                                          Target code
Chapter 1                 2301373: Introduction                             4
Computer : Analysis Synthesis
            Model
• The compilation can be done in two parts
  – Analysis
  – Synthesis
• In analysis part – The source program is read
  and broken down into constituent pieces.
  – (The syntax and the meaning of the source string is
    determined and then an intermediate code is
    created from the input source program)
• In synthesis part – This intermediate form of the
  source language is taken and converted into an
  equivalent target program.
Computer : Analysis Synthesis
           Model
Analysis Part
•   The analysis part is carried out in three sub-parts
     – Lexical Analysis
         • In this part the source program is read and then it is broken
           into stream of strings. Such strings are called tokens
           (tokens are collection of characters having some meaning).
     – Syntax Analysis
         • In this step the tokens are arranged in hierarchical
           structure that ultimately helps in finding the syntax of the
           source string.
     – Semantic Analysis
         • In this step the meaning of the source string is determined.
Properties of Compiler
• It must be bug free
• It must generate correct machine code.
• The generated machine code must run fast.
• The compiler itself must run fast (compilation
  time must be proportional to program size)
• The compiler must be portable (i.e modular,
  supporting separate compilation)
• It must give good diagnostics and error
  messages.
• The generated code must work well with
  existing debuggers.
Phases of Compiler – Lexical
               Analysis
•   It is also called scanning.
•   It breaks the complete source code into tokens
     – For example : total = count + rate * 10
•   Then in lexical analysis phase this statement is broken up into
    series of tokens as follows:
     –   The identifier total
     –   The assignment symbol
     –   The identifier count
     –   The plus sign
     –   The identifier rate
     –   The multiplication sign
     –   The constant number 10
Phases of Compiler – Syntax
               Analysis
•   It is also called parsing.
•   In this phase, the tokens generated by lexical analyzer are grouped
    together to form a hierarchical structure.
•   It determines the structure of the source string by grouping the
    token together.
•   The hierarchical structure generated in this phase is called parse
    tree or syntax tree.
Parse tree

           =


total            +



                       *
        count

                           10
                rate
Phases of Compiler– Semantics
               Analysis
•   It determines the meaning of source string
•   For example - the meaning of source string means matching of
    parenthesis in the expression or matching of if….else statements
    or performing arithmetic operations of the expressions that are
    type compatible, or checking the scope of operation
Intermediate Representation
•   Most compilers translate the source code into some form of
    intermediate code
•   Intermediate code is later converted into machine code
•   Intermediate code forms such as three address code, quadruple,
    triple, posix
Example : Intermediate Code
            Generation
•   T1 : int to float
                              =
•   T2 : rate * t1
•   T3 : count + T2total            +

•   Total = T3
                                          *
                           count

                                              10
                                   rate
Code Optimization
•   It attempts to improve the intermediate code
•   Faster executing code or less consumption of memory
•   Machine Independent Code Optimization
•   Machine Dependent Code Optimization
Code Generation
•   In this phase the target code is generated (machine code)
•   The intermediate code instructions are translated into sequence of
    machine instructions
    –   MOV   rate, R1
    –   MUL   #10.0, R1
    –   MOV   count, R2
    –   ADD   R2, R1
    –   MOV   R1, total
Symbol Table Management
• It maintains and stores, identifiers(variables)
  used in program.
• It stores information about attributes of each
  identifier (attributes : type, its scope,
  information about storage allocated to it)
• It also stores information about the
  subroutines(functions) used in the program
  –   with its number of arguments
  –   Type of these arguments
  –   Method of passing these argument(call by value or refrenece)
  –   Return type
Symbol Table Management
•   Various phase use the symbol table
     – Semantic Analysis and Intermediate Code Generation we need
       to know what type of identifiers are used.
     – Code Generation, typically information about how much
       storage is allocated to identifier.
Symbol Table Management
Symbol Table Management
Grouping of Phases

                                                 Intermediate
                                                     Code



                                Input                              Back End
          Front End
                                Program
Input
Program                                                                                   Output
                                                                                          Program
          Lexical                    Semantic                     Code          Code
                       Parser
          Analysis                    Analysis                  Optimizer     Generator
Compiler Development
               Approach
•   Initially compiler were divided into multiple passes so that
    compiler has to manage only one pass at a time.
•   This approach was used because of limited in main memory.
•   Now a days two pass design of compiler is used.
     – The front end translate the source code into an intermediate
        representation.
     – The back end works with the intermediate representation to
        produce the machine code.
Compiler Development
               Approach
•   In many cases optimizers and error checkers can be shared by both
    phases if they are using intermediate representation.
•   Certain languages are capable of being compiled in a single pass
    also, due to few rules of that language like–
     – Place all variable declaration initially
     – Declaration of functions before it is used
Types of Compiler
• Native code compiler
   – The compiler designed to compile a source code for a same
     type of platform only.
• Cross compiler
   – The compiler designed to compile a source code for different
     platforms.
   – Such compiler s are often used for designing embedded system
• Source to source compiler or transcompiler
   – The compiler that takes high level language source code as
     input and outputs source code of another high level language.
   – it may perform a translation of a program from Pascal to C. An
     automatic parallelizing compiler will frequently take in a high
     level language program as an input and then transform the
     code and annotate it with parallel code annotations
Types of Compiler
• One pass Compiler
  – The compiler which completes whole compilation
    process in a single pass.
  – i.e., it traverse through the whole source code only
    once.
• Threaded Code Compiler
  – The compiler which will simply replace a string
    (e.g., name of subroutine) by an appropriate binary
    code.
• Incremental Compiler
  – The compiler which compiles only the changed lines
    from the source code and update the object code
Types of Compiler
• Stage Compiler
  – A compiler which converts the code into assembly
    code only.
• Just-in-time Compiler
  – A compiler which converts the code into machine
    code after the program starts execution.
• Retargetable Compiler
  – A compiler that can be easily modified to compile a
    source code for different CPU architectures.
• Parallelizing Compiler
  – A Compiler capable of compiling a code in parallel
    computer architecture.
Language Specification
•   In computer, all the instructions are represented as strings.
     – Instructions are in form of numbers, name, pictures
       or sounds
•   Strings used in organized manner forms a language.
•   Every programming language can be described by grammar.
•   Grammar allows us to write a computer program
•   A program code is checked whether a string of statements is
    syntactically correct.
Language Specification
• To design a language, we have to define alphabets.
   – Alphabets : A finite non-empty set of symbols that
     are used to form a word(string)
   – Example : An alphabet might be a set like {a, b}.
      • The symbol “ ∑” denote an alphabet
      • If ∑ = {a, b}, then we can create strings like a, ab, aab,
        abb, bba and so on and null string is denoted as “ ”.
      • The length of string can be denoted by |X|. Than | aba | =
        3, |a| = 1 and |n|=0.
   – The concatenation of X and Y is denoted by XY
   – The set of all strings over an alphabet “ ∑” is
     denoted by “ ∑*”
Language Specification
    – The set of nonempty strings over “ ∑” is denoted by
      “ ∑+”
     – Languages are set sets, standard set operations such as union,
       intersection and complementation
•   To describe language through regular expressions and grammars
    method , to determine a given string belongs to language or not.
Regular Expressions
• A regular expression provides a concise and
  flexible means to "match" (specify and recognize)
  strings of text, such as particular characters,
  words, or patterns of characters.
• The concept of regular expressions was first
  popularized by utilities provided by Unix
  distributions, in particular the editor ed and the
  filter grep.
• A regular expression is written in a formal language
  that can be interpreted by a regular expression
  processor, which is a program that either serves as
  a parser generator or examines text and identifies
Regular Expressions
• Regular expressions are used by many text
  editors, utilities, and programming languages to
  search and manipulate text based on patterns.
Finite Automata
• A finite-state machine (FSM) or finite-state
  automaton (plural: automata), or simply a state
  machine, is a mathematical model used to design
  computer programs and digital logic circuits.
• It is conceived as an abstract machine that can be in
  one of a finite number of states.
• The machine is in only one state at a time; the state it
  is in at any given time is called the current state.
• One of the state is designated as “Starting State” .
• More states are designated as “Final State”.
Finite Automata
• It can change from one state to another when initiated
  by a triggering event or condition, this is called a
  transition.
• A particular FSM is defined by a list of the possible
  transition states from each current state, and the
  triggering condition for each transition.
• Finite-state machines can model a large number of
  problems, among which are electronic design
  automation, communication protocol design, parsing
  and other engineering applications.
Finite Automata
•   States are represented as Circles
•   Transition are represented by Arrows
•   Each arrow is labeled with a character or a set of characters that
    cause the specified transition to occur.
•   The starting state has arrow entering it that is not connected to
    anything else
Finite Automata
• Deterministic Finite Automata (DFA)
  – The machine can exist in only one state at any given
    time
• Non-deterministic Finite Automata (NFA)
  – The machine can exist in multiple states at the
    same time
Deterministic Finite Automata
•   A Deterministic Finite Automaton (DFA) consists of:
      Q ==> a finite set of states
      Σ ==> a finite set of input symbols (alphabet)
      q0 ==> a start state
      F ==> set of final states
      δ ==> a transition function, which is a mapping between Q x Σ
       ==> Q
      A DFA is defined by the 5-tuple: {Q Σ q F δ }
How to use a DFA?
•   Input: a word w in Σ*
     – Question: Is w acceptable by the DFA?
     – Steps:
         • Start at the “start state” q0
         • For every input symbol in the sequence w do
         • Compute the next state from the current state, given the
           current input symbol in w and the transition function
         • If after all symbols in w are consumed, the current state is
           one of the final states (F) then accept w; Otherwise, reject
           w.
Regular Languages
•   Let L(A) be a language recognized by a
•   DFA A.
     – Then L(A) is called a “Regular Language”.
Example #1
•   Build a DFA for the following language:
     – L = {w | w is a binary string that contains 01 as a substring}
     – Steps for building a DFA to recognize L:
         • Σ = {0,1}
         • Decide on the states: Q
         • Designate start state and final state(s)
         • δ: Decide on the transitions:
     – Final states == same as “accepting states”
     – Other states == same as “non-accepting states”
Regular expression: (0+1)*01(0+1)*

DFA for strings containing 01
Non-deterministic Finite Automata
                 (NFA)
•   A Non-deterministic Finite Automaton
•   (NFA)
     – is of course “non-deterministic”
     – Implying that the machine can exist in more than one state at
       the same time
     – Outgoing transitions could be non-deterministic
Non-deterministic Finite Automata
                 (NFA)
•   A Non-deterministic Finite Automaton (NFA) consists of:
     – Q ==> a finite set of states
     – Σ ==> a finite set of input symbols (alphabet)
     – q0 ==> a start state
     – F ==> set of final states
     – δ ==> a transition function, which is a mapping between Q x Σ
       ==> subset of Q
     – An NFA is also defined by the 5-tuple: {Q Σ q F δ }
How to use an NFA?
• Input: a word w in Σ*
• Question: Is w acceptable by the NFA?
• Steps:
   – Start at the “start state” q0
   – For every input symbol in the sequence w do
   – Determine all the possible next states from the
     current state, given the current input symbol in w
     and the transition function
   – If after all symbols in w are consumed, at least one
     of the current states is a final state then accept w;
   – Otherwise, reject w.
Regular expression: (0+1)*01(0+1)*
NFA for strings containing 01
Differences: DFA vs. NFA
DFA                                      NFA
•   All transitions are deterministic    •   Transition are non-deterministic
     – Each transition leads to one           – A transition could lead to
         state                                   subset of state
•   For each state, transition on all    •   For each state, not all symbols
    possible symbols ( alphabet)             necessarily have to be defined in
    should be defined                        the transition function
•   Accepts input if the last state is   •   Accepts input if one of the last
    in F                                     states is in F
•   Sometimes harder to construct        •   Generally easier than a DFA to
    because of the number of states          construct
•   Practical implementation is          •   Practical implementation has to
    feasible                                 be derterministic(so needs
                                             converstion to DFA)
Construct a DFA to accept a string containing a zero
                followed by a one.
Construct a DFA to accept a string containing two consecutive zeroes
                followed by two consecutive ones
Grammars
• A grammar for any natural language such as Hindi,
  Gujarati, English, etc. is a formal description of the
  correctness of any kind of simple, complex or
  compound sentence of that language.

• Grammar checks      the   syntactic   correctness   of   a
  sentence.

• Similarly, a grammar for a programming language is a
  formal description of the syntax, form or construction,
  of programs and individual statements written in that
  programming language.
A formal grammar G is a 4
              tupel
• G={N, T, P, S}
  – Where, N : Set of non-terminal symbols
  – T : Set of terminal symbols
  – P : Set of production rules or simply production
• Terminal
  – Terminal symbols are literal characters that can appear in the inputs
    to or outputs from the production rules of a formal grammar and that
    cannot be broken down into "smaller" units. To be precise, terminal
    symbols cannot be changed using the rules of the grammar.
• Non-terminal
  – Nonterminal symbols, are the symbols which can be replaced; thus
    there are strings composed of some combination of terminal and
    nonterminal symbols.
Grammar
• Subject : The subject is the person, place, or thing
  that acts, is acted on, or is described in the
  sentence.
  •   Simple subject - a noun or a pronoun (e.g she, he, cat, city)
  •   Complete subject - a noun or a pronoun plus any modifiers
       (e.g the black cat,the clouds in the sky )


• Adjectives : They are words that describe nouns or
  pronouns. They may come before the word they
  describe (That is a cute puppy.) or they may follow
  the word they describe (That puppy is cute.).
Grammar
• Predicate :The predicate usually follows the subject ,
  tells what the subject does, has, or is, what is done to
  it, or where it is. It is the action or description that
  occurs in the sentence.
• Noun : A noun is a word used to refer to people,
  animals, objects, substances, states, events and
  feelings.
• Article : English has two types of articles: definite (the)
  and indefinite (a, an.) The use of these articles
  depends mainly on whether you are referring to any
  member of a group, or to a specific member of a group
Grammar
• Verbs : Verbs are a class of words used to show the
  performance of an action (do, throw, run), existence
  (be), possession (have), or state (know, love) of a
  subject.
• Direct Object : A direct object is a noun or pronoun
  that receives the action of a "transitive verb" in an
  active sentence or shows the result of the action. It
  answers the question "What?" or "Whom?" after an
  action verb.

•   Consider the english statement below
     – The small CD contains a large information.
Grammar
•   Subject
     – Article : the
     – Adjective : small
     – Noun : CD
•   Predicate
     – Verb : contains
     – Direct object : a large information
•   A direct object
     – Article : a
     – Adjective : large
     – Noun : information
Grammar
•   The   small CD contains a large information.
     1.    <sentence> : <subject><predicate>
     2.    <subject> : <article><adjective><noun>
     3.    <predicate> : <verb><direct-object>
     4.    <direct-object> : <article><adjective><noun>
     5.    <article> : The | a
     6.    <adjective> : small | large
     7.    <noun> : CD | Information
     8.    <verb> : contains
Generating a string in language
•   <sentence>
•   <subject><predicate>
•   <article><adjective><noun><verb><direct-object>
•   The | a, small | large, CD | information, <article><adjective><noun>
•   The | a, small | large, CD | information, contains
Grammar
•   N = {sentence, subject, predicate, article, adjective, noun, verb,
    direct-object}
•   T = {The, a, small, large, CD, information, contains}
•   S = sentence
•   P={           <sentence> : <subject><predicate>
                  <subject> : <article><adjective><noun>
                  <predicate> : <verb><direct-object>
                  <direct-object> : <article><adjective><noun>
                  <article> : The | a
                  <adjective> : small | large
                  <noun> : CD | Information
                  <verb> : contains
         }
The C Language Grammar
                             (abbreviated)
•   Terminals:
     – n if do while for switch break continue typedef struct return main
       int long char float double void static ;( ) a b c A B C 0 1 2 + * - / _ #
       include += ++ ...
•   Nonterminals:
     – n <statement> <expression> <C source file> <identifier> <digit>
       <nondigit> <identifier> <selection-statement>
       <loop-statement>
•   Start symbol: <C source file>
•   A string: #include <stdio.h>
              int main(void)
              {
                  printf("Hello World!n");
                  return 0;
              }
Hierarchy of Grammars
• Grammars can be divided into four classes by increasing
  the restrictions on the form of the productions.
• This hierarchy is also know as Chomsky(1963)
• It consists of four types of hierarchy classes
   – Type 0 : formal or unrestricted grammar
   – Type 1 : context-sensitive grammar
   – Type 2 : context-free grammar
   – Type 3 : right linear or regular grammar
Type 0 Grammars
•   These grammars, known as phrase structure grammars, contains
    production of form

                α :: = β
    Where both α and β can be strings
Type-1 Grammar
•   These grammar are known as context sensitive grammar
•   Their derivation or reduction of strings can take place only in
    specific contexts
•   αAβ :: =α∏β
     – String ∏ in a sentential form can be replaced by ‘A’ only when
       it is enclosed by the strings α and β.
     –
Type-2 Grammar
•   These grammar are known as context free grammar

    – A ::= ∏
Type-3 grammar
•   These grammar is also known as linear grammar or regular
    grammar

More Related Content

What's hot

Intermediate code generator
Intermediate code generatorIntermediate code generator
Intermediate code generatorsanchi29
 
basics of compiler design
basics of compiler designbasics of compiler design
basics of compiler designPreeti Katiyar
 
Compiler construction tools
Compiler construction toolsCompiler construction tools
Compiler construction toolsAkhil Kaushik
 
Compiler Design(NANTHU NOTES)
Compiler Design(NANTHU NOTES)Compiler Design(NANTHU NOTES)
Compiler Design(NANTHU NOTES)guest251d9a
 
Intermediate code generation (Compiler Design)
Intermediate code generation (Compiler Design)   Intermediate code generation (Compiler Design)
Intermediate code generation (Compiler Design) Tasif Tanzim
 
Compiler Design Basics
Compiler Design BasicsCompiler Design Basics
Compiler Design BasicsAkhil Kaushik
 
Fundamentals of Language Processing
Fundamentals of Language ProcessingFundamentals of Language Processing
Fundamentals of Language ProcessingHemant Sharma
 
Intermediate code generation in Compiler Design
Intermediate code generation in Compiler DesignIntermediate code generation in Compiler Design
Intermediate code generation in Compiler DesignKuppusamy P
 
Compiler design syntax analysis
Compiler design syntax analysisCompiler design syntax analysis
Compiler design syntax analysisRicha Sharma
 
Introduction to system programming
Introduction to system programmingIntroduction to system programming
Introduction to system programmingsonalikharade3
 
Error detection recovery
Error detection recoveryError detection recovery
Error detection recoveryTech_MX
 
Lexical Analysis - Compiler Design
Lexical Analysis - Compiler DesignLexical Analysis - Compiler Design
Lexical Analysis - Compiler DesignAkhil Kaushik
 

What's hot (20)

Lexical analysis - Compiler Design
Lexical analysis - Compiler DesignLexical analysis - Compiler Design
Lexical analysis - Compiler Design
 
Compilers
CompilersCompilers
Compilers
 
Intermediate code generator
Intermediate code generatorIntermediate code generator
Intermediate code generator
 
basics of compiler design
basics of compiler designbasics of compiler design
basics of compiler design
 
Introduction to Compiler design
Introduction to Compiler design Introduction to Compiler design
Introduction to Compiler design
 
Types of Parser
Types of ParserTypes of Parser
Types of Parser
 
COMPILER DESIGN- Introduction & Lexical Analysis:
COMPILER DESIGN- Introduction & Lexical Analysis: COMPILER DESIGN- Introduction & Lexical Analysis:
COMPILER DESIGN- Introduction & Lexical Analysis:
 
Compiler construction tools
Compiler construction toolsCompiler construction tools
Compiler construction tools
 
Compiler Design(NANTHU NOTES)
Compiler Design(NANTHU NOTES)Compiler Design(NANTHU NOTES)
Compiler Design(NANTHU NOTES)
 
Intermediate code generation (Compiler Design)
Intermediate code generation (Compiler Design)   Intermediate code generation (Compiler Design)
Intermediate code generation (Compiler Design)
 
Compiler Design Basics
Compiler Design BasicsCompiler Design Basics
Compiler Design Basics
 
Code Optimization
Code OptimizationCode Optimization
Code Optimization
 
Fundamentals of Language Processing
Fundamentals of Language ProcessingFundamentals of Language Processing
Fundamentals of Language Processing
 
phases of a compiler
 phases of a compiler phases of a compiler
phases of a compiler
 
Compilers
CompilersCompilers
Compilers
 
Intermediate code generation in Compiler Design
Intermediate code generation in Compiler DesignIntermediate code generation in Compiler Design
Intermediate code generation in Compiler Design
 
Compiler design syntax analysis
Compiler design syntax analysisCompiler design syntax analysis
Compiler design syntax analysis
 
Introduction to system programming
Introduction to system programmingIntroduction to system programming
Introduction to system programming
 
Error detection recovery
Error detection recoveryError detection recovery
Error detection recovery
 
Lexical Analysis - Compiler Design
Lexical Analysis - Compiler DesignLexical Analysis - Compiler Design
Lexical Analysis - Compiler Design
 

Similar to Introduction to compiler

Compiler Design Introduction
Compiler Design Introduction Compiler Design Introduction
Compiler Design Introduction Thapar Institute
 
Phases of Compiler.pptx
Phases of Compiler.pptxPhases of Compiler.pptx
Phases of Compiler.pptxssuser3b4934
 
Introduction to Compiler Construction
Introduction to Compiler Construction Introduction to Compiler Construction
Introduction to Compiler Construction Sarmad Ali
 
4_5802928814682016556.pptx
4_5802928814682016556.pptx4_5802928814682016556.pptx
4_5802928814682016556.pptxAshenafiGirma5
 
Cd ch1 - introduction
Cd   ch1 - introductionCd   ch1 - introduction
Cd ch1 - introductionmengistu23
 
CD - CH1 - Introduction to compiler design.pptx
CD - CH1 - Introduction to compiler design.pptxCD - CH1 - Introduction to compiler design.pptx
CD - CH1 - Introduction to compiler design.pptxZiyadMohammed17
 
unit1pdf__2021_12_14_12_37_34.pdf
unit1pdf__2021_12_14_12_37_34.pdfunit1pdf__2021_12_14_12_37_34.pdf
unit1pdf__2021_12_14_12_37_34.pdfDrIsikoIsaac
 
what is compiler and five phases of compiler
what is compiler and five phases of compilerwhat is compiler and five phases of compiler
what is compiler and five phases of compileradilmehmood93
 
Compier Design_Unit I.ppt
Compier Design_Unit I.pptCompier Design_Unit I.ppt
Compier Design_Unit I.pptsivaganesh293
 
Compier Design_Unit I.ppt
Compier Design_Unit I.pptCompier Design_Unit I.ppt
Compier Design_Unit I.pptsivaganesh293
 
Chapter1pdf__2021_11_23_10_53_20.pdf
Chapter1pdf__2021_11_23_10_53_20.pdfChapter1pdf__2021_11_23_10_53_20.pdf
Chapter1pdf__2021_11_23_10_53_20.pdfDrIsikoIsaac
 
Compiler Construction
Compiler ConstructionCompiler Construction
Compiler ConstructionSarmad Ali
 
Compiler Construction
Compiler ConstructionCompiler Construction
Compiler ConstructionAhmed Raza
 
COMPILER CONSTRUCTION KU 1.pptx
COMPILER CONSTRUCTION KU 1.pptxCOMPILER CONSTRUCTION KU 1.pptx
COMPILER CONSTRUCTION KU 1.pptxRossy719186
 
Pros and cons of c as a compiler language
  Pros and cons of c as a compiler language  Pros and cons of c as a compiler language
Pros and cons of c as a compiler languageAshok Raj
 
Compiler an overview
Compiler  an overviewCompiler  an overview
Compiler an overviewamudha arul
 

Similar to Introduction to compiler (20)

Compiler Design Introduction
Compiler Design Introduction Compiler Design Introduction
Compiler Design Introduction
 
Chapter 1.pptx
Chapter 1.pptxChapter 1.pptx
Chapter 1.pptx
 
Phases of Compiler.pptx
Phases of Compiler.pptxPhases of Compiler.pptx
Phases of Compiler.pptx
 
Introduction to Compiler Construction
Introduction to Compiler Construction Introduction to Compiler Construction
Introduction to Compiler Construction
 
4_5802928814682016556.pptx
4_5802928814682016556.pptx4_5802928814682016556.pptx
4_5802928814682016556.pptx
 
Cd ch1 - introduction
Cd   ch1 - introductionCd   ch1 - introduction
Cd ch1 - introduction
 
CD - CH1 - Introduction to compiler design.pptx
CD - CH1 - Introduction to compiler design.pptxCD - CH1 - Introduction to compiler design.pptx
CD - CH1 - Introduction to compiler design.pptx
 
unit1pdf__2021_12_14_12_37_34.pdf
unit1pdf__2021_12_14_12_37_34.pdfunit1pdf__2021_12_14_12_37_34.pdf
unit1pdf__2021_12_14_12_37_34.pdf
 
what is compiler and five phases of compiler
what is compiler and five phases of compilerwhat is compiler and five phases of compiler
what is compiler and five phases of compiler
 
COMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptxCOMPILER DESIGN PPTS.pptx
COMPILER DESIGN PPTS.pptx
 
Compier Design_Unit I.ppt
Compier Design_Unit I.pptCompier Design_Unit I.ppt
Compier Design_Unit I.ppt
 
Compier Design_Unit I.ppt
Compier Design_Unit I.pptCompier Design_Unit I.ppt
Compier Design_Unit I.ppt
 
Chapter1pdf__2021_11_23_10_53_20.pdf
Chapter1pdf__2021_11_23_10_53_20.pdfChapter1pdf__2021_11_23_10_53_20.pdf
Chapter1pdf__2021_11_23_10_53_20.pdf
 
Compiler Construction
Compiler ConstructionCompiler Construction
Compiler Construction
 
Compiler Construction
Compiler ConstructionCompiler Construction
Compiler Construction
 
COMPILER CONSTRUCTION KU 1.pptx
COMPILER CONSTRUCTION KU 1.pptxCOMPILER CONSTRUCTION KU 1.pptx
COMPILER CONSTRUCTION KU 1.pptx
 
1._Introduction_.pptx
1._Introduction_.pptx1._Introduction_.pptx
1._Introduction_.pptx
 
The Phases of a Compiler
The Phases of a CompilerThe Phases of a Compiler
The Phases of a Compiler
 
Pros and cons of c as a compiler language
  Pros and cons of c as a compiler language  Pros and cons of c as a compiler language
Pros and cons of c as a compiler language
 
Compiler an overview
Compiler  an overviewCompiler  an overview
Compiler an overview
 

More from Abha Damani (20)

Unit2
Unit2Unit2
Unit2
 
Unit6
Unit6Unit6
Unit6
 
Unit5
Unit5Unit5
Unit5
 
Unit4
Unit4Unit4
Unit4
 
Unit3
Unit3Unit3
Unit3
 
Unit 1 introduction to visual basic programming
Unit 1 introduction to visual basic programmingUnit 1 introduction to visual basic programming
Unit 1 introduction to visual basic programming
 
Ch14
Ch14Ch14
Ch14
 
Ch12
Ch12Ch12
Ch12
 
Ch11
Ch11Ch11
Ch11
 
Ch10
Ch10Ch10
Ch10
 
Ch08
Ch08Ch08
Ch08
 
Ch01 enterprise
Ch01 enterpriseCh01 enterprise
Ch01 enterprise
 
3 data mgmt
3 data mgmt3 data mgmt
3 data mgmt
 
2 it supp_sys
2 it supp_sys2 it supp_sys
2 it supp_sys
 
1 org.perf it supp_appl
1 org.perf it supp_appl1 org.perf it supp_appl
1 org.perf it supp_appl
 
Managing and securing the enterprise
Managing and securing the enterpriseManaging and securing the enterprise
Managing and securing the enterprise
 
Ch6
Ch6Ch6
Ch6
 
Unit2
Unit2Unit2
Unit2
 
Unit 3
Unit 3Unit 3
Unit 3
 
Unit 4
Unit 4Unit 4
Unit 4
 

Recently uploaded

How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17Celine George
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Jisc
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfPoh-Sun Goh
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.christianmathematics
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptxMaritesTamaniVerdade
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfNirmal Dwivedi
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseAnaAcapella
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.pptRamjanShidvankar
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
Third Battle of Panipat detailed notes.pptx
Third Battle of Panipat detailed notes.pptxThird Battle of Panipat detailed notes.pptx
Third Battle of Panipat detailed notes.pptxAmita Gupta
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsMebane Rash
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSCeline George
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfSherif Taha
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxnegromaestrong
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701bronxfugly43
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxAreebaZafar22
 

Recently uploaded (20)

Asian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptxAsian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptx
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Third Battle of Panipat detailed notes.pptx
Third Battle of Panipat detailed notes.pptxThird Battle of Panipat detailed notes.pptx
Third Battle of Panipat detailed notes.pptx
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 

Introduction to compiler

  • 1. Introduction to Compiler Prakash Khaire Department of Computer Science and Technology UTU
  • 2. Introduction to Compiler • Compiler are basically “Language Translator”. – Language translators switch texts from one language into another, making sure that the translated version conforms to the grammar and style rules of the target language. • Compiler is a program which takes one language (source program) as input and converts into an equivalent another language (target program). Source program target program Compiler
  • 3. Introduction to Compiler • During this process of translation if some errors are encountered then compiler displays them as error messages. • The compiler takes a source program as higher level languages such as C, PASCAL, FORTRAN and converts it into low level languages or a machine language.
  • 4. Stream of characters Process of Compiling scanner Stream of tokens parser Parse/syntax tree Semantic analyzer Annotated tree Intermediate code generator Intermediate code Code optimization Intermediate code Code generator Target code Code optimization Target code Chapter 1 2301373: Introduction 4
  • 5. Computer : Analysis Synthesis Model • The compilation can be done in two parts – Analysis – Synthesis • In analysis part – The source program is read and broken down into constituent pieces. – (The syntax and the meaning of the source string is determined and then an intermediate code is created from the input source program) • In synthesis part – This intermediate form of the source language is taken and converted into an equivalent target program.
  • 6. Computer : Analysis Synthesis Model
  • 7. Analysis Part • The analysis part is carried out in three sub-parts – Lexical Analysis • In this part the source program is read and then it is broken into stream of strings. Such strings are called tokens (tokens are collection of characters having some meaning). – Syntax Analysis • In this step the tokens are arranged in hierarchical structure that ultimately helps in finding the syntax of the source string. – Semantic Analysis • In this step the meaning of the source string is determined.
  • 8. Properties of Compiler • It must be bug free • It must generate correct machine code. • The generated machine code must run fast. • The compiler itself must run fast (compilation time must be proportional to program size) • The compiler must be portable (i.e modular, supporting separate compilation) • It must give good diagnostics and error messages. • The generated code must work well with existing debuggers.
  • 9. Phases of Compiler – Lexical Analysis • It is also called scanning. • It breaks the complete source code into tokens – For example : total = count + rate * 10 • Then in lexical analysis phase this statement is broken up into series of tokens as follows: – The identifier total – The assignment symbol – The identifier count – The plus sign – The identifier rate – The multiplication sign – The constant number 10
  • 10. Phases of Compiler – Syntax Analysis • It is also called parsing. • In this phase, the tokens generated by lexical analyzer are grouped together to form a hierarchical structure. • It determines the structure of the source string by grouping the token together. • The hierarchical structure generated in this phase is called parse tree or syntax tree.
  • 11. Parse tree = total + * count 10 rate
  • 12. Phases of Compiler– Semantics Analysis • It determines the meaning of source string • For example - the meaning of source string means matching of parenthesis in the expression or matching of if….else statements or performing arithmetic operations of the expressions that are type compatible, or checking the scope of operation
  • 13. Intermediate Representation • Most compilers translate the source code into some form of intermediate code • Intermediate code is later converted into machine code • Intermediate code forms such as three address code, quadruple, triple, posix
  • 14. Example : Intermediate Code Generation • T1 : int to float = • T2 : rate * t1 • T3 : count + T2total + • Total = T3 * count 10 rate
  • 15. Code Optimization • It attempts to improve the intermediate code • Faster executing code or less consumption of memory • Machine Independent Code Optimization • Machine Dependent Code Optimization
  • 16. Code Generation • In this phase the target code is generated (machine code) • The intermediate code instructions are translated into sequence of machine instructions – MOV rate, R1 – MUL #10.0, R1 – MOV count, R2 – ADD R2, R1 – MOV R1, total
  • 17. Symbol Table Management • It maintains and stores, identifiers(variables) used in program. • It stores information about attributes of each identifier (attributes : type, its scope, information about storage allocated to it) • It also stores information about the subroutines(functions) used in the program – with its number of arguments – Type of these arguments – Method of passing these argument(call by value or refrenece) – Return type
  • 18. Symbol Table Management • Various phase use the symbol table – Semantic Analysis and Intermediate Code Generation we need to know what type of identifiers are used. – Code Generation, typically information about how much storage is allocated to identifier.
  • 21. Grouping of Phases Intermediate Code Input Back End Front End Program Input Program Output Program Lexical Semantic Code Code Parser Analysis Analysis Optimizer Generator
  • 22. Compiler Development Approach • Initially compiler were divided into multiple passes so that compiler has to manage only one pass at a time. • This approach was used because of limited in main memory. • Now a days two pass design of compiler is used. – The front end translate the source code into an intermediate representation. – The back end works with the intermediate representation to produce the machine code.
  • 23. Compiler Development Approach • In many cases optimizers and error checkers can be shared by both phases if they are using intermediate representation. • Certain languages are capable of being compiled in a single pass also, due to few rules of that language like– – Place all variable declaration initially – Declaration of functions before it is used
  • 24. Types of Compiler • Native code compiler – The compiler designed to compile a source code for a same type of platform only. • Cross compiler – The compiler designed to compile a source code for different platforms. – Such compiler s are often used for designing embedded system • Source to source compiler or transcompiler – The compiler that takes high level language source code as input and outputs source code of another high level language. – it may perform a translation of a program from Pascal to C. An automatic parallelizing compiler will frequently take in a high level language program as an input and then transform the code and annotate it with parallel code annotations
  • 25. Types of Compiler • One pass Compiler – The compiler which completes whole compilation process in a single pass. – i.e., it traverse through the whole source code only once. • Threaded Code Compiler – The compiler which will simply replace a string (e.g., name of subroutine) by an appropriate binary code. • Incremental Compiler – The compiler which compiles only the changed lines from the source code and update the object code
  • 26. Types of Compiler • Stage Compiler – A compiler which converts the code into assembly code only. • Just-in-time Compiler – A compiler which converts the code into machine code after the program starts execution. • Retargetable Compiler – A compiler that can be easily modified to compile a source code for different CPU architectures. • Parallelizing Compiler – A Compiler capable of compiling a code in parallel computer architecture.
  • 27. Language Specification • In computer, all the instructions are represented as strings. – Instructions are in form of numbers, name, pictures or sounds • Strings used in organized manner forms a language. • Every programming language can be described by grammar. • Grammar allows us to write a computer program • A program code is checked whether a string of statements is syntactically correct.
  • 28. Language Specification • To design a language, we have to define alphabets. – Alphabets : A finite non-empty set of symbols that are used to form a word(string) – Example : An alphabet might be a set like {a, b}. • The symbol “ ∑” denote an alphabet • If ∑ = {a, b}, then we can create strings like a, ab, aab, abb, bba and so on and null string is denoted as “ ”. • The length of string can be denoted by |X|. Than | aba | = 3, |a| = 1 and |n|=0. – The concatenation of X and Y is denoted by XY – The set of all strings over an alphabet “ ∑” is denoted by “ ∑*”
  • 29. Language Specification – The set of nonempty strings over “ ∑” is denoted by “ ∑+” – Languages are set sets, standard set operations such as union, intersection and complementation • To describe language through regular expressions and grammars method , to determine a given string belongs to language or not.
  • 30. Regular Expressions • A regular expression provides a concise and flexible means to "match" (specify and recognize) strings of text, such as particular characters, words, or patterns of characters. • The concept of regular expressions was first popularized by utilities provided by Unix distributions, in particular the editor ed and the filter grep. • A regular expression is written in a formal language that can be interpreted by a regular expression processor, which is a program that either serves as a parser generator or examines text and identifies
  • 31. Regular Expressions • Regular expressions are used by many text editors, utilities, and programming languages to search and manipulate text based on patterns.
  • 32. Finite Automata • A finite-state machine (FSM) or finite-state automaton (plural: automata), or simply a state machine, is a mathematical model used to design computer programs and digital logic circuits. • It is conceived as an abstract machine that can be in one of a finite number of states. • The machine is in only one state at a time; the state it is in at any given time is called the current state. • One of the state is designated as “Starting State” . • More states are designated as “Final State”.
  • 33. Finite Automata • It can change from one state to another when initiated by a triggering event or condition, this is called a transition. • A particular FSM is defined by a list of the possible transition states from each current state, and the triggering condition for each transition. • Finite-state machines can model a large number of problems, among which are electronic design automation, communication protocol design, parsing and other engineering applications.
  • 34. Finite Automata • States are represented as Circles • Transition are represented by Arrows • Each arrow is labeled with a character or a set of characters that cause the specified transition to occur. • The starting state has arrow entering it that is not connected to anything else
  • 35. Finite Automata • Deterministic Finite Automata (DFA) – The machine can exist in only one state at any given time • Non-deterministic Finite Automata (NFA) – The machine can exist in multiple states at the same time
  • 36. Deterministic Finite Automata • A Deterministic Finite Automaton (DFA) consists of:  Q ==> a finite set of states  Σ ==> a finite set of input symbols (alphabet)  q0 ==> a start state  F ==> set of final states  δ ==> a transition function, which is a mapping between Q x Σ ==> Q  A DFA is defined by the 5-tuple: {Q Σ q F δ }
  • 37. How to use a DFA? • Input: a word w in Σ* – Question: Is w acceptable by the DFA? – Steps: • Start at the “start state” q0 • For every input symbol in the sequence w do • Compute the next state from the current state, given the current input symbol in w and the transition function • If after all symbols in w are consumed, the current state is one of the final states (F) then accept w; Otherwise, reject w.
  • 38. Regular Languages • Let L(A) be a language recognized by a • DFA A. – Then L(A) is called a “Regular Language”.
  • 39. Example #1 • Build a DFA for the following language: – L = {w | w is a binary string that contains 01 as a substring} – Steps for building a DFA to recognize L: • Σ = {0,1} • Decide on the states: Q • Designate start state and final state(s) • δ: Decide on the transitions: – Final states == same as “accepting states” – Other states == same as “non-accepting states”
  • 40. Regular expression: (0+1)*01(0+1)* DFA for strings containing 01
  • 41. Non-deterministic Finite Automata (NFA) • A Non-deterministic Finite Automaton • (NFA) – is of course “non-deterministic” – Implying that the machine can exist in more than one state at the same time – Outgoing transitions could be non-deterministic
  • 42. Non-deterministic Finite Automata (NFA) • A Non-deterministic Finite Automaton (NFA) consists of: – Q ==> a finite set of states – Σ ==> a finite set of input symbols (alphabet) – q0 ==> a start state – F ==> set of final states – δ ==> a transition function, which is a mapping between Q x Σ ==> subset of Q – An NFA is also defined by the 5-tuple: {Q Σ q F δ }
  • 43. How to use an NFA? • Input: a word w in Σ* • Question: Is w acceptable by the NFA? • Steps: – Start at the “start state” q0 – For every input symbol in the sequence w do – Determine all the possible next states from the current state, given the current input symbol in w and the transition function – If after all symbols in w are consumed, at least one of the current states is a final state then accept w; – Otherwise, reject w.
  • 44. Regular expression: (0+1)*01(0+1)* NFA for strings containing 01
  • 45. Differences: DFA vs. NFA DFA NFA • All transitions are deterministic • Transition are non-deterministic – Each transition leads to one – A transition could lead to state subset of state • For each state, transition on all • For each state, not all symbols possible symbols ( alphabet) necessarily have to be defined in should be defined the transition function • Accepts input if the last state is • Accepts input if one of the last in F states is in F • Sometimes harder to construct • Generally easier than a DFA to because of the number of states construct • Practical implementation is • Practical implementation has to feasible be derterministic(so needs converstion to DFA)
  • 46. Construct a DFA to accept a string containing a zero followed by a one.
  • 47. Construct a DFA to accept a string containing two consecutive zeroes followed by two consecutive ones
  • 48. Grammars • A grammar for any natural language such as Hindi, Gujarati, English, etc. is a formal description of the correctness of any kind of simple, complex or compound sentence of that language. • Grammar checks the syntactic correctness of a sentence. • Similarly, a grammar for a programming language is a formal description of the syntax, form or construction, of programs and individual statements written in that programming language.
  • 49. A formal grammar G is a 4 tupel • G={N, T, P, S} – Where, N : Set of non-terminal symbols – T : Set of terminal symbols – P : Set of production rules or simply production • Terminal – Terminal symbols are literal characters that can appear in the inputs to or outputs from the production rules of a formal grammar and that cannot be broken down into "smaller" units. To be precise, terminal symbols cannot be changed using the rules of the grammar. • Non-terminal – Nonterminal symbols, are the symbols which can be replaced; thus there are strings composed of some combination of terminal and nonterminal symbols.
  • 50. Grammar • Subject : The subject is the person, place, or thing that acts, is acted on, or is described in the sentence. • Simple subject - a noun or a pronoun (e.g she, he, cat, city) • Complete subject - a noun or a pronoun plus any modifiers (e.g the black cat,the clouds in the sky ) • Adjectives : They are words that describe nouns or pronouns. They may come before the word they describe (That is a cute puppy.) or they may follow the word they describe (That puppy is cute.).
  • 51. Grammar • Predicate :The predicate usually follows the subject , tells what the subject does, has, or is, what is done to it, or where it is. It is the action or description that occurs in the sentence. • Noun : A noun is a word used to refer to people, animals, objects, substances, states, events and feelings. • Article : English has two types of articles: definite (the) and indefinite (a, an.) The use of these articles depends mainly on whether you are referring to any member of a group, or to a specific member of a group
  • 52. Grammar • Verbs : Verbs are a class of words used to show the performance of an action (do, throw, run), existence (be), possession (have), or state (know, love) of a subject. • Direct Object : A direct object is a noun or pronoun that receives the action of a "transitive verb" in an active sentence or shows the result of the action. It answers the question "What?" or "Whom?" after an action verb. • Consider the english statement below – The small CD contains a large information.
  • 53. Grammar • Subject – Article : the – Adjective : small – Noun : CD • Predicate – Verb : contains – Direct object : a large information • A direct object – Article : a – Adjective : large – Noun : information
  • 54. Grammar • The small CD contains a large information. 1. <sentence> : <subject><predicate> 2. <subject> : <article><adjective><noun> 3. <predicate> : <verb><direct-object> 4. <direct-object> : <article><adjective><noun> 5. <article> : The | a 6. <adjective> : small | large 7. <noun> : CD | Information 8. <verb> : contains
  • 55. Generating a string in language • <sentence> • <subject><predicate> • <article><adjective><noun><verb><direct-object> • The | a, small | large, CD | information, <article><adjective><noun> • The | a, small | large, CD | information, contains
  • 56. Grammar • N = {sentence, subject, predicate, article, adjective, noun, verb, direct-object} • T = {The, a, small, large, CD, information, contains} • S = sentence • P={ <sentence> : <subject><predicate> <subject> : <article><adjective><noun> <predicate> : <verb><direct-object> <direct-object> : <article><adjective><noun> <article> : The | a <adjective> : small | large <noun> : CD | Information <verb> : contains }
  • 57. The C Language Grammar (abbreviated) • Terminals: – n if do while for switch break continue typedef struct return main int long char float double void static ;( ) a b c A B C 0 1 2 + * - / _ # include += ++ ... • Nonterminals: – n <statement> <expression> <C source file> <identifier> <digit> <nondigit> <identifier> <selection-statement> <loop-statement> • Start symbol: <C source file> • A string: #include <stdio.h> int main(void) { printf("Hello World!n"); return 0; }
  • 58.
  • 59.
  • 60.
  • 61. Hierarchy of Grammars • Grammars can be divided into four classes by increasing the restrictions on the form of the productions. • This hierarchy is also know as Chomsky(1963) • It consists of four types of hierarchy classes – Type 0 : formal or unrestricted grammar – Type 1 : context-sensitive grammar – Type 2 : context-free grammar – Type 3 : right linear or regular grammar
  • 62. Type 0 Grammars • These grammars, known as phrase structure grammars, contains production of form α :: = β Where both α and β can be strings
  • 63. Type-1 Grammar • These grammar are known as context sensitive grammar • Their derivation or reduction of strings can take place only in specific contexts • αAβ :: =α∏β – String ∏ in a sentential form can be replaced by ‘A’ only when it is enclosed by the strings α and β. –
  • 64. Type-2 Grammar • These grammar are known as context free grammar – A ::= ∏
  • 65. Type-3 grammar • These grammar is also known as linear grammar or regular grammar