Skip to content

LL(1) Parsing

  • Used in Top-Down Parsing

  • Essential for LL(1) grammar

  • Helps in predictive parsing== ==table construction

Grammar Basics

  • Terminal (T): tokens (id, +, *, (, ), etc.)
  • Non-terminal (NT): variables (E, T, F, etc.)
  • Ξ΅ (epsilon): empty string

  • $: input end marker

  • Grammar: A β†’ Ξ±

FIRST Set

  • FIRST(X) = set of terminals== that can ==appear as the first symbol== of any ==string derived from X

Rules to Compute FIRST ⭐

  1. If X is a terminal

    FIRST(X) = {X}
  2. If X β†’ Ξ΅

    Ρ ∈ FIRST(X)
  3. If X β†’ Y1 Y2 ... Yk ⭐

    • Add FIRST(Y1) βˆ’ {Ξ΅} to FIRST(X)
    • If Ξ΅ ∈ FIRST(Y1), then add FIRST(Y2) βˆ’ {Ξ΅}
    • Continue until:
      • Ξ΅ not found, or
      • All Yi contain Ξ΅ β†’ then add Ξ΅ to FIRST(X)

Key Points

  • FIRST is about starting terminals

  • Ξ΅ only included if whole RHS can derive Ξ΅

FOLLOW Set

  • FOLLOW(A) = set of terminals== that can ==appear immediately after A in some sentential form

Rules to Compute FOLLOW ⭐

  1. If A is start symbol

    $ ∈ FOLLOW(A)
  2. For production X β†’ Ξ± A Ξ²

    FIRST(Ξ²) βˆ’ {Ξ΅} βŠ† FOLLOW(A)
  3. If X β†’ Ξ± A OR X β†’ Ξ± A Ξ² and Ξ΅ ∈ FIRST(Ξ²)

    FOLLOW(X) βŠ† FOLLOW(A)

Key Points

  • FOLLOW depends on context

  • FOLLOW never contains Ξ΅

Algorithm (GATE-Oriented)

  1. Initialize all FIRST and FOLLOW as empty
  2. Apply FIRST rules== until ==no change

  3. Apply FOLLOW rules== iteratively until ==fixed point

Example ⭐

Grammar:

E β†’ T E'
E' β†’ + T E' | Ξ΅
T β†’ F T'
T' β†’ * F T' | Ξ΅
F β†’ ( E ) | id

FIRST:

FIRST(E) = { (, id }
FIRST(E') = { +, Ξ΅ }
FIRST(T) = { (, id }
FIRST(T') = { *, Ξ΅ }
FIRST(F) = { (, id }

FOLLOW:

FOLLOW(E) = { ), $ }
FOLLOW(E') = { ), $ }
FOLLOW(T) = { +, ), $ }
FOLLOW(T') = { +, ), $ }
FOLLOW(F) = { *, +, ), $ }

Note: Xβ€² (X prime) is a new non-terminal introduced to remove left recursion and make the grammar suitable for LL(1) parsing

FIRST of String ⭐

For string Ξ± = X1 X2 ... Xn

  • FIRST(Ξ±) computed same as RHS rule
  • Used directly in parsing table

LL(1) Grammar Condition

Grammar is LL(1) iff for every non-terminal A:

  1. For A β†’ Ξ± | Ξ²

    FIRST(Ξ±) ∩ FIRST(Ξ²) = βˆ…
  2. If Ρ ∈ FIRST(α)

    FIRST(Ξ²) ∩ FOLLOW(A) = βˆ…

FOLLOW vs LFOLLOW vs RFOLLOW ⭐

FOLLOW

  • FOLLOW(A) = set of terminals that can appear immediately after non-terminal A in some sentential form
  • Used in LL(1) parsing
  • $ ∈ FOLLOW(start symbol)
  • Ξ΅ never appears in FOLLOW
  • Example:
S β†’ A b
A β†’ a | Ξ΅
FOLLOW(S) = { $ }
FOLLOW(A) = { b }

LFOLLOW (Left FOLLOW)

  • LFOLLOW(A) = terminals that can appear immediately to the left of A
  • Used in operator precedence / LR parsing concepts
  • Not used in LL(1)
  • Example:
S β†’ a A
LFOLLOW(A) = { a }

RFOLLOW (Right FOLLOW)

  • RFOLLOW(A) = terminals that can appear immediately to the right of A
  • Practically same as FOLLOW(A) in most compiler texts
  • Emphasizes right context
  • Example:
S β†’ A b
RFOLLOW(A) = { b }
  • FOLLOW β†’ LL(1), predictive parsing
  • LFOLLOW / RFOLLOW β†’ theoretical, LR / precedence discussions

FOLLOW usually means RFOLLOW** ⭐

S β†’ AB
A β†’ a
B β†’ b
FOLLOW(A) = {b}
because, S β†’ AB β†’ Ab β†’ ab
LFOLLOW(A) = { }
because, S β†’ AB β†’ aB β†’ ab
RFOLLOW(B) = {b}
because, S β†’ AB β†’ Ab β†’ ab

For production A β†’ Ξ±:

  1. For each a ∈ FIRST(α)

    M[A, a] = A β†’ Ξ±
  2. If Ρ ∈ FIRST(α)

    For each b ∈ FOLLOW(A), M[A, b] = A β†’ Ξ±

Common GATE Traps

  • Mixing FIRST and FOLLOW rules
  • Forgetting $ in FOLLOW(start symbol)
  • Incorrect Ξ΅ propagation
  • FOLLOW depends on LHS, not RHS only

One-Line Memory

  • FIRST β†’ what can start
  • FOLLOW β†’ what can follow
  • LL(1) β†’ no conflict between them

  • LL(1) Predictive Parsing Table
  • Used in Top-Down Parsing
  • Built using FIRST and FOLLOW
  • Decides which production to apply using (Non-terminal, Input symbol)

Structure of Table

  • Rows β†’ Non-terminals
  • Columns β†’ Terminals + $
  • Cell Entry β†’ Grammar production
  • Empty cell β†’ error

Steps to Construct Parsing Table

Step 1: Compute FIRST

  • FIRST of RHS** tells which terminal can start

  • If Ξ΅ ∈ FIRST(RHS), mark it

Step 2: Compute FOLLOW

  • FOLLOW of LHS** tells what can appear next

  • Add $ to FOLLOW(start symbol)

Step 3: Fill Table

For each production A β†’ Ξ±

  1. For every terminal a ∈ FIRST(Ξ±) and a β‰  Ξ΅
M[A, a] = A β†’ Ξ±
  1. If Ρ ∈ FIRST(α)
For every b ∈ FOLLOW(A)
M[A, b] = A β†’ Ξ±
  1. Remaining cells β†’ error

Example

S β†’ a A | b B | Ξ΅
A β†’ S
B β†’ S | Ξ΅

FIRST Sets

FIRST(S) = { a, b, Ξ΅ }
FIRST(A) = { a, b, Ξ΅ }
FIRST(B) = { a, b, Ξ΅ }

FOLLOW Sets

FOLLOW(S) = { $ }
FOLLOW(A) = { $ }
FOLLOW(B) = { $ }

Predictive Parsing Table

ab$
SS β†’ aAS β†’ bBS β†’ Ξ΅
AA β†’ SA β†’ Serror
BB β†’ SB β†’ SB β†’ Ξ΅

How Each Entry Came (Logic)

Row S

  • a ∈ FIRST(a) β†’ S β†’ a
  • b ∈ FIRST(b) β†’ S β†’ b
  • Ξ΅ ∈ FIRST(S) and $ ∈ FOLLOW(S) β†’ S β†’ Ξ΅

Row A

  • A β†’ S
  • FIRST(S) βˆ’ {Ξ΅} = {a, b} β†’ entries under a and b
  • Even though Ξ΅ ∈ FIRST(S), FOLLOW(A) is not used here because grammar would cause conflict (LL(1) violation if added)

Note: Ρ ∈ FIRST(RHS) does NOT automatically mean entry under $. $ column is filled only using FOLLOW(LHS) of the same production Always check which non-terminal owns Ρ

Row B

  • B β†’ S gives entries under a, b
  • B β†’ Ξ΅ and $ ∈ FOLLOW(B) β†’ entry under $

LL(1) Check (GATE Rule)

Grammar is LL(1) iff:

  • No cell has more than one entry
  • FIRST–FIRST and FIRST–FOLLOW conflicts absent

One-Line Memory

  • FIRST fills terminals
  • FOLLOW fills $ for Ξ΅-productions
  • Multiple entries β†’ not LL(1)