Skip to content

Code Optimisation

Directed Acyclic Graph (DAG) ⭐

  • Used for code optimization inside a basic block
  • Graph with directed edges and no cycles
  • Nodes represent variables, constants, or operations
  • Same subexpression β†’ single node
  • Helps in: ⭐
    • Common subexpression elimination

    • Dead code elimination

    • Code motion
  • Scope: local optimization only

Abstract Syntax Tree (AST)

  • Tree representation of program syntax
  • Generated after parsing
  • Interior nodes β†’ operators / constructs
  • Leaf nodes β†’ identifiers / constants
  • Removes unnecessary grammar symbols (unlike parse tree)
  • Used for: ⭐
    • Syntax analysis

    • Semantic analysis

    • Intermediate code generation
  • Structure: tree (hierarchical)

Control Flow Graph (CFG) ⭐

  • Graph representing flow of control
  • Nodes β†’ basic blocks
  • Edges β†’ possible control transfers
  • Used for:
    • Loop detection
    • Reachability analysis
    • Data flow analysis
    • Global optimization
  • May contain cycles (loops)

Key Differences (Exam-Oriented)

FeatureDAGASTCFG
StructureGraph==Tree==Graph
CyclesNot allowedNot allowed==Allowed==
LevelExpression / Basic blockProgram syntaxProgram flow
Optimization==Yes (local)==No==Yes (global)==
Focus==Computation==SyntaxControl

One-Line Memory Hooks

  • AST β†’ how code is written β†’ for meaning

  • DAG β†’ how expressions are optimized β†’ for efficiency

  • CFG β†’ how control flows β†’ for execution flow


Directed Acyclic Graph (DAG)== is used to ==represent basic blocks== in compiler optimization to ==eliminate redundant computations.

Nodes

Each node represents:

  • A variable (leaf node)
  • A constant (leaf node)
  • An operator with operands (+, -, *, /, etc.) So,
  • Number of nodes== = ==No. of distinct variables== + ==No. of constants== + ==No. of distinct operations

  • Common subexpressions share the same node

Edges

Edges represent data dependency:

  • Directed from operand β†’ operator

  • If an operator has k operands, it contributes k incoming edges So,
  • Number of edges== = ==sum of operands of all operator nodes

  • Unary operator β†’ 1 edge
  • Binary operator β†’ 2 edges

Key Point

DAG minimizes nodes and edges by:

  • Merging identical subexpressions
  • Avoiding recomputation

Example

a = b + c
d = b + c
  • Nodes: b, c, (b+c), a, d β†’ 5 nodes
  • Edges: b β†’ (b+c), c β†’ (b+c), (b+c) β†’ a, (b+c) β†’ d β†’ 4 edges
b c
\ /
\ /
( + )
/ \
a d

Best view: Fewer nodes and edges = more optimized code


Data Flow Analysis (DFA) ⭐

  • Technique to collect information about program variables and expressions at different program points

  • Performed on Control Flow Graph (CFG)

  • Foundation for global code optimization
  • Based on iterative equations until fixed point

Basic Terms

  • Basic Block: maximal sequence of statements with single entry and exit
  • CFG Node: basic block
  • CFG Edge: possible control transfer
  • IN[B]: information before block B

  • OUT[B]: information after block B

  • GEN[B]: information generated by block B
  • KILL[B]: information invalidated by block B

Direction of Analysis

  • Forward== Analysis: information ==flows along control flowExample: Available Expression

  • Backward== Analysis: information ==flows opposite to control flowExample: Live Variable

Data Flow Equation (General)

  • Forward:

    IN[B] = β‹‚ / ⋃ OUT[P] for all predecessors P
    OUT[B] = GEN[B] βˆͺ (IN[B] βˆ’ KILL[B])
  • Backward:

    OUT[B] = β‹‚ / ⋃ IN[S] for all successors S
    IN[B] = GEN[B] βˆͺ (OUT[B] βˆ’ KILL[B])

Meet Operator

  • Union (⋃): may analysis
  • Intersection (β‹‚): must analysis

A. Live Variable Analysis ⭐

  • Variable is live== if its ==value may be used later

  • Type: Backward analysis

  • Meet: Union

  • GEN: variables used before definition
  • KILL: variables defined
  • Used for: Dead Code Elimination

B. Available Expression Analysis ⭐

  • Expression is available== if already computed and not killed

  • Type: Forward analysis

  • Meet: Intersection

  • GEN: expressions computed
  • KILL: operands redefined
  • Used for: Common Sub-expression Elimination

Code Optimization

  • Process of improving code without changing semantics
  • Goal: reduce execution time, memory, power
  • Performed on Intermediate Representation (IR)

Types of Optimization

  1. Machine Independent
  2. Machine Dependent

A. Local Optimization

  • Applied within a basic block
  • Techniques:
    • Common sub-expression elimination (via DAG)

    • Constant folding

    • Dead code elimination

B. Global Optimization

  • Applied across multiple basic blocks
  • Requires Data Flow Analysis
  • Techniques:
    • Live variable based== ==dead code elimination

    • Global sub-expression elimination
    • Code motion ⭐

C. Loop Optimization

  • Improves frequently executed loops
  • Techniques:
    1. Loop Invariant Code Motion ⭐

    2. Strength Reduction ⭐ Example: i * 2 β†’ i + i ⭐

    3. Induction Variable Elimination

Techniques: ⭐⭐

  1. Dead Code Elimination – Removes statements whose results are never used (based on live variable analysis)
  2. Common Sub-Expression Elimination – Computes an expression once and reuses its value instead of recomputing
  3. Loop Invariant Code Motion – Moves computations that do not change inside a loop to outside the loop
  4. Strength Reduction – Replaces expensive operations with cheaper ones (e.g., * β†’ +)
  5. Induction Variable Elimination – Removes extra loop variables derivable from a primary induction variable
  6. Code Motion – Relocates computations to points== ==where they execute fewer times without changing semantics

  7. Constant Folding – Evaluates constant expressions at compile time

  8. Constant Propagation – Replaces variables with known constant values

  9. Copy Propagation – Replaces variables that are simple copies with the original variable

  10. Algebraic Simplification – Uses algebraic identities to simplify expressions
  11. Unreachable Code Elimination – Removes code that can never be executed
  12. Peephole Optimization – Performs small local optimizations on a short sequence of instructions

GATE Focus Points

  • Direction of analysis
  • Meet operator choice
  • GEN/KILL construction
  • Mapping:
    DFA β†’ enables optimization
    Optimization β†’ improves performance

One-Line Memory Rule

  • CFG + DFA = Global Optimization