Code Optimisation
AST vs DAG vs CFG Graphs
Section titled βAST vs DAG vs CFG GraphsβDirected Acyclic Graph (DAG) β
- Used for code optimization inside a basic block
- Graph with directed edges and no cycles
- Nodes represent variables, constants, or operations
- Same subexpression β single node
- Helps in: β
-
Common subexpression elimination
-
Dead code elimination
- Code motion
-
- Scope: local optimization only
Abstract Syntax Tree (AST)
- Tree representation of program syntax
- Generated after parsing
- Interior nodes β operators / constructs
- Leaf nodes β identifiers / constants
- Removes unnecessary grammar symbols (unlike parse tree)
- Used for: β
-
Syntax analysis
-
Semantic analysis
- Intermediate code generation
-
- Structure: tree (hierarchical)
Control Flow Graph (CFG) β
- Graph representing flow of control
- Nodes β basic blocks
- Edges β possible control transfers
- Used for:
- Loop detection
- Reachability analysis
- Data flow analysis
- Global optimization
- May contain cycles (loops)
Key Differences (Exam-Oriented)
| Feature | DAG | AST | CFG |
|---|---|---|---|
| Structure | Graph | ==Tree== | Graph |
| Cycles | Not allowed | Not allowed | ==Allowed== |
| Level | Expression / Basic block | Program syntax | Program flow |
| Optimization | ==Yes (local)== | No | ==Yes (global)== |
| Focus | ==Computation== | Syntax | Control |
One-Line Memory Hooks
-
AST β how code is written β for meaning
-
DAG β how expressions are optimized β for efficiency
-
CFG β how control flows β for execution flow
Nodes and Edges in DAG
Section titled βNodes and Edges in DAGβDirected Acyclic Graph (DAG)== is used to ==represent basic blocks== in compiler optimization to ==eliminate redundant computations.
Nodes
Each node represents:
- A variable (leaf node)
- A constant (leaf node)
- An operator with operands (
+,-,*,/, etc.) So, -
Number of nodes== = ==No. of distinct variables== + ==No. of constants== + ==No. of distinct operations
- Common subexpressions share the same node
Edges
Edges represent data dependency:
-
Directed from operand β operator
- If an operator has
koperands, it contributeskincoming edges So, -
Number of edges== = ==sum of operands of all operator nodes
- Unary operator β 1 edge
- Binary operator β 2 edges
Key Point
DAG minimizes nodes and edges by:
- Merging identical subexpressions
- Avoiding recomputation
Example
a = b + cd = b + c- Nodes:
b,c,(b+c),a,dβ 5 nodes - Edges:
b β (b+c),c β (b+c),(b+c) β a,(b+c) β dβ 4 edges
b c \ / \ / ( + ) / \ a dBest view: Fewer nodes and edges = more optimized code
Data Flow Analysis & Code Optimisation
Section titled βData Flow Analysis & Code OptimisationβData Flow Analysis (DFA) β
-
Technique to collect information about program variables and expressions at different program points
-
Performed on Control Flow Graph (CFG)
- Foundation for global code optimization
- Based on iterative equations until fixed point
Basic Terms
- Basic Block: maximal sequence of statements with single entry and exit
- CFG Node: basic block
- CFG Edge: possible control transfer
-
IN[B]: information before block B
-
OUT[B]: information after block B
GEN[B]: information generated by block BKILL[B]: information invalidated by block B
Direction of Analysis
-
Forward== Analysis: information ==flows along control flowExample: Available Expression
-
Backward== Analysis: information ==flows opposite to control flowExample: Live Variable
Data Flow Equation (General)
-
Forward:
IN[B] = β / β OUT[P] for all predecessors POUT[B] = GEN[B] βͺ (IN[B] β KILL[B]) -
Backward:
OUT[B] = β / β IN[S] for all successors SIN[B] = GEN[B] βͺ (OUT[B] β KILL[B])
Meet Operator
- Union (β): may analysis
- Intersection (β): must analysis
A. Live Variable Analysis β
-
Variable is live== if its ==value may be used later
-
Type: Backward analysis
-
Meet: Union
- GEN: variables used before definition
- KILL: variables defined
-
Used for: Dead Code Elimination
B. Available Expression Analysis β
-
Expression is available== if already computed and not killed
-
Type: Forward analysis
-
Meet: Intersection
- GEN: expressions computed
- KILL: operands redefined
-
Used for: Common Sub-expression Elimination
Code Optimization
- Process of improving code without changing semantics
- Goal: reduce execution time, memory, power
- Performed on Intermediate Representation (IR)
Types of Optimization
- Machine Independent
- Machine Dependent
A. Local Optimization
- Applied within a basic block
- Techniques:
-
Common sub-expression elimination (via DAG)
-
Constant folding
-
Dead code elimination
-
B. Global Optimization
- Applied across multiple basic blocks
- Requires Data Flow Analysis
- Techniques:
-
Live variable based== ==dead code elimination
- Global sub-expression elimination
-
Code motion β
-
C. Loop Optimization
- Improves frequently executed loops
- Techniques:
-
Loop Invariant Code Motion β
-
Strength Reduction β Example: i * 2 β i + i β
-
Induction Variable Elimination
-
Techniques: ββ
- Dead Code Elimination β Removes statements whose results are never used (based on live variable analysis)
- Common Sub-Expression Elimination β Computes an expression once and reuses its value instead of recomputing
- Loop Invariant Code Motion β Moves computations that do not change inside a loop to outside the loop
- Strength Reduction β Replaces expensive operations with cheaper ones (e.g.,
*β+) - Induction Variable Elimination β Removes extra loop variables derivable from a primary induction variable
-
Code Motion β Relocates computations to points== ==where they execute fewer times without changing semantics
-
Constant Folding β Evaluates constant expressions at compile time
-
Constant Propagation β Replaces variables with known constant values
-
Copy Propagation β Replaces variables that are simple copies with the original variable
- Algebraic Simplification β Uses algebraic identities to simplify expressions
- Unreachable Code Elimination β Removes code that can never be executed
- Peephole Optimization β Performs small local optimizations on a short sequence of instructions
GATE Focus Points
- Direction of analysis
- Meet operator choice
- GEN/KILL construction
- Mapping:
DFA β enables optimization
Optimization β improves performance
One-Line Memory Rule
- CFG + DFA = Global Optimization