interprocedural analysis · the gate before every memory optimization

What can this pointer actually point to?

Pointer & alias analysis decides whether two memory references can touch the same cell. Get it right and the optimizer is free to reorder, hoist, and forward. Get it wrong by one bit and you ship a miscompile.

This page is a debugger, not a blog post. Step the visualizations: watch points-to sets propagate, nodes coalesce, kills strike old arrows out, and the danger color flash the moment soundness breaks.

§ 01 — first principles

What aliasing is, and why it gates everything

Two expressions alias if they may denote the same storage location at runtime. *p and x alias if p may hold &x. Aliasing is a relation over access paths, derived from a points-to relation over pointers — "what objects can p point to?"

Watch a points-to graph form. Each statement adds a directed pointer edge; after p = q, p copies q's target and gains an arrow to b.

Points-to graph stepperlive memory

1p = &a;   // {a} ⊆ pts(p)
2q = &b;   // {b} ⊆ pts(q)
3p = q;    // pts(q) ⊆ pts(p)  [copy]

step 0 / 3

Press Step. pts(p) and pts(q) build up; after line 3, pts(p) = {a, b}.

Two flavors — the difference is the whole game

may-alias the pair might refer to the same location on some execution. An over-approximation; a "yes" can be a false positive. Used to forbid unsafe transforms — if two refs may alias, you cannot reorder them.

must-alias the pair definitely refers to the same location on every reaching execution. An under-approximation; a "yes" is a guarantee. Used to enable transforms — forward a stored value only if the pointers must-alias.

Mnemonic. may-alias is for safety (don't break things); must-alias is for opportunity (rewrite aggressively). Two pointers may-alias iff their points-to sets intersect. Empty intersection ⇒ provably no-alias ⇒ the optimizer is free.

May vs Must aliasset intersection

Why aliasing is the enabling/blocking analysis

Every transform that moves, removes, or reorders a memory op must first prove the locations are independent. Without alias info, the compiler must assume any store may clobber any load — which kills:

Optimization	What it needs alias info to prove
Dead store elimination	a store to `*p` is dead only if no later load may-alias it
Load/store forwarding	replace `y=q` with the stored value only if `p` must-alias* `q`
GVN / CSE over memory	two loads of `*p` match only if no intervening store may-alias `p`
Register promotion	promote `*p` to SSA only if `p` doesn't may-alias an escaping access
LICM of a load	hoist `*p` from a loop only if no store in the loop may-alias `p`
Vectorization	dependence between `a[i]` and `a[i-k]` proven absent or distant

Wrong alias info breaks a transform — the miscompile

This is the most important interaction on the page. The optimizer wants to forward S1's stored 1 into the load at L2. That is legal only if no intervening store may-alias p. Toggle the analysis assumption and the runtime pointer relationship, then play it.

The miscompile demosoundness

S1store 1, *p      // write through p
S2store 7, *q      // write through q
L2z = load *p   // read back through p

step 0 / 3

§ 02 — precision axes

The dimensions of precision

Analyses are classified by which axes of program structure they distinguish. Each axis costs more but buys precision: flow (order), context (callers), field (struct members), heap (object naming). They multiply — which is the scalability wall.

Flow-sensitivity — does statement order matter?

Flow-insensitive computes one global points-to map valid everywhere; an assignment only generates facts, never kills them. Flow-sensitive computes a solution per program point, tracking gen and kill — a re-assignment kills the old target (a strong update). Toggle the mode and watch line 2 strike the old p→a arrow.

Flow-sensitivity & strong update (kill)gen / kill

1p = &a;
2p = &b;   // flow-sensitive: KILLS p→a
3x = *p;   // what does *p alias?

step 0 / 3

Context-sensitivity — are different calls distinguished?

Context-insensitive analyzes a procedure once; all call sites merge. Context-sensitive specializes per calling context (clones per call site), so facts from caller A don't leak into caller B. With the identity function below, merging makes a spuriously may-alias y and b may-alias x.

Context-sensitivitycall-site cloning

1id(q) { return q; }   // identity
2a = id(&x);          // call site 1
3b = id(&y);          // call site 2

Field & heap, briefly

Field-sensitivity — field-insensitive treats o.f and o.g as one location (so p=o.f ⇒ pts(p)={x,y}); field-sensitive keeps them distinct (pts(p)={x}).

Heap modeling — the heap is unbounded, so objects are named by allocation site: every object from one new shares one abstract name. That merges all objects from a factory into one node; heap cloning (alloc-site + context) separates them.

§ 03 — the canonical tradeoff

Andersen vs Steensgaard

Both are flow-insensitive, whole-program analyses that reduce the program to four constraint forms, then solve. They differ only in the solving discipline: Andersen uses directional inclusion (subset) constraints; Steensgaard uses symmetric unification (equality), merging nodes with union-find.

The four constraint forms

Stmt	Name	Constraint
`a = &b`	address-of	`{b} ⊆ pts(a)`
`a = b`	copy	`pts(b) ⊆ pts(a)`
`a = *b`	load	`∀o∈pts(b): pts(o) ⊆ pts(a)`
`*a = b`	store	`∀o∈pts(a): pts(b) ⊆ pts(o)`

The load/store forms are dynamic: new points-to facts create new copy edges, so the solver iterates to a fixpoint. Step each constraint form below to see its dereference hops.

The four constraint formsdereference hops

Side-by-side: inclusion vs unification on five statements

Step both engines in lockstep. Andersen (left) keeps directionality — q never receives anything, so pts(q) stays {b}. Steensgaard (right) unifies on each copy: step 3 merges {p,q}, step 5 merges {p,q,r} into one node that points to {a,b,c} — so q spuriously gains a and c.

Andersen vs Steensgaard — lockstep stepperthe centerpiece

1p = &a;   // {a} ⊆ pts(p)
2q = &b;   // {b} ⊆ pts(q)
3p = q;    // pts(q) ⊆ pts(p)   [copy, ONE-WAY]
4r = &c;   // {c} ⊆ pts(r)
5r = p;    // pts(p) ⊆ pts(r)   [copy]

Andersen · inclusion · O(n³)

Steensgaard · unification · O(n·α(n))

step 0 / 5

Takeaway. Andersen distinguishes pts(q)={b} from pts(p)={a,b}; Steensgaard collapses both to {a,b,c}. Same soundness (both over-approximate), wildly different precision and cost. Steensgaard when you need whole-program points-to in near-linear time; Andersen when precision matters and n is moderate.

§ 04 — dragon book framing

Flow-insensitive Datalog + call-string contexts

The Dragon Book (12.4) frames pointer analysis as a flow-insensitive, inclusion-based Datalog program with two IDB relations: pts(V,H) (variable V may point to heap object H) and hpts(H,F,G) (field F of object H points to G — field-sensitive). The four rules, one per statement form, are the Datalog encoding of the inclusion constraints, solved by iterative evaluation.

Interprocedural, context-insensitive (12.5): model parameter passing and returns as copy statements, and discover the call graph on the fly — because you need points-to to compute call targets, but you need call targets to compute points-to. This mutual dependence is iterated to a joint fixpoint.

Context-sensitive (12.6): a context is a call string — the sequence of call-site locations on the stack. Recursion makes call strings infinite, so recursive SCCs are collapsed. Each method is cloned per context; with ~10¹⁴ contexts in real programs, results are stored as BDDs.

§ 05 — why it's hard

The interview thread

This is the perennial follow-up. Four pillars:

Pillar	The hard part
Undecidability	Precise may-alias with dynamic allocation and ≥2 dereference levels is undecidable (Landi 1992; Ramalingam 1994). Any sound analysis must over-approximate — there is no perfect answer, only a soundness/precision/cost frontier.
Call-graph ↔ points-to	You need points-to to resolve indirect/virtual calls, but the call graph to propagate points-to. Neither comes first; solve jointly to a fixpoint.
Heap / arrays / recursion	Arrays summarized to one element ⇒ `a[i]`/`a[j]` always may-alias unless a separate dependence test proves otherwise. Recursive structures defeat alloc-site naming.
Scalability wall	The product of all four axes is exponential. In practice, production analyses drop flow-sensitivity first (Andersen/Steensgaard are flow-insensitive).

The one-liner: aliasing is hard because precise answers are undecidable, the analysis is mutually recursive with the call graph, the heap/arrays/recursion force lossy abstraction, and the precision axes multiply — so every real analysis is a deliberate point on a soundness/precision/cost surface.

§ 06 — in practice

LLVM AliasAnalysis, TBAA, strict aliasing

LLVM AAResults. The core query alias(LocA, LocB) returns one of NoAlias MayAlias PartialAlias MustAlias. Implementations are chained in an AA pipeline: basic-aa, CFL-AA, globals-aa, scev-aa, and TBAA. Consumers — GVN, LICM, DSE, the loop vectorizer — all gate on these results.

TBAA (type-based). Uses the source language's type rules: in C/C++ an access of type T1 can't alias one of incompatible T2 (modulo the char* escape hatch). Encoded as !tbaa metadata — it proves float* and int* NoAlias without a points-to analysis.

Strict-aliasing UB. The standard makes it UB to access an object through an incompatible-type lvalue, so the compiler may assume type-incompatible pointers never alias — sound only because violating it is UB. Type-punning through a float* may be legally miscompiled. Mitigations: -fno-strict-aliasing, memcpy, char*/std::byte. The restrict qualifier is a programmer NoAlias assertion that enables vectorization on CPU and GPU.

three worked traces · diagnose depth

Run the examples

Each example pairs its exact code with a stepped visualization and a short read of what it shows. Use Play / Step / Reset, or the toggles.

example 01

Andersen vs Steensgaard on the same code

The single statement a = b is what diverges them: Andersen makes pts(a) a superset of pts(b) (one-way); Steensgaard makes them equal, so the later c = a drags b into {x,y,z}.

Inclusion vs unification — lockstep5 statements

1a = &x;   // address-of
2b = &y;   // address-of
3a = b;    // copy  [pts(b) ⊆ pts(a)]
4c = a;    // copy  [pts(a) ⊆ pts(c)]
5c = &z;   // address-of

Andersen · inclusion

Steensgaard · unification

step 0 / 5

What this shows. Final Andersen: pts(a)={x,y}, pts(b)={y}, pts(c)={x,y,z} — b never aliases x or z. Steensgaard collapses all three to {x,y,z}: b spuriously aliases x and z. Three spurious facts come purely from unification's symmetry.

example 02

Flow-sensitivity enables dead-store elimination

Is the store at line 2 dead? In this variant line 5 reads h, so the value written to g at line 2 is never read — but only a flow-sensitive analysis can prove it, because it separates the g-write from the h-write with a kill.

Dead-store elimination via strong updategen / kill

1p = &g;
2*p = 1;        // store — dead?
3p = &h;       // flow-sens: KILLS p→g
4*p = 2;        // store
5y = h;        // reads h only

step 0 / 5

What this shows. Flow-insensitive: pts(p)={g,h}, no kill — it can't prove line 2 dead. Flow-sensitive: line 3 kills p→g, so line 2 writes g, line 4 writes h, and line 5 reads only h ⇒ line 2's write to g is never read ⇒ dead, eliminate it. A strong update (kill) is sound only when the pointer must point to exactly one object.

example 03

Aliasing blocks LICM

We want to hoist the loop-invariant load t = *p to the preheader, so we load once instead of n times. Legality hinges entirely on alias(p, q). Toggle it.

Hoist a load out of a loopLICM

step 0 / 3

What this shows. If p NoAlias q, the store *q=i can't change *p ⇒ the load is loop-invariant ⇒ hoist is legal (n loads → 1). If p MayAlias q, the store may clobber *p each iteration ⇒ not invariant ⇒ hoist illegal, reload every iteration. Real compilers get the NoAlias from distinct objects, TBAA types, or restrict.

Pointer & alias analysis · diagnose depth · the analysis that gates every memory optimization.

Built as a memory/dataflow debugger. Colors encode meaning — see the legend up top. All semantics traced from the lesson.

What can this pointer actually point to?

What aliasing is, and why it gates everything

Two flavors — the difference is the whole game

Why aliasing is the enabling/blocking analysis

Wrong alias info breaks a transform — the miscompile

The dimensions of precision

Flow-sensitivity — does statement order matter?

Context-sensitivity — are different calls distinguished?

Field & heap, briefly

Andersen vs Steensgaard

The four constraint forms

Side-by-side: inclusion vs unification on five statements

Flow-insensitive Datalog + call-string contexts

The interview thread

LLVM AliasAnalysis, TBAA, strict aliasing

Run the examples

Andersen vs Steensgaard on the same code

Flow-sensitivity enables dead-store elimination

Aliasing blocks LICM

Test the model in your head

Sit the interview