Skip to main content

Reaching Definitions Analysis

Reaching definitions analysis is a fundamental forward data flow analysis that tracks which assignments (definitions) of variables may reach each point in a program. It answers the critical question: "Where could this variable's value have come from?"

Understanding reaching definitions

Imagine tracking packages through a delivery network. When a package arrives at your door, you want to know all the possible distribution centers it might have passed through. Similarly, when you use a variable, reaching definitions tells you all the assignments that could have provided its current value.

Core concepts

A definition "reaches" a program point if there's an execution path from the definition to that point where the variable isn't redefined.

int x = 5;      // Definition D1
int y = 10; // Definition D2
if (condition) {
x = 15; // Definition D3
}
// At this point:
// - D2 reaches (y = 10)
// - Either D1 or D3 reaches (depending on condition)
print(x); // x could be 5 or 15

The mathematics behind it

Formal definition

For each program point, we track a set of definitions that reach it:

  • GEN[B]: Definitions generated in basic block B
  • KILL[B]: Definitions killed (overwritten) in B
  • IN[B]: Definitions reaching the entry of B
  • OUT[B]: Definitions reaching the exit of B

Data flow equations

This is a forward analysis with these equations.

IN[B] = ∪ OUT[P] for all predecessors P of B
OUT[B] = GEN[B] ∪ (IN[B] - KILL[B])

The analysis uses union because a definition reaches if it reaches along any path (may analysis).

Implementation in OpenRewrite

Here's how reaching definitions analysis is implemented.

public class ReachingDefinitionsAnalysis extends ForwardDataFlowAnalysis<Definition, ReachingDefinitions> {

@Override
protected Set<Definition> transfer(BasicBlock block, Set<Definition> inputDefs) {
Set<Definition> result = new HashSet<>(inputDefs);

// Process each statement in the block
for (Tree stmt : block.getStatements()) {
if (isDefinition(stmt)) {
Definition newDef = createDefinition(stmt);

// KILL: Remove previous definitions of the same variable
result.removeIf(def -> def.getVariable().equals(newDef.getVariable()));

// GEN: Add this new definition
result.add(newDef);
}
}

return result;
}

@Override
protected Set<Definition> merge(List<Set<Definition>> inputs) {
// Union all reaching definitions from predecessors
Set<Definition> result = new HashSet<>();
for (Set<Definition> input : inputs) {
result.addAll(input);
}
return result;
}

@Override
protected ReachingDefinitions createResult(Map<BasicBlock, Set<Definition>> analysisResult) {
return new ReachingDefinitions(cfg, analysisResult);
}
}

Working with reachingdefinitions results

The ReachingDefinitions result type provides rich querying capabilities:

Basic queries

ReachingDefinitionsAnalysis analysis = new ReachingDefinitionsAnalysis(cfg);
ReachingDefinitions reachingDefs = analysis.analyze();

// Get all definitions reaching a specific use
J.Identifier use = ...; // Variable use
Set<Definition> definitions = reachingDefs.getReachingDefinitions(use);

// Check if a variable might be uninitialized
boolean mightBeUninitialized = definitions.isEmpty();

// Find all possible values
Set<Expression> possibleValues = definitions.stream()
.map(Definition::getValue)
.collect(Collectors.toSet());

Advanced queries

// Find uses with multiple reaching definitions (join points)
List<J.Identifier> ambiguousUses = reachingDefs.findAmbiguousUses();

// Get definition-use chains
Map<Definition, Set<J.Identifier>> defUseChains = reachingDefs.getDefUseChains();

// Find dead definitions (defined but never used)
Set<Definition> deadDefs = reachingDefs.findDeadDefinitions();

// Check if definitions are "killed" before use
boolean isKilled = reachingDefs.isDefinitionKilled(definition, programPoint);

Common applications

Uninitialized variable detection

Find variables used before initialization.

public class UninitializedVariableDetector extends Recipe {
@Override
public TreeVisitor<?, ExecutionContext> getVisitor() {
return new JavaIsoVisitor<ExecutionContext>() {
@Override
public J.Identifier visitIdentifier(J.Identifier ident, ExecutionContext ctx) {
if (isVariableUse(ident)) {
Boolean uninitialized = ControlFlowSupport.analyze(getCursor(), false,
(cursor, cfg) -> {
ReachingDefinitionsAnalysis analysis = new ReachingDefinitionsAnalysis(cfg);
ReachingDefinitions reachingDefs = analysis.analyze();

Set<Definition> defs = reachingDefs.getReachingDefinitions(ident);
return defs.isEmpty(); // No definitions reach this use
});

if (Boolean.TRUE.equals(uninitialized)) {
return SearchResult.found(ident,
"Variable '" + ident.getSimpleName() + "' may be uninitialized");
}
}
return ident;
}
};
}
}

Constant propagation

Replace variable uses with constant values when possible.

public class ConstantPropagation {
public void propagateConstants(J.MethodDeclaration method) {
ControlFlowSupport.analyze(getCursor(), method,
(cursor, cfg) -> {
ReachingDefinitions reachingDefs = new ReachingDefinitionsAnalysis(cfg).analyze();

// Find uses with single constant definition
for (J.Identifier use : findAllVariableUses(method)) {
Set<Definition> defs = reachingDefs.getReachingDefinitions(use);

if (defs.size() == 1) {
Definition def = defs.iterator().next();
if (isConstant(def.getValue())) {
replaceWithConstant(use, def.getValue());
}
}
}

return method;
});
}
}

Def-use chain analysis

Understanding data dependencies.

public class DataDependencyAnalyzer {
public DependencyGraph buildDataDependencies(ControlFlowGraph cfg) {
ReachingDefinitions reachingDefs = new ReachingDefinitionsAnalysis(cfg).analyze();
DependencyGraph graph = new DependencyGraph();

// Build def-use edges
Map<Definition, Set<J.Identifier>> defUse = reachingDefs.getDefUseChains();
for (Map.Entry<Definition, Set<J.Identifier>> entry : defUse.entrySet()) {
Definition def = entry.getKey();
for (J.Identifier use : entry.getValue()) {
graph.addDataDependency(def.getStatement(), use);
}
}

return graph;
}
}

Advanced patterns

Copy propagation

Eliminate unnecessary variable copies.

// Before: x = y; z = x + 1;
// After: x = y; z = y + 1;

public void propagateCopies(ReachingDefinitions reachingDefs) {
for (J.Identifier use : findVariableUses()) {
Set<Definition> defs = reachingDefs.getReachingDefinitions(use);

// If all reaching definitions are copies from the same variable
if (allCopiesFromSame(defs)) {
String sourceVar = getCommonSource(defs);
replaceUse(use, sourceVar);
}
}
}

Available expressions

Combine with expression analysis.

public class AvailableExpressions {
private final ReachingDefinitions reachingDefs;

public boolean isExpressionAvailable(J.Binary expr, ProgramPoint point) {
// Check if operands have the same reaching definitions
Set<Definition> leftDefs = reachingDefs.getReachingDefinitions(expr.getLeft());
Set<Definition> rightDefs = reachingDefs.getReachingDefinitions(expr.getRight());

// Expression is available if computed earlier with same operand values
return findEarlierComputation(expr, leftDefs, rightDefs, point) != null;
}
}

Type inference

Use reaching definitions for type refinement.

public class TypeRefinement {
public JavaType refineType(J.Identifier var, ReachingDefinitions reachingDefs) {
Set<Definition> defs = reachingDefs.getReachingDefinitions(var);

if (defs.isEmpty()) {
return var.getType(); // No refinement possible
}

// Find most specific common type
JavaType refined = null;
for (Definition def : defs) {
JavaType defType = getDefinitionType(def);
refined = refined == null ? defType : commonSupertype(refined, defType);
}

return refined;
}
}

Performance optimization

Sparse representations

For large methods, use sparse representations.

public class SparseReachingDefinitions {
// Only track definitions at points where they're used
private final Map<VariableUse, Set<Definition>> sparse;

public Set<Definition> query(VariableUse use) {
return sparse.computeIfAbsent(use, this::computeReaching);
}
}

Incremental updates

Update reaching definitions incrementally when code changes.

public class IncrementalReachingDefs {
public void updateAfterEdit(Edit edit) {
BasicBlock affected = findAffectedBlock(edit);

// Only recompute from affected block
Set<BasicBlock> toUpdate = new HashSet<>();
toUpdate.add(affected);

// Add all blocks reachable from affected
Queue<BasicBlock> worklist = new LinkedList<>();
worklist.add(affected);

while (!worklist.isEmpty()) {
BasicBlock current = worklist.poll();
for (BasicBlock succ : cfg.getSuccessors(current)) {
if (toUpdate.add(succ)) {
worklist.add(succ);
}
}
}

// Recompute only affected blocks
recomputeBlocks(toUpdate);
}
}

Integration with other analyses

With liveness analysis

Combine for more precise dead code detection.

public Set<Definition> findTrulyDeadDefinitions(
ReachingDefinitions reachingDefs,
LiveVariables liveVars) {

Set<Definition> dead = new HashSet<>();

for (Definition def : reachingDefs.getAllDefinitions()) {
String var = def.getVariable();

// Dead if variable not live after definition
if (!liveVars.isLive(var, def.getStatement())) {
dead.add(def);
continue;
}

// Also dead if immediately redefined
Set<Definition> reachingNext = reachingDefs.getReachingDefinitions(
getNextUse(var, def.getStatement()));
if (!reachingNext.contains(def)) {
dead.add(def);
}
}

return dead;
}

With taint analysis

Track taint propagation through definitions.

public class TaintTracking {
public Set<Definition> findTaintedDefinitions(
ReachingDefinitions reachingDefs,
Set<Expression> taintSources) {

Set<Definition> tainted = new HashSet<>();
Queue<Definition> worklist = new LinkedList<>();

// Start with definitions from taint sources
for (Expression source : taintSources) {
worklist.addAll(findDefinitionsFrom(source));
}

// Propagate through def-use chains
while (!worklist.isEmpty()) {
Definition def = worklist.poll();
if (tainted.add(def)) {
// Find uses of this definition
for (J.Identifier use : reachingDefs.getUses(def)) {
// Find definitions that use this tainted value
worklist.addAll(findDefinitionsUsing(use));
}
}
}

return tainted;
}
}

Common pitfalls

Aliasing

Multiple variables can refer to the same memory.

int[] a = new int[10];
int[] b = a; // a and b alias
b[0] = 5; // Also defines a[0]

Field definitions

Field assignments need special handling.

class Container {
int value;

void method() {
this.value = 10; // Defines this.value
Container c = this;
c.value = 20; // Also defines this.value!
}
}

Control flow complexity

Some definitions are conditional.

int x;  // Declaration without initialization
if (checkCondition()) {
x = 10;
} else if (otherCondition()) {
x = 20;
}
// x might still be uninitialized here!

Testing reaching definitions

Always test with complex patterns.

@Test
void testMultipleDefinitions() {
rewriteRun(
java("""
class Test {
void method(boolean flag) {
int x = 1; // D1
if (flag) {
x = 2; // D2
} else {
x = 3; // D3
}
use(x); // Reached by D2 or D3, not D1
}
}
""")
);

// Verify that use of x has definitions D2 and D3 reaching
}

Next steps