Skip to main content
Moderne OnlyThis recipe is proprietary to Moderne and runs on the Moderne platform or CLI — it isn’t part of the open-source catalog. Available with a Moderne subscription.Contact Sales

Track data lineage

Recipe IDorg.openrewrite.analysis.java.datalineage.TrackDataLineage
Artifactio.moderne.recipe:rewrite-program-analysis

Tracks the flow of data from database sources to API sinks to understand data dependencies and support compliance requirements.

Prerequisites for detecting a data flow

All of the following conditions must be met for the recipe to report a flow:

  1. The source code must contain at least one method call matching a recognized source (see below).
  2. The source code must contain at least one method call matching a recognized sink (see below).
  3. The tainted data must propagate from the source to the sink through variable assignments within the same method or via fields across methods in the same compilation unit.
  4. No flow breaker (see below) may appear on the path between source and sink.
  5. The relevant library types (e.g., java.sql.ResultSet, javax.ws.rs.core.Response) must be on the classpath so that OpenRewrite can resolve types. If types are unresolved, method matchers will not trigger and no flows will be detected.

Recognized sources (database reads)

CategoryClasses
JDBCjava.sql.ResultSet
JPA (javax)javax.persistence.EntityManager, Query, TypedQuery
JPA (jakarta)jakarta.persistence.EntityManager, Query, TypedQuery
Hibernateorg.hibernate.Session, org.hibernate.query.Query
Spring Dataorg.springframework.data.repository.CrudRepository
Spring JDBCorg.springframework.jdbc.core.JdbcTemplate
MyBatisorg.apache.ibatis.session.SqlSession, org.mybatis.spring.SqlSessionTemplate
MongoDBcom.mongodb.client.MongoCollection, org.springframework.data.mongodb.core.MongoTemplate
Redisredis.clients.jedis.Jedis, org.springframework.data.redis.core.RedisTemplate, ValueOperations, HashOperations
Cassandracom.datastax.driver.core.Session, org.springframework.data.cassandra.core.CassandraTemplate
Elasticsearchorg.elasticsearch.client.RestHighLevelClient, org.springframework.data.elasticsearch.core.ElasticsearchTemplate
HeuristicAny class with Repository, Dao, or Mapper in its name calling methods starting with find, get, query, search, load, fetch, or select

Recognized sinks (API responses)

CategoryClasses
JAX-RS (javax)javax.ws.rs.core.Response, Response.ResponseBuilder
JAX-RS (jakarta)jakarta.ws.rs.core.Response, Response.ResponseBuilder
Spring MVCorg.springframework.http.ResponseEntity, ResponseEntity.BodyBuilder
Servlet (javax)javax.servlet.http.HttpServletResponse, javax.servlet.ServletOutputStream
Servlet (jakarta)jakarta.servlet.http.HttpServletResponse, jakarta.servlet.ServletOutputStream
Java I/Ojava.io.PrintWriter, java.io.Writer, java.io.OutputStream
Jacksoncom.fasterxml.jackson.databind.ObjectMapper, com.fasterxml.jackson.core.JsonGenerator
Gsoncom.google.gson.Gson, com.google.gson.JsonWriter
GraphQLgraphql.schema.DataFetcher, graphql.schema.PropertyDataFetcher
Spring WebFluxServerResponse, reactor.core.publisher.Mono, reactor.core.publisher.Flux
gRPCio.grpc.stub.StreamObserver
WebSocketjavax.websocket.Session, RemoteEndpoint.Basic, jakarta.websocket.*, org.springframework.web.socket.WebSocketSession

Flow breakers

Flows are broken by methods matching common sanitization patterns (anonymize, redact, mask, encrypt, hash, sanitize, etc.) or authorization checks (isAuthorized, hasPermission, hasRole, etc.).

Single recipeOpenRewriteModerne Proprietary License
Try in PlatformTry this recipe in the Moderne platform. Not a user yet? You’ll get a no-setup demo environment, with nothing to install or configure.

Examples

java
Before
import java.sql.ResultSet;
import javax.ws.rs.core.Response;

class UserController {
public Response getUser(String id, ResultSet rs) throws Exception {
String name = rs.getString("name");
String email = rs.getString("email");

User user = new User(name, email);
return Response.ok(user).build();
}

class User {
String name, email;
User(String n, String e) { name = n; email = e; }
}
}
After
import java.sql.ResultSet;
import javax.ws.rs.core.Response;

class UserController {
public Response getUser(String id, ResultSet rs) throws Exception {
String name = rs.getString("name");
String email = rs.getString("email");

User user = new User(name, email);
return /*~~(DATA_LINEAGE use)~~>*/Response.ok(user).build();
}

class User {
String name, email;
User(String n, String e) { name = n; email = e; }
}
}

Usage

Run this recipe

This recipe has no required configuration options. Users of Moderne can run it via the Moderne CLI.

You will need to have configured the Moderne CLI on your machine before you can run the following command.

shell
mod run . --recipe TrackDataLineage

If the recipe is not available locally, then you can install it using:

mod config recipes jar install io.moderne.recipe:rewrite-program-analysis:0.13.1

Data tables

Taint flow
org.openrewrite.analysis.java.taint.table.TaintFlowTable

Records taint flows from sources to sinks with their taint types.

ColumnDescription
Source fileThe source file that the method call occurred in.
Source lineThe line number where the taint source is located.
SourceThe source code where taint originates.
Sink lineThe line number where the taint sink is located.
SinkThe sink code where taint flows to.
Taint typeThe taint type that matched at the sink.
Source files that had results
org.openrewrite.table.SourcesFileResults

Source files that were modified by the recipe run.

ColumnDescription
Source path before the runThe source path of the file before the run. null when a source file was created during the run.
Source path after the runA recipe may modify the source path. This is the path after the run. null when a source file was deleted during the run.
Parent of the recipe that made changesIn a hierarchical recipe, the parent of the recipe that made a change. Empty if this is the root of a hierarchy or if the recipe is not hierarchical at all.
Recipe that made changesThe specific recipe that made a change.
Estimated time savingAn estimated effort that a developer to fix manually instead of using this recipe, in unit of seconds.
CycleThe recipe cycle in which the change was made.
Source files that had search results
org.openrewrite.table.SearchResults

Search results that were found during the recipe run.

ColumnDescription
Source path of search result before the runThe source path of the file with the search result markers present.
Source path of search result after run the runA recipe may modify the source path. This is the path after the run. null when a source file was deleted during the run.
ResultThe trimmed printed tree of the LST element that the marker is attached to.
DescriptionThe content of the description of the marker.
Recipe that added the search markerThe specific recipe that added the Search marker.
Source files that errored on a recipe
org.openrewrite.table.SourcesFileErrors

The details of all errors produced by a recipe run.

ColumnDescription
Source pathThe file that failed to parse.
Recipe that made changesThe specific recipe that made a change.
Stack traceThe stack trace of the failure.
Recipe performance
org.openrewrite.table.RecipeRunStats

Statistics used in analyzing the performance of recipes.

ColumnDescription
The recipeThe recipe whose stats are being measured both individually and cumulatively.
Source file countThe number of source files the recipe ran over.
Source file changed countThe number of source files which were changed in the recipe run. Includes files created, deleted, and edited.
Cumulative scanning time (ns)The total time spent across the scanning phase of this recipe.
Max scanning time (ns)The max time scanning any one source file.
Cumulative edit time (ns)The total time spent across the editing phase of this recipe.
Max edit time (ns)The max time editing any one source file.