Lot Genealogy at Scale: Graph Database Patterns for Traceability

Lot genealogy is the record of which input lots went into which output lots, through every transformation. Regulations like FDA 21 CFR Part 11 and FSMA 204 require you to trace it both ways: forward (where did lot X end up) and backward (what went into lot Y). Model that in relational tables and recall queries get slow and recursive. Model it as a graph and it becomes natural and fast, which on a recall is exactly what you need.

Why genealogy is a graph

Lots split, merge, and transform: a raw material lot is consumed across several work orders; a finished lot blends several inputs. That is a directed acyclic graph, lots are nodes, “consumed into” relationships are edges. The questions regulators ask are graph traversals:

Backward trace (recall a finished lot): walk edges upstream to every input lot and supplier.
Forward trace (a raw lot is contaminated): walk downstream to every finished lot and shipment it reached.

Why relational struggles, and graph fits

In SQL, multi-level genealogy is recursive joins (or WITH RECURSIVE) that degrade as depth and volume grow; a deep recall can fan out across millions of rows. A graph database stores the relationships as first-class edges, so a traversal follows pointers instead of joining tables, and the query reads like the question:

// illustrative (Cypher-style): everything a contaminated raw lot reached
MATCH (raw:Lot {id: $lotId})-[:CONSUMED_INTO*1..]->(downstream:Lot)
RETURN downstream

The *1.. is the variable-depth traversal that is painful in SQL and idiomatic in a graph.

Patterns that keep it fast and compliant

Model lots and transformations explicitly. Nodes for lots, edges for consumption/production, with quantities and timestamps on the edges.
Make it append-only and audited. 21 CFR Part 11 wants tamper-evident, attributable records, so genealogy edges should be immutable once written, the same posture as an event-sourced ledger.
Index the entry points (lot id, supplier, ship-to) so a recall starts instantly, then traverses.
Bound the blast radius in the query, so a forward trace returns the affected shipments and customers directly, which is the FSMA 204 “one step forward” requirement at speed.

The takeaway

Lot genealogy is a graph, so store it as one. A graph database turns multi-level forward and backward traces from slow recursive joins into fast traversals, and an append-only, audited model satisfies the tamper-evidence regulators expect. When a recall hits, the difference between a graph and a pile of recursive SQL is hours versus minutes, which is the whole point of traceability.

Implementing this at your scale?

The walkthrough above comes from production work. AvanSaber’s inventory practice has implemented variations of this pattern across multiple customer engagements.

If you are building this and want expert review of your design, or would rather have the team that built this build yours, book a discovery conversation or describe your situation at [email protected].

See our Implementation engagement model