Lineage and correlation
Lineage is the forensic record of what happened in a pipeline. One row per message per adapter invocation. Input hash, output hash, status, error payload (if failed), timestamps, correlation id. Stored in PostgreSQL, queried via the API, browsed via the CLI.
What a lineage row captures
Section titled “What a lineage row captures”id UUIDexecution_id UUID -- which executionroute_id string -- which routeadapter_name string -- which adapterparent_id UUID|null -- correlation: the previous row in this chainstatus enum -- pending | completed | failed | cancelledinput_hash string -- hash of the inbound messageoutput_hash string|nullerror JSON|null -- exception class + message + traceback (if failed)started_at timestampcompleted_at timestamp|nullbatch_id UUID|null -- if part of a batchmetadata JSON -- adapter-specific extrasA message that flows through three adapters in a route produces three lineage rows, linked parent-to-child.
Correlation: parent_id as the backbone
Section titled “Correlation: parent_id as the backbone”Lineage is a DAG over messages. Every child row carries its parent’s id. This makes two queries trivial:
- Forward trace — starting at message X, find every descendant (transitive closure over
parent_id). - Backward trace — starting at a failed message, find the full ancestor chain up to the route’s initial message.
The factflow lineage chain <id> command returns both directions.
Commits independently
Section titled “Commits independently”Lineage writes happen out of band from the pipeline. The pipeline publishes an async write; lineage commits it on a separate connection. Consequence: a lineage DB hiccup never fails the pipeline, and a pipeline crash still records the lineage up to the point of crash.
This is why you can debug a broken pipeline by reading lineage even when the orchestrator is fatally stuck — the lineage trail is independent.
Pending children race (and why it’s fixed)
Section titled “Pending children race (and why it’s fixed)”Early versions of the lineage service had a race: a parent row could be marked completed before its children registered. A naive query “is this row done?” would return yes, but children were still pending.
Fix: children pre-register before the parent’s status transition. The pending_children count on the parent is incremented when a child is about to be written, decremented when the child finishes. A parent is only “truly” done when status=completed AND pending_children=0.
The API and CLI respect this. factflow lineage chain returns completion states that honour pending children.
Failure isolation
Section titled “Failure isolation”One adapter failing recording lineage never cascades. If lineage itself has issues (DB down, pool exhausted), the pipeline continues — rows are queued to an in-memory buffer until the DB returns. On catastrophic loss, the buffer is flushed to a structured log so operators can reconstruct after the fact.
Compare with systems where observability failures halt the main flow — Factflow explicitly decouples them.
Querying lineage
Section titled “Querying lineage”Over the API
Section titled “Over the API”GET /api/v1/lineage?execution_id=...&status=failed— filter + paginateGET /api/v1/lineage/{id}/chain— forward + backward chainGET /api/v1/lineage/{id}/childrenGET /api/v1/lineage/failures— most recent failures across executionsGET /api/v1/lineage/stats— summary counters per execution
Over the CLI
Section titled “Over the CLI”factflow lineage list --execution IDfactflow lineage chain MSG_IDfactflow lineage children MSG_IDfactflow lineage failuresfactflow lineage stats --execution IDTypical debugging flow
Section titled “Typical debugging flow”- An execution fails —
factflow execution get IDshowsstatus=failed factflow lineage failures --execution ID— find the adapter that tripped itfactflow lineage chain MSG_ID— walk back to the original inputfactflow storage get <key>— read the problematic input bytes- Fix the adapter locally; test it; redeploy
factflow execution replay ID --from-stage <stage>— rerun from the broken stage with the same input
Related
Section titled “Related”- factflow-lineage reference — the service and repository
- Lineage guide — CLI-first debugging walkthrough
- Replay — uses lineage + storage to re-run stages