Replay
Replay republishes stored artefacts back into pipeline queues so a downstream stage can be re-executed. Two flavours: same-pipeline (rerun within the existing execution) and cross-pipeline (full new coordinator, detached).
When to replay
Section titled “When to replay”- A downstream adapter failed; fix it; rerun from that stage
- You updated adapter logic and want to test the new version on the same inputs
- An LLM call timed out; retry with a different model
- You want to fan the same upstream data out to a new pipeline
Prerequisite: storage still has the data
Section titled “Prerequisite: storage still has the data”Replay reads from executions/SRC/ROUTE/STAGE/*. If objects are missing, replay fails loudly. Storage is not a cache — retention is a deployment concern.
Verify before replay:
factflow storage list --execution SRC_ID | headSame-pipeline replay
Section titled “Same-pipeline replay”Rerun one stage within the original execution. The orchestrator republishes stored messages to the specified stage’s queue; existing processors pick them up; downstream stages run as usual.
factflow execution replay SRC_ID --from-stage web_scraperCreates a fresh execution that continues from the specified stage. Upstream stages don’t rerun; stages before --from-stage are skipped.
Cross-pipeline replay
Section titled “Cross-pipeline replay”Route the source’s output into a different pipeline’s route. This is used when:
- Two pipelines share a stage output format (e.g., both consume segmented markdown)
- You want to re-process existing scraped pages through an updated consolidation pipeline
factflow execution replay SRC_ID \ --from-stage markdown \ --to-route embeddings_routeThe server constructs a new detached replay coordinator. It reads from SRC_ID’s storage and publishes to embeddings_route’s queue. The parent execution’s config snapshot is used to resolve route → queue mappings — not the current global directory. This prevents a class of subtle bugs when the same route name appears in multiple configs with different queues.
Via the API
Section titled “Via the API”curl -X POST http://localhost:8000/api/v1/executions/SRC_ID/replay \ -H 'Content-Type: application/json' \ -d '{"from_stage": "markdown", "to_route": "embeddings_route"}'Returns a new execution id. Poll or wait on it as you would any other execution.
Recovery
Section titled “Recovery”Similar shape, different intent: recovery restarts a pipeline that was interrupted mid-execution (server crash, manual kill). Uses lineage to find the last successful stage per route and resumes from there.
Recovery isn’t exposed as a CLI subcommand today — it’s engaged automatically on server startup for executions with status=interrupted.
Route resolution detail
Section titled “Route resolution detail”When a cross-pipeline replay runs:
- Parent execution’s
config_snapshotis loaded --to-routeis looked up in that snapshot — gives the queue name the snapshot used- Replay publishes to that queue name (execution-scoped under the NEW replay execution id)
If you added a new route to a pipeline and want to replay existing data through it, first edit the config, then create a new execution of that config, then replay into the new execution (not the old one).
Related
Section titled “Related”- Concept: Storage model — why storage is the replay source of truth
- Lineage guide — use lineage to find the right
--from-stage - factflow-replay reference
- API: Executions — the
POST /executions/{id}/replayendpoint