Skip to content

Replay

Replay republishes stored artefacts back into pipeline queues so a downstream stage can be re-executed. Two flavours: same-pipeline (rerun within the existing execution) and cross-pipeline (full new coordinator, detached).

  • A downstream adapter failed; fix it; rerun from that stage
  • You updated adapter logic and want to test the new version on the same inputs
  • An LLM call timed out; retry with a different model
  • You want to fan the same upstream data out to a new pipeline

Replay reads from executions/SRC/ROUTE/STAGE/*. If objects are missing, replay fails loudly. Storage is not a cache — retention is a deployment concern.

Verify before replay:

Terminal window
factflow storage list --execution SRC_ID | head

Rerun one stage within the original execution. The orchestrator republishes stored messages to the specified stage’s queue; existing processors pick them up; downstream stages run as usual.

Terminal window
factflow execution replay SRC_ID --from-stage web_scraper

Creates a fresh execution that continues from the specified stage. Upstream stages don’t rerun; stages before --from-stage are skipped.

Route the source’s output into a different pipeline’s route. This is used when:

  • Two pipelines share a stage output format (e.g., both consume segmented markdown)
  • You want to re-process existing scraped pages through an updated consolidation pipeline
Terminal window
factflow execution replay SRC_ID \
--from-stage markdown \
--to-route embeddings_route

The server constructs a new detached replay coordinator. It reads from SRC_ID’s storage and publishes to embeddings_route’s queue. The parent execution’s config snapshot is used to resolve route → queue mappings — not the current global directory. This prevents a class of subtle bugs when the same route name appears in multiple configs with different queues.

Terminal window
curl -X POST http://localhost:8000/api/v1/executions/SRC_ID/replay \
-H 'Content-Type: application/json' \
-d '{"from_stage": "markdown", "to_route": "embeddings_route"}'

Returns a new execution id. Poll or wait on it as you would any other execution.

Similar shape, different intent: recovery restarts a pipeline that was interrupted mid-execution (server crash, manual kill). Uses lineage to find the last successful stage per route and resumes from there.

Recovery isn’t exposed as a CLI subcommand today — it’s engaged automatically on server startup for executions with status=interrupted.

When a cross-pipeline replay runs:

  1. Parent execution’s config_snapshot is loaded
  2. --to-route is looked up in that snapshot — gives the queue name the snapshot used
  3. Replay publishes to that queue name (execution-scoped under the NEW replay execution id)

If you added a new route to a pipeline and want to replay existing data through it, first edit the config, then create a new execution of that config, then replay into the new execution (not the old one).