Run your first pipeline

You’ll start a local backend, submit a pipeline config, and wait for the execution to complete. ~10 minutes.

Prerequisites

CLI installed
uv for running the Python backend
Docker (embedded mode auto-starts Postgres + a queue broker)
The factflow repo cloned locally:

git clone dnb@dnb.ghe.com:radicalAI/factflow.git
cd factflow

1. Start the backend (embedded mode)

In one terminal:

cd backend
uv sync --all-groups
uv run python -m factflow_server serve --embedded

Embedded mode auto-provisions PostgreSQL and an ActiveMQ Artemis broker in Docker. When the startup log shows Uvicorn running on http://0.0.0.0:8000, the backend is ready.

2. Submit the webscraper config

In another terminal:

factflow config create -f backend/config/pipelines/webscraper.yaml

Response (abbreviated):

{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "name": "webscraper",
  "version": 1,
  "active": true
}

Record the id — that’s the config id.

3. Start an execution

factflow config run 550e8400-e29b-41d4-a716-446655440000

Response:

{
  "execution_id": "b3a1c2d3-e4f5-6789-abcd-ef0123456789",
  "status": "running"
}

4. Watch it finish

factflow execution wait b3a1c2d3-e4f5-6789-abcd-ef0123456789

The CLI blocks until the execution reaches a terminal state. Exit code 0 on completed, non-zero on failed or cancelled.

For a live view:

factflow execution get b3a1c2d3-e4f5-6789-abcd-ef0123456789

Shows status, per-route progress, and total message counts.

What the pipeline did

The webscraper pipeline you just ran:

sitemap_parser — fetched the sitemap XML
url_expander — fanned out to one message per URL
web_scraper — fetched each page with adaptive rate limiting
web_content_storage — persisted HTML + metadata to storage

Full adapter details in the factflow-webscraper reference.

What’s next

Monitor and replay — inspect the output, trace lineage, replay stages
Pipeline configuration — YAML anatomy, routes, conditions
Writing a new adapter — add your own pipeline step