Skip to content

Run your first pipeline

You’ll start a local backend, submit a pipeline config, and wait for the execution to complete. ~10 minutes.

  • CLI installed
  • uv for running the Python backend
  • Docker (embedded mode auto-starts Postgres + a queue broker)
  • The factflow repo cloned locally:
Terminal window
git clone dnb@dnb.ghe.com:radicalAI/factflow.git
cd factflow

In one terminal:

Terminal window
cd backend
uv sync --all-groups
uv run python -m factflow_server serve --embedded

Embedded mode auto-provisions PostgreSQL and an ActiveMQ Artemis broker in Docker. When the startup log shows Uvicorn running on http://0.0.0.0:8000, the backend is ready.

In another terminal:

Terminal window
factflow config create -f backend/config/pipelines/webscraper.yaml

Response (abbreviated):

{
"id": "550e8400-e29b-41d4-a716-446655440000",
"name": "webscraper",
"version": 1,
"active": true
}

Record the id — that’s the config id.

Terminal window
factflow config run 550e8400-e29b-41d4-a716-446655440000

Response:

{
"execution_id": "b3a1c2d3-e4f5-6789-abcd-ef0123456789",
"status": "running"
}
Terminal window
factflow execution wait b3a1c2d3-e4f5-6789-abcd-ef0123456789

The CLI blocks until the execution reaches a terminal state. Exit code 0 on completed, non-zero on failed or cancelled.

For a live view:

Terminal window
factflow execution get b3a1c2d3-e4f5-6789-abcd-ef0123456789

Shows status, per-route progress, and total message counts.

The webscraper pipeline you just ran:

  1. sitemap_parser — fetched the sitemap XML
  2. url_expander — fanned out to one message per URL
  3. web_scraper — fetched each page with adaptive rate limiting
  4. web_content_storage — persisted HTML + metadata to storage

Full adapter details in the factflow-webscraper reference.