Skip to content

Adapter catalog

Every type: value supported in pipeline YAML. Grouped by workflow package.

For the full config shape of each adapter, see the adapter’s source file (linked) or call GET /api/v1/adapters/{type} which returns the JSON Schema.

type:PurposeConfig class
sitemap_parserFetch a sitemap XML, extract URLs(see sitemap_adapter.py)
url_expanderFan-out: URL list → one message per URLURLExpanderConfig
web_scraperHTTP fetch with adaptive rate limitingWebScraperSettings
web_crawlerJS-rendered fetch via crawl4aiWebCrawlerConfig
web_content_storagePersist HTML + metadataWebContentStorageConfig
type:PurposeConfig class
storage_retrieverRead bytes from storage by lineage ref(see pipeline_adapters.py)
html_to_markdownGitHub-flavoured HTML → markdown(see pipeline_adapters.py)
smart_segmenterToken-aware markdown splitting(see pipeline_adapters.py)
segment_publisherFan-out: one message per segment(see pipeline_adapters.py)
markdown_storage_writerPersist canonical markdown + segmentsMarkdownStorageConfig
type:PurposeConfig class
embedding_generatorGenerate embeddings, write to configured storageEmbeddingGeneratorConfig

Adapters under factflow_boost.boost_processor:

type:Purpose
boost_enumeratorWalk export folder, emit per-conversation messages
boost_filterDrop conversations matching exclusion rules
boost_norwegian_filterLanguage-gate to Norwegian
boost_deduplicateMinHash near-duplicate removal
boost_clusteringGroup similar conversations
boost_catalogBuild structured catalogue
boost_storage_writerPersist catalogue
boost_rendererRender to HTML / CSV / tree

Config classes live alongside each adapter in factflow_boost.boost_processor.

type:PurposeConfig class
llm_translatorLLM-based translation with markdown preservationLLMTranslatorConfig
type:Purpose
concept_detectionLLM-based concept extraction
concept_consolidationMerge concepts across sources; write to Avalon
knowledge_diffCompare two concept maps, emit structured diff
type:Purpose
sharepoint_ingestPull documents from Microsoft Graph
document_converterConvert Office docs to markdown

Replay isn’t composed into YAML pipelines as a type: — it’s invoked via POST /executions/{id}/replay or the CLI. See Replay guide.

A few utility adapters ship from factflow-engine itself:

type:Purpose
stage_storageGeneric “write the current message to storage” step (used implicitly by most pipelines)

The live catalogue with full config schemas:

Terminal window
factflow config validate --list-adapters

Or HTTP:

Terminal window
curl http://localhost:8000/api/v1/adapters
curl http://localhost:8000/api/v1/adapters/{type}

GET /api/v1/adapters/{type} returns the adapter’s config class as JSON Schema — usable for generating IDE autocomplete and tooling.