Adapter catalog
Every type: value supported in pipeline YAML. Grouped by workflow package.
For the full config shape of each adapter, see the adapter’s source file (linked) or call GET /api/v1/adapters/{type} which returns the JSON Schema.
type: | Purpose | Config class |
|---|---|---|
sitemap_parser | Fetch a sitemap XML, extract URLs | (see sitemap_adapter.py) |
url_expander | Fan-out: URL list → one message per URL | URLExpanderConfig |
web_scraper | HTTP fetch with adaptive rate limiting | WebScraperSettings |
web_crawler | JS-rendered fetch via crawl4ai | WebCrawlerConfig |
web_content_storage | Persist HTML + metadata | WebContentStorageConfig |
type: | Purpose | Config class |
|---|---|---|
storage_retriever | Read bytes from storage by lineage ref | (see pipeline_adapters.py) |
html_to_markdown | GitHub-flavoured HTML → markdown | (see pipeline_adapters.py) |
smart_segmenter | Token-aware markdown splitting | (see pipeline_adapters.py) |
segment_publisher | Fan-out: one message per segment | (see pipeline_adapters.py) |
markdown_storage_writer | Persist canonical markdown + segments | MarkdownStorageConfig |
type: | Purpose | Config class |
|---|---|---|
embedding_generator | Generate embeddings, write to configured storage | EmbeddingGeneratorConfig |
Adapters under factflow_boost.boost_processor:
type: | Purpose |
|---|---|
boost_enumerator | Walk export folder, emit per-conversation messages |
boost_filter | Drop conversations matching exclusion rules |
boost_norwegian_filter | Language-gate to Norwegian |
boost_deduplicate | MinHash near-duplicate removal |
boost_clustering | Group similar conversations |
boost_catalog | Build structured catalogue |
boost_storage_writer | Persist catalogue |
boost_renderer | Render to HTML / CSV / tree |
Config classes live alongside each adapter in factflow_boost.boost_processor.
type: | Purpose | Config class |
|---|---|---|
llm_translator | LLM-based translation with markdown preservation | LLMTranslatorConfig |
type: | Purpose |
|---|---|
concept_detection | LLM-based concept extraction |
concept_consolidation | Merge concepts across sources; write to Avalon |
knowledge_diff | Compare two concept maps, emit structured diff |
type: | Purpose |
|---|---|
sharepoint_ingest | Pull documents from Microsoft Graph |
document_converter | Convert Office docs to markdown |
Replay isn’t composed into YAML pipelines as a type: — it’s invoked via POST /executions/{id}/replay or the CLI. See Replay guide.
Engine built-ins
Section titled “Engine built-ins”A few utility adapters ship from factflow-engine itself:
type: | Purpose |
|---|---|
stage_storage | Generic “write the current message to storage” step (used implicitly by most pipelines) |
Introspection at runtime
Section titled “Introspection at runtime”The live catalogue with full config schemas:
factflow config validate --list-adaptersOr HTTP:
curl http://localhost:8000/api/v1/adapterscurl http://localhost:8000/api/v1/adapters/{type}GET /api/v1/adapters/{type} returns the adapter’s config class as JSON Schema — usable for generating IDE autocomplete and tooling.
Related
Section titled “Related”- Pipeline YAML reference
- Writing a new adapter
- Per-package references under Reference → Packages