Skip to content

Embeddings workflow

Turns text segments into vector embeddings. Writes to a storage provider (pgvector by default) so search endpoints can retrieve them.

version: "1.0"
routes:
embedding_generator:
inbound:
queue: "/queue/embeddings.input"
subscription: "embedding-processors"
concurrency: 3 # LLM rate limit likely dominant
adapters:
- type: "embedding_generator"
config:
provider: "default"
model: "text-embedding-3-small"
storage_providers:
- type: "pgvector18"
BackendConfig typeUse
PostgreSQLStorageProviderpgvector18Production. Queryable via the server’s search endpoints.
MessagePackStorageProvidermessagepackLocal experimentation. No search server needed.

Both satisfy EmbeddingStorageProvider. Swap via config.

Bind a pipeline to multiple embedding slots (e.g., English + Norwegian):

- type: "embedding_generator"
config:
provider: "default"
model: "text-embedding-3-large"
storage_slots:
- name: "en-large"
provider: {type: "pgvector18"}
- name: "no-large"
provider: {type: "pgvector18"}

Queries via POST /api/v1/search/multi-model target specific slots or fan out.

Once written, search via the API (or CLI):

Terminal window
factflow search semantic "your question"
factflow search hybrid "exact term AND intent"
factflow search multi-model "cross-language query" --models en-large,no-large

See the chat & search guide for the full surface.

embedding_generator batches requests to the LLM provider for efficiency. Batch size is configurable; default respects provider-specific token limits.

Lineage records one row per input message (not per batch entry) with batch_id set — see the lineage guide for per-batch inspection.

Outside a pipeline:

from factflow_embeddings import EmbeddingService, EmbeddingConfig
service = EmbeddingService(config=EmbeddingConfig(...))
vectors = await service.embed(["text one", "text two"])