Chat and search
Factflow ships 12 chat endpoints and 6 search endpoints. Together they expose the indexed content (segments + embeddings + full-text) to end users as a RAG-backed chat interface or a programmatic search API.
What runs behind this
Section titled “What runs behind this”- Segments and embeddings produced by pipelines (typically
factflow-markdown→factflow-embeddings) - Stored in PostgreSQL with pgvector for vector similarity + full-text search (FTS)
factflow-serverexposes the HTTP surface; internally callsfactflow-embeddings.EmbeddingServiceandSegmentContentRepository
Search endpoints
Section titled “Search endpoints”POST /api/v1/search — semantic
Section titled “POST /api/v1/search — semantic”Pure vector similarity. Embed the query with the same model used by the pipeline, return top-k nearest neighbours.
curl -X POST http://localhost:8000/api/v1/search \ -H 'Content-Type: application/json' \ -d '{"query": "how do I configure retries", "limit": 10}'CLI:
factflow search semantic "how do I configure retries" --limit 10POST /api/v1/search/hybrid — vector + FTS
Section titled “POST /api/v1/search/hybrid — vector + FTS”Combines semantic similarity with keyword ranking. Better for queries with specific terms (product names, error codes) plus intent.
factflow search hybrid "retry circuit breaker configuration"POST /api/v1/search/rrf — reciprocal rank fusion
Section titled “POST /api/v1/search/rrf — reciprocal rank fusion”Multi-signal ranking — semantic + FTS + optionally other signals — combined via RRF. The default for the chat endpoint.
POST /api/v1/search/multi-model
Section titled “POST /api/v1/search/multi-model”Query across multiple embedding slots (e.g., English + Norwegian models, or general + domain-specific). Returns merged results with per-model scores.
factflow search multi-model "transaksjon avvist" --models no-large,en-largeGET /api/v1/search/sources, GET /api/v1/search/capabilities
Section titled “GET /api/v1/search/sources, GET /api/v1/search/capabilities”Enumerate what’s searchable (which origins are indexed, which embedding slots exist) — use these to build dynamic UIs.
Chat endpoints
Section titled “Chat endpoints”Chat is stateful: every conversation is a thread with a stored history. Messages are RAG-grounded — the server retrieves relevant segments, builds a prompt, calls the LLM, and returns the response plus the sources it used.
POST /api/v1/chat/threads — start a thread
Section titled “POST /api/v1/chat/threads — start a thread”curl -X POST http://localhost:8000/api/v1/chat/threads \ -H 'Content-Type: application/json' \ -d '{"title": "Onboarding questions"}'Returns thread_id.
POST /api/v1/chat/threads/{id}/messages — ask a question
Section titled “POST /api/v1/chat/threads/{id}/messages — ask a question”curl -X POST http://localhost:8000/api/v1/chat/threads/THREAD_ID/messages \ -H 'Content-Type: application/json' \ -d '{"content": "How do I set up a replay for a failed execution?"}'Response streams the LLM output (SSE by default, or JSON if Accept: application/json) plus the source segments used.
GET /api/v1/chat/threads
Section titled “GET /api/v1/chat/threads”List threads (optionally filter by user).
GET /api/v1/chat/threads/{id}/messages
Section titled “GET /api/v1/chat/threads/{id}/messages”Full conversation history.
GET /api/v1/chat/capabilities + GET /api/v1/chat/sources
Section titled “GET /api/v1/chat/capabilities + GET /api/v1/chat/sources”Same purpose as the search endpoints — tell a UI what’s available.
Citing sources
Section titled “Citing sources”Every chat response includes a sources array referencing the segments that informed the answer. Each source carries:
segment_id— primary keycontent— the excerptmetadata— origin, URL, title, pagescore— retrieval rank
Frontends use this to build “cited sources” UIs with links back to the original content.
Authentication
Section titled “Authentication”Chat threads are owned by a user id. The server pulls this from the auth context (typically via DNB SSO headers forwarded by the ingress). Unauthenticated requests either fail or create anonymous threads — depends on deployment config.
Related
Section titled “Related”- Reference: API — every endpoint with request/response schemas
- factflow-server reference — chat subpackage implementation
- factflow-embeddings reference — the search backend
- Running pipelines — pipelines that produce searchable content