factflow-llm
LLM and embedding client infrastructure — provider factory, adaptive rate limiting, and error classification. Used by every adapter that calls a language model.
Tier and role
Section titled “Tier and role”- Tier: shared service
- Import name:
factflow_llm - Source:
backend/packages/factflow-llm/
Adapter authors construct clients via LLMClientFactory, receive a LLMClientProtocol or EmbeddingClientProtocol instance, and call it. The adapter doesn’t know which provider backs the call — that’s the factory’s decision based on config.
Context
Section titled “Context”Five concrete providers wired behind one factory:
- OpenAI (chat + embeddings)
- Azure OpenAI (chat + embeddings, OpenAI-compatible deployment)
- Anthropic (chat)
- Bedrock (embeddings via Titan)
- HuggingFace (embeddings via sentence-transformers, local)
The _AVAILABLE flags (ANTHROPIC_AVAILABLE, BEDROCK_AVAILABLE, SENTENCE_TRANSFORMERS_AVAILABLE) let the factory skip providers whose optional dependency isn’t installed, without failing at import time.
Rationale
Section titled “Rationale”- Adaptive rate limiting.
AdaptiveRateLimiteruses AIMD (additive increase / multiplicative decrease) to probe provider quotas rather than respecting a hardcoded RPS. Adapter throughput rises until it hits a 429, then backs off proportionally. - Error classification.
classify_llm_errormaps provider-specific exceptions to a taxonomy (retryable,terminal,rate_limited) so adapter retry logic doesn’t need provider-specific try/except trees. - Lazy client creation. The factory caches clients per provider profile; first call constructs, subsequent calls reuse.
Public API
Section titled “Public API”Every symbol listed here is in factflow_llm.__all__ (or is the top-level LLMClientFactory).
Factory
Section titled “Factory”from factflow_llm import LLMClientFactoryfrom factflow_llm.settings import LLMConfig
config: LLMConfig = ...factory = LLMClientFactory(config)
chat = factory.create_completion_client(provider_name="default")emb = factory.create_embedding_client(provider_name="openai-embed")Clients (direct constructors, rarely used)
Section titled “Clients (direct constructors, rarely used)”from factflow_llm import ( BaseLLMClient, OpenAIClient, AzureOpenAIClient, AnthropicClient, BedrockEmbeddingClient, HuggingFaceEmbeddingClient,)Prefer the factory. Direct construction is used only in unit tests.
Rate limiting
Section titled “Rate limiting”from factflow_llm import ( AdaptiveRateLimiter, AIMDMetrics, RateLimitConfig, RateLimitSignal, RateLimitedClient, # wraps a BaseLLMClient with rate-limiting RateLimitedEmbeddingClient, # wraps an embedding client)Error classification
Section titled “Error classification”from factflow_llm import ( LLMErrorClassification, # enum: RETRYABLE / TERMINAL / RATE_LIMITED classify_llm_error, # exception → classification get_error_metadata, # extract retry-after, error code, etc. is_fatal_llm_error, # quick boolean check)Message + response types
Section titled “Message + response types”from factflow_llm import ( Message, Role, StreamChunk, CompletionResponse, EmbeddingResponse,)Protocol re-exports
Section titled “Protocol re-exports”from factflow_llm import LLMClientProtocol, EmbeddingClientProtocolThese are the same protocols defined in factflow-protocols, re-exported for convenience.
Settings (from factflow_llm.settings)
Section titled “Settings (from factflow_llm.settings)”from factflow_llm.settings import ( LLMConfig, # root config LLMProviderConfig, # per-provider profile ModelType, # enum: CHAT / EMBEDDING)Availability flags
Section titled “Availability flags”from factflow_llm import ( ANTHROPIC_AVAILABLE, BEDROCK_AVAILABLE, SENTENCE_TRANSFORMERS_AVAILABLE,)Runtime checks so the factory can skip a provider whose optional deps aren’t installed.
Dependencies
Section titled “Dependencies”- Runtime:
openai>=2.26.0,anthropic>=0.84.0,tiktoken>=0.12.0. Bedrock and HuggingFace deps are optional (surface via the_AVAILABLEflags). - Workspace:
factflow-protocols,factflow-foundation - External services: depends on which providers are configured; credentials injected via env / config.
Testing
Section titled “Testing”Tests at backend/packages/factflow-llm/tests/. See the llm-unit-testing skill for mocking patterns — MockLLMClient is not a library class; tests use unittest.mock directly.
Related
Section titled “Related”factflow-protocols— theLLMClientProtocol/EmbeddingClientProtocolcontracts- Rule:
.claude/rules/llm-conventions.md— factory, rate limiting, error classification invariants - Workflow packages that consume LLM clients — embedding adapters in
factflow-embeddings, extraction / chat adapters across the rest