Skip to content

factflow-sharepoint

SharePoint ingest and document conversion adapters. Pulls documents from SharePoint via Microsoft Graph, converts them (Word, PDF, etc.) to markdown for downstream processing.

Closed pipeline producing markdown from SharePoint-hosted Office documents. Slots into pipelines where segmentation / embedding / knowledge extraction run downstream.

Two subpackages:

Ingestion from Microsoft Graph.

  • adapter.py — pipeline adapter (type: sharepoint_ingest in YAML)
  • graph_client.py — thin async wrapper over the Microsoft Graph REST API
  • settings.py — auth + site / drive selection
  • models.py — SharePoint-specific metadata

Convert the downloaded binary documents to markdown.

  • adapter.py — pipeline adapter (type: document_converter)
  • models.py — conversion output metadata
  • Graph over SOAP. Microsoft Graph is the current API; the legacy SOAP SharePoint API is not supported.
  • Converter stage is separate. Ingest and conversion are separate adapters so conversion can be replayed independently if the converter is updated — the binary is persisted between stages.

Top-level factflow_sharepoint/__init__.py is empty. Consumers:

  • Configure both adapters in pipeline YAML
  • Set Graph credentials in env (SHAREPOINT_CLIENT_ID, SHAREPOINT_CLIENT_SECRET, SHAREPOINT_TENANT_ID, or equivalent per settings.py)

Direct imports for custom integrations:

from factflow_sharepoint.sharepoint.adapter import SharePointIngestAdapter
from factflow_sharepoint.document_converter.adapter import DocumentConverterAdapter
  • Workspace: factflow-protocols, factflow-foundation, factflow-engine
  • External services: Microsoft Graph (tenant access), storage provider for downloaded binaries + converted markdown

Tests at backend/packages/workflows/factflow-sharepoint/tests/. Integration tests hit a sandboxed tenant — skipped by default in CI.