Skip to content

Architecture Overview

Asclepius runs as a single Docker container. A Python/FastAPI backend serves both the REST API and the pre-built React frontend, and every LLM call goes out to an external service you point it at. There is no bundled model server.

For deployments that need to publish doctor shares to the public internet, the bundled docker-compose.yml ships a sibling asclepius-share service: same image, started with ASCLEPIUS_MODE=share, mounts only the doctor-share routers, refuses every admin and patient route. Both containers share the SQLite database and vault on disk. See Doctor shares → Publishing the share surface for the deployment topology.

CLIENT DOCKER CONTAINER EXTERNAL HTTP DISPATCH ACQUIRE HTTP LLM JSON RESOLVED USER Browser React UI API FastAPI Backend REST · auth · static python 3.13 WORKER Pipeline background asyncio task GATE Credential Gate max_concurrent RESOLVER Normalization alias lookup · python DB SQLite + FTS5 WAL · aiosqlite FILES Vault volume OCR Tesseract 5 LLM Ollama · vLLM claude · openai VISION Vision-LLM qwen2.5-vl · gpt-4o OCR Remote OCR chandra · gvision LEGEND Focal In-container Store / files External service External HTTP File serve The gate enforces per-credential concurrency; the resolver collapses aliases before writes.
ComponentResponsibility
FastAPI BackendREST API, authentication (session + OIDC), database access, file serving, settings management
React FrontendWeb UI for browsing, searching, managing records, uploading documents, and configuring settings
Processing PipelineFile watcher (watchdog), OCR, LLM extraction, page sectioning, file organization. Runs in a background asyncio task
SQLite + FTS5All structured data storage with WAL mode for concurrent reads. FTS5 virtual table for full-text search
Tesseract OCRLocal OCR engine bundled in the container (5 language packs)
Ollama / ClaudeExternal LLM providers for document classification, data extraction, chat, and AI editing
VaultOrganized file storage on the filesystem, mounted as a Docker volume
  1. User interacts with the React UI in the browser
  2. UI makes REST API calls to the FastAPI backend
  3. Backend validates authentication via signed session cookies (or OIDC)
  4. Backend checks authorization via the user_patient_access table
  5. Backend queries SQLite and serves files from the vault
  1. File watcher (watchdog) detects new files in vault/inbox/
  2. Files are queued with priority (smallest files first)
  3. For each file:
    • Compute SHA-256 hash for deduplication
    • Run OCR (Tesseract, LLM Vision, Google Vision, or Remote)
    • If document >5 pages: smart page-level sectioning
    • Phase 1: Classify document type and extract basic metadata
    • Phase 2: Type-specific extraction (lab results, medications, encounters, etc.)
    • Normalize doctor/facility names, match to existing records
    • Organize file into vault/patients/{slug}/{year}/
  4. Per-document progress tracking (step + current page) visible on Dashboard

See Processing Pipeline for the complete flow.

  • No ORM. Raw SQL with aiosqlite. Easier to reason about, easier to optimize, fewer hidden N+1s.
  • SQLite with WAL. Portable, no extra service to run, fast enough for single-instance use. WAL mode lets the web server keep reading while the pipeline writes.
  • Session-based auth. Signed cookies via itsdangerous, bcrypt for passwords. No JWTs to rotate or revoke.
  • File-based storage. Files live on disk under patient/year folders; metadata lives in the database.
  • No bundled LLM. You point Asclepius at your own Ollama, vLLM, Claude, or OpenAI endpoint. The container stays small and the model lifecycle is yours to manage.
  • Two-phase extraction. A cheap classification pass runs first; the second pass loads only the type-specific prompt. The LLM never sees a kitchen-sink schema.
  • Pipeline in a background asyncio task. The web server never blocks on processing. Cancellation works through an in-memory set of cancelled document IDs that the pipeline checks between steps.
  • Runtime pipeline control. The Settings UI starts and stops the pipeline at runtime via app.state.pipeline_task. After five consecutive provider connectivity failures, the pipeline pauses itself.
  • Settings are live. Configuration changes are written back to YAML and applied to the in-memory config immediately, no restart.