Technical architecture (how it works)¶
This page describes how log-triage works end-to-end and what each major file/module is responsible for.
The codebase is organized around three entry points:
logtriage(CLI)logtriage-webui(Web UI)logtriage-rag(RAG service)
High-level data flow¶
Batch analysis (CLI)¶
- Load config (
logtriage/config.py) - Build pipelines/modules (
logtriage/config.py) - Analyze a file or directory (
logtriage/engine.py) - Group log lines (
logtriage/grouping/*) - Classify grouped chunks (
logtriage/classifiers/*) to produceFindingobjects - Optionally call an LLM (
logtriage/llm_client.py) using a payload rendered bylogtriage/llm_payload.py - Optionally store findings (database helpers in
logtriage/webui/db.py) - Optionally send alerts (
logtriage/alerts.py/logtriage/notifications.py)
Follow/tail analysis (CLI)¶
- Follow a file like
tail -Fwith rotation handling (logtriage/stream.py) - For each batch of appended lines:
- classify
- optionally enrich with LLM + RAG
- optionally store/send
Web UI¶
- The Web UI is a FastAPI app (
logtriage/webui/app.py) that: - reads the same YAML configuration
- shows findings (in-memory or via database)
- can edit
config.yaml - can test/save regexes
- can trigger LLM calls
- can optionally integrate with the RAG service
RAG (Retrieval-Augmented Generation)¶
- The RAG service (
logtriage/rag/service.py) builds and serves a documentation index. - The CLI/Web UI optionally call the service via
logtriage/rag/service_client.py. - When available, the client retrieves documentation snippets relevant to a finding and appends them to the LLM prompt (
logtriage/llm_client.py,logtriage/llm_payload.py).
Key concepts¶
- Pipeline: a reusable set of grouping + classifier rules applied to matching log files.
- Module: binds a pipeline to an on-disk path and a runtime mode (
batchorfollow). - Finding: a structured result representing a problem detected in logs.
Repository structure / role of each file¶
Packaging and entry points¶
setup.py- Packaging metadata and extras (
webui,alerts,rag). - Defines console scripts:
logtriage=logtriage.cli:mainlogtriage-webui=logtriage.webui.__main__:mainlogtriage-rag=logtriage.rag.service:main
logtriage/__init__.py- Exposes
mainand__version__. logtriage/__main__.py- Allows
python -m logtriageto run the CLI. logtriage/cli.py- CLI implementation:
- argument parsing and command dispatch
- module execution in batch/follow modes
- optional config reload (
--reload-on-change) - optional DB initialization and retention cleanup
- starts a background RAG monitor to detect service readiness
logtriage/version.py- Stores the package version string.
Core domain model¶
logtriage/models.py- Dataclasses and enums used across the application:
Severity,Finding- pipeline/module/LLM configuration models
- RAG configuration models and retrieval result types
Configuration loading and validation¶
logtriage/config.py- Loads
config.yaml. - Compiles regexes and builds:
PipelineConfiglist (build_pipelines)ModuleConfiglist (build_modules)- LLM config (
build_llm_config) - RAG config (
build_rag_config,build_module_rag_config)
Log analysis engine¶
logtriage/engine.py- Batch analysis:
analyze_file: read file, group lines, classify groupsanalyze_path: apply the appropriate pipeline(s) across a file or directory
logtriage/utils.py- Utility helpers:
- enumerate log files (
iter_log_files) - pick a pipeline for a file (
select_pipeline)
- enumerate log files (
Streaming / follow mode¶
logtriage/stream.py- Implements
tail -F-like follow mode with rotation detection. - For each appended batch:
- classify
- optional LLM
- optional persistence/alerts
Grouping strategies (logtriage/grouping/)¶
logtriage/grouping/__init__.py- Exposes grouping dispatch.
logtriage/grouping/marker.py- Marker-based grouping (start/end regex delimit chunks).
logtriage/grouping/separator.py- Separator-based grouping (split on a regex separator line; supports
only_last). logtriage/grouping/whole_file.py- Whole-file grouping (treats the entire input as one chunk).
Classifiers (logtriage/classifiers/)¶
logtriage/classifiers/__init__.py- Exposes classifier dispatch.
logtriage/classifiers/regex_counter.py- Regex-based classification:
- apply ignore patterns first
- emit a
Findingfor each match
logtriage/classifiers/*(other files in the folder)- Additional classifier strategies (for example heuristics tailored to specific log formats).
LLM integration¶
logtriage/llm_payload.py- Renders a plain-text prompt payload for the LLM.
- Appends RAG context when provided.
logtriage/llm_client.py- Selects provider config and routes to the correct backend via
_call_llm(). openaibackend (_call_chat_completion): OpenAI chat-completions format (/v1/chat/completions),Authorization: Bearerheader. Covers OpenAI, local vLLM, Ollama, Azure OpenAI, and any compatible API.anthropicbackend (_call_anthropic): Anthropic Messages API (/v1/messages),x-api-keyheader,systemmessage extracted and sent as a top-level field. Response is normalized to the same internal shape so the rest of the pipeline is provider-agnostic.provider_typeis read fromLLMProviderConfigand auto-detected fromapi_base(anything containinganthropic.comdefaults toanthropic).- Handles provider auth via environment variables.
- Optionally retrieves RAG context and adds citations.
Alerts, notifications, and logging¶
logtriage/alerts.py- Dispatches outbound alerts (webhook and/or MQTT) based on severity thresholds.
logtriage/notifications.py- In-process notification collection used by the Web UI and RAG service endpoints.
logtriage/logging_setup.py- Logging configuration from config.
Web UI (logtriage/webui/)¶
logtriage/webui/__main__.py- Web UI entry point; loads config for logging and starts uvicorn.
logtriage/webui/app.py- Main FastAPI application:
- routing for dashboard, log explorer, config editor, regex tools
- session middleware setup
- optional background RAG monitor
logtriage/webui/auth.py- Password verification and session-token helpers.
logtriage/webui/config.py- Web UI-specific settings parsing (
webui.*section of config). logtriage/webui/db.py- SQLAlchemy models + persistence helpers for findings and LLM results.
- Provides retention cleanup and statistics queries.
logtriage/webui/ingestion_status.py- Derives module “stale/active” status for the dashboard.
logtriage/webui/regex_utils.py- Regex validation and sample preparation helpers for the regex lab.
logtriage/webui/assets/*andlogtriage/webui/templates/*- Static assets and server-rendered HTML templates.
RAG implementation (logtriage/rag/)¶
logtriage/rag/service.py- Standalone FastAPI service:
- initializes the index in the background
- exposes endpoints for health, status, repository progress and retrieval
logtriage/rag/service_client.py- HTTP client used by CLI/Web UI to talk to the RAG service.
- Provides a
NoOpRAGClientfallback. logtriage/rag/monitor.py- Background thread helper used by CLI/Web UI to monitor whether the RAG service is up and ready.
logtriage/rag/rag_client.py- In-process coordinator (used by the RAG service):
- knowledge management
- document processing
- embeddings
- vector store
- retrieval
logtriage/rag/knowledge_manager.py- Clones/updates Git repositories and enumerates documentation files.
logtriage/rag/document_processor.py- Splits documentation files into chunks (by headings/paragraphs) with memory cleanup.
logtriage/rag/embeddings.py- Wraps embedding generation (SentenceTransformers) and batching.
logtriage/rag/subprocess_embeddings.py- Alternate embedding approach using subprocess isolation (when used).
logtriage/rag/vector_store.py- Persistent FAISS index + SQLite metadata store.
logtriage/rag/retrieval.py- Builds a query from a
Finding, embeds it, queries vector store, filters by similarity. logtriage/rag/__init__.py- Exposes public RAG symbols for import by the rest of the package.
Where to look for specific behavior¶
- How findings are created:
logtriage/classifiers/*andlogtriage/engine.py - How grouping works:
logtriage/grouping/* - How follow-mode works:
logtriage/stream.py - How LLM calls happen:
logtriage/llm_client.py - How RAG is appended to prompts:
logtriage/llm_client.py+logtriage/llm_payload.py - How the docs index is built:
logtriage/rag/rag_client.py+logtriage/rag/service.py