RAG Quick Start Guide¶

This guide helps you get started with RAG (Retrieval-Augmented Generation) in log-triage quickly.

Prerequisites¶

log-triage installed with RAG dependencies
Git repositories containing documentation
LLM provider configured (OpenAI, local model, etc.)

5-Minute Setup¶

1. Install Dependencies¶

# Install RAG
pip install -e .[rag]

2. Add RAG to Existing config.yaml¶

Add RAG configuration to your existing config.yaml:

# Add this section to your existing config.yaml
rag:
  enabled: true
  cache_dir: "./rag_cache"
  vector_store_dir: "./rag_vector_store"
  embedding_model: "sentence-transformers/all-MiniLM-L6-v2"
  device: "cpu"
  batch_size: 3
  top_k: 5
  similarity_threshold: 0.7
  max_chunks: 10

# Add rag section to existing modules
modules:
  my_service:
    path: "/var/log/my_service"
    pipeline: "my_pipeline"
    llm:
      enabled: true
      provider: "openai"
    rag:
      enabled: true
      knowledge_sources:
        - repo_url: "https://github.com/myorg/docs"
          branch: "main"
          include_paths:
            - "docs/**/*.md"      # All .md files in docs and subdirectories
            - "README.md"          # Specific file in root

Understanding include_paths¶

include_paths (Glob Patterns)¶

Use glob patterns to specify which files to include: - "docs/**/*.md" - All .md files in docs directory and all subdirectories - "README.md" - Specific file in repository root
- "troubleshooting/*.md" - .md files in troubleshooting directory only - "**/*.md" - All .md files in entire repository - "docs/**/*.rst" - All .rst files in docs and subdirectories - "source/**/*.markdown" - All .markdown files in source directory

Note: File extensions are specified directly in the glob patterns. No separate include_extensions field is needed.

3. Restart log-triage¶

python -m logtriage.webui

4. Check Dashboard¶

Navigate to http://localhost:8000
Check the "Knowledge Base (RAG) Status" section
Wait for initial indexing to complete

5. Test RAG Analysis¶

Go to AI Logs Explorer
Find a log entry marked as a finding
Click "Query LLM" button
Review the AI response with citations

Common Use Cases¶

Service Documentation¶

modules:
  api_service:
    rag:
      enabled: true
      knowledge_sources:
        - repo_url: "https://github.com/myorg/api-docs"
          include_paths:
            - "docs/*.md"
            - "troubleshooting/*.md"

Multiple Knowledge Sources¶

modules:
  complex_app:
    rag:
      enabled: true
      knowledge_sources:
        - repo_url: "https://github.com/myorg/user-guide"
          include_paths: ["user/*.md"]
        - repo_url: "https://github.com/myorg/dev-docs"
          include_paths: ["dev/*.md", "api/*.md"]
        - repo_url: "https://github.com/myorg/runbooks"
          include_paths: ["runbooks/*.md"]

Private Repositories¶

For private repositories, ensure SSH keys or credentials are configured:

# Set up SSH keys for git access
ssh-keyscan github.com >> ~/.ssh/known_hosts

Performance Tips¶

For Better Performance¶

rag:
  device: "cuda"  # If GPU available
  similarity_threshold: 0.8  # Higher threshold = faster

For Better Quality¶

rag:
  embedding_model: "sentence-transformers/all-mpnet-base-v2"  # Better model
  top_k: 10  # More context
  similarity_threshold: 0.6  # Lower threshold = more results