RAG Quick Start Guide¶
This guide helps you get started with RAG (Retrieval-Augmented Generation) in log-triage quickly.
Prerequisites¶
- log-triage installed with RAG dependencies
- Git repositories containing documentation
- LLM provider configured (OpenAI, local model, etc.)
5-Minute Setup¶
1. Install Dependencies¶
# Install RAG
pip install -e .[rag]
2. Add RAG to Existing config.yaml¶
Add RAG configuration to your existing config.yaml:
# Add this section to your existing config.yaml
rag:
enabled: true
cache_dir: "./rag_cache"
vector_store_dir: "./rag_vector_store"
embedding_model: "sentence-transformers/all-MiniLM-L6-v2"
device: "cpu"
batch_size: 3
top_k: 5
similarity_threshold: 0.7
max_chunks: 10
# Add rag section to existing modules
modules:
my_service:
path: "/var/log/my_service"
pipeline: "my_pipeline"
llm:
enabled: true
provider: "openai"
rag:
enabled: true
knowledge_sources:
- repo_url: "https://github.com/myorg/docs"
branch: "main"
include_paths:
- "docs/**/*.md" # All .md files in docs and subdirectories
- "README.md" # Specific file in root
Understanding include_paths¶
include_paths (Glob Patterns)¶
Use glob patterns to specify which files to include:
- "docs/**/*.md" - All .md files in docs directory and all subdirectories
- "README.md" - Specific file in repository root
- "troubleshooting/*.md" - .md files in troubleshooting directory only
- "**/*.md" - All .md files in entire repository
- "docs/**/*.rst" - All .rst files in docs and subdirectories
- "source/**/*.markdown" - All .markdown files in source directory
Note: File extensions are specified directly in the glob patterns. No separate include_extensions field is needed.
3. Restart log-triage¶
python -m logtriage.webui
4. Check Dashboard¶
- Navigate to
http://localhost:8000 - Check the "Knowledge Base (RAG) Status" section
- Wait for initial indexing to complete
5. Test RAG Analysis¶
- Go to AI Logs Explorer
- Find a log entry marked as a finding
- Click "Query LLM" button
- Review the AI response with citations
Common Use Cases¶
Service Documentation¶
modules:
api_service:
rag:
enabled: true
knowledge_sources:
- repo_url: "https://github.com/myorg/api-docs"
include_paths:
- "docs/*.md"
- "troubleshooting/*.md"
Multiple Knowledge Sources¶
modules:
complex_app:
rag:
enabled: true
knowledge_sources:
- repo_url: "https://github.com/myorg/user-guide"
include_paths: ["user/*.md"]
- repo_url: "https://github.com/myorg/dev-docs"
include_paths: ["dev/*.md", "api/*.md"]
- repo_url: "https://github.com/myorg/runbooks"
include_paths: ["runbooks/*.md"]
Private Repositories¶
For private repositories, ensure SSH keys or credentials are configured:
# Set up SSH keys for git access
ssh-keyscan github.com >> ~/.ssh/known_hosts
Performance Tips¶
For Better Performance¶
rag:
device: "cuda" # If GPU available
similarity_threshold: 0.8 # Higher threshold = faster
For Better Quality¶
rag:
embedding_model: "sentence-transformers/all-mpnet-base-v2" # Better model
top_k: 10 # More context
similarity_threshold: 0.6 # Lower threshold = more results