Update README with fancy formatting and full instructions

This commit is contained in:
Vera-AI
2026-03-26 12:55:58 -05:00
parent 5a0562f2ef
commit 46361153a9

399
README.md
View File

@@ -1,223 +1,249 @@
# Vera-AI: Persistent Memory Proxy for Ollama <div align="center">
[![Docker](https://img.shields.io/docker/pulls/vera-ai/latest)](https://hub.docker.com/r/vera-ai/latest) # Vera-AI
**Vera-AI** is a transparent proxy for Ollama that adds persistent memory using Qdrant vector storage. It sits between your AI client and Ollama, automatically augmenting conversations with relevant context from previous sessions. ### *Vera* (Latin): **True** — *True AI*
## Features **Persistent Memory Proxy for Ollama**
- **Persistent Memory**: Conversations are stored in Qdrant and retrieved contextually *A transparent proxy that gives your AI conversations lasting memory.*
- **Monthly Curation**: Daily and monthly cleanup of raw memories
- **4-Layer Context**: System prompt + semantic memory + recent context + current messages
- **Configurable UID/GID**: Match container user to host user for volume permissions
- **Timezone Support**: Scheduler runs in your local timezone
- **Debug Logging**: Optional debug logs written to configurable directory
## Prerequisites [![Docker](https://img.shields.io/docker/pulls/vera-ai/latest?style=for-the-badge)](https://hub.docker.com/r/vera-ai/latest)
[![License](https://img.shields.io/badge/license-MIT-blue?style=for-the-badge)](LICENSE)
[![Gitea](https://img.shields.io/badge/repo-Gitea-orange?style=for-the-badge)](http://10.0.0.61:3000/SpeedyFoxAi/vera-ai-v2)
- **Ollama**: Running LLM inference server (e.g., `http://10.0.0.10:11434`) ---
- **Qdrant**: Running vector database (e.g., `http://10.0.0.22:6333`)
- **Docker**: Docker and Docker Compose installed
- **Git**: For cloning the repository
## Quick Start **Vera-AI sits between your AI client and Ollama, automatically augmenting conversations with relevant context from previous sessions.**
Every conversation is stored in Qdrant vector database and retrieved contextually — giving your AI **true memory**.
</div>
---
## 🌟 Features
| Feature | Description |
|---------|-------------|
| **🧠 Persistent Memory** | Conversations stored in Qdrant, retrieved contextually |
| **📅 Monthly Curation** | Daily + monthly cleanup of raw memories |
| **🔍 4-Layer Context** | System + semantic + recent + current messages |
| **👤 Configurable UID/GID** | Match container user to host for permissions |
| **🌍 Timezone Support** | Scheduler runs in your local timezone |
| **📝 Debug Logging** | Optional logs written to configurable directory |
| **🐳 Docker Ready** | One-command build and run |
## 📋 Prerequisites
| Requirement | Description |
|-------------|-------------|
| **Ollama** | LLM inference server (e.g., `http://10.0.0.10:11434`) |
| **Qdrant** | Vector database (e.g., `http://10.0.0.22:6333`) |
| **Docker** | Docker and Docker Compose installed |
| **Git** | For cloning the repository |
## 🚀 Quick Start
```bash ```bash
# Clone the repository # 1. Clone
git clone http://10.0.0.61:3000/SpeedyFoxAi/vera-ai-v2.git git clone http://10.0.0.61:3000/SpeedyFoxAi/vera-ai-v2.git
cd vera-ai-v2 cd vera-ai-v2
# Create environment file from template # 2. Configure
cp .env.example .env cp .env.example .env
nano .env # Set APP_UID, APP_GID, TZ
# Edit .env with your settings # 3. Create directories
nano .env
# Create required directories
mkdir -p config prompts logs mkdir -p config prompts logs
# Copy default config (or create your own)
cp config.toml config/ cp config.toml config/
# Build and run # 4. Run
docker compose build docker compose build
docker compose up -d docker compose up -d
# Test # 5. Test
curl http://localhost:11434/ curl http://localhost:11434/
# Expected: {"status":"ok","ollama":"reachable"}
``` ```
## Full Setup Instructions ## 📖 Full Setup Guide
### 1. Clone Repository ### Step 1: Clone Repository
```bash ```bash
git clone http://10.0.0.61:3000/SpeedyFoxAi/vera-ai-v2.git git clone http://10.0.0.61:3000/SpeedyFoxAi/vera-ai-v2.git
cd vera-ai-v2 cd vera-ai-v2
``` ```
### 2. Create Environment File ### Step 2: Environment Configuration
Create `.env` file (or copy from `.env.example`): Create `.env` file (or copy from `.env.example`):
```bash ```bash
# User/Group Configuration (match your host user) # ═══════════════════════════════════════════════════════════════
APP_UID=1000 # User/Group Configuration
APP_GID=1000 # ═══════════════════════════════════════════════════════════════
# IMPORTANT: Match these to your host user for volume permissions
APP_UID=1000 # Run: id -u to get your UID
APP_GID=1000 # Run: id -g to get your GID
# ═══════════════════════════════════════════════════════════════
# Timezone Configuration # Timezone Configuration
# ═══════════════════════════════════════════════════════════════
# Affects curator schedule (daily at 02:00, monthly on 1st at 03:00)
TZ=America/Chicago TZ=America/Chicago
# API Keys (optional) # ═══════════════════════════════════════════════════════════════
# Optional: Cloud Model Routing
# ═══════════════════════════════════════════════════════════════
# OPENROUTER_API_KEY=your_api_key_here # OPENROUTER_API_KEY=your_api_key_here
``` ```
**Important:** `APP_UID` and `APP_GID` must match your host user's UID/GID for volume permissions: ### Step 3: Directory Structure
```bash ```bash
# Get your UID and GID # Create required directories
id -u # UID
id -g # GID
# Set in .env
APP_UID=1000 # Replace with your UID
APP_GID=1000 # Replace with your GID
```
### 3. Create Required Directories
```bash
# Create directories
mkdir -p config prompts logs mkdir -p config prompts logs
# Copy default configuration # Copy default configuration
cp config.toml config/ cp config.toml config/
# Verify prompts exist (should be in the repo) # Verify prompts exist
ls -la prompts/ ls -la prompts/
# Should show: curator_prompt.md, systemprompt.md # Should show: curator_prompt.md, systemprompt.md
``` ```
### 4. Configure Ollama and Qdrant ### Step 4: Configure Services
Edit `config/config.toml`: Edit `config/config.toml`:
```toml ```toml
[general] [general]
ollama_host = "http://YOUR_OLLAMA_IP:11434" # Your Ollama server
qdrant_host = "http://YOUR_QDRANT_IP:6333" ollama_host = "http://10.0.0.10:11434"
# Your Qdrant server
qdrant_host = "http://10.0.0.22:6333"
qdrant_collection = "memories" qdrant_collection = "memories"
# Embedding model for semantic search
embedding_model = "snowflake-arctic-embed2" embedding_model = "snowflake-arctic-embed2"
debug = false debug = false
[layers] [layers]
# Token budgets for context layers
semantic_token_budget = 25000 semantic_token_budget = 25000
context_token_budget = 22000 context_token_budget = 22000
semantic_search_turns = 2 semantic_search_turns = 2
semantic_score_threshold = 0.6 semantic_score_threshold = 0.6
[curator] [curator]
run_time = "02:00" # Daily curator time # Daily curator: processes recent 24h
full_run_time = "03:00" # Monthly full curator time run_time = "02:00"
full_run_day = 1 # Day of month (1st)
# Monthly curator: processes ALL raw memories
full_run_time = "03:00"
full_run_day = 1 # Day of month (1st)
# Model for curation
curator_model = "gpt-oss:120b" curator_model = "gpt-oss:120b"
``` ```
### 5. Build and Run ### Step 5: Build and Run
```bash ```bash
# Build with your UID/GID # Build with your UID/GID
APP_UID=$(id -u) APP_GID=$(id -g) docker compose build APP_UID=$(id -u) APP_GID=$(id -g) docker compose build
# Run with timezone # Start container
docker compose up -d docker compose up -d
# Check status # Check status
docker ps docker ps
docker logs vera-ai --tail 20 docker logs vera-ai --tail 20
# Test health endpoint
curl http://localhost:11434/
# Expected: {"status":"ok","ollama":"reachable"}
``` ```
### 6. Verify Installation ### Step 6: Verify Installation
```bash ```bash
# Check container is healthy # ✅ Health check
curl http://localhost:11434/
# Expected: {"status":"ok","ollama":"reachable"}
# ✅ Container status
docker ps --format "table {{.Names}}\t{{.Status}}" docker ps --format "table {{.Names}}\t{{.Status}}"
# Expected: vera-ai Up X minutes (healthy) # Expected: vera-ai Up X minutes (healthy)
# Check timezone # ✅ Timezone
docker exec vera-ai date docker exec vera-ai date
# Should show your timezone (e.g., CDT for America/Chicago) # Should show your timezone (e.g., CDT for America/Chicago)
# Check user # ✅ User permissions
docker exec vera-ai id docker exec vera-ai id
# Expected: uid=1000(appuser) gid=1000(appgroup) # Expected: uid=1000(appuser) gid=1000(appgroup)
# Check directories # ✅ Directories
docker exec vera-ai ls -la /app/prompts/ docker exec vera-ai ls -la /app/prompts/
# Should show: curator_prompt.md, systemprompt.md # Should show: curator_prompt.md, systemprompt.md
docker exec vera-ai ls -la /app/logs/ # ✅ Test chat
# Should be writable
# Test chat
curl -X POST http://localhost:11434/api/chat \ curl -X POST http://localhost:11434/api/chat \
-H "Content-Type: application/json" \ -H "Content-Type: application/json" \
-d '{"model":"YOUR_MODEL","messages":[{"role":"user","content":"hello"}],"stream":false}' -d '{"model":"qwen3.5:397b-cloud","messages":[{"role":"user","content":"hello"}],"stream":false}'
``` ```
## Configuration ## ⚙️ Configuration Reference
### Environment Variables (.env) ### Environment Variables
| Variable | Default | Description | | Variable | Default | Description |
|----------|---------|-------------| |----------|---------|-------------|
| `APP_UID` | `999` | User ID for container user (match your host UID) | | `APP_UID` | `999` | Container user ID (match host) |
| `APP_GID` | `999` | Group ID for container group (match your host GID) | | `APP_GID` | `999` | Container group ID (match host) |
| `TZ` | `UTC` | Timezone for scheduler | | `TZ` | `UTC` | Container timezone |
| `OPENROUTER_API_KEY` | - | API key for cloud model routing (optional) | | `OPENROUTER_API_KEY` | - | Cloud model routing key |
| `VERA_CONFIG_DIR` | `/app/config` | Configuration directory (optional) | | `VERA_CONFIG_DIR` | `/app/config` | Config directory |
| `VERA_PROMPTS_DIR` | `/app/prompts` | Prompts directory (optional) | | `VERA_PROMPTS_DIR` | `/app/prompts` | Prompts directory |
| `VERA_LOG_DIR` | `/app/logs` | Debug log directory (optional) | | `VERA_LOG_DIR` | `/app/logs` | Debug logs directory |
### Volume Mappings ### Volume Mappings
| Host Path | Container Path | Mode | Purpose | | Host Path | Container Path | Mode | Purpose |
|-----------|---------------|------|---------| |-----------|----------------|------|---------|
| `./config/config.toml` | `/app/config/config.toml` | `ro` | Configuration file | | `./config/config.toml` | `/app/config/config.toml` | `ro` | Configuration |
| `./prompts/` | `/app/prompts/` | `rw` | Curator and system prompts | | `./prompts/` | `/app/prompts/` | `rw` | Curator prompts |
| `./logs/` | `/app/logs/` | `rw` | Debug logs (when debug=true) | | `./logs/` | `/app/logs/` | `rw` | Debug logs |
### Directory Structure ### Directory Structure
``` ```
vera-ai-v2/ vera-ai-v2/
├── config/ ├── 📁 config/
│ └── config.toml # Main configuration (mounted read-only) │ └── 📄 config.toml # Main configuration
├── prompts/ ├── 📁 prompts/
│ ├── curator_prompt.md # Prompt for memory curator │ ├── 📄 curator_prompt.md # Memory curation prompt
│ └── systemprompt.md # System context (curator can append) │ └── 📄 systemprompt.md # System context
├── logs/ # Debug logs (when debug=true) ├── 📁 logs/ # Debug logs (when debug=true)
├── app/ ├── 📁 app/
│ ├── main.py # FastAPI application │ ├── 🐍 main.py # FastAPI application
│ ├── config.py # Configuration loading │ ├── 🐍 config.py # Configuration loader
│ ├── curator.py # Memory curation │ ├── 🐍 curator.py # Memory curation
│ ├── proxy_handler.py # Chat request handling │ ├── 🐍 proxy_handler.py # Chat handling
│ ├── qdrant_service.py # Qdrant operations │ ├── 🐍 qdrant_service.py # Vector operations
│ ├── singleton.py # QdrantService singleton │ ├── 🐍 singleton.py # QdrantService singleton
│ └── utils.py # Utilities │ └── 🐍 utils.py # Utilities
├── static/ # Legacy (symlinks to prompts/) ├── 📁 static/ # Legacy symlinks
├── .env.example # Environment template ├── 📄 .env.example # Environment template
├── docker-compose.yml # Docker Compose config ├── 📄 docker-compose.yml # Docker Compose
├── Dockerfile # Container definition ├── 📄 Dockerfile # Container definition
├── requirements.txt # Python dependencies ├── 📄 requirements.txt # Python dependencies
└── README.md # This file └── 📄 README.md # This file
``` ```
## Docker Compose ## 🐳 Docker Compose
```yaml ```yaml
services: services:
@@ -233,8 +259,11 @@ services:
env_file: env_file:
- .env - .env
volumes: volumes:
# Configuration (read-only)
- ./config/config.toml:/app/config/config.toml:ro - ./config/config.toml:/app/config/config.toml:ro
# Prompts (read-write for curator)
- ./prompts:/app/prompts:rw - ./prompts:/app/prompts:rw
# Debug logs (read-write)
- ./logs:/app/logs:rw - ./logs:/app/logs:rw
network_mode: "host" network_mode: "host"
restart: unless-stopped restart: unless-stopped
@@ -246,37 +275,36 @@ services:
start_period: 10s start_period: 10s
``` ```
## Timezone Configuration ## 🌍 Timezone Configuration
The `TZ` environment variable sets the container timezone, which affects the scheduler: The `TZ` variable sets the container timezone for the scheduler:
```bash ```bash
# .env file # Common timezones
TZ=America/Chicago TZ=UTC # Coordinated Universal Time
TZ=America/New_York # Eastern Time
# Scheduler runs at: TZ=America/Chicago # Central Time
# - Daily curator: 02:00 Chicago time TZ=America/Los_Angeles # Pacific Time
# - Monthly curator: 03:00 Chicago time on 1st TZ=Europe/London # GMT/BST
``` ```
Common timezones: **Curation Schedule:**
- `UTC` - Coordinated Universal Time | Schedule | Time | What | Frequency |
- `America/New_York` - Eastern Time |----------|------|------|-----------|
- `America/Chicago` - Central Time | Daily | 02:00 | Recent 24h | Every day |
- `America/Los_Angeles` - Pacific Time | Monthly | 03:00 on 1st | ALL raw memories | 1st of month |
- `Europe/London` - GMT/BST
## API Endpoints ## 🔌 API Endpoints
| Endpoint | Method | Description | | Endpoint | Method | Description |
|----------|--------|-------------| |----------|--------|-------------|
| `/` | GET | Health check | | `/` | `GET` | Health check |
| `/api/chat` | POST | Chat completion (augmented with memory) | | `/api/chat` | `POST` | Chat completion (with memory) |
| `/api/tags` | GET | List models | | `/api/tags` | `GET` | List available models |
| `/api/generate` | POST | Generate completion | | `/api/generate` | `POST` | Generate completion |
| `/curator/run` | POST | Trigger curator manually | | `/curator/run` | `POST` | Trigger curator manually |
## Manual Curator Trigger ### Manual Curation
```bash ```bash
# Daily curation (recent 24h) # Daily curation (recent 24h)
@@ -286,119 +314,132 @@ curl -X POST http://localhost:11434/curator/run
curl -X POST "http://localhost:11434/curator/run?full=true" curl -X POST "http://localhost:11434/curator/run?full=true"
``` ```
## Memory System ## 🧠 Memory System
### 4-Layer Context ### 4-Layer Context
1. **System Prompt**: From `prompts/systemprompt.md` ```
2. **Semantic Memory**: Curated Q&A pairs retrieved by relevance ┌─────────────────────────────────────────────────────────────┐
3. **Recent Context**: Last N conversation turns │ Layer 1: System Prompt │
4. **Current Messages**: User/assistant messages from request │ - From prompts/systemprompt.md │
│ - Static context, curator can append rules │
### Curation Schedule ├─────────────────────────────────────────────────────────────┤
│ Layer 2: Semantic Memory │
| Schedule | Time | What | Frequency | │ - Curated Q&A pairs from Qdrant │
|----------|------|------|-----------| │ - Retrieved by relevance to current message │
| Daily | 02:00 | Recent 24h raw memories | Every day | ├─────────────────────────────────────────────────────────────┤
| Monthly | 03:00 on 1st | ALL raw memories | 1st of month | │ Layer 3: Recent Context │
│ - Last N conversation turns from Qdrant │
│ - Chronological order │
├─────────────────────────────────────────────────────────────┤
│ Layer 4: Current Messages │
│ - User/assistant messages from current request │
│ - Passed through unchanged │
└─────────────────────────────────────────────────────────────┘
```
### Memory Types ### Memory Types
- **raw**: Unprocessed conversation turns | Type | Description | Retention |
- **curated**: Cleaned, summarized Q&A pairs |------|-------------|-----------|
- **test**: Test entries (can be ignored) | `raw` | Unprocessed conversation turns | Until curation |
| `curated` | Cleaned Q&A pairs | Permanent |
| `test` | Test entries | Can be ignored |
## Troubleshooting ## 🔧 Troubleshooting
### Permission Denied ### Permission Denied
If you see permission errors on `/app/prompts/` or `/app/logs/`:
```bash ```bash
# Check your UID/GID # Check your UID/GID
id id
# Rebuild with correct UID/GID # Rebuild with correct values
APP_UID=$(id -u) APP_GID=$(id -g) docker compose build --no-cache APP_UID=$(id -u) APP_GID=$(id -g) docker compose build --no-cache
docker compose up -d docker compose up -d
``` ```
### Timezone Issues ### Wrong Timezone
If curator runs at wrong time:
```bash ```bash
# Check container timezone # Check container time
docker exec vera-ai date docker exec vera-ai date
# Set correct timezone in .env # Fix in .env
TZ=America/Chicago TZ=America/Chicago
``` ```
### Health Check Failing ### Health Check Failing
```bash ```bash
# Check container logs # Check logs
docker logs vera-ai --tail 50 docker logs vera-ai --tail 50
# Check Ollama connectivity # Test Ollama connectivity
docker exec vera-ai python -c "import urllib.request; print(urllib.request.urlopen('http://YOUR_OLLAMA_IP:11434/').read())" docker exec vera-ai python -c "
import urllib.request
print(urllib.request.urlopen('http://YOUR_OLLAMA_IP:11434/').read())
"
# Check Qdrant connectivity # Test Qdrant connectivity
docker exec vera-ai python -c "import urllib.request; print(urllib.request.urlopen('http://YOUR_QDRANT_IP:6333/').read())" docker exec vera-ai python -c "
import urllib.request
print(urllib.request.urlopen('http://YOUR_QDRANT_IP:6333/').read())
"
``` ```
### Container Not Starting ### Port Already in Use
```bash ```bash
# Check if port is in use # Check what's using port 11434
sudo lsof -i :11434 sudo lsof -i :11434
# Check Docker logs # Stop conflicting service or change port in config
docker compose logs
# Rebuild from scratch
docker compose down
docker compose build --no-cache
docker compose up -d
``` ```
## Development ## 🛠️ Development
### Building from Source ### Build from Source
```bash ```bash
# Clone repository
git clone http://10.0.0.61:3000/SpeedyFoxAi/vera-ai-v2.git git clone http://10.0.0.61:3000/SpeedyFoxAi/vera-ai-v2.git
cd vera-ai-v2 cd vera-ai-v2
# Install dependencies locally (optional)
pip install -r requirements.txt pip install -r requirements.txt
# Build Docker image
docker compose build docker compose build
``` ```
### Running Tests ### Run Tests
```bash ```bash
# Test health endpoint # Health check
curl http://localhost:11434/ curl http://localhost:11434/
# Test chat endpoint # Non-streaming chat
curl -X POST http://localhost:11434/api/chat \ curl -X POST http://localhost:11434/api/chat \
-H "Content-Type: application/json" \ -H "Content-Type: application/json" \
-d '{"model":"qwen3.5:397b-cloud","messages":[{"role":"user","content":"test"}],"stream":false}' -d '{"model":"qwen3.5:397b-cloud","messages":[{"role":"user","content":"test"}],"stream":false}'
# Test curator # Trigger curation
curl -X POST http://localhost:11434/curator/run curl -X POST http://localhost:11434/curator/run
``` ```
## License ## 📄 License
MIT License - see LICENSE file for details. MIT License - see [LICENSE](LICENSE) file for details.
## Support ## 🤝 Support
- **Issues**: http://10.0.0.61:3000/SpeedyFoxAi/vera-ai-v2/issues | Resource | Link |
- **Repository**: http://10.0.0.61:3000/SpeedyFoxAi/vera-ai-v2 |----------|------|
| **Repository** | http://10.0.0.61:3000/SpeedyFoxAi/vera-ai-v2 |
| **Issues** | http://10.0.0.61:3000/SpeedyFoxAi/vera-ai-v2/issues |
---
<div align="center">
**Vera-AI***True AI Memory*
Made with ❤️ by the OpenClaw team
</div>