445 lines
13 KiB
Markdown
445 lines
13 KiB
Markdown
<div align="center">
|
|
|
|
# Vera-AI
|
|
|
|
### *Vera* (Latin): **True** — *True AI*
|
|
|
|
**Persistent Memory Proxy for Ollama**
|
|
|
|
*A transparent proxy that gives your AI conversations lasting memory.*
|
|
|
|
[](https://hub.docker.com/r/vera-ai/latest)
|
|
[](LICENSE)
|
|
[](http://10.0.0.61:3000/SpeedyFoxAi/vera-ai-v2)
|
|
|
|
---
|
|
|
|
**Vera-AI sits between your AI client and Ollama, automatically augmenting conversations with relevant context from previous sessions.**
|
|
|
|
Every conversation is stored in Qdrant vector database and retrieved contextually — giving your AI **true memory**.
|
|
|
|
</div>
|
|
|
|
---
|
|
|
|
## 🌟 Features
|
|
|
|
| Feature | Description |
|
|
|---------|-------------|
|
|
| **🧠 Persistent Memory** | Conversations stored in Qdrant, retrieved contextually |
|
|
| **📅 Monthly Curation** | Daily + monthly cleanup of raw memories |
|
|
| **🔍 4-Layer Context** | System + semantic + recent + current messages |
|
|
| **👤 Configurable UID/GID** | Match container user to host for permissions |
|
|
| **🌍 Timezone Support** | Scheduler runs in your local timezone |
|
|
| **📝 Debug Logging** | Optional logs written to configurable directory |
|
|
| **🐳 Docker Ready** | One-command build and run |
|
|
|
|
## 📋 Prerequisites
|
|
|
|
| Requirement | Description |
|
|
|-------------|-------------|
|
|
| **Ollama** | LLM inference server (e.g., `http://10.0.0.10:11434`) |
|
|
| **Qdrant** | Vector database (e.g., `http://10.0.0.22:6333`) |
|
|
| **Docker** | Docker and Docker Compose installed |
|
|
| **Git** | For cloning the repository |
|
|
|
|
## 🚀 Quick Start
|
|
|
|
```bash
|
|
# 1. Clone
|
|
git clone http://10.0.0.61:3000/SpeedyFoxAi/vera-ai-v2.git
|
|
cd vera-ai-v2
|
|
|
|
# 2. Configure
|
|
cp .env.example .env
|
|
nano .env # Set APP_UID, APP_GID, TZ
|
|
|
|
# 3. Create directories
|
|
mkdir -p config prompts logs
|
|
cp config.toml config/
|
|
|
|
# 4. Run
|
|
docker compose build
|
|
docker compose up -d
|
|
|
|
# 5. Test
|
|
curl http://localhost:11434/
|
|
# Expected: {"status":"ok","ollama":"reachable"}
|
|
```
|
|
|
|
## 📖 Full Setup Guide
|
|
|
|
### Step 1: Clone Repository
|
|
|
|
```bash
|
|
git clone http://10.0.0.61:3000/SpeedyFoxAi/vera-ai-v2.git
|
|
cd vera-ai-v2
|
|
```
|
|
|
|
### Step 2: Environment Configuration
|
|
|
|
Create `.env` file (or copy from `.env.example`):
|
|
|
|
```bash
|
|
# ═══════════════════════════════════════════════════════════════
|
|
# User/Group Configuration
|
|
# ═══════════════════════════════════════════════════════════════
|
|
# IMPORTANT: Match these to your host user for volume permissions
|
|
|
|
APP_UID=1000 # Run: id -u to get your UID
|
|
APP_GID=1000 # Run: id -g to get your GID
|
|
|
|
# ═══════════════════════════════════════════════════════════════
|
|
# Timezone Configuration
|
|
# ═══════════════════════════════════════════════════════════════
|
|
# Affects curator schedule (daily at 02:00, monthly on 1st at 03:00)
|
|
|
|
TZ=America/Chicago
|
|
|
|
# ═══════════════════════════════════════════════════════════════
|
|
# Optional: Cloud Model Routing
|
|
# ═══════════════════════════════════════════════════════════════
|
|
# OPENROUTER_API_KEY=your_api_key_here
|
|
```
|
|
|
|
### Step 3: Directory Structure
|
|
|
|
```bash
|
|
# Create required directories
|
|
mkdir -p config prompts logs
|
|
|
|
# Copy default configuration
|
|
cp config.toml config/
|
|
|
|
# Verify prompts exist
|
|
ls -la prompts/
|
|
# Should show: curator_prompt.md, systemprompt.md
|
|
```
|
|
|
|
### Step 4: Configure Services
|
|
|
|
Edit `config/config.toml`:
|
|
|
|
```toml
|
|
[general]
|
|
# Your Ollama server
|
|
ollama_host = "http://10.0.0.10:11434"
|
|
|
|
# Your Qdrant server
|
|
qdrant_host = "http://10.0.0.22:6333"
|
|
qdrant_collection = "memories"
|
|
|
|
# Embedding model for semantic search
|
|
embedding_model = "snowflake-arctic-embed2"
|
|
debug = false
|
|
|
|
[layers]
|
|
# Token budgets for context layers
|
|
semantic_token_budget = 25000
|
|
context_token_budget = 22000
|
|
semantic_search_turns = 2
|
|
semantic_score_threshold = 0.6
|
|
|
|
[curator]
|
|
# Daily curator: processes recent 24h
|
|
run_time = "02:00"
|
|
|
|
# Monthly curator: processes ALL raw memories
|
|
full_run_time = "03:00"
|
|
full_run_day = 1 # Day of month (1st)
|
|
|
|
# Model for curation
|
|
curator_model = "gpt-oss:120b"
|
|
```
|
|
|
|
### Step 5: Build and Run
|
|
|
|
```bash
|
|
# Build with your UID/GID
|
|
APP_UID=$(id -u) APP_GID=$(id -g) docker compose build
|
|
|
|
# Start container
|
|
docker compose up -d
|
|
|
|
# Check status
|
|
docker ps
|
|
docker logs vera-ai --tail 20
|
|
```
|
|
|
|
### Step 6: Verify Installation
|
|
|
|
```bash
|
|
# ✅ Health check
|
|
curl http://localhost:11434/
|
|
# Expected: {"status":"ok","ollama":"reachable"}
|
|
|
|
# ✅ Container status
|
|
docker ps --format "table {{.Names}}\t{{.Status}}"
|
|
# Expected: vera-ai Up X minutes (healthy)
|
|
|
|
# ✅ Timezone
|
|
docker exec vera-ai date
|
|
# Should show your timezone (e.g., CDT for America/Chicago)
|
|
|
|
# ✅ User permissions
|
|
docker exec vera-ai id
|
|
# Expected: uid=1000(appuser) gid=1000(appgroup)
|
|
|
|
# ✅ Directories
|
|
docker exec vera-ai ls -la /app/prompts/
|
|
# Should show: curator_prompt.md, systemprompt.md
|
|
|
|
# ✅ Test chat
|
|
curl -X POST http://localhost:11434/api/chat \
|
|
-H "Content-Type: application/json" \
|
|
-d '{"model":"qwen3.5:397b-cloud","messages":[{"role":"user","content":"hello"}],"stream":false}'
|
|
```
|
|
|
|
## ⚙️ Configuration Reference
|
|
|
|
### Environment Variables
|
|
|
|
| Variable | Default | Description |
|
|
|----------|---------|-------------|
|
|
| `APP_UID` | `999` | Container user ID (match host) |
|
|
| `APP_GID` | `999` | Container group ID (match host) |
|
|
| `TZ` | `UTC` | Container timezone |
|
|
| `OPENROUTER_API_KEY` | - | Cloud model routing key |
|
|
| `VERA_CONFIG_DIR` | `/app/config` | Config directory |
|
|
| `VERA_PROMPTS_DIR` | `/app/prompts` | Prompts directory |
|
|
| `VERA_LOG_DIR` | `/app/logs` | Debug logs directory |
|
|
|
|
### Volume Mappings
|
|
|
|
| Host Path | Container Path | Mode | Purpose |
|
|
|-----------|----------------|------|---------|
|
|
| `./config/config.toml` | `/app/config/config.toml` | `ro` | Configuration |
|
|
| `./prompts/` | `/app/prompts/` | `rw` | Curator prompts |
|
|
| `./logs/` | `/app/logs/` | `rw` | Debug logs |
|
|
|
|
### Directory Structure
|
|
|
|
```
|
|
vera-ai-v2/
|
|
├── 📁 config/
|
|
│ └── 📄 config.toml # Main configuration
|
|
├── 📁 prompts/
|
|
│ ├── 📄 curator_prompt.md # Memory curation prompt
|
|
│ └── 📄 systemprompt.md # System context
|
|
├── 📁 logs/ # Debug logs (when debug=true)
|
|
├── 📁 app/
|
|
│ ├── 🐍 main.py # FastAPI application
|
|
│ ├── 🐍 config.py # Configuration loader
|
|
│ ├── 🐍 curator.py # Memory curation
|
|
│ ├── 🐍 proxy_handler.py # Chat handling
|
|
│ ├── 🐍 qdrant_service.py # Vector operations
|
|
│ ├── 🐍 singleton.py # QdrantService singleton
|
|
│ └── 🐍 utils.py # Utilities
|
|
├── 📁 static/ # Legacy symlinks
|
|
├── 📄 .env.example # Environment template
|
|
├── 📄 docker-compose.yml # Docker Compose
|
|
├── 📄 Dockerfile # Container definition
|
|
├── 📄 requirements.txt # Python dependencies
|
|
└── 📄 README.md # This file
|
|
```
|
|
|
|
## 🐳 Docker Compose
|
|
|
|
```yaml
|
|
services:
|
|
vera-ai:
|
|
build:
|
|
context: .
|
|
dockerfile: Dockerfile
|
|
args:
|
|
APP_UID: ${APP_UID:-999}
|
|
APP_GID: ${APP_GID:-999}
|
|
image: vera-ai:latest
|
|
container_name: vera-ai
|
|
env_file:
|
|
- .env
|
|
volumes:
|
|
# Configuration (read-only)
|
|
- ./config/config.toml:/app/config/config.toml:ro
|
|
# Prompts (read-write for curator)
|
|
- ./prompts:/app/prompts:rw
|
|
# Debug logs (read-write)
|
|
- ./logs:/app/logs:rw
|
|
network_mode: "host"
|
|
restart: unless-stopped
|
|
healthcheck:
|
|
test: ["CMD", "python", "-c", "import urllib.request; urllib.request.urlopen('http://localhost:11434/')"]
|
|
interval: 30s
|
|
timeout: 10s
|
|
retries: 3
|
|
start_period: 10s
|
|
```
|
|
|
|
## 🌍 Timezone Configuration
|
|
|
|
The `TZ` variable sets the container timezone for the scheduler:
|
|
|
|
```bash
|
|
# Common timezones
|
|
TZ=UTC # Coordinated Universal Time
|
|
TZ=America/New_York # Eastern Time
|
|
TZ=America/Chicago # Central Time
|
|
TZ=America/Los_Angeles # Pacific Time
|
|
TZ=Europe/London # GMT/BST
|
|
```
|
|
|
|
**Curation Schedule:**
|
|
| Schedule | Time | What | Frequency |
|
|
|----------|------|------|-----------|
|
|
| Daily | 02:00 | Recent 24h | Every day |
|
|
| Monthly | 03:00 on 1st | ALL raw memories | 1st of month |
|
|
|
|
## 🔌 API Endpoints
|
|
|
|
| Endpoint | Method | Description |
|
|
|----------|--------|-------------|
|
|
| `/` | `GET` | Health check |
|
|
| `/api/chat` | `POST` | Chat completion (with memory) |
|
|
| `/api/tags` | `GET` | List available models |
|
|
| `/api/generate` | `POST` | Generate completion |
|
|
| `/curator/run` | `POST` | Trigger curator manually |
|
|
|
|
### Manual Curation
|
|
|
|
```bash
|
|
# Daily curation (recent 24h)
|
|
curl -X POST http://localhost:11434/curator/run
|
|
|
|
# Full curation (all raw memories)
|
|
curl -X POST "http://localhost:11434/curator/run?full=true"
|
|
```
|
|
|
|
## 🧠 Memory System
|
|
|
|
### 4-Layer Context
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────┐
|
|
│ Layer 1: System Prompt │
|
|
│ - From prompts/systemprompt.md │
|
|
│ - Static context, curator can append rules │
|
|
├─────────────────────────────────────────────────────────────┤
|
|
│ Layer 2: Semantic Memory │
|
|
│ - Curated Q&A pairs from Qdrant │
|
|
│ - Retrieved by relevance to current message │
|
|
├─────────────────────────────────────────────────────────────┤
|
|
│ Layer 3: Recent Context │
|
|
│ - Last N conversation turns from Qdrant │
|
|
│ - Chronological order │
|
|
├─────────────────────────────────────────────────────────────┤
|
|
│ Layer 4: Current Messages │
|
|
│ - User/assistant messages from current request │
|
|
│ - Passed through unchanged │
|
|
└─────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
### Memory Types
|
|
|
|
| Type | Description | Retention |
|
|
|------|-------------|-----------|
|
|
| `raw` | Unprocessed conversation turns | Until curation |
|
|
| `curated` | Cleaned Q&A pairs | Permanent |
|
|
| `test` | Test entries | Can be ignored |
|
|
|
|
## 🔧 Troubleshooting
|
|
|
|
### Permission Denied
|
|
|
|
```bash
|
|
# Check your UID/GID
|
|
id
|
|
|
|
# Rebuild with correct values
|
|
APP_UID=$(id -u) APP_GID=$(id -g) docker compose build --no-cache
|
|
docker compose up -d
|
|
```
|
|
|
|
### Wrong Timezone
|
|
|
|
```bash
|
|
# Check container time
|
|
docker exec vera-ai date
|
|
|
|
# Fix in .env
|
|
TZ=America/Chicago
|
|
```
|
|
|
|
### Health Check Failing
|
|
|
|
```bash
|
|
# Check logs
|
|
docker logs vera-ai --tail 50
|
|
|
|
# Test Ollama connectivity
|
|
docker exec vera-ai python -c "
|
|
import urllib.request
|
|
print(urllib.request.urlopen('http://YOUR_OLLAMA_IP:11434/').read())
|
|
"
|
|
|
|
# Test Qdrant connectivity
|
|
docker exec vera-ai python -c "
|
|
import urllib.request
|
|
print(urllib.request.urlopen('http://YOUR_QDRANT_IP:6333/').read())
|
|
"
|
|
```
|
|
|
|
### Port Already in Use
|
|
|
|
```bash
|
|
# Check what's using port 11434
|
|
sudo lsof -i :11434
|
|
|
|
# Stop conflicting service or change port in config
|
|
```
|
|
|
|
## 🛠️ Development
|
|
|
|
### Build from Source
|
|
|
|
```bash
|
|
git clone http://10.0.0.61:3000/SpeedyFoxAi/vera-ai-v2.git
|
|
cd vera-ai-v2
|
|
pip install -r requirements.txt
|
|
docker compose build
|
|
```
|
|
|
|
### Run Tests
|
|
|
|
```bash
|
|
# Health check
|
|
curl http://localhost:11434/
|
|
|
|
# Non-streaming chat
|
|
curl -X POST http://localhost:11434/api/chat \
|
|
-H "Content-Type: application/json" \
|
|
-d '{"model":"qwen3.5:397b-cloud","messages":[{"role":"user","content":"test"}],"stream":false}'
|
|
|
|
# Trigger curation
|
|
curl -X POST http://localhost:11434/curator/run
|
|
```
|
|
|
|
## 📄 License
|
|
|
|
MIT License - see [LICENSE](LICENSE) file for details.
|
|
|
|
## 🤝 Support
|
|
|
|
| Resource | Link |
|
|
|----------|------|
|
|
| **Repository** | http://10.0.0.61:3000/SpeedyFoxAi/vera-ai-v2 |
|
|
| **Issues** | http://10.0.0.61:3000/SpeedyFoxAi/vera-ai-v2/issues |
|
|
|
|
---
|
|
|
|
<div align="center">
|
|
|
|
**Vera-AI** — *True AI Memory*
|
|
|
|
Brought to you by SpeedyFoxAi
|
|
|
|
</div> |