Files

Vera-AI abfcc91eb3 v2.0.3: Improve error handling, add tests, cleanup

- Fix bare except clauses in curator.py and main.py
- Change embedding model to snowflake-arctic-embed2
- Increase semantic_score_threshold to 0.6
- Add memory context explanation to systemprompt.md
- Add pytest dependencies to requirements.txt
- Remove unused context_handler.py and .env.example
- Add project documentation (CLAUDE.md) and test files

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-03-30 08:47:56 -05:00

3.3 KiB

Raw Blame History

Vera-AI Project

Persistent Memory Proxy for Ollama

Status: Built and running on deb8. Goal: Validate and improve.

Vera-AI sits between AI clients and Ollama, storing conversations in Qdrant and retrieving context semantically — giving AI true memory.

Architecture

Client → Vera-AI (port 11434) → Ollama
              ↓
           Qdrant (vector DB)
              ↓
           Memory Storage

Key Components

File	Purpose
`app/main.py`	FastAPI application entry point
`app/proxy_handler.py`	Chat request handling
`app/qdrant_service.py`	Vector DB operations
`app/curator.py`	Memory curation (daily/monthly)
`app/config.py`	Configuration loader
`config/config.toml`	Main configuration file

4-Layer Context System

System Prompt — From prompts/systemprompt.md
Semantic Memory — Curated Q&A from Qdrant (relevance search)
Recent Context — Last N conversation turns
Current Messages — User's current request

Configuration

Key settings in config/config.toml:

[general]
ollama_host = "http://10.0.0.10:11434"
qdrant_host = "http://10.0.0.22:6333"
qdrant_collection = "memories"
embedding_model = "snowflake-arctic-embed2"

[layers]
semantic_token_budget = 25000
context_token_budget = 22000
semantic_search_turns = 2
semantic_score_threshold = 0.6

[curator]
run_time = "02:00"  # Daily curation time
curator_model = "gpt-oss:120b"

Environment Variables

Variable	Default	Description
`APP_UID`	`999`	Container user ID
`APP_GID`	`999`	Container group ID
`TZ`	`UTC`	Timezone
`VERA_DEBUG`	`false`	Enable debug logging

Running

# Build and start
docker compose build
docker compose up -d

# Check status
docker ps
docker logs VeraAI --tail 20

# Health check
curl http://localhost:11434/

API Endpoints

Endpoint	Method	Description
`/`	GET	Health check
`/api/chat`	POST	Chat completion (with memory)
`/api/tags`	GET	List models
`/api/generate`	POST	Generate completion
`/curator/run`	POST	Trigger curation manually

Development Workflow

This project is synced with deb9 (10.0.0.48). To sync changes:

# Pull from deb9
sshpass -p 'passw0rd' scp -r -o StrictHostKeyChecking=no n8n@10.0.0.48:/home/n8n/vera-ai/* /home/n8n/vera-ai/

# Push to deb9 (after local changes)
sshpass -p 'passw0rd' scp -r -o StrictHostKeyChecking=no /home/n8n/vera-ai/* n8n@10.0.0.48:/home/n8n/vera-ai/

Memory System

raw memories — Unprocessed conversation turns (until curation)
curated memories — Cleaned Q&A pairs (permanent)
test memories — Test entries (can be ignored)

Curation runs daily at 02:00 and monthly on the 1st at 03:00.

Service	Host	Port
Qdrant	10.0.0.22	6333
Ollama	10.0.0.10	11434
deb9	10.0.0.48	Source project (SSH)
deb8	10.0.0.46	Docker runtime

Qdrant Collections

Collection	Purpose
`python_kb`	Python code patterns reference for this project
`memories`	Conversation memory storage (default)
`vera_memories`	Alternative memory collection

3.3 KiB Raw Blame History