128 lines
3.3 KiB
Markdown
128 lines
3.3 KiB
Markdown
|
|
# Vera-AI Project
|
||
|
|
|
||
|
|
**Persistent Memory Proxy for Ollama**
|
||
|
|
|
||
|
|
> **Status:** Built and running on deb8. Goal: Validate and improve.
|
||
|
|
|
||
|
|
Vera-AI sits between AI clients and Ollama, storing conversations in Qdrant and retrieving context semantically — giving AI **true memory**.
|
||
|
|
|
||
|
|
## Architecture
|
||
|
|
|
||
|
|
```
|
||
|
|
Client → Vera-AI (port 11434) → Ollama
|
||
|
|
↓
|
||
|
|
Qdrant (vector DB)
|
||
|
|
↓
|
||
|
|
Memory Storage
|
||
|
|
```
|
||
|
|
|
||
|
|
## Key Components
|
||
|
|
|
||
|
|
| File | Purpose |
|
||
|
|
|------|---------|
|
||
|
|
| `app/main.py` | FastAPI application entry point |
|
||
|
|
| `app/proxy_handler.py` | Chat request handling |
|
||
|
|
| `app/qdrant_service.py` | Vector DB operations |
|
||
|
|
| `app/curator.py` | Memory curation (daily/monthly) |
|
||
|
|
| `app/config.py` | Configuration loader |
|
||
|
|
| `config/config.toml` | Main configuration file |
|
||
|
|
|
||
|
|
## 4-Layer Context System
|
||
|
|
|
||
|
|
1. **System Prompt** — From `prompts/systemprompt.md`
|
||
|
|
2. **Semantic Memory** — Curated Q&A from Qdrant (relevance search)
|
||
|
|
3. **Recent Context** — Last N conversation turns
|
||
|
|
4. **Current Messages** — User's current request
|
||
|
|
|
||
|
|
## Configuration
|
||
|
|
|
||
|
|
Key settings in `config/config.toml`:
|
||
|
|
|
||
|
|
```toml
|
||
|
|
[general]
|
||
|
|
ollama_host = "http://10.0.0.10:11434"
|
||
|
|
qdrant_host = "http://10.0.0.22:6333"
|
||
|
|
qdrant_collection = "memories"
|
||
|
|
embedding_model = "snowflake-arctic-embed2"
|
||
|
|
|
||
|
|
[layers]
|
||
|
|
semantic_token_budget = 25000
|
||
|
|
context_token_budget = 22000
|
||
|
|
semantic_search_turns = 2
|
||
|
|
semantic_score_threshold = 0.6
|
||
|
|
|
||
|
|
[curator]
|
||
|
|
run_time = "02:00" # Daily curation time
|
||
|
|
curator_model = "gpt-oss:120b"
|
||
|
|
```
|
||
|
|
|
||
|
|
## Environment Variables
|
||
|
|
|
||
|
|
| Variable | Default | Description |
|
||
|
|
|----------|---------|-------------|
|
||
|
|
| `APP_UID` | `999` | Container user ID |
|
||
|
|
| `APP_GID` | `999` | Container group ID |
|
||
|
|
| `TZ` | `UTC` | Timezone |
|
||
|
|
| `VERA_DEBUG` | `false` | Enable debug logging |
|
||
|
|
|
||
|
|
## Running
|
||
|
|
|
||
|
|
```bash
|
||
|
|
# Build and start
|
||
|
|
docker compose build
|
||
|
|
docker compose up -d
|
||
|
|
|
||
|
|
# Check status
|
||
|
|
docker ps
|
||
|
|
docker logs VeraAI --tail 20
|
||
|
|
|
||
|
|
# Health check
|
||
|
|
curl http://localhost:11434/
|
||
|
|
```
|
||
|
|
|
||
|
|
## API Endpoints
|
||
|
|
|
||
|
|
| Endpoint | Method | Description |
|
||
|
|
|----------|--------|-------------|
|
||
|
|
| `/` | GET | Health check |
|
||
|
|
| `/api/chat` | POST | Chat completion (with memory) |
|
||
|
|
| `/api/tags` | GET | List models |
|
||
|
|
| `/api/generate` | POST | Generate completion |
|
||
|
|
| `/curator/run` | POST | Trigger curation manually |
|
||
|
|
|
||
|
|
## Development Workflow
|
||
|
|
|
||
|
|
This project is synced with **deb9** (10.0.0.48). To sync changes:
|
||
|
|
|
||
|
|
```bash
|
||
|
|
# Pull from deb9
|
||
|
|
sshpass -p 'passw0rd' scp -r -o StrictHostKeyChecking=no n8n@10.0.0.48:/home/n8n/vera-ai/* /home/n8n/vera-ai/
|
||
|
|
|
||
|
|
# Push to deb9 (after local changes)
|
||
|
|
sshpass -p 'passw0rd' scp -r -o StrictHostKeyChecking=no /home/n8n/vera-ai/* n8n@10.0.0.48:/home/n8n/vera-ai/
|
||
|
|
```
|
||
|
|
|
||
|
|
## Memory System
|
||
|
|
|
||
|
|
- **raw** memories — Unprocessed conversation turns (until curation)
|
||
|
|
- **curated** memories — Cleaned Q&A pairs (permanent)
|
||
|
|
- **test** memories — Test entries (can be ignored)
|
||
|
|
|
||
|
|
Curation runs daily at 02:00 and monthly on the 1st at 03:00.
|
||
|
|
|
||
|
|
## Related Infrastructure
|
||
|
|
|
||
|
|
| Service | Host | Port |
|
||
|
|
|---------|------|------|
|
||
|
|
| Qdrant | 10.0.0.22 | 6333 |
|
||
|
|
| Ollama | 10.0.0.10 | 11434 |
|
||
|
|
| deb9 | 10.0.0.48 | Source project (SSH) |
|
||
|
|
| deb8 | 10.0.0.46 | Docker runtime |
|
||
|
|
|
||
|
|
## Qdrant Collections
|
||
|
|
|
||
|
|
| Collection | Purpose |
|
||
|
|
|------------|---------|
|
||
|
|
| `python_kb` | Python code patterns reference for this project |
|
||
|
|
| `memories` | Conversation memory storage (default) |
|
||
|
|
| `vera_memories` | Alternative memory collection |
|