Initial commit: TrueRecall v2.2 with 30b curator and timer-based curation

This commit is contained in:
root
2026-02-24 20:27:44 -06:00
commit 8bb1abaf18
23 changed files with 4112 additions and 0 deletions

494
session.md Normal file
View File

@@ -0,0 +1,494 @@
# TrueRecall v2 - Session Notes
**Last Updated:** 2026-02-24 19:02 CST
**Status:** ✅ Active & Verified
**Version:** v2.2 (Timer-based curation deployed)
---
## Session End (18:09 CST)
**Reason:** User starting new session
**Current State:**
- Real-time watcher: ✅ Active (capturing live)
- Timer curator: ✅ Deployed (every 30 min via cron)
- Daily curator: ❌ Removed (replaced by timer)
- Total memories: 12,378 (all tagged with `curated: false`)
- Gems: 5 (from Feb 18 test)
**Next session start:** Read this file, then check:
```bash
# Quick status
python3 /root/.openclaw/workspace/.projects/true-recall-v2/tr-continuous/curator_by_count.py --status
sudo systemctl status mem-qdrant-watcher
curl -s http://10.0.0.40:6333/collections/memories_tr | jq '.result.points_count'
```
---
## Executive Summary
TrueRecall v2 is a complete memory system with real-time capture, daily curation, and context injection. All components are operational.
---
## Current State (Verified 18:09 CST)
### Qdrant Collections
| Collection | Points | Purpose | Status |
|------------|--------|---------|--------|
| `memories_tr` | **12,378** | Full text (live capture) | ✅ Active |
| `gems_tr` | **5** | Curated gems (injection) | ✅ Active |
| `true_recall` | existing | Legacy archive | 📦 Preserved |
| `kimi_memories` | 12,223 | Original backup | 📦 Preserved |
**Note:** All memories tagged with `curated: false` for timer curator.
### Services
| Service | Status | Uptime |
|---------|--------|--------|
| `mem-qdrant-watcher` | ✅ Active | 30+ min |
| OpenClaw Gateway | ✅ Running | 2026.2.23 |
| memory-qdrant plugin | ✅ Loaded | recall: gems_tr, capture: memories_tr |
---
## Architecture
### v2.2: Timer-Based Curation (DEPLOYED)
**Data Flow:**
```
┌─────────────────┐ ┌──────────────────────┐ ┌─────────────┐
│ OpenClaw Chat │────▶│ Real-Time Watcher │────▶│ memories_tr │
│ (Session JSONL)│ │ (Python daemon) │ │ (Qdrant) │
└─────────────────┘ └──────────────────────┘ └──────┬──────┘
│ Every 30 min
┌──────────────────┐
│ Timer Curator │
│ (cron/qwen3) │
└────────┬─────────┘
┌──────────────────┐
│ gems_tr │
│ (Qdrant) │
└────────┬─────────┘
Per turn │
┌──────────────────┐
│ memory-qdrant │
│ plugin │
└──────────────────┘
```
**Key Changes:**
- ✅ Replaced daily 2:45 AM batch with 30-minute timer
- ✅ All memories tagged `curated: false` on write
- ✅ Migration completed for 12,378 existing memories
- ✅ No Redis dependency (direct Qdrant only)
---
## Components
### Curation Mode: Timer-Based (DEPLOYED v2.2)
| Setting | Value | Adjustable |
|---------|-------|------------|
| **Trigger** | Cron timer | ✅ |
| **Interval** | 30 minutes | ✅ Config file |
| **Batch size** | 100 memories max | ✅ Config file |
| **Minimum** | None (0 is OK) | — |
**Config:** `/tr-continuous/curator_config.json`
```json
{
"timer_minutes": 30,
"max_batch_size": 100,
"user_id": "rob",
"source_collection": "memories_tr",
"target_collection": "gems_tr"
}
```
**Cron:**
```
*/30 * * * * cd .../tr-continuous && python3 curator_timer.py
```
**Old modes deprecated:**
- ❌ Turn-based (every N turns)
- ❌ Hybrid (timer + turn)
- ❌ Daily batch (2:45 AM)
### 1. Real-Time Watcher (Primary Capture)
**Location:** `/root/.openclaw/workspace/skills/qdrant-memory/scripts/realtime_qdrant_watcher.py`
**Function:**
- Watches `/root/.openclaw/agents/main/sessions/*.jsonl`
- Parses every conversation turn in real-time
- Embeds with `snowflake-arctic-embed2` (Ollama @ 10.0.0.10)
- Stores directly to `memories_tr` (no Redis)
- **Cleans content:** Removes markdown, tables, metadata, thinking tags
**Service:** `mem-qdrant-watcher.service`
- **Status:** Active since 16:46:53 CST
- **Systemd:** Enabled, auto-restart
**Log:** `journalctl -u mem-qdrant-watcher -f`
---
### 2. Content Cleaner (Existing Data)
**Location:** `/root/.openclaw/workspace/skills/qdrant-memory/scripts/clean_memories_tr.py`
**Function:**
- Batch-cleans existing `memories_tr` points
- Removes: `**bold**`, `|tables|`, `` `code` ``, `---` rules, `# headers`
- Flattens nested content dicts
- Rate-limited to prevent Qdrant overload
**Usage:**
```bash
# Dry run (preview)
python3 clean_memories_tr.py --dry-run
# Clean all
python3 clean_memories_tr.py --execute
# Clean limited (test)
python3 clean_memories_tr.py --execute --limit 100
```
---
### 3. Timer Curator (v2.2 - DEPLOYED)
**Replaces:** Daily curator (2:45 AM batch) and turn-based curator
**Location:** `/root/.openclaw/workspace/.projects/true-recall-v2/tr-continuous/curator_timer.py`
**Schedule:** Every 30 minutes (cron)
**Flow:**
1. Query uncurated memories (`curated: false`)
2. Send batch to qwen3 (max 100)
3. Extract gems using curator prompt
4. Store gems to `gems_tr`
5. Mark processed memories as `curated: true`
**Files:**
| File | Purpose |
|------|---------|
| `curator_timer.py` | Main curator script |
| `curator_config.json` | Adjustable settings |
| `migrate_add_curated.py` | One-time migration (completed) |
**Usage:**
```bash
# Dry run (preview)
python3 curator_timer.py --dry-run
# Manual run
python3 curator_timer.py --config curator_config.json
```
**Status:** ✅ Deployed, first run will process ~12,378 existing memories
### 5. Silent Compacting (NEW - Concept)
**Idea:** Automatically remove old context from prompt when token limit approached.
**Behavior:**
- Trigger: Context window > 80% full
- Action: Remove oldest messages (silently)
- Preserve: Gems always kept, recent N turns kept
- Result: Seamless conversation without "compacting" notification
**Config:**
```json
{
"compacting": {
"enabled": true,
"triggerAtPercent": 80,
"keepRecentTurns": 20,
"preserveGems": true,
"silent": true
}
}
```
**Status:** ⏳ Concept only - requires OpenClaw core changes
### 6. memory-qdrant Plugin
**Location:** `/root/.openclaw/extensions/memory-qdrant/`
**Config:**
```json
{
"collectionName": "gems_tr",
"captureCollection": "memories_tr",
"autoRecall": true,
"autoCapture": true
}
```
**Function:**
- **Recall:** Searches `gems_tr`, injects as context (hidden)
- **Capture:** Session-level capture to `memories_tr` (backup)
**Status:** Loaded, dual collection support working
---
## Files & Locations
### Core Project Files
```
/root/.openclaw/workspace/.projects/true-recall-v2/
├── README.md # Architecture docs
├── session.md # This file
├── curator-prompt.md # Gem extraction prompt
├── tr-daily/ # Daily batch curation
│ └── curate_from_qdrant.py # Daily curator (2:45 AM)
├── tr-continuous/ # Real-time curation (NEW)
│ ├── curator_by_count.py # Turn-based curator
│ ├── curator_turn_based.py # Alternative approach
│ ├── curator_cron.sh # Cron wrapper
│ ├── turn-curator.service # Systemd service
│ └── README.md # Documentation
└── shared/
└── (shared resources)
```
### New Files (2026-02-24 19:00)
| File | Purpose |
|------|---------|
| `tr-continuous/curator_timer.py` | Timer-based curator (deployed) |
| `tr-continuous/curator_config.json` | Curator settings |
| `tr-continuous/migrate_add_curated.py` | Migration script (completed) |
### Legacy Files (Pre-v2.2)
| File | Status | Note |
|------|--------|------|
| `tr-daily/curate_from_qdrant.py` | 📦 Archived | Replaced by timer |
| `tr-continuous/curator_by_count.py` | 📦 Archived | Replaced by timer |
| `tr-continuous/curator_turn_based.py` | 📦 Archived | Replaced by timer |
### System Locations
| File | Purpose |
|------|---------|
| `/root/.openclaw/extensions/memory-qdrant/` | Plugin code |
| `/root/.openclaw/openclaw.json` | Plugin configuration |
| `/etc/systemd/system/mem-qdrant-watcher.service` | Systemd service |
---
## Changes Made Today (2026-02-24 19:00)
### 1. Timer Curator Deployed (v2.2)
- Created `curator_timer.py` — simplified timer-based curation
- Created `curator_config.json` — adjustable settings
- Removed daily 2:45 AM cron job
- Added `*/30 * * * *` cron timer
- **Status:** ✅ Deployed, logs to `/var/log/true-recall-timer.log`
### 2. Migration Completed
- Created `migrate_add_curated.py`
- Tagged 12,378 existing memories with `curated: false`
- Updated watcher to add `curated: false` to new memories
- **Status:** ✅ Complete
### 3. Simplified Architecture
- ❌ Removed turn-based curator complexity
- ❌ Removed daily batch processing
- ✅ Single timer trigger every 30 minutes
- ✅ No minimum threshold (processes 0-N memories)
---
## Configuration
### memory-qdrant Plugin
**File:** `/root/.openclaw/openclaw.json`
```json
{
"memory-qdrant": {
"config": {
"autoCapture": true,
"autoRecall": true,
"collectionName": "gems_tr",
"captureCollection": "memories_tr",
"embeddingModel": "snowflake-arctic-embed2",
"maxRecallResults": 2,
"minRecallScore": 0.7,
"ollamaUrl": "http://10.0.0.10:11434",
"qdrantUrl": "http://10.0.0.40:6333"
},
"enabled": true
}
}
```
### Gateway (OpenClaw Update Fix)
```json
{
"gateway": {
"controlUi": {
"allowedOrigins": ["*"],
"allowInsecureAuth": false,
"dangerouslyDisableDeviceAuth": true
}
}
}
```
---
## Validation Commands
### Check Collections
```bash
# Points count
curl -s http://10.0.0.40:6333/collections/memories_tr | jq '.result.points_count'
curl -s http://10.0.0.40:6333/collections/gems_tr | jq '.result.points_count'
# Recent points
curl -s -X POST http://10.0.0.40:6333/collections/memories_tr/points/scroll \
-H "Content-Type: application/json" \
-d '{"limit": 5, "with_payload": true}' | jq '.result.points[].payload.content'
```
### Check Services
```bash
# Watcher status
sudo systemctl status mem-qdrant-watcher
# Watcher logs
sudo journalctl -u mem-qdrant-watcher -n 20
# OpenClaw status
openclaw status
```
---
## Troubleshooting
### Issue: Watcher Not Capturing
**Check:**
1. Service running? `systemctl status mem-qdrant-watcher`
2. Logs: `journalctl -u mem-qdrant-watcher -f`
3. Qdrant accessible? `curl http://10.0.0.40:6333/`
4. Ollama accessible? `curl http://10.0.0.10:11434/api/tags`
### Issue: Cleaner Fails
**Common causes:**
- Qdrant connection timeout (add `time.sleep(0.1)` between batches)
- Nested content dicts (handled in updated script)
- Type errors (non-string content — handled)
### Issue: Plugin Not Loading
**Check:**
1. `openclaw.json` syntax valid? `openclaw config validate`
2. Plugin compiled? `cd /root/.openclaw/extensions/memory-qdrant && npx tsc`
3. Gateway logs: `tail /tmp/openclaw/openclaw-$(date +%Y-%m-%d).log`
---
## Cron Schedule (Updated v2.2)
| Time | Job | Script | Status |
|------|-----|--------|--------|
| Every 30 min | Timer curator | `tr-continuous/curator_timer.py` | ✅ Active |
| Per turn | Capture | `mem-qdrant-watcher` | ✅ Daemon |
| Per turn | Injection | `memory-qdrant` plugin | ✅ Active |
**Removed:**
- ❌ 2:45 AM daily curator
- ❌ Every-minute turn curator check
---
## Next Steps
### Immediate
- ⏳ Monitor first timer run (logs: `/var/log/true-recall-timer.log`)
- ⏳ Validate gem extraction quality from timer curator
- ⏳ Archive old curator scripts if timer works
### Completed ✅
-**Compactor config** — Minimal overhead: `mode: default`, `reserveTokensFloor: 0`, `memoryFlush: false`
### Future
- ⏳ Curator tuning based on timer results
- ⏳ Silent compacting (requires OpenClaw core changes)
### Planned Features (Backlog)
-**Interactive install script** — Prompts for embedding model, timer interval, batch size, endpoints
-**Single embedding model option** — Use one model for both collections
-**Configurable thresholds** — Per-user customization via prompts
**Compactor Settings (Applied):**
```json5
{
agents: {
defaults: {
compaction: {
mode: "default",
reserveTokensFloor: 0,
memoryFlush: { enabled: false }
}
}
}
}
```
**Note:** Only `mode`, `reserveTokensFloor`, and `memoryFlush` are valid under `agents.defaults.compaction`. Other settings are Pi runtime parameters.
**Install script prompts:**
1. Embedding model (snowflake vs mxbai)
2. Timer interval (5 min / 30 min / hourly)
3. Batch size (50 / 100 / 500)
4. Qdrant/Ollama URLs
5. User ID
---
## Session Recovery
If starting fresh:
1. Read `README.md` for architecture overview
2. Check service status: `sudo systemctl status mem-qdrant-watcher`
3. Check timer curator: `tail /var/log/true-recall-timer.log`
4. Verify collections: `curl http://10.0.0.40:6333/collections`
---
*Last Verified: 2026-02-24 19:29 CST*
*Version: v2.2 (30b curator, install script planned)*