# TrueRecall v2 **Project:** Gem extraction and memory recall system **Status:** ✅ Active & Verified **Location:** `~/.openclaw/workspace/.projects/true-recall-v2/` **Last Updated:** 2026-02-24 19:02 CST --- ## Table of Contents - [Quick Start](#quick-start) - [Overview](#overview) - [Current State](#current-state) - [Architecture](#architecture) - [Components](#components) - [Files & Locations](#files--locations) - [Configuration](#configuration) - [Validation](#validation) - [Troubleshooting](#troubleshooting) - [Status Summary](#status-summary) --- ## Quick Start ```bash # Check system status openclaw status sudo systemctl status mem-qdrant-watcher # View recent captures curl -s http://:6333/collections/memories_tr | jq '.result.points_count' # Check collections curl -s http://:6333/collections | jq '.result.collections[].name' ``` --- ## Overview TrueRecall v2 extracts "gems" (key insights) from conversations and injects them as context. It consists of three layers: 1. **Capture** — Real-time watcher saves every turn to `memories_tr` 2. **Curation** — Daily curator extracts gems to `gems_tr` 3. **Injection** — Plugin searches `gems_tr` and injects gems per turn --- ## Current State ### Verified at 19:02 CST | Collection | Points | Purpose | Status | |------------|--------|---------|--------| | `memories_tr` | **12,378** | Full text (live capture) | ✅ Active | | `gems_tr` | **5** | Curated gems (injection) | ✅ Active | **All memories tagged with `curated: false` for timer curation.** ### Services Status | Service | Status | Details | |---------|--------|---------| | `mem-qdrant-watcher` | ✅ Active | PID 1748, capturing | | Timer curator | ✅ Deployed | Every 30 min via cron | | OpenClaw Gateway | ✅ Running | Version 2026.2.23 | | memory-qdrant plugin | ✅ Loaded | recall: gems_tr | --- ## Comparison: TrueRecall v2 vs Jarvis Memory vs v1 | Feature | Jarvis Memory | TrueRecall v1 | TrueRecall v2 | |---------|---------------|---------------|---------------| | **Storage** | Redis | Redis + Qdrant | Qdrant only | | **Capture** | Session batch | Session batch | Real-time | | **Curation** | Manual | Daily 2:45 AM | Timer (5 min) | | **Embedding** | — | snowflake | snowflake + mxbai | | **Curator LLM** | — | qwen3:4b | qwen3:30b | | **State tracking** | — | — | `curated` tag | | **Batch size** | — | 24h worth | Configurable | | **JSON parsing** | — | Fallback needed | Native (30b) | **Key Improvements v2:** - ✅ Real-time capture (no batch delay) - ✅ Timer-based curation (responsive vs daily) - ✅ 30b curator (better gems, faster ~3s) - ✅ `curated` tag (reliable state tracking) - ✅ No Redis dependency (simpler stack) --- ## Architecture ### v2.2: Timer-Based Curation ``` ┌─────────────────┐ ┌──────────────────────┐ ┌─────────────┐ │ OpenClaw Chat │────▶│ Real-Time Watcher │────▶│ memories_tr │ │ (Session JSONL)│ │ (Python daemon) │ │ (Qdrant) │ └─────────────────┘ └──────────────────────┘ └──────┬──────┘ │ │ Every 30 min ▼ ┌──────────────────┐ │ Timer Curator │ │ (cron/qwen3) │ └────────┬─────────┘ │ ▼ ┌──────────────────┐ │ gems_tr │ │ (Qdrant) │ └────────┬─────────┘ │ Per turn │ ▼ ┌──────────────────┐ │ memory-qdrant │ │ plugin │ └──────────────────┘ ``` **Key Changes in v2.2:** - ✅ Timer-based curation (30 min intervals) - ✅ All memories tagged `curated: false` on capture - ✅ Migration complete (12,378 memories) - ❌ Removed daily batch processing (2:45 AM) --- ## Components ### 1. Real-Time Watcher **File:** `skills/qdrant-memory/scripts/realtime_qdrant_watcher.py` **What it does:** - Watches `~/.openclaw/agents/main/sessions/*.jsonl` - Parses each turn (user + AI) - Embeds with `snowflake-arctic-embed2` - Stores to `memories_tr` instantly - **Cleans:** Removes markdown, tables, metadata **Service:** `mem-qdrant-watcher.service` **Commands:** ```bash # Check status sudo systemctl status mem-qdrant-watcher # View logs sudo journalctl -u mem-qdrant-watcher -f # Restart sudo systemctl restart mem-qdrant-watcher ``` --- ### 2. Content Cleaner **File:** `skills/qdrant-memory/scripts/clean_memories_tr.py` **Purpose:** Batch-clean existing points **Usage:** ```bash # Preview changes python3 clean_memories_tr.py --dry-run # Clean all python3 clean_memories_tr.py --execute # Clean 100 (test) python3 clean_memories_tr.py --execute --limit 100 ``` **Cleans:** - `**bold**` → plain text - `|tables|` → removed - `` `code` `` → plain text - `---` rules → removed - `# headers` → removed --- ### 3. Timer Curator **File:** `tr-continuous/curator_timer.py` **Schedule:** Every 30 minutes (cron) **Flow:** 1. Query uncurated memories from `memories_tr` 2. Send batch to qwen3 (max 100) 3. Extract gems → store to `gems_tr` 4. Mark memories as `curated: true` **Config:** `tr-continuous/curator_config.json` ```json { "timer_minutes": 30, "max_batch_size": 100 } ``` **Logs:** `/var/log/true-recall-timer.log` --- ### 4. Curation Model Comparison **Current:** `qwen3:4b-instruct` | Metric | 4b | 30b | |--------|----|----| | Speed | ~10-30s per batch | **~3.3s** (tested 2026-02-24) | | JSON reliability | ⚠️ Needs fallback | ✅ Native | | Context quality | Basic extraction | ✅ Nuanced | | Snippet accuracy | ~80% | ✅ Expected: 95%+ | **30b Benchmark (2026-02-24):** - Load: 108ms - Prompt eval: 49ms (1,576 tok/s) - Generation: 2.9s (233 tokens, 80 tok/s) - **Total: 3.26s** **Trade-offs:** - **4b:** Faster batch processing, lightweight, catches explicit decisions - **30b:** Deeper context, better inference, ~3x slower but superior quality **Gem Quality Comparison (Sample Review):** | Aspect | 4b | 30b | |--------|----|----| | **Context depth** | "Extracted via fallback" | Explains *why* decisions were made | | **Confidence scores** | 0.7-0.85 | 0.9-0.97 | | **Snippet accuracy** | ~80% (wrong source) | ✅ 95%+ (relevant quotes) | | **Categories** | Generic "extracted" | Specific: knowledge, technical, decision | | **Example** | "User implemented BorgBackup" (no context) | "User selected mxbai... due to top MTEB score of 66.5" (explains reasoning) | **Verdict:** 30b produces significantly higher quality gems — richer context, accurate snippets, and captures architectural intent, not just surface facts. --- ### 5. Semantic Deduplication (Similarity Checking) **Why:** Smaller models (4b) often extract duplicate or near-duplicate gems. Without checking, your `gems_tr` collection fills with redundant entries. **The Problem:** - "User decided on Redis" and "User selected Redis for caching" are the same gem - Smaller models lack nuance — they extract surface variations as separate gems - Over time, 30-50% of gems may be duplicates **Solution: Semantic Similarity Check** Before inserting a new gem: 1. Embed the candidate gem text 2. Search `gems_tr` for similar embeddings (past 24h) 3. If similarity > 0.85, SKIP (don't insert) 4. If similarity 0.70-0.85, MERGE (update existing with richer context) 5. If similarity < 0.70, INSERT (new unique gem) **Implementation Options:** #### Option A: Built-in Curator Check (Recommended) Modify `curator_timer.py` to add pre-insertion similarity check: ```python import numpy as np from qdrant_client import QdrantClient qdrant = QdrantClient("http://:6333") def is_duplicate(gem_text: str, user_id: str = "rob", threshold: float = 0.85) -> bool: """Check if similar gem exists in past 24h""" # Embed the candidate response = requests.post( "http://:11434/api/embeddings", json={"model": "mxbai-embed-large", "prompt": gem_text} ) embedding = response.json()["embedding"] # Search for similar gems results = qdrant.search( collection_name="gems_tr", query_vector=embedding, limit=3, query_filter={ "must": [ {"key": "user_id", "match": {"value": user_id}}, {"key": "timestamp", "range": {"gte": "now-24h"}} ] } ) # Check similarity scores for result in results: if result.score > threshold: return True # Duplicate found return False # In main loop, before inserting: if is_duplicate(gem["gem"]): log.info(f"Skipping duplicate gem: {gem['gem'][:50]}...") continue ``` **Pros:** Catches duplicates at source, no extra jobs **Cons:** Adds ~50-100ms per gem (embedding call) #### Option B: Periodic AI Review (Subagent Task) Have a subagent periodically review and merge duplicates: ```bash # Run weekly via cron 0 3 * * 0 cd && python3 dedup_gems.py ``` **dedup_gems.py approach:** 1. Load all gems from past 7 days 2. Group by semantic similarity (clustering) 3. For each cluster > 1 gem: - Keep highest confidence gem as primary - Merge context from others into primary - Delete duplicates **Pros:** Can use reasoning model for nuanced merging **Cons:** Batch job, duplicates exist until cleanup runs #### Option C: Real-time Watcher Hook Add deduplication to the real-time watcher before memories are even stored: ```python # In watcher, before upsert to memories_tr if is_similar_to_recent(memory_text, window="1h"): memory["duplicate_of"] = similar_id # Tag but still store ``` **Pros:** Prevents duplicate memories upstream **Cons:** Memories may differ slightly even if gems would be same **Recommendation by Model:** | Model | Recommended Approach | Reason | |-------|---------------------|--------| | **4b** | **Option A + B** | Built-in check prevents duplicates; periodic review catches edge cases | | **30b** | **Option B only** | 30b produces fewer duplicates; weekly review sufficient | | **Production** | **Option A** | Best balance of prevention and performance | **Configuration:** Add to `curator_config.json`: ```json { "deduplication": { "enabled": true, "similarity_threshold": 0.85, "lookback_hours": 24, "mode": "skip" // "skip", "merge", or "flag" } } ``` --- ### 6. OpenClaw Compactor Configuration **Status:** ✅ Applied **Goal:** Minimal overhead — just remove context, do nothing else. **Config Applied:** ```json5 { agents: { defaults: { compaction: { mode: "default", // "default" or "safeguard" reserveTokensFloor: 0, // Disable safety floor (default: 20000) memoryFlush: { enabled: false // Disable silent .md file writes } } } } } ``` **What this does:** - `mode: "default"` — Standard summarization (faster) - `reserveTokensFloor: 0` — Allow aggressive settings (disables 20k minimum) - `memoryFlush.enabled: false` — No silent "write memory" turns **Known Issue: UI Glitch During Compaction** When compaction runs, the Control UI may briefly behave unexpectedly: - Typed text may not appear immediately after hitting Enter - Messages may render out of order briefly - UI "catches up" within 1-2 seconds after compaction completes **Why:** Compaction replaces the full conversation history with a summary. The UI's WebSocket state can get briefly out of sync during this transition. **Workaround:** - Wait 2-3 seconds after hitting Enter during compaction - Or hard refresh (Ctrl+Shift+R) if UI seems stuck - **Note:** This is an OpenClaw Control UI limitation — cannot be fixed from TrueRecall side at this time. **Note:** `reserveTokens` and `keepRecentTokens` are Pi runtime settings, not configurable via `agents.defaults.compaction`. They are set per-model in `contextWindow`/`contextTokens`. --- ### 7. Configuration Options Reference **All configurable options with defaults:** | Option | Default | Description | |--------|---------|-------------| | **Embedding model** | `mxbai-embed-large` | Model for generating gem embeddings. `mxbai` = higher accuracy (MTEB 66.5). `snowflake` = faster processing. | | **Timer interval** | `5` minutes | How often the curator runs. `5 min` = fast backlog clearing. `30 min` = balanced. `60 min` = minimal overhead. | | **Batch size** | `100` | Max memories sent to curator per run. Higher = fewer API calls but more memory usage. | | **Max gems per run** | *(unlimited)* | Hard limit on gems extracted per batch. Not set by default — extracts all found gems. | | **Qdrant URL** | `http://:6333` | Vector database endpoint. Change if Qdrant runs on different host/port. | | **Ollama URL** | `http://:11434` | LLM endpoint for gem extraction. Change if Ollama runs elsewhere. | | **Curator LLM** | `qwen3:30b-a3b-instruct` | Model for extracting gems. `30b` = best quality (~3s). `4b` = faster but needs JSON fallback. | | **User ID** | `rob` | Owner identifier for memories. Used for filtering and multi-user setups. | | **Source collection** | `memories_tr` | Qdrant collection for raw captured memories. | | **Target collection** | `gems_tr` | Qdrant collection for curated gems (injected into context). | | **Watcher service** | `enabled` | Real-time capture daemon. Reads session JSONL and writes to Qdrant. | | **Cron timer** | `enabled` | Periodic curation job. Runs `curator_timer.py` on schedule. | | **Log path** | `/var/log/true-recall-timer.log` | Where curator output is written. Check with `tail -f`. | | **Dry-run mode** | `disabled` | Test mode — shows what would be curated without writing to Qdrant. | **OpenClaw-side options:** | Option | Default | Description | |--------|---------|-------------| | **Compactor mode** | `default` | How context is summarized. `default` = fast standard. `safeguard` = chunked for very long sessions. | | **Memory flush** | `disabled` | If enabled, writes silent "memory" turn before compaction. Adds overhead — disabled for minimal lag. | | **Context pruning** | `cache-ttl` | Removes old tool results from context. `cache-ttl` = prunes hourly. `off` = no pruning. | --- ### 8. Embedding Models **Current Setup:** - `memories_tr`: `snowflake-arctic-embed2` (capture similarity) - `gems_tr`: `mxbai-embed-large` (recall similarity) **Rationale:** - mxbai has higher MTEB score (66.5) for semantic search - snowflake is faster for high-volume capture **Note:** For simplicity, a single embedding model could be used for both collections. This would reduce complexity and memory overhead, though with slightly lower recall performance. --- ### 9. memory-qdrant Plugin **Location:** `~/.openclaw/extensions/memory-qdrant/` **Config (openclaw.json):** ```json { "collectionName": "gems_tr", "captureCollection": "memories_tr", "autoRecall": true, "autoCapture": true } ``` **Functions:** - **Recall:** Searches `gems_tr`, injects gems (hidden) - **Capture:** Session-level to `memories_tr` (backup) --- ## Files & Locations ### Core Project ``` ~/.openclaw/workspace/.projects/true-recall-v2/ ├── README.md # This file ├── session.md # Detailed notes ├── curator-prompt.md # Extraction prompt ├── tr-daily/ │ └── curate_from_qdrant.py # Daily curator └── shared/ ``` ### New Files (2026-02-24) | File | Purpose | |------|---------| | `tr-continuous/curator_timer.py` | Timer curator (v2.2) | | `tr-continuous/curator_config.json` | Curator settings | | `tr-continuous/migrate_add_curated.py` | Migration script | | `skills/qdrant-memory/scripts/realtime_qdrant_watcher.py` | Capture daemon | | `skills/qdrant-memory/mem-qdrant-watcher.service` | Systemd service | ### Archived Files (v2.1) | File | Status | Note | |------|--------|------| | `tr-daily/curate_from_qdrant.py` | 📦 Archived | Replaced by timer | | `tr-continuous/curator_by_count.py` | 📦 Archived | Replaced by timer | ### System Files | File | Purpose | |------|---------| | `~/.openclaw/extensions/memory-qdrant/` | Plugin code | | `~/.openclaw/openclaw.json` | Configuration | | `/etc/systemd/system/mem-qdrant-watcher.service` | Service file | --- ## Configuration ### memory-qdrant Plugin **File:** `~/.openclaw/openclaw.json` ```json { "memory-qdrant": { "config": { "autoCapture": true, "autoRecall": true, "collectionName": "gems_tr", "captureCollection": "memories_tr", "embeddingModel": "snowflake-arctic-embed2", "maxRecallResults": 2, "minRecallScore": 0.7, "ollamaUrl": "http://:11434", "qdrantUrl": "http://:6333" }, "enabled": true } } ``` ### Gateway Control UI (OpenClaw 2026.2.23) ```json { "gateway": { "controlUi": { "allowedOrigins": ["*"], "allowInsecureAuth": false, "dangerouslyDisableDeviceAuth": true } } } ``` --- ## Validation ### Check Collections ```bash # Count points curl -s http://:6333/collections/memories_tr | jq '.result.points_count' curl -s http://:6333/collections/gems_tr | jq '.result.points_count' # View recent captures curl -s -X POST http://:6333/collections/memories_tr/points/scroll \ -H "Content-Type: application/json" \ -d '{"limit": 3, "with_payload": true}' | jq '.result.points[].payload.content' ``` ### Check Services ```bash # Watcher sudo systemctl status mem-qdrant-watcher sudo journalctl -u mem-qdrant-watcher -n 20 # OpenClaw openclaw status openclaw gateway status ``` ### Test Capture Send a message, then check: ```bash # Should increase by 1-2 points curl -s http://:6333/collections/memories_tr | jq '.result.points_count' ``` --- ## Troubleshooting ### Watcher Not Capturing ```bash # Check logs sudo journalctl -u mem-qdrant-watcher -f # Verify dependencies curl http://:6333/ # Qdrant curl http://:11434/api/tags # Ollama ``` ### Plugin Not Loading ```bash # Validate config openclaw config validate # Check logs tail /tmp/openclaw/openclaw-$(date +%Y-%m-%d).log | grep memory-qdrant # Restart gateway openclaw gateway restart ``` ### Gateway Won't Start (OpenClaw 2026.2.23+) **Error:** `non-loopback Control UI requires gateway.controlUi.allowedOrigins` **Fix:** Add to `openclaw.json`: ```json "gateway": { "controlUi": { "allowedOrigins": ["*"] } } ``` --- ## Status Summary | Component | Status | Notes | |-----------|--------|-------| | Real-time watcher | ✅ Active | PID 1748, capturing | | memories_tr | ✅ 12,378 pts | All tagged `curated: false` | | gems_tr | ✅ 5 pts | Injection ready | | Timer curator | ✅ Deployed | Every 30 min via cron | | Plugin injection | ✅ Working | Uses gems_tr | | Migration | ✅ Complete | 12,378 memories | **Logs:** `tail /var/log/true-recall-timer.log` **Next:** Monitor first timer run --- ## Roadmap ### Planned Features | Feature | Status | Description | |---------|--------|-------------| | Interactive install script | ⏳ Planned | Prompts for embedding model, timer interval, batch size, endpoints | | Single embedding model | ⏳ Planned | Option to use one model for both collections | | Configurable thresholds | ⏳ Planned | Per-user customization via prompts | **Install script will prompt for:** 1. **Embedding model** — snowflake (fast) vs mxbai (accurate) 2. **Timer interval** — 5 min / 30 min / hourly 3. **Batch size** — 50 / 100 / 500 memories 4. **Endpoints** — Qdrant/Ollama URLs 5. **User ID** — for multi-user setups --- **Maintained by:** Rob **AI Assistant:** Kimi 🎙️ **Version:** 2026.02.24-v2.2