root ae36275153 feat: Add v1 detection to installer
- Detect v1 installation (folder, cron, systemd service)
- Warn user that upgrade is not supported
- Explain removal steps clearly
- Allow continue-at-own-risk option
- Exit if user declines
2026-02-24 21:20:37 -06:00
2026-02-24 21:20:37 -06:00

TrueRecall v2

Project: Gem extraction and memory recall system
Status: Active & Verified
Location: ~/.openclaw/workspace/.projects/true-recall-v2/
Last Updated: 2026-02-24 19:02 CST


Table of Contents


Quick Start

# Check system status
openclaw status
sudo systemctl status mem-qdrant-watcher

# View recent captures
curl -s http://<QDRANT_IP>:6333/collections/memories_tr | jq '.result.points_count'

# Check collections
curl -s http://<QDRANT_IP>:6333/collections | jq '.result.collections[].name'

Overview

TrueRecall v2 extracts "gems" (key insights) from conversations and injects them as context. It consists of three layers:

  1. Capture — Real-time watcher saves every turn to memories_tr
  2. Curation — Daily curator extracts gems to gems_tr
  3. Injection — Plugin searches gems_tr and injects gems per turn

Current State

Verified at 19:02 CST

Collection Points Purpose Status
memories_tr 12,378 Full text (live capture) Active
gems_tr 5 Curated gems (injection) Active

All memories tagged with curated: false for timer curation.

Services Status

Service Status Details
mem-qdrant-watcher Active PID 1748, capturing
Timer curator Deployed Every 30 min via cron
OpenClaw Gateway Running Version 2026.2.23
memory-qdrant plugin Loaded recall: gems_tr

Comparison: TrueRecall v2 vs Jarvis Memory vs v1

Feature Jarvis Memory TrueRecall v1 TrueRecall v2
Storage Redis Redis + Qdrant Qdrant only
Capture Session batch Session batch Real-time
Curation Manual Daily 2:45 AM Timer (5 min)
Embedding snowflake snowflake + mxbai
Curator LLM qwen3:4b qwen3:30b
State tracking curated tag
Batch size 24h worth Configurable
JSON parsing Fallback needed Native (30b)

Key Improvements v2:

  • Real-time capture (no batch delay)
  • Timer-based curation (responsive vs daily)
  • 30b curator (better gems, faster ~3s)
  • curated tag (reliable state tracking)
  • No Redis dependency (simpler stack)

Architecture

v2.2: Timer-Based Curation

┌─────────────────┐     ┌──────────────────────┐     ┌─────────────┐
│  OpenClaw Chat  │────▶│ Real-Time Watcher    │────▶│ memories_tr │
│  (Session JSONL)│     │ (Python daemon)      │     │ (Qdrant)    │
└─────────────────┘     └──────────────────────┘     └──────┬──────┘
                                                          │
                                                          │ Every 30 min
                                                          ▼
                                                ┌──────────────────┐
                                                │  Timer Curator   │
                                                │   (cron/qwen3)   │
                                                └────────┬─────────┘
                                                         │
                                                         ▼
                                                ┌──────────────────┐
                                                │    gems_tr       │
                                                │   (Qdrant)       │
                                                └────────┬─────────┘
                                                         │
                                              Per turn   │
                                                         ▼
                                                ┌──────────────────┐
                                                │ memory-qdrant    │
                                                │     plugin       │
                                                └──────────────────┘

Key Changes in v2.2:

  • Timer-based curation (30 min intervals)
  • All memories tagged curated: false on capture
  • Migration complete (12,378 memories)
  • Removed daily batch processing (2:45 AM)

Components

1. Real-Time Watcher

File: skills/qdrant-memory/scripts/realtime_qdrant_watcher.py

What it does:

  • Watches ~/.openclaw/agents/main/sessions/*.jsonl
  • Parses each turn (user + AI)
  • Embeds with snowflake-arctic-embed2
  • Stores to memories_tr instantly
  • Cleans: Removes markdown, tables, metadata

Service: mem-qdrant-watcher.service

Commands:

# Check status
sudo systemctl status mem-qdrant-watcher

# View logs
sudo journalctl -u mem-qdrant-watcher -f

# Restart
sudo systemctl restart mem-qdrant-watcher

2. Content Cleaner

File: skills/qdrant-memory/scripts/clean_memories_tr.py

Purpose: Batch-clean existing points

Usage:

# Preview changes
python3 clean_memories_tr.py --dry-run

# Clean all
python3 clean_memories_tr.py --execute

# Clean 100 (test)
python3 clean_memories_tr.py --execute --limit 100

Cleans:

  • **bold** → plain text
  • |tables| → removed
  • `code` → plain text
  • --- rules → removed
  • # headers → removed

3. Timer Curator

File: tr-continuous/curator_timer.py

Schedule: Every 30 minutes (cron)

Flow:

  1. Query uncurated memories from memories_tr
  2. Send batch to qwen3 (max 100)
  3. Extract gems → store to gems_tr
  4. Mark memories as curated: true

Config: tr-continuous/curator_config.json

{
  "timer_minutes": 30,
  "max_batch_size": 100
}

Logs: /var/log/true-recall-timer.log


4. Curation Model Comparison

Current: qwen3:4b-instruct

Metric 4b 30b
Speed ~10-30s per batch ~3.3s (tested 2026-02-24)
JSON reliability ⚠️ Needs fallback Native
Context quality Basic extraction Nuanced
Snippet accuracy ~80% Expected: 95%+

30b Benchmark (2026-02-24):

  • Load: 108ms
  • Prompt eval: 49ms (1,576 tok/s)
  • Generation: 2.9s (233 tokens, 80 tok/s)
  • Total: 3.26s

Trade-offs:

  • 4b: Faster batch processing, lightweight, catches explicit decisions
  • 30b: Deeper context, better inference, ~3x slower but superior quality

Gem Quality Comparison (Sample Review):

Aspect 4b 30b
Context depth "Extracted via fallback" Explains why decisions were made
Confidence scores 0.7-0.85 0.9-0.97
Snippet accuracy ~80% (wrong source) 95%+ (relevant quotes)
Categories Generic "extracted" Specific: knowledge, technical, decision
Example "User implemented BorgBackup" (no context) "User selected mxbai... due to top MTEB score of 66.5" (explains reasoning)

Verdict: 30b produces significantly higher quality gems — richer context, accurate snippets, and captures architectural intent, not just surface facts.


5. Semantic Deduplication (Similarity Checking)

Why: Smaller models (4b) often extract duplicate or near-duplicate gems. Without checking, your gems_tr collection fills with redundant entries.

The Problem:

  • "User decided on Redis" and "User selected Redis for caching" are the same gem
  • Smaller models lack nuance — they extract surface variations as separate gems
  • Over time, 30-50% of gems may be duplicates

Solution: Semantic Similarity Check

Before inserting a new gem:

  1. Embed the candidate gem text
  2. Search gems_tr for similar embeddings (past 24h)
  3. If similarity > 0.85, SKIP (don't insert)
  4. If similarity 0.70-0.85, MERGE (update existing with richer context)
  5. If similarity < 0.70, INSERT (new unique gem)

Implementation Options:

Modify curator_timer.py to add pre-insertion similarity check:

import numpy as np
from qdrant_client import QdrantClient

qdrant = QdrantClient("http://<QDRANT_IP>:6333")

def is_duplicate(gem_text: str, user_id: str = "rob", threshold: float = 0.85) -> bool:
    """Check if similar gem exists in past 24h"""
    # Embed the candidate
    response = requests.post(
        "http://<OLLAMA_IP>:11434/api/embeddings",
        json={"model": "mxbai-embed-large", "prompt": gem_text}
    )
    embedding = response.json()["embedding"]
    
    # Search for similar gems
    results = qdrant.search(
        collection_name="gems_tr",
        query_vector=embedding,
        limit=3,
        query_filter={
            "must": [
                {"key": "user_id", "match": {"value": user_id}},
                {"key": "timestamp", "range": {"gte": "now-24h"}}
            ]
        }
    )
    
    # Check similarity scores
    for result in results:
        if result.score > threshold:
            return True  # Duplicate found
    return False

# In main loop, before inserting:
if is_duplicate(gem["gem"]):
    log.info(f"Skipping duplicate gem: {gem['gem'][:50]}...")
    continue

Pros: Catches duplicates at source, no extra jobs Cons: Adds ~50-100ms per gem (embedding call)

Option B: Periodic AI Review (Subagent Task)

Have a subagent periodically review and merge duplicates:

# Run weekly via cron
0 3 * * 0 cd <PROJECT_PATH> && python3 dedup_gems.py

dedup_gems.py approach:

  1. Load all gems from past 7 days
  2. Group by semantic similarity (clustering)
  3. For each cluster > 1 gem:
    • Keep highest confidence gem as primary
    • Merge context from others into primary
    • Delete duplicates

Pros: Can use reasoning model for nuanced merging Cons: Batch job, duplicates exist until cleanup runs

Option C: Real-time Watcher Hook

Add deduplication to the real-time watcher before memories are even stored:

# In watcher, before upsert to memories_tr
if is_similar_to_recent(memory_text, window="1h"):
    memory["duplicate_of"] = similar_id  # Tag but still store

Pros: Prevents duplicate memories upstream Cons: Memories may differ slightly even if gems would be same

Recommendation by Model:

Model Recommended Approach Reason
4b Option A + B Built-in check prevents duplicates; periodic review catches edge cases
30b Option B only 30b produces fewer duplicates; weekly review sufficient
Production Option A Best balance of prevention and performance

Configuration:

Add to curator_config.json:

{
  "deduplication": {
    "enabled": true,
    "similarity_threshold": 0.85,
    "lookback_hours": 24,
    "mode": "skip"  // "skip", "merge", or "flag"
  }
}

6. OpenClaw Compactor Configuration

Status: Applied

Goal: Minimal overhead — just remove context, do nothing else.

Config Applied:

{
  agents: {
    defaults: {
      compaction: {
        mode: "default",              // "default" or "safeguard"
        reserveTokensFloor: 0,        // Disable safety floor (default: 20000)
        memoryFlush: {
          enabled: false              // Disable silent .md file writes
        }
      }
    }
  }
}

What this does:

  • mode: "default" — Standard summarization (faster)
  • reserveTokensFloor: 0 — Allow aggressive settings (disables 20k minimum)
  • memoryFlush.enabled: false — No silent "write memory" turns

Known Issue: UI Glitch During Compaction

When compaction runs, the Control UI may briefly behave unexpectedly:

  • Typed text may not appear immediately after hitting Enter
  • Messages may render out of order briefly
  • UI "catches up" within 1-2 seconds after compaction completes

Why: Compaction replaces the full conversation history with a summary. The UI's WebSocket state can get briefly out of sync during this transition.

Workaround:

  • Wait 2-3 seconds after hitting Enter during compaction
  • Or hard refresh (Ctrl+Shift+R) if UI seems stuck
  • Note: This is an OpenClaw Control UI limitation — cannot be fixed from TrueRecall side at this time.

Note: reserveTokens and keepRecentTokens are Pi runtime settings, not configurable via agents.defaults.compaction. They are set per-model in contextWindow/contextTokens.


7. Configuration Options Reference

All configurable options with defaults:

Option Default Description
Embedding model mxbai-embed-large Model for generating gem embeddings. mxbai = higher accuracy (MTEB 66.5). snowflake = faster processing.
Timer interval 5 minutes How often the curator runs. 5 min = fast backlog clearing. 30 min = balanced. 60 min = minimal overhead.
Batch size 100 Max memories sent to curator per run. Higher = fewer API calls but more memory usage.
Max gems per run (unlimited) Hard limit on gems extracted per batch. Not set by default — extracts all found gems.
Qdrant URL http://<QDRANT_IP>:6333 Vector database endpoint. Change if Qdrant runs on different host/port.
Ollama URL http://<OLLAMA_IP>:11434 LLM endpoint for gem extraction. Change if Ollama runs elsewhere.
Curator LLM qwen3:30b-a3b-instruct Model for extracting gems. 30b = best quality (~3s). 4b = faster but needs JSON fallback.
User ID rob Owner identifier for memories. Used for filtering and multi-user setups.
Source collection memories_tr Qdrant collection for raw captured memories.
Target collection gems_tr Qdrant collection for curated gems (injected into context).
Watcher service enabled Real-time capture daemon. Reads session JSONL and writes to Qdrant.
Cron timer enabled Periodic curation job. Runs curator_timer.py on schedule.
Log path /var/log/true-recall-timer.log Where curator output is written. Check with tail -f.
Dry-run mode disabled Test mode — shows what would be curated without writing to Qdrant.

OpenClaw-side options:

Option Default Description
Compactor mode default How context is summarized. default = fast standard. safeguard = chunked for very long sessions.
Memory flush disabled If enabled, writes silent "memory" turn before compaction. Adds overhead — disabled for minimal lag.
Context pruning cache-ttl Removes old tool results from context. cache-ttl = prunes hourly. off = no pruning.

8. Embedding Models

Current Setup:

  • memories_tr: snowflake-arctic-embed2 (capture similarity)
  • gems_tr: mxbai-embed-large (recall similarity)

Rationale:

  • mxbai has higher MTEB score (66.5) for semantic search
  • snowflake is faster for high-volume capture

Note: For simplicity, a single embedding model could be used for both collections. This would reduce complexity and memory overhead, though with slightly lower recall performance.


9. memory-qdrant Plugin

Location: ~/.openclaw/extensions/memory-qdrant/

Config (openclaw.json):

{
  "collectionName": "gems_tr",
  "captureCollection": "memories_tr",
  "autoRecall": true,
  "autoCapture": true
}

Functions:

  • Recall: Searches gems_tr, injects gems (hidden)
  • Capture: Session-level to memories_tr (backup)

Files & Locations

Core Project

~/.openclaw/workspace/.projects/true-recall-v2/
├── README.md                    # This file
├── session.md                   # Detailed notes
├── curator-prompt.md            # Extraction prompt
├── tr-daily/
│   └── curate_from_qdrant.py   # Daily curator
└── shared/

New Files (2026-02-24)

File Purpose
tr-continuous/curator_timer.py Timer curator (v2.2)
tr-continuous/curator_config.json Curator settings
tr-continuous/migrate_add_curated.py Migration script
skills/qdrant-memory/scripts/realtime_qdrant_watcher.py Capture daemon
skills/qdrant-memory/mem-qdrant-watcher.service Systemd service

Archived Files (v2.1)

File Status Note
tr-daily/curate_from_qdrant.py 📦 Archived Replaced by timer
tr-continuous/curator_by_count.py 📦 Archived Replaced by timer

System Files

File Purpose
~/.openclaw/extensions/memory-qdrant/ Plugin code
~/.openclaw/openclaw.json Configuration
/etc/systemd/system/mem-qdrant-watcher.service Service file

Configuration

memory-qdrant Plugin

File: ~/.openclaw/openclaw.json

{
  "memory-qdrant": {
    "config": {
      "autoCapture": true,
      "autoRecall": true,
      "collectionName": "gems_tr",
      "captureCollection": "memories_tr",
      "embeddingModel": "snowflake-arctic-embed2",
      "maxRecallResults": 2,
      "minRecallScore": 0.7,
      "ollamaUrl": "http://<OLLAMA_IP>:11434",
      "qdrantUrl": "http://<QDRANT_IP>:6333"
    },
    "enabled": true
  }
}

Gateway Control UI (OpenClaw 2026.2.23)

{
  "gateway": {
    "controlUi": {
      "allowedOrigins": ["*"],
      "allowInsecureAuth": false,
      "dangerouslyDisableDeviceAuth": true
    }
  }
}

Validation

Check Collections

# Count points
curl -s http://<QDRANT_IP>:6333/collections/memories_tr | jq '.result.points_count'
curl -s http://<QDRANT_IP>:6333/collections/gems_tr | jq '.result.points_count'

# View recent captures
curl -s -X POST http://<QDRANT_IP>:6333/collections/memories_tr/points/scroll \
  -H "Content-Type: application/json" \
  -d '{"limit": 3, "with_payload": true}' | jq '.result.points[].payload.content'

Check Services

# Watcher
sudo systemctl status mem-qdrant-watcher
sudo journalctl -u mem-qdrant-watcher -n 20

# OpenClaw
openclaw status
openclaw gateway status

Test Capture

Send a message, then check:

# Should increase by 1-2 points
curl -s http://<QDRANT_IP>:6333/collections/memories_tr | jq '.result.points_count'

Troubleshooting

Watcher Not Capturing

# Check logs
sudo journalctl -u mem-qdrant-watcher -f

# Verify dependencies
curl http://<QDRANT_IP>:6333/          # Qdrant
curl http://<OLLAMA_IP>:11434/api/tags # Ollama

Plugin Not Loading

# Validate config
openclaw config validate

# Check logs
tail /tmp/openclaw/openclaw-$(date +%Y-%m-%d).log | grep memory-qdrant

# Restart gateway
openclaw gateway restart

Gateway Won't Start (OpenClaw 2026.2.23+)

Error: non-loopback Control UI requires gateway.controlUi.allowedOrigins

Fix: Add to openclaw.json:

"gateway": {
  "controlUi": {
    "allowedOrigins": ["*"]
  }
}

Status Summary

Component Status Notes
Real-time watcher Active PID 1748, capturing
memories_tr 12,378 pts All tagged curated: false
gems_tr 5 pts Injection ready
Timer curator Deployed Every 30 min via cron
Plugin injection Working Uses gems_tr
Migration Complete 12,378 memories

Logs: tail /var/log/true-recall-timer.log

Next: Monitor first timer run


Roadmap

Planned Features

Feature Status Description
Interactive install script Planned Prompts for embedding model, timer interval, batch size, endpoints
Single embedding model Planned Option to use one model for both collections
Configurable thresholds Planned Per-user customization via prompts

Install script will prompt for:

  1. Embedding model — snowflake (fast) vs mxbai (accurate)
  2. Timer interval — 5 min / 30 min / hourly
  3. Batch size — 50 / 100 / 500 memories
  4. Endpoints — Qdrant/Ollama URLs
  5. User ID — for multi-user setups

Maintained by: Rob
AI Assistant: Kimi 🎙️
Version: 2026.02.24-v2.2

Description
True Recall - AI memory extraction and gem detection system
Readme 394 KiB
Languages
Python 93.6%
Shell 6.4%