Go to file

root 99a1aabd11 docs: Add Semantic Deduplication section with similarity checking

- Why smaller models need deduplication (4b vs 30b)
- Three implementation options (built-in, periodic AI, watcher hook)
- Code example for pre-insertion similarity check
- Configuration options for deduplication settings
- Recommendations by model size
- Fixed section numbering

2026-02-24 21:15:04 -06:00

__pycache__

Initial commit: TrueRecall v2.2 with 30b curator and timer-based curation

2026-02-24 20:27:44 -06:00

tr-compact

Initial commit: TrueRecall v2.2 with 30b curator and timer-based curation

2026-02-24 20:27:44 -06:00

tr-continuous

Initial commit: TrueRecall v2.2 with 30b curator and timer-based curation

2026-02-24 20:27:44 -06:00

tr-daily

Initial commit: TrueRecall v2.2 with 30b curator and timer-based curation

2026-02-24 20:27:44 -06:00

checklist.md

docs: Add Security & Privacy Review section to checklist

2026-02-24 21:01:52 -06:00

curator-prompt.md

fix: Strengthen curator prompt to prevent malformed gems

2026-02-24 21:12:00 -06:00

debug_curator.py

Initial commit: TrueRecall v2.2 with 30b curator and timer-based curation

2026-02-24 20:27:44 -06:00

install.py

feat: Add plugin configuration options to installer

2026-02-24 20:59:59 -06:00

migrate_memories.py

Initial commit: TrueRecall v2.2 with 30b curator and timer-based curation

2026-02-24 20:27:44 -06:00

README.md

docs: Add Semantic Deduplication section with similarity checking

2026-02-24 21:15:04 -06:00

README.md.bak.2026-02-24-1518

Initial commit: TrueRecall v2.2 with 30b curator and timer-based curation

2026-02-24 20:27:44 -06:00

README.md.bak.2026-02-24-1518-final

Initial commit: TrueRecall v2.2 with 30b curator and timer-based curation

2026-02-24 20:27:44 -06:00

README.md.neuralstream.bak

Initial commit: TrueRecall v2.2 with 30b curator and timer-based curation

2026-02-24 20:27:44 -06:00

session.md

security: Remove private IPs and paths from repository

2026-02-24 21:02:53 -06:00

session.md.bak.2026-02-24-1518

Initial commit: TrueRecall v2.2 with 30b curator and timer-based curation

2026-02-24 20:27:44 -06:00

session.md.neuralstream.bak

Initial commit: TrueRecall v2.2 with 30b curator and timer-based curation

2026-02-24 20:27:44 -06:00

test_curator.py

Initial commit: TrueRecall v2.2 with 30b curator and timer-based curation

2026-02-24 20:27:44 -06:00

README.md

TrueRecall v2

Project: Gem extraction and memory recall system
Status: ✅ Active & Verified
Location: ~/.openclaw/workspace/.projects/true-recall-v2/
Last Updated: 2026-02-24 19:02 CST

Quick Start
Overview
Current State
Architecture
Components
Files & Locations
Configuration
Validation
Troubleshooting
Status Summary

Quick Start

# Check system status
openclaw status
sudo systemctl status mem-qdrant-watcher

# View recent captures
curl -s http://<QDRANT_IP>:6333/collections/memories_tr | jq '.result.points_count'

# Check collections
curl -s http://<QDRANT_IP>:6333/collections | jq '.result.collections[].name'

Overview

TrueRecall v2 extracts "gems" (key insights) from conversations and injects them as context. It consists of three layers:

Capture — Real-time watcher saves every turn to memories_tr
Curation — Daily curator extracts gems to gems_tr
Injection — Plugin searches gems_tr and injects gems per turn

Current State

Verified at 19:02 CST

Collection	Points	Purpose	Status
`memories_tr`	12,378	Full text (live capture)	✅ Active
`gems_tr`	5	Curated gems (injection)	✅ Active

All memories tagged with curated: false for timer curation.

Services Status

Service	Status	Details
`mem-qdrant-watcher`	✅ Active	PID 1748, capturing
Timer curator	✅ Deployed	Every 30 min via cron
OpenClaw Gateway	✅ Running	Version 2026.2.23
memory-qdrant plugin	✅ Loaded	recall: gems_tr

Comparison: TrueRecall v2 vs Jarvis Memory vs v1

Feature	Jarvis Memory	TrueRecall v1	TrueRecall v2
Storage	Redis	Redis + Qdrant	Qdrant only
Capture	Session batch	Session batch	Real-time
Curation	Manual	Daily 2:45 AM	Timer (5 min)
Embedding	—	snowflake	snowflake + mxbai
Curator LLM	—	qwen3:4b	qwen3:30b
State tracking	—	—	`curated` tag
Batch size	—	24h worth	Configurable
JSON parsing	—	Fallback needed	Native (30b)

Key Improvements v2:

✅ Real-time capture (no batch delay)
✅ Timer-based curation (responsive vs daily)
✅ 30b curator (better gems, faster ~3s)
✅ curated tag (reliable state tracking)
✅ No Redis dependency (simpler stack)

Architecture

v2.2: Timer-Based Curation

┌─────────────────┐     ┌──────────────────────┐     ┌─────────────┐
│  OpenClaw Chat  │────▶│ Real-Time Watcher    │────▶│ memories_tr │
│  (Session JSONL)│     │ (Python daemon)      │     │ (Qdrant)    │
└─────────────────┘     └──────────────────────┘     └──────┬──────┘
                                                          │
                                                          │ Every 30 min
                                                          ▼
                                                ┌──────────────────┐
                                                │  Timer Curator   │
                                                │   (cron/qwen3)   │
                                                └────────┬─────────┘
                                                         │
                                                         ▼
                                                ┌──────────────────┐
                                                │    gems_tr       │
                                                │   (Qdrant)       │
                                                └────────┬─────────┘
                                                         │
                                              Per turn   │
                                                         ▼
                                                ┌──────────────────┐
                                                │ memory-qdrant    │
                                                │     plugin       │
                                                └──────────────────┘

Key Changes in v2.2:

✅ Timer-based curation (30 min intervals)
✅ All memories tagged curated: false on capture
✅ Migration complete (12,378 memories)
❌ Removed daily batch processing (2:45 AM)

Components

1. Real-Time Watcher

File: skills/qdrant-memory/scripts/realtime_qdrant_watcher.py

What it does:

Watches ~/.openclaw/agents/main/sessions/*.jsonl
Parses each turn (user + AI)
Embeds with snowflake-arctic-embed2
Stores to memories_tr instantly
Cleans: Removes markdown, tables, metadata

Service: mem-qdrant-watcher.service

Commands:

# Check status
sudo systemctl status mem-qdrant-watcher

# View logs
sudo journalctl -u mem-qdrant-watcher -f

# Restart
sudo systemctl restart mem-qdrant-watcher

2. Content Cleaner

File: skills/qdrant-memory/scripts/clean_memories_tr.py

Purpose: Batch-clean existing points

Usage:

# Preview changes
python3 clean_memories_tr.py --dry-run

# Clean all
python3 clean_memories_tr.py --execute

# Clean 100 (test)
python3 clean_memories_tr.py --execute --limit 100

Cleans:

**bold** → plain text
|tables| → removed
`code` → plain text
--- rules → removed
# headers → removed

3. Timer Curator

File: tr-continuous/curator_timer.py

Schedule: Every 30 minutes (cron)

Flow:

Query uncurated memories from memories_tr
Send batch to qwen3 (max 100)
Extract gems → store to gems_tr
Mark memories as curated: true

Config: tr-continuous/curator_config.json

{
  "timer_minutes": 30,
  "max_batch_size": 100
}

Logs: /var/log/true-recall-timer.log

4. Curation Model Comparison

Current: qwen3:4b-instruct

Metric	4b	30b
Speed	~10-30s per batch	~3.3s (tested 2026-02-24)
JSON reliability	⚠️ Needs fallback	✅ Native
Context quality	Basic extraction	✅ Nuanced
Snippet accuracy	~80%	✅ Expected: 95%+

30b Benchmark (2026-02-24):

Load: 108ms
Prompt eval: 49ms (1,576 tok/s)
Generation: 2.9s (233 tokens, 80 tok/s)
Total: 3.26s

Trade-offs:

4b: Faster batch processing, lightweight, catches explicit decisions
30b: Deeper context, better inference, ~3x slower but superior quality

Gem Quality Comparison (Sample Review):

Aspect	4b	30b
Context depth	"Extracted via fallback"	Explains why decisions were made
Confidence scores	0.7-0.85	0.9-0.97
Snippet accuracy	~80% (wrong source)	✅ 95%+ (relevant quotes)
Categories	Generic "extracted"	Specific: knowledge, technical, decision
Example	"User implemented BorgBackup" (no context)	"User selected mxbai... due to top MTEB score of 66.5" (explains reasoning)

Verdict: 30b produces significantly higher quality gems — richer context, accurate snippets, and captures architectural intent, not just surface facts.

5. Semantic Deduplication (Similarity Checking)

Why: Smaller models (4b) often extract duplicate or near-duplicate gems. Without checking, your gems_tr collection fills with redundant entries.

The Problem:

"User decided on Redis" and "User selected Redis for caching" are the same gem
Smaller models lack nuance — they extract surface variations as separate gems
Over time, 30-50% of gems may be duplicates

Solution: Semantic Similarity Check

Before inserting a new gem:

Embed the candidate gem text
Search gems_tr for similar embeddings (past 24h)
If similarity > 0.85, SKIP (don't insert)
If similarity 0.70-0.85, MERGE (update existing with richer context)
If similarity < 0.70, INSERT (new unique gem)

Implementation Options:

Option A: Built-in Curator Check (Recommended)

Modify curator_timer.py to add pre-insertion similarity check:

import numpy as np
from qdrant_client import QdrantClient

qdrant = QdrantClient("http://<QDRANT_IP>:6333")

def is_duplicate(gem_text: str, user_id: str = "rob", threshold: float = 0.85) -> bool:
    """Check if similar gem exists in past 24h"""
    # Embed the candidate
    response = requests.post(
        "http://<OLLAMA_IP>:11434/api/embeddings",
        json={"model": "mxbai-embed-large", "prompt": gem_text}
    )
    embedding = response.json()["embedding"]
    
    # Search for similar gems
    results = qdrant.search(
        collection_name="gems_tr",
        query_vector=embedding,
        limit=3,
        query_filter={
            "must": [
                {"key": "user_id", "match": {"value": user_id}},
                {"key": "timestamp", "range": {"gte": "now-24h"}}
            ]
        }
    )
    
    # Check similarity scores
    for result in results:
        if result.score > threshold:
            return True  # Duplicate found
    return False

# In main loop, before inserting:
if is_duplicate(gem["gem"]):
    log.info(f"Skipping duplicate gem: {gem['gem'][:50]}...")
    continue

Pros: Catches duplicates at source, no extra jobs Cons: Adds ~50-100ms per gem (embedding call)

Option B: Periodic AI Review (Subagent Task)

Have a subagent periodically review and merge duplicates:

# Run weekly via cron
0 3 * * 0 cd <PROJECT_PATH> && python3 dedup_gems.py

dedup_gems.py approach:

Load all gems from past 7 days
Group by semantic similarity (clustering)
For each cluster > 1 gem:
- Keep highest confidence gem as primary
- Merge context from others into primary
- Delete duplicates

Pros: Can use reasoning model for nuanced merging Cons: Batch job, duplicates exist until cleanup runs

Option C: Real-time Watcher Hook

Add deduplication to the real-time watcher before memories are even stored:

# In watcher, before upsert to memories_tr
if is_similar_to_recent(memory_text, window="1h"):
    memory["duplicate_of"] = similar_id  # Tag but still store

Pros: Prevents duplicate memories upstream Cons: Memories may differ slightly even if gems would be same

Recommendation by Model:

Model	Recommended Approach	Reason
4b	Option A + B	Built-in check prevents duplicates; periodic review catches edge cases
30b	Option B only	30b produces fewer duplicates; weekly review sufficient
Production	Option A	Best balance of prevention and performance

Configuration:

Add to curator_config.json:

{
  "deduplication": {
    "enabled": true,
    "similarity_threshold": 0.85,
    "lookback_hours": 24,
    "mode": "skip"  // "skip", "merge", or "flag"
  }
}

6. OpenClaw Compactor Configuration

Status: ✅ Applied

Goal: Minimal overhead — just remove context, do nothing else.

Config Applied:

{
  agents: {
    defaults: {
      compaction: {
        mode: "default",              // "default" or "safeguard"
        reserveTokensFloor: 0,        // Disable safety floor (default: 20000)
        memoryFlush: {
          enabled: false              // Disable silent .md file writes
        }
      }
    }
  }
}

What this does:

mode: "default" — Standard summarization (faster)
reserveTokensFloor: 0 — Allow aggressive settings (disables 20k minimum)
memoryFlush.enabled: false — No silent "write memory" turns

Note: reserveTokens and keepRecentTokens are Pi runtime settings, not configurable via agents.defaults.compaction. They are set per-model in contextWindow/contextTokens.

7. Configuration Options Reference

All configurable options with defaults:

Option	Default	Description
Embedding model	`mxbai-embed-large`	Model for generating gem embeddings. `mxbai` = higher accuracy (MTEB 66.5). `snowflake` = faster processing.
Timer interval	`5` minutes	How often the curator runs. `5 min` = fast backlog clearing. `30 min` = balanced. `60 min` = minimal overhead.
Batch size	`100`	Max memories sent to curator per run. Higher = fewer API calls but more memory usage.
Max gems per run	(unlimited)	Hard limit on gems extracted per batch. Not set by default — extracts all found gems.
Qdrant URL	`http://<QDRANT_IP>:6333`	Vector database endpoint. Change if Qdrant runs on different host/port.
Ollama URL	`http://<OLLAMA_IP>:11434`	LLM endpoint for gem extraction. Change if Ollama runs elsewhere.
Curator LLM	`qwen3:30b-a3b-instruct`	Model for extracting gems. `30b` = best quality (~3s). `4b` = faster but needs JSON fallback.
User ID	`rob`	Owner identifier for memories. Used for filtering and multi-user setups.
Source collection	`memories_tr`	Qdrant collection for raw captured memories.
Target collection	`gems_tr`	Qdrant collection for curated gems (injected into context).
Watcher service	`enabled`	Real-time capture daemon. Reads session JSONL and writes to Qdrant.
Cron timer	`enabled`	Periodic curation job. Runs `curator_timer.py` on schedule.
Log path	`/var/log/true-recall-timer.log`	Where curator output is written. Check with `tail -f`.
Dry-run mode	`disabled`	Test mode — shows what would be curated without writing to Qdrant.

OpenClaw-side options:

Option	Default	Description
Compactor mode	`default`	How context is summarized. `default` = fast standard. `safeguard` = chunked for very long sessions.
Memory flush	`disabled`	If enabled, writes silent "memory" turn before compaction. Adds overhead — disabled for minimal lag.
Context pruning	`cache-ttl`	Removes old tool results from context. `cache-ttl` = prunes hourly. `off` = no pruning.

8. Embedding Models

Current Setup:

memories_tr: snowflake-arctic-embed2 (capture similarity)
gems_tr: mxbai-embed-large (recall similarity)

Rationale:

mxbai has higher MTEB score (66.5) for semantic search
snowflake is faster for high-volume capture

Note: For simplicity, a single embedding model could be used for both collections. This would reduce complexity and memory overhead, though with slightly lower recall performance.

9. memory-qdrant Plugin

Location: ~/.openclaw/extensions/memory-qdrant/

Config (openclaw.json):

{
  "collectionName": "gems_tr",
  "captureCollection": "memories_tr",
  "autoRecall": true,
  "autoCapture": true
}

Functions:

Recall: Searches gems_tr, injects gems (hidden)
Capture: Session-level to memories_tr (backup)

Files & Locations

Core Project

~/.openclaw/workspace/.projects/true-recall-v2/
├── README.md                    # This file
├── session.md                   # Detailed notes
├── curator-prompt.md            # Extraction prompt
├── tr-daily/
│   └── curate_from_qdrant.py   # Daily curator
└── shared/

New Files (2026-02-24)

File	Purpose
`tr-continuous/curator_timer.py`	Timer curator (v2.2)
`tr-continuous/curator_config.json`	Curator settings
`tr-continuous/migrate_add_curated.py`	Migration script
`skills/qdrant-memory/scripts/realtime_qdrant_watcher.py`	Capture daemon
`skills/qdrant-memory/mem-qdrant-watcher.service`	Systemd service

Archived Files (v2.1)

File	Status	Note
`tr-daily/curate_from_qdrant.py`	📦 Archived	Replaced by timer
`tr-continuous/curator_by_count.py`	📦 Archived	Replaced by timer

System Files

File	Purpose
`~/.openclaw/extensions/memory-qdrant/`	Plugin code
`~/.openclaw/openclaw.json`	Configuration
`/etc/systemd/system/mem-qdrant-watcher.service`	Service file

Configuration

memory-qdrant Plugin

File: ~/.openclaw/openclaw.json

{
  "memory-qdrant": {
    "config": {
      "autoCapture": true,
      "autoRecall": true,
      "collectionName": "gems_tr",
      "captureCollection": "memories_tr",
      "embeddingModel": "snowflake-arctic-embed2",
      "maxRecallResults": 2,
      "minRecallScore": 0.7,
      "ollamaUrl": "http://<OLLAMA_IP>:11434",
      "qdrantUrl": "http://<QDRANT_IP>:6333"
    },
    "enabled": true
  }
}

Gateway Control UI (OpenClaw 2026.2.23)

{
  "gateway": {
    "controlUi": {
      "allowedOrigins": ["*"],
      "allowInsecureAuth": false,
      "dangerouslyDisableDeviceAuth": true
    }
  }
}

Validation

Check Collections

# Count points
curl -s http://<QDRANT_IP>:6333/collections/memories_tr | jq '.result.points_count'
curl -s http://<QDRANT_IP>:6333/collections/gems_tr | jq '.result.points_count'

# View recent captures
curl -s -X POST http://<QDRANT_IP>:6333/collections/memories_tr/points/scroll \
  -H "Content-Type: application/json" \
  -d '{"limit": 3, "with_payload": true}' | jq '.result.points[].payload.content'

Check Services

# Watcher
sudo systemctl status mem-qdrant-watcher
sudo journalctl -u mem-qdrant-watcher -n 20

# OpenClaw
openclaw status
openclaw gateway status

Test Capture

Send a message, then check:

# Should increase by 1-2 points
curl -s http://<QDRANT_IP>:6333/collections/memories_tr | jq '.result.points_count'

Troubleshooting

Watcher Not Capturing

# Check logs
sudo journalctl -u mem-qdrant-watcher -f

# Verify dependencies
curl http://<QDRANT_IP>:6333/          # Qdrant
curl http://<OLLAMA_IP>:11434/api/tags # Ollama

Plugin Not Loading

# Validate config
openclaw config validate

# Check logs
tail /tmp/openclaw/openclaw-$(date +%Y-%m-%d).log | grep memory-qdrant

# Restart gateway
openclaw gateway restart

Gateway Won't Start (OpenClaw 2026.2.23+)

Error: non-loopback Control UI requires gateway.controlUi.allowedOrigins

Fix: Add to openclaw.json:

"gateway": {
  "controlUi": {
    "allowedOrigins": ["*"]
  }
}

Status Summary

Component	Status	Notes
Real-time watcher	✅ Active	PID 1748, capturing
memories_tr	✅ 12,378 pts	All tagged `curated: false`
gems_tr	✅ 5 pts	Injection ready
Timer curator	✅ Deployed	Every 30 min via cron
Plugin injection	✅ Working	Uses gems_tr
Migration	✅ Complete	12,378 memories

Logs: tail /var/log/true-recall-timer.log

Next: Monitor first timer run

Roadmap

Planned Features

Feature	Status	Description
Interactive install script	⏳ Planned	Prompts for embedding model, timer interval, batch size, endpoints
Single embedding model	⏳ Planned	Option to use one model for both collections
Configurable thresholds	⏳ Planned	Per-user customization via prompts

Install script will prompt for:

Embedding model — snowflake (fast) vs mxbai (accurate)
Timer interval — 5 min / 30 min / hourly
Batch size — 50 / 100 / 500 memories
Endpoints — Qdrant/Ollama URLs
User ID — for multi-user setups

Maintained by: Rob
AI Assistant: Kimi 🎙️
Version: 2026.02.24-v2.2

README.md

TrueRecall v2

Table of Contents

Quick Start

Overview

Current State

Verified at 19:02 CST

Services Status

Comparison: TrueRecall v2 vs Jarvis Memory vs v1

Architecture

v2.2: Timer-Based Curation

Components

1. Real-Time Watcher

2. Content Cleaner

3. Timer Curator

4. Curation Model Comparison

5. Semantic Deduplication (Similarity Checking)

Option A: Built-in Curator Check (Recommended)

Option B: Periodic AI Review (Subagent Task)

Option C: Real-time Watcher Hook

6. OpenClaw Compactor Configuration

7. Configuration Options Reference

8. Embedding Models

9. memory-qdrant Plugin

Files & Locations

Core Project

New Files (2026-02-24)

Archived Files (v2.1)

System Files

Configuration

memory-qdrant Plugin

Gateway Control UI (OpenClaw 2026.2.23)

Validation

Check Collections

Check Services

Test Capture

Troubleshooting

Watcher Not Capturing

Plugin Not Loading

Gateway Won't Start (OpenClaw 2026.2.23+)

Status Summary

Roadmap

Planned Features