Documented 6 areas needing improvement: - Semantic Deduplication (High) - Search Result Deduplication (Medium) - Gem Quality Scoring (Medium) - Temporal Decay (Low) - Gem Merging/Updating (Low) - Importance Calibration (Low)
588 lines
18 KiB
Markdown
588 lines
18 KiB
Markdown
# TrueRecall v2 - Session Notes
|
|
|
|
**Last Updated:** 2026-02-25 12:04 CST
|
|
**Status:** ✅ **Context Injection FIXED & Working**
|
|
**Version:** v2.2.1 (Post-fix validation)
|
|
|
|
---
|
|
|
|
## 🔥 CRITICAL FIXES APPLIED (2026-02-25 12:00 CST)
|
|
|
|
### Issues Found & Fixed
|
|
|
|
| Issue | Root Cause | Fix Applied |
|
|
|-------|------------|-------------|
|
|
| **Context injection broken** | Embedding model mismatch | ✅ Changed curator from `mxbai-embed-large` to `snowflake-arctic-embed2` |
|
|
| **Gems had no vectors** | `store_gem()` used wrong field | ✅ Updated to use `text` field for embedding |
|
|
| **JSON parsing errors** | Complex prompt causing LLM failures | ✅ Simplified extraction prompt |
|
|
| **Field mismatch** | Memories have `text`, curator expected `content` | ✅ Curator now supports both `text` and `content` fields |
|
|
| **Silent embedding failures** | No error logging | ✅ Added explicit error messages |
|
|
| **Gem ID collision** | Hash used non-existent fields | ✅ Hash now uses `embedding_text_for_hash[:100]` |
|
|
| **Meta-gems extracted** | Curator extracted from debug output | ✅ Added SKIP_PATTERNS filter |
|
|
| **gems_tr pollution** | 5 meta-gems + 1 real gem | ✅ Cleaned, now 1 real gem only |
|
|
| **First-person gems** | Third person format "User decided..." | ✅ Changed to "I decided..." for better matching |
|
|
|
|
### Validation Results
|
|
|
|
```bash
|
|
# Test query: "OpenClaw gateway update fixed gems"
|
|
# Result: Score 0.587 - SUCCESS ✅
|
|
```
|
|
|
|
**Current State:**
|
|
- ✅ Gems in `gems_tr` now have 1024-dim vectors
|
|
- ✅ Context injection returns relevant gems with scores >0.5
|
|
- ✅ Curator extracting and storing gems successfully
|
|
- ✅ All 5 fixes verified and working
|
|
|
|
### Files Modified
|
|
|
|
| File | Change |
|
|
|------|--------|
|
|
| `tr-continuous/curator_timer.py` | Embedding model, field handling, JSON parsing |
|
|
| `README.md` | Updated status and embedding model info |
|
|
| `function_check.md` | Added fixes section, updated sign-off |
|
|
| `session.md` | This update |
|
|
|
|
---
|
|
|
|
## Needed Improvements
|
|
|
|
| Issue | Description | Priority |
|
|
|-------|-------------|----------|
|
|
| **Semantic Deduplication** | No dedup between similar gems. Same fact phrased differently creates multiple gems. | High |
|
|
| **Search Result Deduplication** | Similar gems above threshold both injected, causing redundancy. | Medium |
|
|
| **Gem Quality Scoring** | No quality metric. Some gems may be low value. | Medium |
|
|
| **Temporal Decay** | All gems treated equally regardless of age. | Low |
|
|
| **Gem Merging/Updating** | When user changes preference, old gem still exists. | Low |
|
|
| **Importance Calibration** | All curator gems marked "medium". Should be dynamic. | Low |
|
|
|
|
---
|
|
|
|
## Session End (18:09 CST)
|
|
|
|
**Reason:** User starting new session
|
|
|
|
**Current State:**
|
|
- Real-time watcher: ✅ Active (capturing live)
|
|
- Timer curator: ✅ Deployed (every 5 min via cron)
|
|
- Daily curator: ❌ Removed (replaced by timer)
|
|
- Total memories: 12,729 (1,502 uncurated, 11,227 curated)
|
|
- Gems: 73 (actively extracting)
|
|
|
|
**Next session start:** Read this file, then check:
|
|
```bash
|
|
# Quick status
|
|
python3 ~/.openclaw/workspace/.local_projects/true-recall-v2/tr-continuous/curator_timer.py --status
|
|
sudo systemctl status mem-qdrant-watcher
|
|
curl -s http://<QDRANT_IP>:6333/collections/memories_tr | jq '.result.points_count'
|
|
```
|
|
|
|
---
|
|
|
|
## Executive Summary
|
|
|
|
TrueRecall v2 is a complete memory system with real-time capture, daily curation, and context injection. All components are operational.
|
|
|
|
---
|
|
|
|
## Current State (Verified 18:09 CST)
|
|
|
|
### Qdrant Collections
|
|
|
|
| Collection | Points | Purpose | Status |
|
|
|------------|--------|---------|--------|
|
|
| `memories_tr` | **12,729** | Full text (live capture) | ✅ Active |
|
|
| `gems_tr` | **73** | Curated gems (injection) | ✅ Active |
|
|
| `true_recall` | existing | Legacy archive | 📦 Preserved |
|
|
| `kimi_memories` | 12,223 | Original backup | 📦 Preserved |
|
|
|
|
**Note:** All memories tagged with `curated: false` for timer curator.
|
|
|
|
### Services
|
|
|
|
| Service | Status | Uptime |
|
|
|---------|--------|--------|
|
|
| `mem-qdrant-watcher` | ✅ Active | 30+ min |
|
|
| OpenClaw Gateway | ✅ Running | 2026.2.23 |
|
|
| memory-qdrant plugin | ✅ Loaded | recall: gems_tr, capture: memories_tr |
|
|
|
|
---
|
|
|
|
## Architecture
|
|
|
|
### v2.2: Timer-Based Curation (DEPLOYED)
|
|
|
|
**Data Flow:**
|
|
```
|
|
┌─────────────────┐ ┌──────────────────────┐ ┌─────────────┐
|
|
│ OpenClaw Chat │────▶│ Real-Time Watcher │────▶│ memories_tr │
|
|
│ (Session JSONL)│ │ (Python daemon) │ │ (Qdrant) │
|
|
└─────────────────┘ └──────────────────────┘ └──────┬──────┘
|
|
│
|
|
│ Every 5 min
|
|
▼
|
|
┌──────────────────┐
|
|
│ Timer Curator │
|
|
│ (cron/qwen3) │
|
|
└────────┬─────────┘
|
|
│
|
|
▼
|
|
┌──────────────────┐
|
|
│ gems_tr │
|
|
│ (Qdrant) │
|
|
└────────┬─────────┘
|
|
│
|
|
Per turn │
|
|
▼
|
|
┌──────────────────┐
|
|
│ memory-qdrant │
|
|
│ plugin │
|
|
└──────────────────┘
|
|
```
|
|
|
|
**Key Changes:**
|
|
- ✅ Replaced daily 2:45 AM batch with 5-minute timer
|
|
- ✅ All memories tagged `curated: false` on write
|
|
- ✅ Migration completed for 12,378 existing memories
|
|
- ✅ No Redis dependency (direct Qdrant only)
|
|
|
|
---
|
|
|
|
## Components
|
|
|
|
### Curation Mode: Timer-Based (DEPLOYED v2.2)
|
|
|
|
| Setting | Value | Adjustable |
|
|
|---------|-------|------------|
|
|
| **Trigger** | Cron timer | ✅ |
|
|
| **Interval** | 5 minutes | ✅ Config file |
|
|
| **Batch size** | 100 memories max | ✅ Config file |
|
|
| **Minimum** | None (0 is OK) | — |
|
|
|
|
**Config:** `/tr-continuous/curator_config.json`
|
|
```json
|
|
{
|
|
"timer_minutes": 30,
|
|
"max_batch_size": 100,
|
|
"user_id": "rob",
|
|
"source_collection": "memories_tr",
|
|
"target_collection": "gems_tr"
|
|
}
|
|
```
|
|
|
|
**Cron:**
|
|
```
|
|
*/30 * * * * cd .../tr-continuous && python3 curator_timer.py
|
|
```
|
|
|
|
**Old modes deprecated:**
|
|
- ❌ Turn-based (every N turns)
|
|
- ❌ Hybrid (timer + turn)
|
|
- ❌ Daily batch (2:45 AM)
|
|
|
|
### 1. Real-Time Watcher (Primary Capture)
|
|
|
|
**Location:** `~/.openclaw/workspace/skills/qdrant-memory/scripts/realtime_qdrant_watcher.py`
|
|
|
|
**Function:**
|
|
- Watches `/root/.openclaw/agents/main/sessions/*.jsonl`
|
|
- Parses every conversation turn in real-time
|
|
- Embeds with `snowflake-arctic-embed2` (Ollama @ <OLLAMA_IP>)
|
|
- Stores directly to `memories_tr` (no Redis)
|
|
- **Cleans content:** Removes markdown, tables, metadata, thinking tags
|
|
|
|
**Service:** `mem-qdrant-watcher.service`
|
|
- **Status:** Active since 16:46:53 CST
|
|
- **Systemd:** Enabled, auto-restart
|
|
|
|
**Log:** `journalctl -u mem-qdrant-watcher -f`
|
|
|
|
---
|
|
|
|
### 2. Content Cleaner (Existing Data)
|
|
|
|
**Location:** `~/.openclaw/workspace/skills/qdrant-memory/scripts/clean_memories_tr.py`
|
|
|
|
**Function:**
|
|
- Batch-cleans existing `memories_tr` points
|
|
- Removes: `**bold**`, `|tables|`, `` `code` ``, `---` rules, `# headers`
|
|
- Flattens nested content dicts
|
|
- Rate-limited to prevent Qdrant overload
|
|
|
|
**Usage:**
|
|
```bash
|
|
# Dry run (preview)
|
|
python3 clean_memories_tr.py --dry-run
|
|
|
|
# Clean all
|
|
python3 clean_memories_tr.py --execute
|
|
|
|
# Clean limited (test)
|
|
python3 clean_memories_tr.py --execute --limit 100
|
|
```
|
|
|
|
---
|
|
|
|
### 3. Timer Curator (v2.2 - DEPLOYED)
|
|
|
|
**Replaces:** Daily curator (2:45 AM batch) and turn-based curator
|
|
|
|
**Location:** `~/.openclaw/workspace/.local_projects/true-recall-v2/tr-continuous/curator_timer.py`
|
|
|
|
**Schedule:** Every 30 minutes (cron)
|
|
|
|
**Flow:**
|
|
1. Query uncurated memories (`curated: false`)
|
|
2. Send batch to qwen3 (max 100)
|
|
3. Extract gems using curator prompt
|
|
4. Store gems to `gems_tr`
|
|
5. Mark processed memories as `curated: true`
|
|
|
|
**Files:**
|
|
| File | Purpose |
|
|
|------|---------|
|
|
| `curator_timer.py` | Main curator script |
|
|
| `curator_config.json` | Adjustable settings |
|
|
| `migrate_add_curated.py` | One-time migration (completed) |
|
|
|
|
**Usage:**
|
|
```bash
|
|
# Dry run (preview)
|
|
python3 curator_timer.py --dry-run
|
|
|
|
# Manual run
|
|
python3 curator_timer.py --config curator_config.json
|
|
```
|
|
|
|
**Status:** ✅ Deployed, first run will process ~12,378 existing memories
|
|
|
|
### 5. Silent Compacting (NEW - Concept)
|
|
|
|
**Idea:** Automatically remove old context from prompt when token limit approached.
|
|
|
|
**Behavior:**
|
|
- Trigger: Context window > 80% full
|
|
- Action: Remove oldest messages (silently)
|
|
- Preserve: Gems always kept, recent N turns kept
|
|
- Result: Seamless conversation without "compacting" notification
|
|
|
|
**Config:**
|
|
```json
|
|
{
|
|
"compacting": {
|
|
"enabled": true,
|
|
"triggerAtPercent": 80,
|
|
"keepRecentTurns": 20,
|
|
"preserveGems": true,
|
|
"silent": true
|
|
}
|
|
}
|
|
```
|
|
|
|
**Status:** ⏳ Concept only - requires OpenClaw core changes
|
|
|
|
### 6. memory-qdrant Plugin
|
|
|
|
**Location:** `/root/.openclaw/extensions/memory-qdrant/`
|
|
|
|
**Config:**
|
|
```json
|
|
{
|
|
"collectionName": "gems_tr",
|
|
"captureCollection": "memories_tr",
|
|
"autoRecall": true,
|
|
"autoCapture": true
|
|
}
|
|
```
|
|
|
|
**Function:**
|
|
- **Recall:** Searches `gems_tr`, injects as context (hidden)
|
|
- **Capture:** Session-level capture to `memories_tr` (backup)
|
|
|
|
**Status:** Loaded, dual collection support working
|
|
|
|
---
|
|
|
|
## Files & Locations
|
|
|
|
### Core Project Files
|
|
|
|
```
|
|
~/.openclaw/workspace/.local_projects/true-recall-v2/
|
|
├── README.md # Architecture docs
|
|
├── session.md # This file
|
|
├── curator-prompt.md # Gem extraction prompt
|
|
├── tr-daily/ # Daily batch curation
|
|
│ └── curate_from_qdrant.py # Daily curator (2:45 AM)
|
|
├── tr-continuous/ # Real-time curation (NEW)
|
|
│ ├── curator_by_count.py # Turn-based curator
|
|
│ ├── curator_turn_based.py # Alternative approach
|
|
│ ├── curator_cron.sh # Cron wrapper
|
|
│ ├── turn-curator.service # Systemd service
|
|
│ └── README.md # Documentation
|
|
└── shared/
|
|
└── (shared resources)
|
|
```
|
|
|
|
### New Files (2026-02-24 19:00)
|
|
|
|
| File | Purpose |
|
|
|------|---------|
|
|
| `tr-continuous/curator_timer.py` | Timer-based curator (deployed) |
|
|
| `tr-continuous/curator_config.json` | Curator settings |
|
|
| `tr-continuous/migrate_add_curated.py` | Migration script (completed) |
|
|
|
|
### Legacy Files (Pre-v2.2)
|
|
|
|
| File | Status | Note |
|
|
|------|--------|------|
|
|
| `tr-daily/curate_from_qdrant.py` | 📦 Archived | Replaced by timer |
|
|
| `tr-continuous/curator_by_count.py` | 📦 Archived | Replaced by timer |
|
|
| `tr-continuous/curator_turn_based.py` | 📦 Archived | Replaced by timer |
|
|
|
|
### System Locations
|
|
|
|
| File | Purpose |
|
|
|------|---------|
|
|
| `/root/.openclaw/extensions/memory-qdrant/` | Plugin code |
|
|
| `/root/.openclaw/openclaw.json` | Plugin configuration |
|
|
| `/etc/systemd/system/mem-qdrant-watcher.service` | Systemd service |
|
|
|
|
---
|
|
|
|
## 🔥 CRITICAL FIXES APPLIED (2026-02-25 12:00-12:41 CST)
|
|
|
|
### Issues Found & Fixed Today
|
|
|
|
| Issue | Root Cause | Fix Applied |
|
|
|-------|------------|-------------|
|
|
| **Watcher stuck on old session** | Watcher only checked for new sessions when current file deleted | ✅ Restarted watcher, now follows current session (12:22) |
|
|
| **Plugin capture 0 exchanges** | OpenClaw changed to OpenAI content format (array), plugin expected string | ✅ Added `extractMessageText()` to parse content arrays (12:34) |
|
|
| **Session switching logic** | Old sessions persisted, watcher never switched | ✅ Fixed session detection logic in watcher |
|
|
| **Plugin content extraction** | `msg.content` is now array with `{type, text}` items | ✅ Extracts text from `type: "text"` items |
|
|
|
|
### Validation Results (2026-02-25 12:41)
|
|
|
|
```
|
|
memory-qdrant: parsed 17 user, 116 assistant messages, 9 exchanges
|
|
memory-qdrant: first msg role=user, contentType=array
|
|
```
|
|
|
|
**Before:** 0 exchanges extracted
|
|
**After:** 9 exchanges captured per session
|
|
|
|
### Components Status
|
|
|
|
| Component | Before | After | Status |
|
|
|-----------|--------|-------|--------|
|
|
| Real-time watcher | Stuck on Feb 24 session | Following current session | ✅ Fixed |
|
|
| Plugin capture | 0 exchanges | 9 exchanges | ✅ Fixed |
|
|
| Context injection | Working | Still working | ✅ Verified |
|
|
|
|
### Files Modified (2026-02-25)
|
|
|
|
| File | Change |
|
|
|------|--------|
|
|
| `extensions/memory-qdrant/index.ts` | Added `extractMessageText()` function, removed debug logging |
|
|
| `extensions/memory-qdrant/index.js` | Compiled TypeScript changes |
|
|
| `session.md` | This update |
|
|
| `function_check.md` | Added fixes section |
|
|
|
|
---
|
|
|
|
## Changes Made Today (2026-02-24 19:00)
|
|
|
|
### 1. Timer Curator Deployed (v2.2)
|
|
|
|
- Created `curator_timer.py` — simplified timer-based curation
|
|
- Created `curator_config.json` — adjustable settings
|
|
- Removed daily 2:45 AM cron job
|
|
- Added `*/30 * * * *` cron timer
|
|
- **Status:** ✅ Deployed, logs to `/var/log/true-recall-timer.log`
|
|
|
|
### 2. Migration Completed
|
|
|
|
- Created `migrate_add_curated.py`
|
|
- Tagged 12,378 existing memories with `curated: false`
|
|
- Updated watcher to add `curated: false` to new memories
|
|
- **Status:** ✅ Complete
|
|
|
|
### 3. Simplified Architecture
|
|
|
|
- ❌ Removed turn-based curator complexity
|
|
- ❌ Removed daily batch processing
|
|
- ✅ Single timer trigger every 30 minutes
|
|
- ✅ No minimum threshold (processes 0-N memories)
|
|
|
|
---
|
|
|
|
## Configuration
|
|
|
|
### memory-qdrant Plugin
|
|
|
|
**File:** `/root/.openclaw/openclaw.json`
|
|
|
|
```json
|
|
{
|
|
"memory-qdrant": {
|
|
"config": {
|
|
"autoCapture": true,
|
|
"autoRecall": true,
|
|
"collectionName": "gems_tr",
|
|
"captureCollection": "memories_tr",
|
|
"embeddingModel": "snowflake-arctic-embed2",
|
|
"maxRecallResults": 2,
|
|
"minRecallScore": 0.7,
|
|
"ollamaUrl": "http://<OLLAMA_IP>:11434",
|
|
"qdrantUrl": "http://<QDRANT_IP>:6333"
|
|
},
|
|
"enabled": true
|
|
}
|
|
}
|
|
```
|
|
|
|
### Gateway (OpenClaw Update Fix)
|
|
|
|
```json
|
|
{
|
|
"gateway": {
|
|
"controlUi": {
|
|
"allowedOrigins": ["*"],
|
|
"allowInsecureAuth": false,
|
|
"dangerouslyDisableDeviceAuth": true
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## Validation Commands
|
|
|
|
### Check Collections
|
|
|
|
```bash
|
|
# Points count
|
|
curl -s http://<QDRANT_IP>:6333/collections/memories_tr | jq '.result.points_count'
|
|
curl -s http://<QDRANT_IP>:6333/collections/gems_tr | jq '.result.points_count'
|
|
|
|
# Recent points
|
|
curl -s -X POST http://<QDRANT_IP>:6333/collections/memories_tr/points/scroll \
|
|
-H "Content-Type: application/json" \
|
|
-d '{"limit": 5, "with_payload": true}' | jq '.result.points[].payload.content'
|
|
```
|
|
|
|
### Check Services
|
|
|
|
```bash
|
|
# Watcher status
|
|
sudo systemctl status mem-qdrant-watcher
|
|
|
|
# Watcher logs
|
|
sudo journalctl -u mem-qdrant-watcher -n 20
|
|
|
|
# OpenClaw status
|
|
openclaw status
|
|
```
|
|
|
|
---
|
|
|
|
## Troubleshooting
|
|
|
|
### Issue: Watcher Not Capturing
|
|
|
|
**Check:**
|
|
1. Service running? `systemctl status mem-qdrant-watcher`
|
|
2. Logs: `journalctl -u mem-qdrant-watcher -f`
|
|
3. Qdrant accessible? `curl http://<QDRANT_IP>:6333/`
|
|
4. Ollama accessible? `curl http://<OLLAMA_IP>:11434/api/tags`
|
|
|
|
### Issue: Cleaner Fails
|
|
|
|
**Common causes:**
|
|
- Qdrant connection timeout (add `time.sleep(0.1)` between batches)
|
|
- Nested content dicts (handled in updated script)
|
|
- Type errors (non-string content — handled)
|
|
|
|
### Issue: Plugin Not Loading
|
|
|
|
**Check:**
|
|
1. `openclaw.json` syntax valid? `openclaw config validate`
|
|
2. Plugin compiled? `cd /root/.openclaw/extensions/memory-qdrant && npx tsc`
|
|
3. Gateway logs: `tail /tmp/openclaw/openclaw-$(date +%Y-%m-%d).log`
|
|
|
|
---
|
|
|
|
## Cron Schedule (Updated v2.2)
|
|
|
|
| Time | Job | Script | Status |
|
|
|------|-----|--------|--------|
|
|
| Every 30 min | Timer curator | `tr-continuous/curator_timer.py` | ✅ Active |
|
|
| Per turn | Capture | `mem-qdrant-watcher` | ✅ Daemon |
|
|
| Per turn | Injection | `memory-qdrant` plugin | ✅ Active |
|
|
|
|
**Removed:**
|
|
- ❌ 2:45 AM daily curator
|
|
- ❌ Every-minute turn curator check
|
|
|
|
---
|
|
|
|
## Next Steps
|
|
|
|
### Immediate
|
|
- ⏳ Monitor first timer run (logs: `/var/log/true-recall-timer.log`)
|
|
- ⏳ Validate gem extraction quality from timer curator
|
|
- ⏳ Archive old curator scripts if timer works
|
|
|
|
### Completed ✅
|
|
- ✅ **Compactor config** — Minimal overhead: `mode: default`, `reserveTokensFloor: 0`, `memoryFlush: false`
|
|
|
|
### Future
|
|
- ⏳ Curator tuning based on timer results
|
|
- ⏳ Silent compacting (requires OpenClaw core changes)
|
|
|
|
### Planned Features (Backlog)
|
|
- ⏳ **Interactive install script** — Prompts for embedding model, timer interval, batch size, endpoints
|
|
- ⏳ **Single embedding model option** — Use one model for both collections
|
|
- ⏳ **Configurable thresholds** — Per-user customization via prompts
|
|
|
|
**Compactor Settings (Applied):**
|
|
```json5
|
|
{
|
|
agents: {
|
|
defaults: {
|
|
compaction: {
|
|
mode: "default",
|
|
reserveTokensFloor: 0,
|
|
memoryFlush: { enabled: false }
|
|
}
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
**Note:** Only `mode`, `reserveTokensFloor`, and `memoryFlush` are valid under `agents.defaults.compaction`. Other settings are Pi runtime parameters.
|
|
|
|
**Install script prompts:**
|
|
1. Embedding model (snowflake vs mxbai)
|
|
2. Timer interval (5 min / 30 min / hourly)
|
|
3. Batch size (50 / 100 / 500)
|
|
4. Qdrant/Ollama URLs
|
|
5. User ID
|
|
|
|
---
|
|
|
|
## Session Recovery
|
|
|
|
If starting fresh:
|
|
1. Read `README.md` for architecture overview
|
|
2. Check service status: `sudo systemctl status mem-qdrant-watcher`
|
|
3. Check timer curator: `tail /var/log/true-recall-timer.log`
|
|
4. Verify collections: `curl http://<QDRANT_IP>:6333/collections`
|
|
|
|
---
|
|
|
|
*Last Verified: 2026-02-24 19:29 CST*
|
|
*Version: v2.2 (30b curator, install script planned)*
|