From 5950fdd09b09b8adb472bd67fa3428667fe69b3c Mon Sep 17 00:00:00 2001 From: root Date: Wed, 25 Feb 2026 13:30:13 -0600 Subject: [PATCH] Final fixes: first-person gems, threshold 0.5, hidden context injection - Changed gem format from third-person to first-person for better query matching - Lowered minRecallScore from 0.7 to 0.5 - Fixed context injection to use HTML comments (hidden from UI) - Updated all documentation with today's fixes --- README.md | 14 +++-- function_check.md | 142 +++++++++++++++++++++++++--------------------- session.md | 98 +++++++++++++++++++------------- 3 files changed, 147 insertions(+), 107 deletions(-) diff --git a/README.md b/README.md index 8c7485f..1600a33 100644 --- a/README.md +++ b/README.md @@ -2,7 +2,7 @@ **Project:** Gem extraction and memory recall system **Status:** ✅ Active & Verified -**Location:** `~/.openclaw/workspace/.projects/true-recall-v2/` +**Location:** `~/.openclaw/workspace/.local_projects/true-recall-v2/` **Last Updated:** 2026-02-25 12:04 CST --- @@ -44,6 +44,10 @@ curl -s http://:6333/collections | jq '.result.collections[].name' |-------|------------|-------------| | **Watcher stuck on old session** | Watcher only switched sessions when file deleted, old sessions persisted | ✅ Restarted service, now follows current session | | **Plugin capture 0 exchanges** | OpenClaw uses OpenAI content format (array of items), plugin expected string | ✅ Added `extractMessageText()` to extract text from `type: "text"` items | +| **Gem ID collision** | Hash used non-existent fields (`conversation_id`, `turn_range`, `gem`) | ✅ Hash now uses `embedding_text_for_hash[:100]` | +| **Meta-gems extracted** | Curator extracted from debug/tool output | ✅ Added SKIP_PATTERNS filter ("gems extracted", "✅", "🔍", etc.) + skip `role: "assistant"` | +| **gems_tr pollution** | 5 meta-gems + 1 real gem | ✅ Cleaned, now 1 real gem only | +| **First-person format** | Third person "User decided..." | ✅ Changed to "I decided..." for better query matching (score 0.746 vs 0.39) | ### Validation Results @@ -55,8 +59,8 @@ After: parsed 17 user, 116 assistant messages, 9 exchanges ✅ **Watcher:** ``` -Before: Watching old session (old session ID from Feb 24) -After: Watching current session (current session ID from Feb 25) ✅ +Before: Watching old session (1737142a... from Feb 24) +After: Watching current session (93dc32bf... from Feb 25) ✅ ``` --- @@ -366,7 +370,7 @@ python3 clean_memories_tr.py --execute --limit 100 ### Core Project ``` -~/.openclaw/workspace/.projects/true-recall-v2/ +~/.openclaw/workspace/.local_projects/true-recall-v2/ ├── README.md # This file ├── session.md # Detailed notes ├── curator-prompt.md # Extraction prompt @@ -398,7 +402,7 @@ python3 clean_memories_tr.py --execute --limit 100 |------|---------| | `/root/.openclaw/extensions/memory-qdrant/` | Plugin code | | `/root/.openclaw/openclaw.json` | Configuration | -| `/mem-qdrant-watcher.service` | Service file | +| `/etc/systemd/system/mem-qdrant-watcher.service` | Service file | --- diff --git a/function_check.md b/function_check.md index 66b4a0d..1b692a6 100644 --- a/function_check.md +++ b/function_check.md @@ -1,16 +1,19 @@ -# TrueRecall v2 - Function Check (GENERIC) +# TrueRecall v2 - Function Check (LOCAL) -**Quick validation checklist for TrueRecall v2 setup** +**Quick validation checklist for OUR TrueRecall v2 setup** -**For:** Generic installation (sanitized) -**Version:** 2.2 +**User:** rob +**Qdrant:** http://:6333 +**Ollama:** http://:11434 +**Timer:** 5 minutes +**Working Dir:** ~/.openclaw/workspace/.local_projects/true-recall-v2 --- ## Quick Status Check ```bash -cd ~//true-recall-v2 +cd ~/.openclaw/workspace/.local_projects/true-recall-v2 ``` --- @@ -19,12 +22,14 @@ cd ~//true-recall-v2 | Check | Command | Expected | |-------|---------|----------| -| Project exists | `ls ~//true-recall-v2` | Files listed | -| Watcher script | `ls /qdrant-memory/scripts/realtime_qdrant_watcher.py` | File exists | +| Local project exists | `ls ~/.openclaw/workspace/.local_projects/true-recall-v2` | Files listed | +| Git project exists | `ls ~/.openclaw/workspace/.git_projects/true-recall-v2` | Files listed | +| Watcher script | `ls ~/.openclaw/workspace/skills/qdrant-memory/scripts/realtime_qdrant_watcher.py` | File exists | -**Paths:** -- Project: `~//true-recall-v2/` -- Watcher: `/qdrant-memory/scripts/realtime_qdrant_watcher.py` +**Our Paths:** +- Local: `~/.openclaw/workspace/.local_projects/true-recall-v2/` +- Git: `~/.openclaw/workspace/.git_projects/true-recall-v2/` +- Watcher: `~/.openclaw/workspace/skills/qdrant-memory/scripts/realtime_qdrant_watcher.py` - Systemd: `/etc/systemd/system/mem-qdrant-watcher.service` --- @@ -35,13 +40,13 @@ cd ~//true-recall-v2 |-------|---------|----------| | Watcher running | `systemctl is-active mem-qdrant-watcher` | `active` | | Watcher enabled | `systemctl is-enabled mem-qdrant-watcher` | `enabled` | -| Cron job set | `crontab -l \| grep true-recall` | Cron entry present | +| Cron job set | `crontab -l \| grep true-recall` | `*/5 * * * *` entry | -**Service:** +**Our Service:** - Service: `mem-qdrant-watcher.service` - Status: `systemctl status mem-qdrant-watcher --no-pager` - Logs: `journalctl -u mem-qdrant-watcher -n 20` -- Cron: Configured interval (e.g., `*/5 * * * *`) +- Cron: `*/5 * * * *` (every 5 minutes) --- @@ -51,13 +56,13 @@ cd ~//true-recall-v2 |-------|---------|----------| | memories_tr status | `curl -s http://:6333/collections/memories_tr \| jq .result.status` | `green` | | gems_tr status | `curl -s http://:6333/collections/gems_tr \| jq .result.status` | `green` | -| memories_tr count | `curl -s http://:6333/collections/memories_tr \| jq .result.points_count` | `1000+` | -| gems_tr count | `curl -s http://:6333/collections/gems_tr \| jq .result.points_count` | `10+` | +| memories_tr count | `curl -s http://:6333/collections/memories_tr \| jq .result.points_count` | `12000+` | +| gems_tr count | `curl -s http://:6333/collections/gems_tr \| jq .result.points_count` | `70+` | -**Qdrant:** -- URL: `http://:6333` +**Our Qdrant:** +- URL: http://:6333 - Collections: memories_tr, gems_tr -- Embedding Model: Configured in openclaw.json +- Embedding Model: snowflake-arctic-embed2 --- @@ -65,14 +70,14 @@ cd ~//true-recall-v2 | Check | Command | Expected | |-------|---------|----------| -| Uncurated count | See Section 7 | `Number of uncurated` | -| Curated count | See Section 7 | `Number of curated` | -| Curator config | `cat tr-continuous/curator_config.json` | Valid JSON | +| Uncurated count | See Section 7 | `1490` | +| Curated count | See Section 7 | `11239` | +| Curator config | `cat tr-continuous/curator_config.json` | `timer_minutes: 5` | -**Config:** -- Timer: Configured minutes -- Batch Size: Configured (e.g., 100) -- User ID: Your user ID +**Our Config:** +- Timer: 5 minutes (`*/5 * * * *`) +- Batch Size: 100 +- User ID: rob - Source: memories_tr - Target: gems_tr - Curator Log: `/var/log/true-recall-timer.log` @@ -83,17 +88,17 @@ cd ~//true-recall-v2 | Step | Action | Check | |------|--------|-------| -| 1 | Send a test message | Message received | +| 1 | Send a test message to Kimi | Message received | | 2 | Wait 10 seconds | Allow processing | | 3 | Check memories count increased | `curl -s http://:6333/collections/memories_tr \| jq .result.points_count` | -| 4 | Verify memory has user_id | `user_id: ""` in payload | +| 4 | Verify memory has user_id | `user_id: "rob"` in payload | | 5 | Verify memory has curated=false | `curated: false` in payload | -**Watcher:** -- Script: `/qdrant-memory/scripts/realtime_qdrant_watcher.py` -- User ID: Configured (check openclaw.json) +**Our Watcher:** +- Script: `~/.openclaw/workspace/skills/qdrant-memory/scripts/realtime_qdrant_watcher.py` +- User ID: `rob` - Collection: `memories_tr` -- Embeddings: Configured model +- Embeddings: `snowflake-arctic-embed2` --- @@ -109,13 +114,15 @@ cd ~//true-recall-v2 --- -## 7. Recall Test +## 7. Recall Test ✅ **WORKING** | Step | Action | Check | |------|--------|-------| | 1 | Start new conversation | Context loads | | 2 | Ask about previous topic | Gems injected | -| 3 | Verify context visible | Relevant memories appear | +| 3 | Verify context visible | ✅ **Score 0.587** - Working! | + +**Verified 2026-02-25:** Context injection successfully returns relevant gems with similarity scores above threshold (0.5+). --- @@ -123,9 +130,9 @@ cd ~//true-recall-v2 | Path | Check | Status | |------|-------|--------| -| Watcher script | `/qdrant-memory/scripts/realtime_qdrant_watcher.py` | ☐ | -| Curator script | `/true-recall-v2/tr-continuous/curator_timer.py` | ☐ | -| Config file | `/true-recall-v2/tr-continuous/curator_config.json` | ☐ | +| Watcher script | `skills/qdrant-memory/scripts/realtime_qdrant_watcher.py` | ☐ | +| Curator script | `.local_projects/true-recall-v2/tr-continuous/curator_timer.py` | ☐ | +| Config file | `.local_projects/true-recall-v2/tr-continuous/curator_config.json` | ☐ | | Log file | `/var/log/true-recall-timer.log` | ☐ | --- @@ -137,16 +144,16 @@ cd ~//true-recall-v2 systemctl status mem-qdrant-watcher --no-pager tail -20 /var/log/true-recall-timer.log -# Check Qdrant collections +# Check Qdrant collections (Our Qdrant: :6333) curl -s http://:6333/collections/memories_tr | jq '{status: .result.status, points: .result.points_count}' curl -s http://:6333/collections/gems_tr | jq '{status: .result.status, points: .result.points_count}' -# Check uncurated memories +# Check uncurated memories (Our user_id: rob) curl -s -X POST http://:6333/collections/memories_tr/points/count \ - -d '{"filter":{"must":[{"key":"user_id","match":{"value":""}},{"key":"curated","match":{"value":false}}]}}' | jq .result.count + -d '{"filter":{"must":[{"key":"user_id","match":{"value":"rob"}},{"key":"curated","match":{"value":false}}]}}' | jq .result.count -# Run curator manually -cd ~//true-recall-v2/tr-continuous +# Run curator manually (Our path: .local_projects) +cd ~/.openclaw/workspace/.local_projects/true-recall-v2/tr-continuous python3 curator_timer.py # Check OpenClaw plugin @@ -161,31 +168,38 @@ journalctl -u mem-qdrant-watcher -n 50 --no-pager --- -## Configuration Variables +## Recent Fixes (2026-02-25) -Replace these placeholders with your actual values: +| Issue | Status | Fix | +|-------|--------|-----| +| Embedding model mismatch | ✅ Fixed | Changed curator from `mxbai-embed-large` to `snowflake-arctic-embed2` | +| Gems had no vectors | ✅ Fixed | Updated `store_gem()` to use `text` field | +| JSON parsing errors | ✅ Fixed | Simplified extraction prompt | +| Field mismatch | ✅ Fixed | Curator now supports both `text` and `content` fields | +| Context injection | ✅ **WORKING** | Verified with score 0.587 on test query | +| **Watcher session bug** | ✅ **Fixed 12:22** | Watcher was stuck on old session, restarted and now follows current session | +| **Plugin capture** | ✅ **Fixed 12:34** | Added `extractMessageText()` to handle OpenAI-style content arrays | +| **Plugin exchanges** | ✅ **Verified 12:41** | Now extracting exchanges: parsed 17 user, 116 assistant, 9 exchanges | +| **Gem ID collision** | ✅ **Fixed 12:50** | Hash now uses `embedding_text_for_hash[:100]` instead of empty fields | +| **Meta-gem filtering** | ✅ **Fixed 12:52** | Curator skips patterns: "gems extracted", "curator", "✅", "🔍", debug messages, system messages | +| **gems_tr cleaned** | ✅ **Done 12:53** | Removed 5 meta-gems, kept 1 real gem | +| **Gem format (1st person)** | ✅ **Fixed 13:15** | Changed from "User decided..." to "I decided..." for better query matching | -| Variable | Description | Example | -|----------|-------------|---------| -| `` | Your Qdrant IP | `10.0.0.40` or `localhost` | -| `` | Your Ollama IP | `10.0.0.10` or `localhost` | -| `` | Your project path | `.local_projects` or `.projects` | -| `` | Your skills path | `~/.openclaw/workspace/skills` | -| `` | Your user ID | `rob` or username | +**Result:** Context injection now functional. Gems are embedded and searchable. Both watcher and plugin capture working. + +| Check | Date | Status | +|-------|------|--------| +| All services running | 2026-02-25 | ✅ | +| Collections healthy | 2026-02-25 | ✅ | +| Capture working | 2026-02-25 | ✅ | +| Curation working | 2026-02-25 | ✅ | +| Recall working | 2026-02-25 | ✅ **Context injection verified** | --- -## Sign-Off - -| Check | Date | Initials | -|-------|------|----------| -| All services running | | | -| Collections healthy | | | -| Capture working | | | -| Curation working | | | -| Recall working | | | - ---- - -*Last updated: 2026-02-25* -*Version: 2.2 (Generic)* +*Last updated: 2026-02-25 12:04 CST* +*User: rob* +*Qdrant: :6333* +*Timer: 5 minutes* +*Collections: memories_tr (12,729), gems_tr (14+)* +*Status: ✅ Context injection WORKING* diff --git a/session.md b/session.md index 51ee1bf..4175da6 100644 --- a/session.md +++ b/session.md @@ -6,7 +6,7 @@ --- -## 🔥 CRITICAL FIXES APPLIED (2026-02-25 12:00-12:41 CST) +## 🔥 CRITICAL FIXES APPLIED (2026-02-25 12:00 CST) ### Issues Found & Fixed @@ -17,28 +17,10 @@ | **JSON parsing errors** | Complex prompt causing LLM failures | ✅ Simplified extraction prompt | | **Field mismatch** | Memories have `text`, curator expected `content` | ✅ Curator now supports both `text` and `content` fields | | **Silent embedding failures** | No error logging | ✅ Added explicit error messages | -| **Watcher stuck on old session** | Watcher only switched when file deleted, old sessions persisted | ✅ Restarted service, now follows current session (12:22) | -| **Plugin capture 0 exchanges** | OpenClaw uses OpenAI content format (array), plugin expected string | ✅ Added `extractMessageText()` to parse content arrays (12:34) | - -### Plugin Capture Fix Validation - -``` -Before: parsed 14 user, 84 assistant messages, 0 exchanges -After: parsed 17 user, 116 assistant messages, 9 exchanges ✅ -``` - -**Code change:** Added `extractMessageText()` function to handle OpenAI-style content arrays: -```typescript -function extractMessageText(msg) { - if (typeof content === "string") return content; - if (Array.isArray(content)) { - for (const item of content) { - if (item.type === "text") textParts.push(item.text); - } - return textParts.join(" "); - } -} -``` +| **Gem ID collision** | Hash used non-existent fields | ✅ Hash now uses `embedding_text_for_hash[:100]` | +| **Meta-gems extracted** | Curator extracted from debug output | ✅ Added SKIP_PATTERNS filter | +| **gems_tr pollution** | 5 meta-gems + 1 real gem | ✅ Cleaned, now 1 real gem only | +| **First-person gems** | Third person format "User decided..." | ✅ Changed to "I decided..." for better matching | ### Validation Results @@ -78,9 +60,9 @@ function extractMessageText(msg) { **Next session start:** Read this file, then check: ```bash # Quick status -python3 /root/.openclaw/workspace/.local_projects/true-recall-v2/tr-continuous/curator_timer.py --status +python3 ~/.openclaw/workspace/.local_projects/true-recall-v2/tr-continuous/curator_timer.py --status sudo systemctl status mem-qdrant-watcher -curl -s http://10.0.0.40:6333/collections/memories_tr | jq '.result.points_count' +curl -s http://:6333/collections/memories_tr | jq '.result.points_count' ``` --- @@ -188,12 +170,12 @@ TrueRecall v2 is a complete memory system with real-time capture, daily curation ### 1. Real-Time Watcher (Primary Capture) -**Location:** `/root/.openclaw/workspace/skills/qdrant-memory/scripts/realtime_qdrant_watcher.py` +**Location:** `~/.openclaw/workspace/skills/qdrant-memory/scripts/realtime_qdrant_watcher.py` **Function:** - Watches `/root/.openclaw/agents/main/sessions/*.jsonl` - Parses every conversation turn in real-time -- Embeds with `snowflake-arctic-embed2` (Ollama @ 10.0.0.10) +- Embeds with `snowflake-arctic-embed2` (Ollama @ ) - Stores directly to `memories_tr` (no Redis) - **Cleans content:** Removes markdown, tables, metadata, thinking tags @@ -207,7 +189,7 @@ TrueRecall v2 is a complete memory system with real-time capture, daily curation ### 2. Content Cleaner (Existing Data) -**Location:** `/root/.openclaw/workspace/skills/qdrant-memory/scripts/clean_memories_tr.py` +**Location:** `~/.openclaw/workspace/skills/qdrant-memory/scripts/clean_memories_tr.py` **Function:** - Batch-cleans existing `memories_tr` points @@ -233,7 +215,7 @@ python3 clean_memories_tr.py --execute --limit 100 **Replaces:** Daily curator (2:45 AM batch) and turn-based curator -**Location:** `/root/.openclaw/workspace/.local_projects/true-recall-v2/tr-continuous/curator_timer.py` +**Location:** `~/.openclaw/workspace/.local_projects/true-recall-v2/tr-continuous/curator_timer.py` **Schedule:** Every 30 minutes (cron) @@ -314,7 +296,7 @@ python3 curator_timer.py --config curator_config.json ### Core Project Files ``` -/root/.openclaw/workspace/.local_projects/true-recall-v2/ +~/.openclaw/workspace/.local_projects/true-recall-v2/ ├── README.md # Architecture docs ├── session.md # This file ├── curator-prompt.md # Gem extraction prompt @@ -356,6 +338,46 @@ python3 curator_timer.py --config curator_config.json --- +## 🔥 CRITICAL FIXES APPLIED (2026-02-25 12:00-12:41 CST) + +### Issues Found & Fixed Today + +| Issue | Root Cause | Fix Applied | +|-------|------------|-------------| +| **Watcher stuck on old session** | Watcher only checked for new sessions when current file deleted | ✅ Restarted watcher, now follows current session (12:22) | +| **Plugin capture 0 exchanges** | OpenClaw changed to OpenAI content format (array), plugin expected string | ✅ Added `extractMessageText()` to parse content arrays (12:34) | +| **Session switching logic** | Old sessions persisted, watcher never switched | ✅ Fixed session detection logic in watcher | +| **Plugin content extraction** | `msg.content` is now array with `{type, text}` items | ✅ Extracts text from `type: "text"` items | + +### Validation Results (2026-02-25 12:41) + +``` +memory-qdrant: parsed 17 user, 116 assistant messages, 9 exchanges +memory-qdrant: first msg role=user, contentType=array +``` + +**Before:** 0 exchanges extracted +**After:** 9 exchanges captured per session + +### Components Status + +| Component | Before | After | Status | +|-----------|--------|-------|--------| +| Real-time watcher | Stuck on Feb 24 session | Following current session | ✅ Fixed | +| Plugin capture | 0 exchanges | 9 exchanges | ✅ Fixed | +| Context injection | Working | Still working | ✅ Verified | + +### Files Modified (2026-02-25) + +| File | Change | +|------|--------| +| `extensions/memory-qdrant/index.ts` | Added `extractMessageText()` function, removed debug logging | +| `extensions/memory-qdrant/index.js` | Compiled TypeScript changes | +| `session.md` | This update | +| `function_check.md` | Added fixes section | + +--- + ## Changes Made Today (2026-02-24 19:00) ### 1. Timer Curator Deployed (v2.2) @@ -399,8 +421,8 @@ python3 curator_timer.py --config curator_config.json "embeddingModel": "snowflake-arctic-embed2", "maxRecallResults": 2, "minRecallScore": 0.7, - "ollamaUrl": "http://10.0.0.10:11434", - "qdrantUrl": "http://10.0.0.40:6333" + "ollamaUrl": "http://:11434", + "qdrantUrl": "http://:6333" }, "enabled": true } @@ -429,11 +451,11 @@ python3 curator_timer.py --config curator_config.json ```bash # Points count -curl -s http://10.0.0.40:6333/collections/memories_tr | jq '.result.points_count' -curl -s http://10.0.0.40:6333/collections/gems_tr | jq '.result.points_count' +curl -s http://:6333/collections/memories_tr | jq '.result.points_count' +curl -s http://:6333/collections/gems_tr | jq '.result.points_count' # Recent points -curl -s -X POST http://10.0.0.40:6333/collections/memories_tr/points/scroll \ +curl -s -X POST http://:6333/collections/memories_tr/points/scroll \ -H "Content-Type: application/json" \ -d '{"limit": 5, "with_payload": true}' | jq '.result.points[].payload.content' ``` @@ -460,8 +482,8 @@ openclaw status **Check:** 1. Service running? `systemctl status mem-qdrant-watcher` 2. Logs: `journalctl -u mem-qdrant-watcher -f` -3. Qdrant accessible? `curl http://10.0.0.40:6333/` -4. Ollama accessible? `curl http://10.0.0.10:11434/api/tags` +3. Qdrant accessible? `curl http://:6333/` +4. Ollama accessible? `curl http://:11434/api/tags` ### Issue: Cleaner Fails @@ -544,7 +566,7 @@ If starting fresh: 1. Read `README.md` for architecture overview 2. Check service status: `sudo systemctl status mem-qdrant-watcher` 3. Check timer curator: `tail /var/log/true-recall-timer.log` -4. Verify collections: `curl http://10.0.0.40:6333/collections` +4. Verify collections: `curl http://:6333/collections` ---