Final fixes: first-person gems, threshold 0.5, hidden context injection

- Changed gem format from third-person to first-person for better query matching - Lowered minRecallScore from 0.7 to 0.5 - Fixed context injection to use HTML comments (hidden from UI) - Updated all documentation with today's fixes
2026-02-25 13:30:13 -06:00
parent 87a390901d
commit 5950fdd09b
3 changed files with 147 additions and 107 deletions
--- a/session.md
+++ b/session.md
@@ -6,7 +6,7 @@

 ---

-## 🔥 CRITICAL FIXES APPLIED (2026-02-25 12:00-12:41 CST)
+## 🔥 CRITICAL FIXES APPLIED (2026-02-25 12:00 CST)

 ### Issues Found & Fixed

@@ -17,28 +17,10 @@
 | **JSON parsing errors** | Complex prompt causing LLM failures | ✅ Simplified extraction prompt |
 | **Field mismatch** | Memories have `text`, curator expected `content` | ✅ Curator now supports both `text` and `content` fields |
 | **Silent embedding failures** | No error logging | ✅ Added explicit error messages |
-| **Watcher stuck on old session** | Watcher only switched when file deleted, old sessions persisted | ✅ Restarted service, now follows current session (12:22) |
-| **Plugin capture 0 exchanges** | OpenClaw uses OpenAI content format (array), plugin expected string | ✅ Added `extractMessageText()` to parse content arrays (12:34) |
-
-### Plugin Capture Fix Validation
-
-```
-Before: parsed 14 user, 84 assistant messages, 0 exchanges
-After:  parsed 17 user, 116 assistant messages, 9 exchanges ✅
-```
-
-**Code change:** Added `extractMessageText()` function to handle OpenAI-style content arrays:
-```typescript
-function extractMessageText(msg) {
-  if (typeof content === "string") return content;
-  if (Array.isArray(content)) {
-    for (const item of content) {
-      if (item.type === "text") textParts.push(item.text);
-    }
-    return textParts.join(" ");
-  }
-}
-```
+| **Gem ID collision** | Hash used non-existent fields | ✅ Hash now uses `embedding_text_for_hash[:100]` |
+| **Meta-gems extracted** | Curator extracted from debug output | ✅ Added SKIP_PATTERNS filter |
+| **gems_tr pollution** | 5 meta-gems + 1 real gem | ✅ Cleaned, now 1 real gem only |
+| **First-person gems** | Third person format "User decided..." | ✅ Changed to "I decided..." for better matching |

 ### Validation Results

@@ -78,9 +60,9 @@ function extractMessageText(msg) {
 **Next session start:** Read this file, then check:
 ```bash
 # Quick status
-python3 /root/.openclaw/workspace/.local_projects/true-recall-v2/tr-continuous/curator_timer.py --status
+python3 ~/.openclaw/workspace/.local_projects/true-recall-v2/tr-continuous/curator_timer.py --status
 sudo systemctl status mem-qdrant-watcher
-curl -s http://10.0.0.40:6333/collections/memories_tr | jq '.result.points_count'
+curl -s http://<QDRANT_IP>:6333/collections/memories_tr | jq '.result.points_count'
 ```

 ---
@@ -188,12 +170,12 @@ TrueRecall v2 is a complete memory system with real-time capture, daily curation

 ### 1. Real-Time Watcher (Primary Capture)

-**Location:** `/root/.openclaw/workspace/skills/qdrant-memory/scripts/realtime_qdrant_watcher.py`
+**Location:** `~/.openclaw/workspace/skills/qdrant-memory/scripts/realtime_qdrant_watcher.py`

 **Function:**
 - Watches `/root/.openclaw/agents/main/sessions/*.jsonl`
 - Parses every conversation turn in real-time
- Embeds with `snowflake-arctic-embed2` (Ollama @ 10.0.0.10)
+- Embeds with `snowflake-arctic-embed2` (Ollama @ <OLLAMA_IP>)
 - Stores directly to `memories_tr` (no Redis)
 - **Cleans content:** Removes markdown, tables, metadata, thinking tags

@@ -207,7 +189,7 @@ TrueRecall v2 is a complete memory system with real-time capture, daily curation

 ### 2. Content Cleaner (Existing Data)

-**Location:** `/root/.openclaw/workspace/skills/qdrant-memory/scripts/clean_memories_tr.py`
+**Location:** `~/.openclaw/workspace/skills/qdrant-memory/scripts/clean_memories_tr.py`

 **Function:**
 - Batch-cleans existing `memories_tr` points
@@ -233,7 +215,7 @@ python3 clean_memories_tr.py --execute --limit 100

 **Replaces:** Daily curator (2:45 AM batch) and turn-based curator

-**Location:** `/root/.openclaw/workspace/.local_projects/true-recall-v2/tr-continuous/curator_timer.py`
+**Location:** `~/.openclaw/workspace/.local_projects/true-recall-v2/tr-continuous/curator_timer.py`

 **Schedule:** Every 30 minutes (cron)

@@ -314,7 +296,7 @@ python3 curator_timer.py --config curator_config.json
 ### Core Project Files

 ```
-/root/.openclaw/workspace/.local_projects/true-recall-v2/
+~/.openclaw/workspace/.local_projects/true-recall-v2/
 ├── README.md                           # Architecture docs
 ├── session.md                          # This file
 ├── curator-prompt.md                   # Gem extraction prompt
@@ -356,6 +338,46 @@ python3 curator_timer.py --config curator_config.json

 ---

+## 🔥 CRITICAL FIXES APPLIED (2026-02-25 12:00-12:41 CST)
+
+### Issues Found & Fixed Today
+
+| Issue | Root Cause | Fix Applied |
+|-------|------------|-------------|
+| **Watcher stuck on old session** | Watcher only checked for new sessions when current file deleted | ✅ Restarted watcher, now follows current session (12:22) |
+| **Plugin capture 0 exchanges** | OpenClaw changed to OpenAI content format (array), plugin expected string | ✅ Added `extractMessageText()` to parse content arrays (12:34) |
+| **Session switching logic** | Old sessions persisted, watcher never switched | ✅ Fixed session detection logic in watcher |
+| **Plugin content extraction** | `msg.content` is now array with `{type, text}` items | ✅ Extracts text from `type: "text"` items |
+
+### Validation Results (2026-02-25 12:41)
+
+```
+memory-qdrant: parsed 17 user, 116 assistant messages, 9 exchanges
+memory-qdrant: first msg role=user, contentType=array
+```
+
+**Before:** 0 exchanges extracted  
+**After:** 9 exchanges captured per session
+
+### Components Status
+
+| Component | Before | After | Status |
+|-----------|--------|-------|--------|
+| Real-time watcher | Stuck on Feb 24 session | Following current session | ✅ Fixed |
+| Plugin capture | 0 exchanges | 9 exchanges | ✅ Fixed |
+| Context injection | Working | Still working | ✅ Verified |
+
+### Files Modified (2026-02-25)
+
+| File | Change |
+|------|--------|
+| `extensions/memory-qdrant/index.ts` | Added `extractMessageText()` function, removed debug logging |
+| `extensions/memory-qdrant/index.js` | Compiled TypeScript changes |
+| `session.md` | This update |
+| `function_check.md` | Added fixes section |
+
+---
+
 ## Changes Made Today (2026-02-24 19:00)

 ### 1. Timer Curator Deployed (v2.2)
@@ -399,8 +421,8 @@ python3 curator_timer.py --config curator_config.json
      "embeddingModel": "snowflake-arctic-embed2",
      "maxRecallResults": 2,
      "minRecallScore": 0.7,
-      "ollamaUrl": "http://10.0.0.10:11434",
-      "qdrantUrl": "http://10.0.0.40:6333"
+      "ollamaUrl": "http://<OLLAMA_IP>:11434",
+      "qdrantUrl": "http://<QDRANT_IP>:6333"
    },
    "enabled": true
  }
@@ -429,11 +451,11 @@ python3 curator_timer.py --config curator_config.json

 ```bash
 # Points count
-curl -s http://10.0.0.40:6333/collections/memories_tr | jq '.result.points_count'
-curl -s http://10.0.0.40:6333/collections/gems_tr | jq '.result.points_count'
+curl -s http://<QDRANT_IP>:6333/collections/memories_tr | jq '.result.points_count'
+curl -s http://<QDRANT_IP>:6333/collections/gems_tr | jq '.result.points_count'

 # Recent points
-curl -s -X POST http://10.0.0.40:6333/collections/memories_tr/points/scroll \
+curl -s -X POST http://<QDRANT_IP>:6333/collections/memories_tr/points/scroll \
  -H "Content-Type: application/json" \
  -d '{"limit": 5, "with_payload": true}' | jq '.result.points[].payload.content'
 ```
@@ -460,8 +482,8 @@ openclaw status
 **Check:**
 1. Service running? `systemctl status mem-qdrant-watcher`
 2. Logs: `journalctl -u mem-qdrant-watcher -f`
-3. Qdrant accessible? `curl http://10.0.0.40:6333/`
-4. Ollama accessible? `curl http://10.0.0.10:11434/api/tags`
+3. Qdrant accessible? `curl http://<QDRANT_IP>:6333/`
+4. Ollama accessible? `curl http://<OLLAMA_IP>:11434/api/tags`

 ### Issue: Cleaner Fails

@@ -544,7 +566,7 @@ If starting fresh:
 1. Read `README.md` for architecture overview
 2. Check service status: `sudo systemctl status mem-qdrant-watcher`
 3. Check timer curator: `tail /var/log/true-recall-timer.log`
-4. Verify collections: `curl http://10.0.0.40:6333/collections`
+4. Verify collections: `curl http://<QDRANT_IP>:6333/collections`

 ---