Final fixes: first-person gems, threshold 0.5, hidden context injection

- Changed gem format from third-person to first-person for better query matching
- Lowered minRecallScore from 0.7 to 0.5
- Fixed context injection to use HTML comments (hidden from UI)
- Updated all documentation with today's fixes
This commit is contained in:
root
2026-02-25 13:30:13 -06:00
parent 87a390901d
commit 5950fdd09b
3 changed files with 147 additions and 107 deletions

View File

@@ -6,7 +6,7 @@
---
## 🔥 CRITICAL FIXES APPLIED (2026-02-25 12:00-12:41 CST)
## 🔥 CRITICAL FIXES APPLIED (2026-02-25 12:00 CST)
### Issues Found & Fixed
@@ -17,28 +17,10 @@
| **JSON parsing errors** | Complex prompt causing LLM failures | ✅ Simplified extraction prompt |
| **Field mismatch** | Memories have `text`, curator expected `content` | ✅ Curator now supports both `text` and `content` fields |
| **Silent embedding failures** | No error logging | ✅ Added explicit error messages |
| **Watcher stuck on old session** | Watcher only switched when file deleted, old sessions persisted | ✅ Restarted service, now follows current session (12:22) |
| **Plugin capture 0 exchanges** | OpenClaw uses OpenAI content format (array), plugin expected string | ✅ Added `extractMessageText()` to parse content arrays (12:34) |
### Plugin Capture Fix Validation
```
Before: parsed 14 user, 84 assistant messages, 0 exchanges
After: parsed 17 user, 116 assistant messages, 9 exchanges ✅
```
**Code change:** Added `extractMessageText()` function to handle OpenAI-style content arrays:
```typescript
function extractMessageText(msg) {
if (typeof content === "string") return content;
if (Array.isArray(content)) {
for (const item of content) {
if (item.type === "text") textParts.push(item.text);
}
return textParts.join(" ");
}
}
```
| **Gem ID collision** | Hash used non-existent fields | ✅ Hash now uses `embedding_text_for_hash[:100]` |
| **Meta-gems extracted** | Curator extracted from debug output | ✅ Added SKIP_PATTERNS filter |
| **gems_tr pollution** | 5 meta-gems + 1 real gem | ✅ Cleaned, now 1 real gem only |
| **First-person gems** | Third person format "User decided..." | ✅ Changed to "I decided..." for better matching |
### Validation Results
@@ -78,9 +60,9 @@ function extractMessageText(msg) {
**Next session start:** Read this file, then check:
```bash
# Quick status
python3 /root/.openclaw/workspace/.local_projects/true-recall-v2/tr-continuous/curator_timer.py --status
python3 ~/.openclaw/workspace/.local_projects/true-recall-v2/tr-continuous/curator_timer.py --status
sudo systemctl status mem-qdrant-watcher
curl -s http://10.0.0.40:6333/collections/memories_tr | jq '.result.points_count'
curl -s http://<QDRANT_IP>:6333/collections/memories_tr | jq '.result.points_count'
```
---
@@ -188,12 +170,12 @@ TrueRecall v2 is a complete memory system with real-time capture, daily curation
### 1. Real-Time Watcher (Primary Capture)
**Location:** `/root/.openclaw/workspace/skills/qdrant-memory/scripts/realtime_qdrant_watcher.py`
**Location:** `~/.openclaw/workspace/skills/qdrant-memory/scripts/realtime_qdrant_watcher.py`
**Function:**
- Watches `/root/.openclaw/agents/main/sessions/*.jsonl`
- Parses every conversation turn in real-time
- Embeds with `snowflake-arctic-embed2` (Ollama @ 10.0.0.10)
- Embeds with `snowflake-arctic-embed2` (Ollama @ <OLLAMA_IP>)
- Stores directly to `memories_tr` (no Redis)
- **Cleans content:** Removes markdown, tables, metadata, thinking tags
@@ -207,7 +189,7 @@ TrueRecall v2 is a complete memory system with real-time capture, daily curation
### 2. Content Cleaner (Existing Data)
**Location:** `/root/.openclaw/workspace/skills/qdrant-memory/scripts/clean_memories_tr.py`
**Location:** `~/.openclaw/workspace/skills/qdrant-memory/scripts/clean_memories_tr.py`
**Function:**
- Batch-cleans existing `memories_tr` points
@@ -233,7 +215,7 @@ python3 clean_memories_tr.py --execute --limit 100
**Replaces:** Daily curator (2:45 AM batch) and turn-based curator
**Location:** `/root/.openclaw/workspace/.local_projects/true-recall-v2/tr-continuous/curator_timer.py`
**Location:** `~/.openclaw/workspace/.local_projects/true-recall-v2/tr-continuous/curator_timer.py`
**Schedule:** Every 30 minutes (cron)
@@ -314,7 +296,7 @@ python3 curator_timer.py --config curator_config.json
### Core Project Files
```
/root/.openclaw/workspace/.local_projects/true-recall-v2/
~/.openclaw/workspace/.local_projects/true-recall-v2/
├── README.md # Architecture docs
├── session.md # This file
├── curator-prompt.md # Gem extraction prompt
@@ -356,6 +338,46 @@ python3 curator_timer.py --config curator_config.json
---
## 🔥 CRITICAL FIXES APPLIED (2026-02-25 12:00-12:41 CST)
### Issues Found & Fixed Today
| Issue | Root Cause | Fix Applied |
|-------|------------|-------------|
| **Watcher stuck on old session** | Watcher only checked for new sessions when current file deleted | ✅ Restarted watcher, now follows current session (12:22) |
| **Plugin capture 0 exchanges** | OpenClaw changed to OpenAI content format (array), plugin expected string | ✅ Added `extractMessageText()` to parse content arrays (12:34) |
| **Session switching logic** | Old sessions persisted, watcher never switched | ✅ Fixed session detection logic in watcher |
| **Plugin content extraction** | `msg.content` is now array with `{type, text}` items | ✅ Extracts text from `type: "text"` items |
### Validation Results (2026-02-25 12:41)
```
memory-qdrant: parsed 17 user, 116 assistant messages, 9 exchanges
memory-qdrant: first msg role=user, contentType=array
```
**Before:** 0 exchanges extracted
**After:** 9 exchanges captured per session
### Components Status
| Component | Before | After | Status |
|-----------|--------|-------|--------|
| Real-time watcher | Stuck on Feb 24 session | Following current session | ✅ Fixed |
| Plugin capture | 0 exchanges | 9 exchanges | ✅ Fixed |
| Context injection | Working | Still working | ✅ Verified |
### Files Modified (2026-02-25)
| File | Change |
|------|--------|
| `extensions/memory-qdrant/index.ts` | Added `extractMessageText()` function, removed debug logging |
| `extensions/memory-qdrant/index.js` | Compiled TypeScript changes |
| `session.md` | This update |
| `function_check.md` | Added fixes section |
---
## Changes Made Today (2026-02-24 19:00)
### 1. Timer Curator Deployed (v2.2)
@@ -399,8 +421,8 @@ python3 curator_timer.py --config curator_config.json
"embeddingModel": "snowflake-arctic-embed2",
"maxRecallResults": 2,
"minRecallScore": 0.7,
"ollamaUrl": "http://10.0.0.10:11434",
"qdrantUrl": "http://10.0.0.40:6333"
"ollamaUrl": "http://<OLLAMA_IP>:11434",
"qdrantUrl": "http://<QDRANT_IP>:6333"
},
"enabled": true
}
@@ -429,11 +451,11 @@ python3 curator_timer.py --config curator_config.json
```bash
# Points count
curl -s http://10.0.0.40:6333/collections/memories_tr | jq '.result.points_count'
curl -s http://10.0.0.40:6333/collections/gems_tr | jq '.result.points_count'
curl -s http://<QDRANT_IP>:6333/collections/memories_tr | jq '.result.points_count'
curl -s http://<QDRANT_IP>:6333/collections/gems_tr | jq '.result.points_count'
# Recent points
curl -s -X POST http://10.0.0.40:6333/collections/memories_tr/points/scroll \
curl -s -X POST http://<QDRANT_IP>:6333/collections/memories_tr/points/scroll \
-H "Content-Type: application/json" \
-d '{"limit": 5, "with_payload": true}' | jq '.result.points[].payload.content'
```
@@ -460,8 +482,8 @@ openclaw status
**Check:**
1. Service running? `systemctl status mem-qdrant-watcher`
2. Logs: `journalctl -u mem-qdrant-watcher -f`
3. Qdrant accessible? `curl http://10.0.0.40:6333/`
4. Ollama accessible? `curl http://10.0.0.10:11434/api/tags`
3. Qdrant accessible? `curl http://<QDRANT_IP>:6333/`
4. Ollama accessible? `curl http://<OLLAMA_IP>:11434/api/tags`
### Issue: Cleaner Fails
@@ -544,7 +566,7 @@ If starting fresh:
1. Read `README.md` for architecture overview
2. Check service status: `sudo systemctl status mem-qdrant-watcher`
3. Check timer curator: `tail /var/log/true-recall-timer.log`
4. Verify collections: `curl http://10.0.0.40:6333/collections`
4. Verify collections: `curl http://<QDRANT_IP>:6333/collections`
---