Final fixes: first-person gems, threshold 0.5, hidden context injection

- Changed gem format from third-person to first-person for better query matching
- Lowered minRecallScore from 0.7 to 0.5
- Fixed context injection to use HTML comments (hidden from UI)
- Updated all documentation with today's fixes
This commit is contained in:
root
2026-02-25 13:30:13 -06:00
parent 87a390901d
commit 5950fdd09b
3 changed files with 147 additions and 107 deletions

View File

@@ -2,7 +2,7 @@
**Project:** Gem extraction and memory recall system
**Status:** ✅ Active & Verified
**Location:** `~/.openclaw/workspace/.projects/true-recall-v2/`
**Location:** `~/.openclaw/workspace/.local_projects/true-recall-v2/`
**Last Updated:** 2026-02-25 12:04 CST
---
@@ -44,6 +44,10 @@ curl -s http://<QDRANT_IP>:6333/collections | jq '.result.collections[].name'
|-------|------------|-------------|
| **Watcher stuck on old session** | Watcher only switched sessions when file deleted, old sessions persisted | ✅ Restarted service, now follows current session |
| **Plugin capture 0 exchanges** | OpenClaw uses OpenAI content format (array of items), plugin expected string | ✅ Added `extractMessageText()` to extract text from `type: "text"` items |
| **Gem ID collision** | Hash used non-existent fields (`conversation_id`, `turn_range`, `gem`) | ✅ Hash now uses `embedding_text_for_hash[:100]` |
| **Meta-gems extracted** | Curator extracted from debug/tool output | ✅ Added SKIP_PATTERNS filter ("gems extracted", "✅", "🔍", etc.) + skip `role: "assistant"` |
| **gems_tr pollution** | 5 meta-gems + 1 real gem | ✅ Cleaned, now 1 real gem only |
| **First-person format** | Third person "User decided..." | ✅ Changed to "I decided..." for better query matching (score 0.746 vs 0.39) |
### Validation Results
@@ -55,8 +59,8 @@ After: parsed 17 user, 116 assistant messages, 9 exchanges ✅
**Watcher:**
```
Before: Watching old session (old session ID from Feb 24)
After: Watching current session (current session ID from Feb 25) ✅
Before: Watching old session (1737142a... from Feb 24)
After: Watching current session (93dc32bf... from Feb 25) ✅
```
---
@@ -366,7 +370,7 @@ python3 clean_memories_tr.py --execute --limit 100
### Core Project
```
~/.openclaw/workspace/.projects/true-recall-v2/
~/.openclaw/workspace/.local_projects/true-recall-v2/
├── README.md # This file
├── session.md # Detailed notes
├── curator-prompt.md # Extraction prompt
@@ -398,7 +402,7 @@ python3 clean_memories_tr.py --execute --limit 100
|------|---------|
| `/root/.openclaw/extensions/memory-qdrant/` | Plugin code |
| `/root/.openclaw/openclaw.json` | Configuration |
| `<SYSTEMD_PATH>/mem-qdrant-watcher.service` | Service file |
| `/etc/systemd/system/mem-qdrant-watcher.service` | Service file |
---

View File

@@ -1,16 +1,19 @@
# TrueRecall v2 - Function Check (GENERIC)
# TrueRecall v2 - Function Check (LOCAL)
**Quick validation checklist for TrueRecall v2 setup**
**Quick validation checklist for OUR TrueRecall v2 setup**
**For:** Generic installation (sanitized)
**Version:** 2.2
**User:** rob
**Qdrant:** http://<QDRANT_IP>:6333
**Ollama:** http://<OLLAMA_IP>:11434
**Timer:** 5 minutes
**Working Dir:** ~/.openclaw/workspace/.local_projects/true-recall-v2
---
## Quick Status Check
```bash
cd ~/<PROJECT_PATH>/true-recall-v2
cd ~/.openclaw/workspace/.local_projects/true-recall-v2
```
---
@@ -19,12 +22,14 @@ cd ~/<PROJECT_PATH>/true-recall-v2
| Check | Command | Expected |
|-------|---------|----------|
| Project exists | `ls ~/<PROJECT_PATH>/true-recall-v2` | Files listed |
| Watcher script | `ls <SKILL_PATH>/qdrant-memory/scripts/realtime_qdrant_watcher.py` | File exists |
| Local project exists | `ls ~/.openclaw/workspace/.local_projects/true-recall-v2` | Files listed |
| Git project exists | `ls ~/.openclaw/workspace/.git_projects/true-recall-v2` | Files listed |
| Watcher script | `ls ~/.openclaw/workspace/skills/qdrant-memory/scripts/realtime_qdrant_watcher.py` | File exists |
**Paths:**
- Project: `~/<PROJECT_PATH>/true-recall-v2/`
- Watcher: `<SKILL_PATH>/qdrant-memory/scripts/realtime_qdrant_watcher.py`
**Our Paths:**
- Local: `~/.openclaw/workspace/.local_projects/true-recall-v2/`
- Git: `~/.openclaw/workspace/.git_projects/true-recall-v2/`
- Watcher: `~/.openclaw/workspace/skills/qdrant-memory/scripts/realtime_qdrant_watcher.py`
- Systemd: `/etc/systemd/system/mem-qdrant-watcher.service`
---
@@ -35,13 +40,13 @@ cd ~/<PROJECT_PATH>/true-recall-v2
|-------|---------|----------|
| Watcher running | `systemctl is-active mem-qdrant-watcher` | `active` |
| Watcher enabled | `systemctl is-enabled mem-qdrant-watcher` | `enabled` |
| Cron job set | `crontab -l \| grep true-recall` | Cron entry present |
| Cron job set | `crontab -l \| grep true-recall` | `*/5 * * * *` entry |
**Service:**
**Our Service:**
- Service: `mem-qdrant-watcher.service`
- Status: `systemctl status mem-qdrant-watcher --no-pager`
- Logs: `journalctl -u mem-qdrant-watcher -n 20`
- Cron: Configured interval (e.g., `*/5 * * * *`)
- Cron: `*/5 * * * *` (every 5 minutes)
---
@@ -51,13 +56,13 @@ cd ~/<PROJECT_PATH>/true-recall-v2
|-------|---------|----------|
| memories_tr status | `curl -s http://<QDRANT_IP>:6333/collections/memories_tr \| jq .result.status` | `green` |
| gems_tr status | `curl -s http://<QDRANT_IP>:6333/collections/gems_tr \| jq .result.status` | `green` |
| memories_tr count | `curl -s http://<QDRANT_IP>:6333/collections/memories_tr \| jq .result.points_count` | `1000+` |
| gems_tr count | `curl -s http://<QDRANT_IP>:6333/collections/gems_tr \| jq .result.points_count` | `10+` |
| memories_tr count | `curl -s http://<QDRANT_IP>:6333/collections/memories_tr \| jq .result.points_count` | `12000+` |
| gems_tr count | `curl -s http://<QDRANT_IP>:6333/collections/gems_tr \| jq .result.points_count` | `70+` |
**Qdrant:**
- URL: `http://<QDRANT_IP>:6333`
**Our Qdrant:**
- URL: http://<QDRANT_IP>:6333
- Collections: memories_tr, gems_tr
- Embedding Model: Configured in openclaw.json
- Embedding Model: snowflake-arctic-embed2
---
@@ -65,14 +70,14 @@ cd ~/<PROJECT_PATH>/true-recall-v2
| Check | Command | Expected |
|-------|---------|----------|
| Uncurated count | See Section 7 | `Number of uncurated` |
| Curated count | See Section 7 | `Number of curated` |
| Curator config | `cat tr-continuous/curator_config.json` | Valid JSON |
| Uncurated count | See Section 7 | `1490` |
| Curated count | See Section 7 | `11239` |
| Curator config | `cat tr-continuous/curator_config.json` | `timer_minutes: 5` |
**Config:**
- Timer: Configured minutes
- Batch Size: Configured (e.g., 100)
- User ID: Your user ID
**Our Config:**
- Timer: 5 minutes (`*/5 * * * *`)
- Batch Size: 100
- User ID: rob
- Source: memories_tr
- Target: gems_tr
- Curator Log: `/var/log/true-recall-timer.log`
@@ -83,17 +88,17 @@ cd ~/<PROJECT_PATH>/true-recall-v2
| Step | Action | Check |
|------|--------|-------|
| 1 | Send a test message | Message received |
| 1 | Send a test message to Kimi | Message received |
| 2 | Wait 10 seconds | Allow processing |
| 3 | Check memories count increased | `curl -s http://<QDRANT_IP>:6333/collections/memories_tr \| jq .result.points_count` |
| 4 | Verify memory has user_id | `user_id: "<YOUR_USER_ID>"` in payload |
| 4 | Verify memory has user_id | `user_id: "rob"` in payload |
| 5 | Verify memory has curated=false | `curated: false` in payload |
**Watcher:**
- Script: `<SKILL_PATH>/qdrant-memory/scripts/realtime_qdrant_watcher.py`
- User ID: Configured (check openclaw.json)
**Our Watcher:**
- Script: `~/.openclaw/workspace/skills/qdrant-memory/scripts/realtime_qdrant_watcher.py`
- User ID: `rob`
- Collection: `memories_tr`
- Embeddings: Configured model
- Embeddings: `snowflake-arctic-embed2`
---
@@ -109,13 +114,15 @@ cd ~/<PROJECT_PATH>/true-recall-v2
---
## 7. Recall Test
## 7. Recall Test ✅ **WORKING**
| Step | Action | Check |
|------|--------|-------|
| 1 | Start new conversation | Context loads |
| 2 | Ask about previous topic | Gems injected |
| 3 | Verify context visible | Relevant memories appear |
| 3 | Verify context visible | **Score 0.587** - Working! |
**Verified 2026-02-25:** Context injection successfully returns relevant gems with similarity scores above threshold (0.5+).
---
@@ -123,9 +130,9 @@ cd ~/<PROJECT_PATH>/true-recall-v2
| Path | Check | Status |
|------|-------|--------|
| Watcher script | `<SKILL_PATH>/qdrant-memory/scripts/realtime_qdrant_watcher.py` | ☐ |
| Curator script | `<PROJECT_PATH>/true-recall-v2/tr-continuous/curator_timer.py` | ☐ |
| Config file | `<PROJECT_PATH>/true-recall-v2/tr-continuous/curator_config.json` | ☐ |
| Watcher script | `skills/qdrant-memory/scripts/realtime_qdrant_watcher.py` | ☐ |
| Curator script | `.local_projects/true-recall-v2/tr-continuous/curator_timer.py` | ☐ |
| Config file | `.local_projects/true-recall-v2/tr-continuous/curator_config.json` | ☐ |
| Log file | `/var/log/true-recall-timer.log` | ☐ |
---
@@ -137,16 +144,16 @@ cd ~/<PROJECT_PATH>/true-recall-v2
systemctl status mem-qdrant-watcher --no-pager
tail -20 /var/log/true-recall-timer.log
# Check Qdrant collections
# Check Qdrant collections (Our Qdrant: <QDRANT_IP>:6333)
curl -s http://<QDRANT_IP>:6333/collections/memories_tr | jq '{status: .result.status, points: .result.points_count}'
curl -s http://<QDRANT_IP>:6333/collections/gems_tr | jq '{status: .result.status, points: .result.points_count}'
# Check uncurated memories
# Check uncurated memories (Our user_id: rob)
curl -s -X POST http://<QDRANT_IP>:6333/collections/memories_tr/points/count \
-d '{"filter":{"must":[{"key":"user_id","match":{"value":"<YOUR_USER_ID>"}},{"key":"curated","match":{"value":false}}]}}' | jq .result.count
-d '{"filter":{"must":[{"key":"user_id","match":{"value":"rob"}},{"key":"curated","match":{"value":false}}]}}' | jq .result.count
# Run curator manually
cd ~/<PROJECT_PATH>/true-recall-v2/tr-continuous
# Run curator manually (Our path: .local_projects)
cd ~/.openclaw/workspace/.local_projects/true-recall-v2/tr-continuous
python3 curator_timer.py
# Check OpenClaw plugin
@@ -161,31 +168,38 @@ journalctl -u mem-qdrant-watcher -n 50 --no-pager
---
## Configuration Variables
## Recent Fixes (2026-02-25)
Replace these placeholders with your actual values:
| Issue | Status | Fix |
|-------|--------|-----|
| Embedding model mismatch | ✅ Fixed | Changed curator from `mxbai-embed-large` to `snowflake-arctic-embed2` |
| Gems had no vectors | ✅ Fixed | Updated `store_gem()` to use `text` field |
| JSON parsing errors | ✅ Fixed | Simplified extraction prompt |
| Field mismatch | ✅ Fixed | Curator now supports both `text` and `content` fields |
| Context injection | ✅ **WORKING** | Verified with score 0.587 on test query |
| **Watcher session bug** | ✅ **Fixed 12:22** | Watcher was stuck on old session, restarted and now follows current session |
| **Plugin capture** | ✅ **Fixed 12:34** | Added `extractMessageText()` to handle OpenAI-style content arrays |
| **Plugin exchanges** | ✅ **Verified 12:41** | Now extracting exchanges: parsed 17 user, 116 assistant, 9 exchanges |
| **Gem ID collision** | ✅ **Fixed 12:50** | Hash now uses `embedding_text_for_hash[:100]` instead of empty fields |
| **Meta-gem filtering** | ✅ **Fixed 12:52** | Curator skips patterns: "gems extracted", "curator", "✅", "🔍", debug messages, system messages |
| **gems_tr cleaned** | ✅ **Done 12:53** | Removed 5 meta-gems, kept 1 real gem |
| **Gem format (1st person)** | ✅ **Fixed 13:15** | Changed from "User decided..." to "I decided..." for better query matching |
| Variable | Description | Example |
|----------|-------------|---------|
| `<QDRANT_IP>` | Your Qdrant IP | `10.0.0.40` or `localhost` |
| `<OLLAMA_IP>` | Your Ollama IP | `10.0.0.10` or `localhost` |
| `<PROJECT_PATH>` | Your project path | `.local_projects` or `.projects` |
| `<SKILL_PATH>` | Your skills path | `~/.openclaw/workspace/skills` |
| `<YOUR_USER_ID>` | Your user ID | `rob` or username |
**Result:** Context injection now functional. Gems are embedded and searchable. Both watcher and plugin capture working.
| Check | Date | Status |
|-------|------|--------|
| All services running | 2026-02-25 | ✅ |
| Collections healthy | 2026-02-25 | ✅ |
| Capture working | 2026-02-25 | ✅ |
| Curation working | 2026-02-25 | ✅ |
| Recall working | 2026-02-25 | ✅ **Context injection verified** |
---
## Sign-Off
| Check | Date | Initials |
|-------|------|----------|
| All services running | | |
| Collections healthy | | |
| Capture working | | |
| Curation working | | |
| Recall working | | |
---
*Last updated: 2026-02-25*
*Version: 2.2 (Generic)*
*Last updated: 2026-02-25 12:04 CST*
*User: rob*
*Qdrant: <QDRANT_IP>:6333*
*Timer: 5 minutes*
*Collections: memories_tr (12,729), gems_tr (14+)*
*Status: ✅ Context injection WORKING*

View File

@@ -6,7 +6,7 @@
---
## 🔥 CRITICAL FIXES APPLIED (2026-02-25 12:00-12:41 CST)
## 🔥 CRITICAL FIXES APPLIED (2026-02-25 12:00 CST)
### Issues Found & Fixed
@@ -17,28 +17,10 @@
| **JSON parsing errors** | Complex prompt causing LLM failures | ✅ Simplified extraction prompt |
| **Field mismatch** | Memories have `text`, curator expected `content` | ✅ Curator now supports both `text` and `content` fields |
| **Silent embedding failures** | No error logging | ✅ Added explicit error messages |
| **Watcher stuck on old session** | Watcher only switched when file deleted, old sessions persisted | ✅ Restarted service, now follows current session (12:22) |
| **Plugin capture 0 exchanges** | OpenClaw uses OpenAI content format (array), plugin expected string | ✅ Added `extractMessageText()` to parse content arrays (12:34) |
### Plugin Capture Fix Validation
```
Before: parsed 14 user, 84 assistant messages, 0 exchanges
After: parsed 17 user, 116 assistant messages, 9 exchanges ✅
```
**Code change:** Added `extractMessageText()` function to handle OpenAI-style content arrays:
```typescript
function extractMessageText(msg) {
if (typeof content === "string") return content;
if (Array.isArray(content)) {
for (const item of content) {
if (item.type === "text") textParts.push(item.text);
}
return textParts.join(" ");
}
}
```
| **Gem ID collision** | Hash used non-existent fields | ✅ Hash now uses `embedding_text_for_hash[:100]` |
| **Meta-gems extracted** | Curator extracted from debug output | ✅ Added SKIP_PATTERNS filter |
| **gems_tr pollution** | 5 meta-gems + 1 real gem | ✅ Cleaned, now 1 real gem only |
| **First-person gems** | Third person format "User decided..." | ✅ Changed to "I decided..." for better matching |
### Validation Results
@@ -78,9 +60,9 @@ function extractMessageText(msg) {
**Next session start:** Read this file, then check:
```bash
# Quick status
python3 /root/.openclaw/workspace/.local_projects/true-recall-v2/tr-continuous/curator_timer.py --status
python3 ~/.openclaw/workspace/.local_projects/true-recall-v2/tr-continuous/curator_timer.py --status
sudo systemctl status mem-qdrant-watcher
curl -s http://10.0.0.40:6333/collections/memories_tr | jq '.result.points_count'
curl -s http://<QDRANT_IP>:6333/collections/memories_tr | jq '.result.points_count'
```
---
@@ -188,12 +170,12 @@ TrueRecall v2 is a complete memory system with real-time capture, daily curation
### 1. Real-Time Watcher (Primary Capture)
**Location:** `/root/.openclaw/workspace/skills/qdrant-memory/scripts/realtime_qdrant_watcher.py`
**Location:** `~/.openclaw/workspace/skills/qdrant-memory/scripts/realtime_qdrant_watcher.py`
**Function:**
- Watches `/root/.openclaw/agents/main/sessions/*.jsonl`
- Parses every conversation turn in real-time
- Embeds with `snowflake-arctic-embed2` (Ollama @ 10.0.0.10)
- Embeds with `snowflake-arctic-embed2` (Ollama @ <OLLAMA_IP>)
- Stores directly to `memories_tr` (no Redis)
- **Cleans content:** Removes markdown, tables, metadata, thinking tags
@@ -207,7 +189,7 @@ TrueRecall v2 is a complete memory system with real-time capture, daily curation
### 2. Content Cleaner (Existing Data)
**Location:** `/root/.openclaw/workspace/skills/qdrant-memory/scripts/clean_memories_tr.py`
**Location:** `~/.openclaw/workspace/skills/qdrant-memory/scripts/clean_memories_tr.py`
**Function:**
- Batch-cleans existing `memories_tr` points
@@ -233,7 +215,7 @@ python3 clean_memories_tr.py --execute --limit 100
**Replaces:** Daily curator (2:45 AM batch) and turn-based curator
**Location:** `/root/.openclaw/workspace/.local_projects/true-recall-v2/tr-continuous/curator_timer.py`
**Location:** `~/.openclaw/workspace/.local_projects/true-recall-v2/tr-continuous/curator_timer.py`
**Schedule:** Every 30 minutes (cron)
@@ -314,7 +296,7 @@ python3 curator_timer.py --config curator_config.json
### Core Project Files
```
/root/.openclaw/workspace/.local_projects/true-recall-v2/
~/.openclaw/workspace/.local_projects/true-recall-v2/
├── README.md # Architecture docs
├── session.md # This file
├── curator-prompt.md # Gem extraction prompt
@@ -356,6 +338,46 @@ python3 curator_timer.py --config curator_config.json
---
## 🔥 CRITICAL FIXES APPLIED (2026-02-25 12:00-12:41 CST)
### Issues Found & Fixed Today
| Issue | Root Cause | Fix Applied |
|-------|------------|-------------|
| **Watcher stuck on old session** | Watcher only checked for new sessions when current file deleted | ✅ Restarted watcher, now follows current session (12:22) |
| **Plugin capture 0 exchanges** | OpenClaw changed to OpenAI content format (array), plugin expected string | ✅ Added `extractMessageText()` to parse content arrays (12:34) |
| **Session switching logic** | Old sessions persisted, watcher never switched | ✅ Fixed session detection logic in watcher |
| **Plugin content extraction** | `msg.content` is now array with `{type, text}` items | ✅ Extracts text from `type: "text"` items |
### Validation Results (2026-02-25 12:41)
```
memory-qdrant: parsed 17 user, 116 assistant messages, 9 exchanges
memory-qdrant: first msg role=user, contentType=array
```
**Before:** 0 exchanges extracted
**After:** 9 exchanges captured per session
### Components Status
| Component | Before | After | Status |
|-----------|--------|-------|--------|
| Real-time watcher | Stuck on Feb 24 session | Following current session | ✅ Fixed |
| Plugin capture | 0 exchanges | 9 exchanges | ✅ Fixed |
| Context injection | Working | Still working | ✅ Verified |
### Files Modified (2026-02-25)
| File | Change |
|------|--------|
| `extensions/memory-qdrant/index.ts` | Added `extractMessageText()` function, removed debug logging |
| `extensions/memory-qdrant/index.js` | Compiled TypeScript changes |
| `session.md` | This update |
| `function_check.md` | Added fixes section |
---
## Changes Made Today (2026-02-24 19:00)
### 1. Timer Curator Deployed (v2.2)
@@ -399,8 +421,8 @@ python3 curator_timer.py --config curator_config.json
"embeddingModel": "snowflake-arctic-embed2",
"maxRecallResults": 2,
"minRecallScore": 0.7,
"ollamaUrl": "http://10.0.0.10:11434",
"qdrantUrl": "http://10.0.0.40:6333"
"ollamaUrl": "http://<OLLAMA_IP>:11434",
"qdrantUrl": "http://<QDRANT_IP>:6333"
},
"enabled": true
}
@@ -429,11 +451,11 @@ python3 curator_timer.py --config curator_config.json
```bash
# Points count
curl -s http://10.0.0.40:6333/collections/memories_tr | jq '.result.points_count'
curl -s http://10.0.0.40:6333/collections/gems_tr | jq '.result.points_count'
curl -s http://<QDRANT_IP>:6333/collections/memories_tr | jq '.result.points_count'
curl -s http://<QDRANT_IP>:6333/collections/gems_tr | jq '.result.points_count'
# Recent points
curl -s -X POST http://10.0.0.40:6333/collections/memories_tr/points/scroll \
curl -s -X POST http://<QDRANT_IP>:6333/collections/memories_tr/points/scroll \
-H "Content-Type: application/json" \
-d '{"limit": 5, "with_payload": true}' | jq '.result.points[].payload.content'
```
@@ -460,8 +482,8 @@ openclaw status
**Check:**
1. Service running? `systemctl status mem-qdrant-watcher`
2. Logs: `journalctl -u mem-qdrant-watcher -f`
3. Qdrant accessible? `curl http://10.0.0.40:6333/`
4. Ollama accessible? `curl http://10.0.0.10:11434/api/tags`
3. Qdrant accessible? `curl http://<QDRANT_IP>:6333/`
4. Ollama accessible? `curl http://<OLLAMA_IP>:11434/api/tags`
### Issue: Cleaner Fails
@@ -544,7 +566,7 @@ If starting fresh:
1. Read `README.md` for architecture overview
2. Check service status: `sudo systemctl status mem-qdrant-watcher`
3. Check timer curator: `tail /var/log/true-recall-timer.log`
4. Verify collections: `curl http://10.0.0.40:6333/collections`
4. Verify collections: `curl http://<QDRANT_IP>:6333/collections`
---