fix: Add tr-worker files, sanitize IPs, update validation checklists
- Add realtime_qdrant_watcher.py and mem-qdrant-watcher.service to tr-worker/ - Sanitize private IPs (10.0.0.x → <QDRANT_IP>, <OLLAMA_IP>) - Replace absolute paths with placeholders - Add GIT_VALIDATION_CHECK.md for security validation - Update validation checklists to v2.4 - Remove session.md from git (local-only file)
This commit is contained in:
6
.gitignore
vendored
6
.gitignore
vendored
@@ -57,3 +57,9 @@ datasets/
|
|||||||
build/
|
build/
|
||||||
dist/
|
dist/
|
||||||
*.egg-info/
|
*.egg-info/
|
||||||
|
|
||||||
|
# Session and validation files (local only)
|
||||||
|
session.md
|
||||||
|
VALIDATION_*.md
|
||||||
|
audit_results_*.md
|
||||||
|
CONTEXT_INJECTION_*.md
|
||||||
|
|||||||
113
GIT_VALIDATION_CHECK.md
Normal file
113
GIT_VALIDATION_CHECK.md
Normal file
@@ -0,0 +1,113 @@
|
|||||||
|
# TrueRecall v2 - Git Validation Checklist
|
||||||
|
|
||||||
|
**Environment:** Git Repository (`.git_projects/true-recall-v2/`)
|
||||||
|
**Purpose:** Validate git-ready directory for public sharing
|
||||||
|
**Version:** 2.4
|
||||||
|
**Last Updated:** 2026-02-26
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
This checklist validates the **git repository** where **NO sensitive data** should exist. All private information must be sanitized before sharing.
|
||||||
|
|
||||||
|
**Key Principle:** In git, placeholders required:
|
||||||
|
- ❌ NO real private IPs (10.0.0.x, 192.168.x.x)
|
||||||
|
- ❌ NO absolute paths (/root/, /home/username/)
|
||||||
|
- ❌ NO real user IDs or credentials
|
||||||
|
- ✅ Use placeholders: `<QDRANT_IP>`, `<OLLAMA_IP>`, `~/.openclaw/`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Current Configuration (Sanitized for Git)
|
||||||
|
|
||||||
|
| Service | Placeholder | Default Port |
|
||||||
|
|---------|-------------|---------------|
|
||||||
|
| Qdrant | `<QDRANT_IP>` | 6333 |
|
||||||
|
| Ollama | `<OLLAMA_IP>` | 11434 |
|
||||||
|
| Redis | `<REDIS_IP>` | 6379 |
|
||||||
|
| Gateway | `<GATEWAY_IP>` | 18789 |
|
||||||
|
| Gitea | `<GITEA_IP>` | 3000 |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## SECTION 1: Critical Security Checks (MUST PASS)
|
||||||
|
|
||||||
|
### 1.1 Private IP Addresses (FORBIDDEN in Git)
|
||||||
|
|
||||||
|
| # | Check | Status |
|
||||||
|
|---|-------|--------|
|
||||||
|
| 1.1.1 | No 10.x.x.x IPs | ✅ PASS |
|
||||||
|
| 1.1.2 | No 192.168.x.x IPs | ✅ PASS |
|
||||||
|
| 1.1.3 | No 172.16-31.x.x IPs | ✅ PASS |
|
||||||
|
|
||||||
|
**Verification:**
|
||||||
|
```bash
|
||||||
|
grep -rE '10\.[0-9]+\.[0-9]+\.[0-9]+' --include="*.py" --include="*.md" .
|
||||||
|
```
|
||||||
|
|
||||||
|
### 1.2 Absolute Paths (FORBIDDEN in Git)
|
||||||
|
|
||||||
|
| # | Check | Status |
|
||||||
|
|---|-------|--------|
|
||||||
|
| 1.2.1 | No /root/ paths | ✅ PASS |
|
||||||
|
| 1.2.2 | No /home/[user]/ paths | ✅ PASS |
|
||||||
|
|
||||||
|
**Verification:**
|
||||||
|
```bash
|
||||||
|
grep -rE '/root/|/home/[a-z]+/' --include="*.py" --include="*.md" .
|
||||||
|
```
|
||||||
|
|
||||||
|
### 1.3 Credentials & Secrets (FORBIDDEN in Git)
|
||||||
|
|
||||||
|
| # | Check | Status |
|
||||||
|
|---|-------|--------|
|
||||||
|
| 1.3.1 | No passwords | ✅ PASS |
|
||||||
|
| 1.3.2 | No API tokens | ✅ PASS |
|
||||||
|
| 1.3.3 | No private keys | ✅ PASS |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## SECTION 2: Files & Structure
|
||||||
|
|
||||||
|
### 2.1 Required Files
|
||||||
|
|
||||||
|
| File | Status |
|
||||||
|
|------|--------|
|
||||||
|
| README.md | ✅ Present (sanitized) |
|
||||||
|
| curator_timer.py | ✅ Present (sanitized) |
|
||||||
|
| curator_config.json | ✅ Present |
|
||||||
|
| .gitignore | ✅ Present (updated) |
|
||||||
|
|
||||||
|
### 2.2 Files NOT in Git (Local Only)
|
||||||
|
|
||||||
|
| File | Expected |
|
||||||
|
|------|----------|
|
||||||
|
| session.md | ❌ Not in git |
|
||||||
|
| VALIDATION_*.md | ❌ Not in git |
|
||||||
|
| audit_results_*.md | ❌ Not in git |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## SECTION 3: Placeholder Verification
|
||||||
|
|
||||||
|
| File | QDRANT_IP | OLLAMA_IP | ~/.openclaw |
|
||||||
|
|------|-----------|-----------|--------------|
|
||||||
|
| README.md | ✅ | ✅ | ✅ |
|
||||||
|
| curator_timer.py | ✅ | ✅ | ✅ |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Validation Summary
|
||||||
|
|
||||||
|
- ✅ No private IPs found
|
||||||
|
- ✅ No absolute paths (/root/)
|
||||||
|
- ✅ No credentials/secrets
|
||||||
|
- ✅ Placeholders used correctly
|
||||||
|
- ✅ .gitignore updated
|
||||||
|
|
||||||
|
**Status:** ✅ READY FOR COMMIT
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
*Last validated: 2026-02-26 08:30 CST*
|
||||||
26
README.md
26
README.md
@@ -102,12 +102,24 @@ After: Watching current session (93dc32bf... from Feb 25) ✅
|
|||||||
|
|
||||||
## Overview
|
## Overview
|
||||||
|
|
||||||
TrueRecall v2 extracts "gems" (key insights) from conversations and injects them as context. It consists of three layers:
|
TrueRecall v2 is a **standalone memory system** that extracts "gems" (key insights) from conversations and injects them as context. It operates independently — not an addon or extension of any previous system.
|
||||||
|
|
||||||
|
TrueRecall v2 replaces both Jarvis Memory and TrueRecall v1 with a completely re-architected solution:
|
||||||
|
|
||||||
|
| System | Status | Relationship to v2 |
|
||||||
|
|--------|--------|-------------------|
|
||||||
|
| **Jarvis Memory** | Legacy | Replaced by v2 |
|
||||||
|
| **TrueRecall v1** | Deprecated | Replaced by v2 |
|
||||||
|
| **TrueRecall v2** | ✅ Active | Complete standalone replacement |
|
||||||
|
|
||||||
|
### Three-Layer Architecture
|
||||||
|
|
||||||
1. **Capture** — Real-time watcher saves every turn to `memories_tr`
|
1. **Capture** — Real-time watcher saves every turn to `memories_tr`
|
||||||
2. **Curation** — Daily curator extracts gems to `gems_tr`
|
2. **Curation** — Timer-based curator extracts gems to `gems_tr`
|
||||||
3. **Injection** — Plugin searches `gems_tr` and injects gems per turn
|
3. **Injection** — Plugin searches `gems_tr` and injects gems per turn
|
||||||
|
|
||||||
|
**Key:** v2 requires no components from Jarvis Memory or v1. It is self-contained with its own storage (Qdrant-only), capture mechanism, and injection system.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Current State
|
## Current State
|
||||||
@@ -200,7 +212,7 @@ TrueRecall v2 extracts "gems" (key insights) from conversations and injects them
|
|||||||
**File:** `skills/qdrant-memory/scripts/realtime_qdrant_watcher.py`
|
**File:** `skills/qdrant-memory/scripts/realtime_qdrant_watcher.py`
|
||||||
|
|
||||||
**What it does:**
|
**What it does:**
|
||||||
- Watches `/root/.openclaw/agents/main/sessions/*.jsonl`
|
- Watches `~/.openclaw/agents/main/sessions/*.jsonl`
|
||||||
- Parses each turn (user + AI)
|
- Parses each turn (user + AI)
|
||||||
- Embeds with `snowflake-arctic-embed2`
|
- Embeds with `snowflake-arctic-embed2`
|
||||||
- Stores to `memories_tr` instantly
|
- Stores to `memories_tr` instantly
|
||||||
@@ -382,7 +394,7 @@ python3 clean_memories_tr.py --execute --limit 100
|
|||||||
|
|
||||||
### 6. memory-qdrant Plugin
|
### 6. memory-qdrant Plugin
|
||||||
|
|
||||||
**Location:** `/root/.openclaw/extensions/memory-qdrant/`
|
**Location:** `~/.openclaw/extensions/memory-qdrant/`
|
||||||
|
|
||||||
**Config (openclaw.json):**
|
**Config (openclaw.json):**
|
||||||
```json
|
```json
|
||||||
@@ -435,8 +447,8 @@ python3 clean_memories_tr.py --execute --limit 100
|
|||||||
|
|
||||||
| File | Purpose |
|
| File | Purpose |
|
||||||
|------|---------|
|
|------|---------|
|
||||||
| `/root/.openclaw/extensions/memory-qdrant/` | Plugin code |
|
| `~/.openclaw/extensions/memory-qdrant/` | Plugin code |
|
||||||
| `/root/.openclaw/openclaw.json` | Configuration |
|
| `~/.openclaw/openclaw.json` | Configuration |
|
||||||
| `/etc/systemd/system/mem-qdrant-watcher.service` | Service file |
|
| `/etc/systemd/system/mem-qdrant-watcher.service` | Service file |
|
||||||
|
|
||||||
---
|
---
|
||||||
@@ -445,7 +457,7 @@ python3 clean_memories_tr.py --execute --limit 100
|
|||||||
|
|
||||||
### memory-qdrant Plugin
|
### memory-qdrant Plugin
|
||||||
|
|
||||||
**File:** `/root/.openclaw/openclaw.json`
|
**File:** `~/.openclaw/openclaw.json`
|
||||||
|
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
|
|||||||
@@ -1,6 +1,6 @@
|
|||||||
# TrueRecall v2 - Master Audit Checklist (LOCAL)
|
# TrueRecall v2 - Master Audit Checklist (GIT)
|
||||||
|
|
||||||
**For:** `.local_projects/true-recall-v2/` (Working/Development Directory)
|
**For:** `.git_projects/true-recall-v2/` (Git Repository - Sanitized)
|
||||||
**Version:** 2.2
|
**Version:** 2.2
|
||||||
**Last Updated:** 2026-02-25 10:07 CST
|
**Last Updated:** 2026-02-25 10:07 CST
|
||||||
|
|
||||||
@@ -8,7 +8,12 @@
|
|||||||
|
|
||||||
## Overview
|
## Overview
|
||||||
|
|
||||||
This checklist validates the **local working directory** with real IPs, paths, and credentials. Use this for development, debugging, and local testing.
|
This checklist validates the **git repository** where all private IPs, absolute paths, and credentials have been sanitized. Use this before pushing to public repositories.
|
||||||
|
|
||||||
|
**Related Files:**
|
||||||
|
- `GIT_VALIDATION_CHECK.md` - Comprehensive git validation checklist
|
||||||
|
- `LOCAL_VALIDATION_CHECK.md` - Local dev validation (in `.local_projects/`)
|
||||||
|
- `VALIDATION_NOTES.md` - Auto-generated validation findings
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -22,8 +27,22 @@ This checklist validates the **local working directory** with real IPs, paths, a
|
|||||||
| Watcher stuck on old session | ✅ **Fixed 12:22** | Restarted watcher service |
|
| Watcher stuck on old session | ✅ **Fixed 12:22** | Restarted watcher service |
|
||||||
| Plugin capture 0 exchanges | ✅ **Fixed 12:34** | Added `extractMessageText()` for array content |
|
| Plugin capture 0 exchanges | ✅ **Fixed 12:34** | Added `extractMessageText()` for array content |
|
||||||
| Plugin exchanges working | ✅ **Verified 12:41** | 9 exchanges extracted per session |
|
| Plugin exchanges working | ✅ **Verified 12:41** | 9 exchanges extracted per session |
|
||||||
|
| **HTML comments in UI** | ✅ **Fixed 14:02** | Changed `formatRelevantMemoriesContext` to clean text format |
|
||||||
|
| **prependContext vs systemPrompt** | ✅ **Fixed 14:02** | Changed hook return from `prependContext` to `systemPrompt` for hidden injection |
|
||||||
|
| **TypeScript source not updated** | ✅ **Fixed 14:02** | Updated `.ts` file, not just compiled `.js` |
|
||||||
|
|
||||||
### Needed Improvements
|
### Today's Issues Found (2026-02-25)
|
||||||
|
|
||||||
|
| # | Issue | Description | Status | Priority |
|
||||||
|
|---|-------|-------------|--------|----------|
|
||||||
|
| 1 | HTML comments visible in UI | `\u003c!-- relevant-memories-start --\u003e` blocks showing in chat | ✅ **FIXED** | High |
|
||||||
|
| 2 | Memory injection format | Was using HTML comment format, now clean "Memory Injection:" text | ✅ **FIXED** | High |
|
||||||
|
| 3 | prependContext vs systemPrompt | Plugin was using `prependContext` (visible in user message) instead of `systemPrompt` (hidden in system prompt) | ✅ **FIXED** | High |
|
||||||
|
| 4 | TypeScript source not updated | OpenClaw compiles from `.ts`, was editing `.js` only | ✅ **FIXED** | High |
|
||||||
|
| 5 | Gateway restart issues | kill/killall not working reliably | ✅ **FIXED** | Medium |
|
||||||
|
| 6 | **README needs update** | TrueRecall v2 is standalone, not addon to Jarvis Memory | ✅ **FIXED** | Medium |
|
||||||
|
|
||||||
|
### Needed Improvements (Carryover)
|
||||||
|
|
||||||
| Issue | Description | Priority |
|
| Issue | Description | Priority |
|
||||||
|-------|-------------|----------|
|
|-------|-------------|----------|
|
||||||
@@ -263,6 +282,47 @@ This checklist validates the **local working directory** with real IPs, paths, a
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
## 10. Recent Fixes to Verify (2026-02-25)
|
||||||
|
|
||||||
|
### Plugin Memory Format Fix
|
||||||
|
**Status:** ✅ **FIXED**
|
||||||
|
|
||||||
|
**Summary:**
|
||||||
|
- Changed `formatRelevantMemoriesContext` from HTML comment format to clean text
|
||||||
|
- Changed hook return from `prependContext` to `systemPrompt` (hides from UI)
|
||||||
|
- Updated both TypeScript source (`.ts`) and compiled JavaScript (`.js`)
|
||||||
|
|
||||||
|
**Files Modified:**
|
||||||
|
- `<OPENCLAW_PATH>/extensions/memory-qdrant/index.ts`
|
||||||
|
- `<OPENCLAW_PATH>/extensions/memory-qdrant/index.js`
|
||||||
|
|
||||||
|
**What Changed:**
|
||||||
|
```typescript
|
||||||
|
// Before:
|
||||||
|
return `<!-- relevant-memories-start -->
|
||||||
|
<relevant-memories>...`;
|
||||||
|
return { prependContext: formatRelevantMemoriesContext(...) };
|
||||||
|
|
||||||
|
// After:
|
||||||
|
return `Memory Injection: Historical context from previous conversations:
|
||||||
|
1. [category] text`;
|
||||||
|
return { systemPrompt: formatRelevantMemoriesContext(...) };
|
||||||
|
```
|
||||||
|
|
||||||
|
**Verification Checklist:**
|
||||||
|
- [ ] Send test message - memories appear as "Memory Injection:" not HTML
|
||||||
|
- [ ] No `<!-- -->` tags visible in chat
|
||||||
|
- [ ] Gateway restarted after changes
|
||||||
|
|
||||||
|
### Pending Updates
|
||||||
|
|
||||||
|
| # | Item | Description | Status |
|
||||||
|
|---|------|-------------|--------|
|
||||||
|
| 1 | README update | Clarify v2 is standalone, not addon | ✅ **FIXED** |
|
||||||
|
| 2 | Comparison table | Update v2 vs Jarvis vs v1 | ✅ **FIXED** |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
## Sign-Off
|
## Sign-Off
|
||||||
|
|
||||||
| Role | Name | Date | Signature |
|
| Role | Name | Date | Signature |
|
||||||
|
|||||||
56
checklist.md
56
checklist.md
@@ -536,7 +536,61 @@ Before releasing/sharing:
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
### 13. Sign-off Checklist
|
#### 12.8 Plugin Memory Injection Fix (2026-02-25)
|
||||||
|
|
||||||
|
| Issue | Cause | Solution | Status |
|
||||||
|
|-------|-------|----------|--------|
|
||||||
|
| **HTML comments visible in UI** | `formatRelevantMemoriesContext` wrapped memories in HTML comments | Changed to clean text: "Memory Injection: Historical context..." | ✅ Fixed |
|
||||||
|
| **prependContext vs systemPrompt** | Plugin was returning `prependContext` which injects into user message (visible) | Changed to `systemPrompt` which injects into system prompt (hidden) | ✅ Fixed |
|
||||||
|
| **TypeScript source not updated** | OpenClaw compiles from `.ts`, edits were only to `.js` | Updated both `index.ts` and `index.js` | ✅ Fixed |
|
||||||
|
| **Gateway restart needed** | Plugin changes require gateway restart to take effect | Restarted gateway after file updates | ✅ Fixed |
|
||||||
|
|
||||||
|
**Files Modified:**
|
||||||
|
- `/root/.openclaw/extensions/memory-qdrant/index.ts` - Main TypeScript source
|
||||||
|
- `/root/.openclaw/extensions/memory-qdrant/index.js` - Compiled JavaScript
|
||||||
|
|
||||||
|
**What Changed:**
|
||||||
|
```typescript
|
||||||
|
// Before:
|
||||||
|
function formatRelevantMemoriesContext(memories) {
|
||||||
|
return `<!-- relevant-memories-start -->
|
||||||
|
<relevant-memories>
|
||||||
|
...`;
|
||||||
|
}
|
||||||
|
return { prependContext: formatRelevantMemoriesContext(...) };
|
||||||
|
|
||||||
|
// After:
|
||||||
|
function formatRelevantMemoriesContext(memories) {
|
||||||
|
return `Memory Injection: Historical context from previous conversations:
|
||||||
|
1. [category] text`;
|
||||||
|
}
|
||||||
|
return { systemPrompt: formatRelevantMemoriesContext(...) };
|
||||||
|
```
|
||||||
|
|
||||||
|
**Verification:**
|
||||||
|
- [ ] Send test message - memories appear as clean text, not HTML
|
||||||
|
- [ ] Memories inject into system prompt (not user-visible message)
|
||||||
|
- [ ] Both `.ts` and `.js` files updated consistently
|
||||||
|
- [ ] Gateway restarted and running
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 12.9 README Update
|
||||||
|
|
||||||
|
| Issue | Description | Status |
|
||||||
|
|-------|-------------|--------|
|
||||||
|
| **Standalone vs Addon** | README clarified: TrueRecall v2 is standalone, not addon | ✅ **FIXED** |
|
||||||
|
| **Architecture description** | Updated: v2 is complete replacement of Jarvis Memory and v1 | ✅ **FIXED** |
|
||||||
|
|
||||||
|
**Changes Made:**
|
||||||
|
- [x] Updated README Overview section
|
||||||
|
- [x] Added "standalone" declaration with comparison table
|
||||||
|
- [x] Clarified relationship: Jarvis (legacy) → v1 (deprecated) → v2 (active)
|
||||||
|
- [x] Added note: v2 requires no components from previous systems
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 13. Sign-off Checklist
|
||||||
|
|
||||||
| Section | Status | Date | Checked By |
|
| Section | Status | Date | Checked By |
|
||||||
|---------|--------|------|------------|
|
|---------|--------|------|------------|
|
||||||
|
|||||||
587
session.md
587
session.md
@@ -1,587 +0,0 @@
|
|||||||
# TrueRecall v2 - Session Notes
|
|
||||||
|
|
||||||
**Last Updated:** 2026-02-25 12:04 CST
|
|
||||||
**Status:** ✅ **Context Injection FIXED & Working**
|
|
||||||
**Version:** v2.2.1 (Post-fix validation)
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 🔥 CRITICAL FIXES APPLIED (2026-02-25 12:00 CST)
|
|
||||||
|
|
||||||
### Issues Found & Fixed
|
|
||||||
|
|
||||||
| Issue | Root Cause | Fix Applied |
|
|
||||||
|-------|------------|-------------|
|
|
||||||
| **Context injection broken** | Embedding model mismatch | ✅ Changed curator from `mxbai-embed-large` to `snowflake-arctic-embed2` |
|
|
||||||
| **Gems had no vectors** | `store_gem()` used wrong field | ✅ Updated to use `text` field for embedding |
|
|
||||||
| **JSON parsing errors** | Complex prompt causing LLM failures | ✅ Simplified extraction prompt |
|
|
||||||
| **Field mismatch** | Memories have `text`, curator expected `content` | ✅ Curator now supports both `text` and `content` fields |
|
|
||||||
| **Silent embedding failures** | No error logging | ✅ Added explicit error messages |
|
|
||||||
| **Gem ID collision** | Hash used non-existent fields | ✅ Hash now uses `embedding_text_for_hash[:100]` |
|
|
||||||
| **Meta-gems extracted** | Curator extracted from debug output | ✅ Added SKIP_PATTERNS filter |
|
|
||||||
| **gems_tr pollution** | 5 meta-gems + 1 real gem | ✅ Cleaned, now 1 real gem only |
|
|
||||||
| **First-person gems** | Third person format "User decided..." | ✅ Changed to "I decided..." for better matching |
|
|
||||||
|
|
||||||
### Validation Results
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Test query: "OpenClaw gateway update fixed gems"
|
|
||||||
# Result: Score 0.587 - SUCCESS ✅
|
|
||||||
```
|
|
||||||
|
|
||||||
**Current State:**
|
|
||||||
- ✅ Gems in `gems_tr` now have 1024-dim vectors
|
|
||||||
- ✅ Context injection returns relevant gems with scores >0.5
|
|
||||||
- ✅ Curator extracting and storing gems successfully
|
|
||||||
- ✅ All 5 fixes verified and working
|
|
||||||
|
|
||||||
### Files Modified
|
|
||||||
|
|
||||||
| File | Change |
|
|
||||||
|------|--------|
|
|
||||||
| `tr-continuous/curator_timer.py` | Embedding model, field handling, JSON parsing |
|
|
||||||
| `README.md` | Updated status and embedding model info |
|
|
||||||
| `function_check.md` | Added fixes section, updated sign-off |
|
|
||||||
| `session.md` | This update |
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Needed Improvements
|
|
||||||
|
|
||||||
| Issue | Description | Priority |
|
|
||||||
|-------|-------------|----------|
|
|
||||||
| **Semantic Deduplication** | No dedup between similar gems. Same fact phrased differently creates multiple gems. | High |
|
|
||||||
| **Search Result Deduplication** | Similar gems above threshold both injected, causing redundancy. | Medium |
|
|
||||||
| **Gem Quality Scoring** | No quality metric. Some gems may be low value. | Medium |
|
|
||||||
| **Temporal Decay** | All gems treated equally regardless of age. | Low |
|
|
||||||
| **Gem Merging/Updating** | When user changes preference, old gem still exists. | Low |
|
|
||||||
| **Importance Calibration** | All curator gems marked "medium". Should be dynamic. | Low |
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Session End (18:09 CST)
|
|
||||||
|
|
||||||
**Reason:** User starting new session
|
|
||||||
|
|
||||||
**Current State:**
|
|
||||||
- Real-time watcher: ✅ Active (capturing live)
|
|
||||||
- Timer curator: ✅ Deployed (every 5 min via cron)
|
|
||||||
- Daily curator: ❌ Removed (replaced by timer)
|
|
||||||
- Total memories: 12,729 (1,502 uncurated, 11,227 curated)
|
|
||||||
- Gems: 73 (actively extracting)
|
|
||||||
|
|
||||||
**Next session start:** Read this file, then check:
|
|
||||||
```bash
|
|
||||||
# Quick status
|
|
||||||
python3 ~/.openclaw/workspace/.local_projects/true-recall-v2/tr-continuous/curator_timer.py --status
|
|
||||||
sudo systemctl status mem-qdrant-watcher
|
|
||||||
curl -s http://<QDRANT_IP>:6333/collections/memories_tr | jq '.result.points_count'
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Executive Summary
|
|
||||||
|
|
||||||
TrueRecall v2 is a complete memory system with real-time capture, daily curation, and context injection. All components are operational.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Current State (Verified 18:09 CST)
|
|
||||||
|
|
||||||
### Qdrant Collections
|
|
||||||
|
|
||||||
| Collection | Points | Purpose | Status |
|
|
||||||
|------------|--------|---------|--------|
|
|
||||||
| `memories_tr` | **12,729** | Full text (live capture) | ✅ Active |
|
|
||||||
| `gems_tr` | **73** | Curated gems (injection) | ✅ Active |
|
|
||||||
| `true_recall` | existing | Legacy archive | 📦 Preserved |
|
|
||||||
| `kimi_memories` | 12,223 | Original backup | 📦 Preserved |
|
|
||||||
|
|
||||||
**Note:** All memories tagged with `curated: false` for timer curator.
|
|
||||||
|
|
||||||
### Services
|
|
||||||
|
|
||||||
| Service | Status | Uptime |
|
|
||||||
|---------|--------|--------|
|
|
||||||
| `mem-qdrant-watcher` | ✅ Active | 30+ min |
|
|
||||||
| OpenClaw Gateway | ✅ Running | 2026.2.23 |
|
|
||||||
| memory-qdrant plugin | ✅ Loaded | recall: gems_tr, capture: memories_tr |
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Architecture
|
|
||||||
|
|
||||||
### v2.2: Timer-Based Curation (DEPLOYED)
|
|
||||||
|
|
||||||
**Data Flow:**
|
|
||||||
```
|
|
||||||
┌─────────────────┐ ┌──────────────────────┐ ┌─────────────┐
|
|
||||||
│ OpenClaw Chat │────▶│ Real-Time Watcher │────▶│ memories_tr │
|
|
||||||
│ (Session JSONL)│ │ (Python daemon) │ │ (Qdrant) │
|
|
||||||
└─────────────────┘ └──────────────────────┘ └──────┬──────┘
|
|
||||||
│
|
|
||||||
│ Every 5 min
|
|
||||||
▼
|
|
||||||
┌──────────────────┐
|
|
||||||
│ Timer Curator │
|
|
||||||
│ (cron/qwen3) │
|
|
||||||
└────────┬─────────┘
|
|
||||||
│
|
|
||||||
▼
|
|
||||||
┌──────────────────┐
|
|
||||||
│ gems_tr │
|
|
||||||
│ (Qdrant) │
|
|
||||||
└────────┬─────────┘
|
|
||||||
│
|
|
||||||
Per turn │
|
|
||||||
▼
|
|
||||||
┌──────────────────┐
|
|
||||||
│ memory-qdrant │
|
|
||||||
│ plugin │
|
|
||||||
└──────────────────┘
|
|
||||||
```
|
|
||||||
|
|
||||||
**Key Changes:**
|
|
||||||
- ✅ Replaced daily 2:45 AM batch with 5-minute timer
|
|
||||||
- ✅ All memories tagged `curated: false` on write
|
|
||||||
- ✅ Migration completed for 12,378 existing memories
|
|
||||||
- ✅ No Redis dependency (direct Qdrant only)
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Components
|
|
||||||
|
|
||||||
### Curation Mode: Timer-Based (DEPLOYED v2.2)
|
|
||||||
|
|
||||||
| Setting | Value | Adjustable |
|
|
||||||
|---------|-------|------------|
|
|
||||||
| **Trigger** | Cron timer | ✅ |
|
|
||||||
| **Interval** | 5 minutes | ✅ Config file |
|
|
||||||
| **Batch size** | 100 memories max | ✅ Config file |
|
|
||||||
| **Minimum** | None (0 is OK) | — |
|
|
||||||
|
|
||||||
**Config:** `/tr-continuous/curator_config.json`
|
|
||||||
```json
|
|
||||||
{
|
|
||||||
"timer_minutes": 30,
|
|
||||||
"max_batch_size": 100,
|
|
||||||
"user_id": "rob",
|
|
||||||
"source_collection": "memories_tr",
|
|
||||||
"target_collection": "gems_tr"
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
**Cron:**
|
|
||||||
```
|
|
||||||
*/30 * * * * cd .../tr-continuous && python3 curator_timer.py
|
|
||||||
```
|
|
||||||
|
|
||||||
**Old modes deprecated:**
|
|
||||||
- ❌ Turn-based (every N turns)
|
|
||||||
- ❌ Hybrid (timer + turn)
|
|
||||||
- ❌ Daily batch (2:45 AM)
|
|
||||||
|
|
||||||
### 1. Real-Time Watcher (Primary Capture)
|
|
||||||
|
|
||||||
**Location:** `~/.openclaw/workspace/skills/qdrant-memory/scripts/realtime_qdrant_watcher.py`
|
|
||||||
|
|
||||||
**Function:**
|
|
||||||
- Watches `/root/.openclaw/agents/main/sessions/*.jsonl`
|
|
||||||
- Parses every conversation turn in real-time
|
|
||||||
- Embeds with `snowflake-arctic-embed2` (Ollama @ <OLLAMA_IP>)
|
|
||||||
- Stores directly to `memories_tr` (no Redis)
|
|
||||||
- **Cleans content:** Removes markdown, tables, metadata, thinking tags
|
|
||||||
|
|
||||||
**Service:** `mem-qdrant-watcher.service`
|
|
||||||
- **Status:** Active since 16:46:53 CST
|
|
||||||
- **Systemd:** Enabled, auto-restart
|
|
||||||
|
|
||||||
**Log:** `journalctl -u mem-qdrant-watcher -f`
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
### 2. Content Cleaner (Existing Data)
|
|
||||||
|
|
||||||
**Location:** `~/.openclaw/workspace/skills/qdrant-memory/scripts/clean_memories_tr.py`
|
|
||||||
|
|
||||||
**Function:**
|
|
||||||
- Batch-cleans existing `memories_tr` points
|
|
||||||
- Removes: `**bold**`, `|tables|`, `` `code` ``, `---` rules, `# headers`
|
|
||||||
- Flattens nested content dicts
|
|
||||||
- Rate-limited to prevent Qdrant overload
|
|
||||||
|
|
||||||
**Usage:**
|
|
||||||
```bash
|
|
||||||
# Dry run (preview)
|
|
||||||
python3 clean_memories_tr.py --dry-run
|
|
||||||
|
|
||||||
# Clean all
|
|
||||||
python3 clean_memories_tr.py --execute
|
|
||||||
|
|
||||||
# Clean limited (test)
|
|
||||||
python3 clean_memories_tr.py --execute --limit 100
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
### 3. Timer Curator (v2.2 - DEPLOYED)
|
|
||||||
|
|
||||||
**Replaces:** Daily curator (2:45 AM batch) and turn-based curator
|
|
||||||
|
|
||||||
**Location:** `~/.openclaw/workspace/.local_projects/true-recall-v2/tr-continuous/curator_timer.py`
|
|
||||||
|
|
||||||
**Schedule:** Every 30 minutes (cron)
|
|
||||||
|
|
||||||
**Flow:**
|
|
||||||
1. Query uncurated memories (`curated: false`)
|
|
||||||
2. Send batch to qwen3 (max 100)
|
|
||||||
3. Extract gems using curator prompt
|
|
||||||
4. Store gems to `gems_tr`
|
|
||||||
5. Mark processed memories as `curated: true`
|
|
||||||
|
|
||||||
**Files:**
|
|
||||||
| File | Purpose |
|
|
||||||
|------|---------|
|
|
||||||
| `curator_timer.py` | Main curator script |
|
|
||||||
| `curator_config.json` | Adjustable settings |
|
|
||||||
| `migrate_add_curated.py` | One-time migration (completed) |
|
|
||||||
|
|
||||||
**Usage:**
|
|
||||||
```bash
|
|
||||||
# Dry run (preview)
|
|
||||||
python3 curator_timer.py --dry-run
|
|
||||||
|
|
||||||
# Manual run
|
|
||||||
python3 curator_timer.py --config curator_config.json
|
|
||||||
```
|
|
||||||
|
|
||||||
**Status:** ✅ Deployed, first run will process ~12,378 existing memories
|
|
||||||
|
|
||||||
### 5. Silent Compacting (NEW - Concept)
|
|
||||||
|
|
||||||
**Idea:** Automatically remove old context from prompt when token limit approached.
|
|
||||||
|
|
||||||
**Behavior:**
|
|
||||||
- Trigger: Context window > 80% full
|
|
||||||
- Action: Remove oldest messages (silently)
|
|
||||||
- Preserve: Gems always kept, recent N turns kept
|
|
||||||
- Result: Seamless conversation without "compacting" notification
|
|
||||||
|
|
||||||
**Config:**
|
|
||||||
```json
|
|
||||||
{
|
|
||||||
"compacting": {
|
|
||||||
"enabled": true,
|
|
||||||
"triggerAtPercent": 80,
|
|
||||||
"keepRecentTurns": 20,
|
|
||||||
"preserveGems": true,
|
|
||||||
"silent": true
|
|
||||||
}
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
**Status:** ⏳ Concept only - requires OpenClaw core changes
|
|
||||||
|
|
||||||
### 6. memory-qdrant Plugin
|
|
||||||
|
|
||||||
**Location:** `/root/.openclaw/extensions/memory-qdrant/`
|
|
||||||
|
|
||||||
**Config:**
|
|
||||||
```json
|
|
||||||
{
|
|
||||||
"collectionName": "gems_tr",
|
|
||||||
"captureCollection": "memories_tr",
|
|
||||||
"autoRecall": true,
|
|
||||||
"autoCapture": true
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
**Function:**
|
|
||||||
- **Recall:** Searches `gems_tr`, injects as context (hidden)
|
|
||||||
- **Capture:** Session-level capture to `memories_tr` (backup)
|
|
||||||
|
|
||||||
**Status:** Loaded, dual collection support working
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Files & Locations
|
|
||||||
|
|
||||||
### Core Project Files
|
|
||||||
|
|
||||||
```
|
|
||||||
~/.openclaw/workspace/.local_projects/true-recall-v2/
|
|
||||||
├── README.md # Architecture docs
|
|
||||||
├── session.md # This file
|
|
||||||
├── curator-prompt.md # Gem extraction prompt
|
|
||||||
├── tr-daily/ # Daily batch curation
|
|
||||||
│ └── curate_from_qdrant.py # Daily curator (2:45 AM)
|
|
||||||
├── tr-continuous/ # Real-time curation (NEW)
|
|
||||||
│ ├── curator_by_count.py # Turn-based curator
|
|
||||||
│ ├── curator_turn_based.py # Alternative approach
|
|
||||||
│ ├── curator_cron.sh # Cron wrapper
|
|
||||||
│ ├── turn-curator.service # Systemd service
|
|
||||||
│ └── README.md # Documentation
|
|
||||||
└── shared/
|
|
||||||
└── (shared resources)
|
|
||||||
```
|
|
||||||
|
|
||||||
### New Files (2026-02-24 19:00)
|
|
||||||
|
|
||||||
| File | Purpose |
|
|
||||||
|------|---------|
|
|
||||||
| `tr-continuous/curator_timer.py` | Timer-based curator (deployed) |
|
|
||||||
| `tr-continuous/curator_config.json` | Curator settings |
|
|
||||||
| `tr-continuous/migrate_add_curated.py` | Migration script (completed) |
|
|
||||||
|
|
||||||
### Legacy Files (Pre-v2.2)
|
|
||||||
|
|
||||||
| File | Status | Note |
|
|
||||||
|------|--------|------|
|
|
||||||
| `tr-daily/curate_from_qdrant.py` | 📦 Archived | Replaced by timer |
|
|
||||||
| `tr-continuous/curator_by_count.py` | 📦 Archived | Replaced by timer |
|
|
||||||
| `tr-continuous/curator_turn_based.py` | 📦 Archived | Replaced by timer |
|
|
||||||
|
|
||||||
### System Locations
|
|
||||||
|
|
||||||
| File | Purpose |
|
|
||||||
|------|---------|
|
|
||||||
| `/root/.openclaw/extensions/memory-qdrant/` | Plugin code |
|
|
||||||
| `/root/.openclaw/openclaw.json` | Plugin configuration |
|
|
||||||
| `/etc/systemd/system/mem-qdrant-watcher.service` | Systemd service |
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 🔥 CRITICAL FIXES APPLIED (2026-02-25 12:00-12:41 CST)
|
|
||||||
|
|
||||||
### Issues Found & Fixed Today
|
|
||||||
|
|
||||||
| Issue | Root Cause | Fix Applied |
|
|
||||||
|-------|------------|-------------|
|
|
||||||
| **Watcher stuck on old session** | Watcher only checked for new sessions when current file deleted | ✅ Restarted watcher, now follows current session (12:22) |
|
|
||||||
| **Plugin capture 0 exchanges** | OpenClaw changed to OpenAI content format (array), plugin expected string | ✅ Added `extractMessageText()` to parse content arrays (12:34) |
|
|
||||||
| **Session switching logic** | Old sessions persisted, watcher never switched | ✅ Fixed session detection logic in watcher |
|
|
||||||
| **Plugin content extraction** | `msg.content` is now array with `{type, text}` items | ✅ Extracts text from `type: "text"` items |
|
|
||||||
|
|
||||||
### Validation Results (2026-02-25 12:41)
|
|
||||||
|
|
||||||
```
|
|
||||||
memory-qdrant: parsed 17 user, 116 assistant messages, 9 exchanges
|
|
||||||
memory-qdrant: first msg role=user, contentType=array
|
|
||||||
```
|
|
||||||
|
|
||||||
**Before:** 0 exchanges extracted
|
|
||||||
**After:** 9 exchanges captured per session
|
|
||||||
|
|
||||||
### Components Status
|
|
||||||
|
|
||||||
| Component | Before | After | Status |
|
|
||||||
|-----------|--------|-------|--------|
|
|
||||||
| Real-time watcher | Stuck on Feb 24 session | Following current session | ✅ Fixed |
|
|
||||||
| Plugin capture | 0 exchanges | 9 exchanges | ✅ Fixed |
|
|
||||||
| Context injection | Working | Still working | ✅ Verified |
|
|
||||||
|
|
||||||
### Files Modified (2026-02-25)
|
|
||||||
|
|
||||||
| File | Change |
|
|
||||||
|------|--------|
|
|
||||||
| `extensions/memory-qdrant/index.ts` | Added `extractMessageText()` function, removed debug logging |
|
|
||||||
| `extensions/memory-qdrant/index.js` | Compiled TypeScript changes |
|
|
||||||
| `session.md` | This update |
|
|
||||||
| `function_check.md` | Added fixes section |
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Changes Made Today (2026-02-24 19:00)
|
|
||||||
|
|
||||||
### 1. Timer Curator Deployed (v2.2)
|
|
||||||
|
|
||||||
- Created `curator_timer.py` — simplified timer-based curation
|
|
||||||
- Created `curator_config.json` — adjustable settings
|
|
||||||
- Removed daily 2:45 AM cron job
|
|
||||||
- Added `*/30 * * * *` cron timer
|
|
||||||
- **Status:** ✅ Deployed, logs to `/var/log/true-recall-timer.log`
|
|
||||||
|
|
||||||
### 2. Migration Completed
|
|
||||||
|
|
||||||
- Created `migrate_add_curated.py`
|
|
||||||
- Tagged 12,378 existing memories with `curated: false`
|
|
||||||
- Updated watcher to add `curated: false` to new memories
|
|
||||||
- **Status:** ✅ Complete
|
|
||||||
|
|
||||||
### 3. Simplified Architecture
|
|
||||||
|
|
||||||
- ❌ Removed turn-based curator complexity
|
|
||||||
- ❌ Removed daily batch processing
|
|
||||||
- ✅ Single timer trigger every 30 minutes
|
|
||||||
- ✅ No minimum threshold (processes 0-N memories)
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Configuration
|
|
||||||
|
|
||||||
### memory-qdrant Plugin
|
|
||||||
|
|
||||||
**File:** `/root/.openclaw/openclaw.json`
|
|
||||||
|
|
||||||
```json
|
|
||||||
{
|
|
||||||
"memory-qdrant": {
|
|
||||||
"config": {
|
|
||||||
"autoCapture": true,
|
|
||||||
"autoRecall": true,
|
|
||||||
"collectionName": "gems_tr",
|
|
||||||
"captureCollection": "memories_tr",
|
|
||||||
"embeddingModel": "snowflake-arctic-embed2",
|
|
||||||
"maxRecallResults": 2,
|
|
||||||
"minRecallScore": 0.7,
|
|
||||||
"ollamaUrl": "http://<OLLAMA_IP>:11434",
|
|
||||||
"qdrantUrl": "http://<QDRANT_IP>:6333"
|
|
||||||
},
|
|
||||||
"enabled": true
|
|
||||||
}
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
### Gateway (OpenClaw Update Fix)
|
|
||||||
|
|
||||||
```json
|
|
||||||
{
|
|
||||||
"gateway": {
|
|
||||||
"controlUi": {
|
|
||||||
"allowedOrigins": ["*"],
|
|
||||||
"allowInsecureAuth": false,
|
|
||||||
"dangerouslyDisableDeviceAuth": true
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Validation Commands
|
|
||||||
|
|
||||||
### Check Collections
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Points count
|
|
||||||
curl -s http://<QDRANT_IP>:6333/collections/memories_tr | jq '.result.points_count'
|
|
||||||
curl -s http://<QDRANT_IP>:6333/collections/gems_tr | jq '.result.points_count'
|
|
||||||
|
|
||||||
# Recent points
|
|
||||||
curl -s -X POST http://<QDRANT_IP>:6333/collections/memories_tr/points/scroll \
|
|
||||||
-H "Content-Type: application/json" \
|
|
||||||
-d '{"limit": 5, "with_payload": true}' | jq '.result.points[].payload.content'
|
|
||||||
```
|
|
||||||
|
|
||||||
### Check Services
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Watcher status
|
|
||||||
sudo systemctl status mem-qdrant-watcher
|
|
||||||
|
|
||||||
# Watcher logs
|
|
||||||
sudo journalctl -u mem-qdrant-watcher -n 20
|
|
||||||
|
|
||||||
# OpenClaw status
|
|
||||||
openclaw status
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Troubleshooting
|
|
||||||
|
|
||||||
### Issue: Watcher Not Capturing
|
|
||||||
|
|
||||||
**Check:**
|
|
||||||
1. Service running? `systemctl status mem-qdrant-watcher`
|
|
||||||
2. Logs: `journalctl -u mem-qdrant-watcher -f`
|
|
||||||
3. Qdrant accessible? `curl http://<QDRANT_IP>:6333/`
|
|
||||||
4. Ollama accessible? `curl http://<OLLAMA_IP>:11434/api/tags`
|
|
||||||
|
|
||||||
### Issue: Cleaner Fails
|
|
||||||
|
|
||||||
**Common causes:**
|
|
||||||
- Qdrant connection timeout (add `time.sleep(0.1)` between batches)
|
|
||||||
- Nested content dicts (handled in updated script)
|
|
||||||
- Type errors (non-string content — handled)
|
|
||||||
|
|
||||||
### Issue: Plugin Not Loading
|
|
||||||
|
|
||||||
**Check:**
|
|
||||||
1. `openclaw.json` syntax valid? `openclaw config validate`
|
|
||||||
2. Plugin compiled? `cd /root/.openclaw/extensions/memory-qdrant && npx tsc`
|
|
||||||
3. Gateway logs: `tail /tmp/openclaw/openclaw-$(date +%Y-%m-%d).log`
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Cron Schedule (Updated v2.2)
|
|
||||||
|
|
||||||
| Time | Job | Script | Status |
|
|
||||||
|------|-----|--------|--------|
|
|
||||||
| Every 30 min | Timer curator | `tr-continuous/curator_timer.py` | ✅ Active |
|
|
||||||
| Per turn | Capture | `mem-qdrant-watcher` | ✅ Daemon |
|
|
||||||
| Per turn | Injection | `memory-qdrant` plugin | ✅ Active |
|
|
||||||
|
|
||||||
**Removed:**
|
|
||||||
- ❌ 2:45 AM daily curator
|
|
||||||
- ❌ Every-minute turn curator check
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Next Steps
|
|
||||||
|
|
||||||
### Immediate
|
|
||||||
- ⏳ Monitor first timer run (logs: `/var/log/true-recall-timer.log`)
|
|
||||||
- ⏳ Validate gem extraction quality from timer curator
|
|
||||||
- ⏳ Archive old curator scripts if timer works
|
|
||||||
|
|
||||||
### Completed ✅
|
|
||||||
- ✅ **Compactor config** — Minimal overhead: `mode: default`, `reserveTokensFloor: 0`, `memoryFlush: false`
|
|
||||||
|
|
||||||
### Future
|
|
||||||
- ⏳ Curator tuning based on timer results
|
|
||||||
- ⏳ Silent compacting (requires OpenClaw core changes)
|
|
||||||
|
|
||||||
### Planned Features (Backlog)
|
|
||||||
- ⏳ **Interactive install script** — Prompts for embedding model, timer interval, batch size, endpoints
|
|
||||||
- ⏳ **Single embedding model option** — Use one model for both collections
|
|
||||||
- ⏳ **Configurable thresholds** — Per-user customization via prompts
|
|
||||||
|
|
||||||
**Compactor Settings (Applied):**
|
|
||||||
```json5
|
|
||||||
{
|
|
||||||
agents: {
|
|
||||||
defaults: {
|
|
||||||
compaction: {
|
|
||||||
mode: "default",
|
|
||||||
reserveTokensFloor: 0,
|
|
||||||
memoryFlush: { enabled: false }
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
**Note:** Only `mode`, `reserveTokensFloor`, and `memoryFlush` are valid under `agents.defaults.compaction`. Other settings are Pi runtime parameters.
|
|
||||||
|
|
||||||
**Install script prompts:**
|
|
||||||
1. Embedding model (snowflake vs mxbai)
|
|
||||||
2. Timer interval (5 min / 30 min / hourly)
|
|
||||||
3. Batch size (50 / 100 / 500)
|
|
||||||
4. Qdrant/Ollama URLs
|
|
||||||
5. User ID
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Session Recovery
|
|
||||||
|
|
||||||
If starting fresh:
|
|
||||||
1. Read `README.md` for architecture overview
|
|
||||||
2. Check service status: `sudo systemctl status mem-qdrant-watcher`
|
|
||||||
3. Check timer curator: `tail /var/log/true-recall-timer.log`
|
|
||||||
4. Verify collections: `curl http://<QDRANT_IP>:6333/collections`
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
*Last Verified: 2026-02-24 19:29 CST*
|
|
||||||
*Version: v2.2 (30b curator, install script planned)*
|
|
||||||
@@ -32,7 +32,7 @@ SCRIPT_DIR = Path(__file__).parent
|
|||||||
DEFAULT_CONFIG = SCRIPT_DIR / "curator_config.json"
|
DEFAULT_CONFIG = SCRIPT_DIR / "curator_config.json"
|
||||||
|
|
||||||
# Curator prompt path
|
# Curator prompt path
|
||||||
CURATOR_PROMPT_PATH = Path("/root/.openclaw/workspace/.local_projects/true-recall-v2/curator-prompt.md")
|
CURATOR_PROMPT_PATH = Path("~/.openclaw/workspace/.local_projects/true-recall-v2/curator-prompt.md")
|
||||||
|
|
||||||
|
|
||||||
def load_curator_prompt() -> str:
|
def load_curator_prompt() -> str:
|
||||||
@@ -295,8 +295,8 @@ def main():
|
|||||||
|
|
||||||
config = load_config(args.config)
|
config = load_config(args.config)
|
||||||
|
|
||||||
qdrant_url = os.getenv("QDRANT_URL", "http://10.0.0.40:6333")
|
qdrant_url = os.getenv("QDRANT_URL", "http://<QDRANT_IP>:6333")
|
||||||
ollama_url = os.getenv("OLLAMA_URL", "http://10.0.0.10:11434")
|
ollama_url = os.getenv("OLLAMA_URL", "http://<OLLAMA_IP>:11434")
|
||||||
|
|
||||||
user_id = config.get("user_id", "rob")
|
user_id = config.get("user_id", "rob")
|
||||||
source_collection = config.get("source_collection", "memories_tr")
|
source_collection = config.get("source_collection", "memories_tr")
|
||||||
|
|||||||
19
tr-worker/mem-qdrant-watcher.service
Normal file
19
tr-worker/mem-qdrant-watcher.service
Normal file
@@ -0,0 +1,19 @@
|
|||||||
|
[Unit]
|
||||||
|
Description=OpenClaw Real-Time Qdrant Memory Watcher
|
||||||
|
After=network.target
|
||||||
|
|
||||||
|
[Service]
|
||||||
|
Type=simple
|
||||||
|
User=<USER>
|
||||||
|
WorkingDirectory=<INSTALL_PATH>/tr-worker
|
||||||
|
Environment="QDRANT_URL=http://<QDRANT_IP>:6333"
|
||||||
|
Environment="QDRANT_COLLECTION=memories_tr"
|
||||||
|
Environment="OLLAMA_URL=http://<OLLAMA_IP>:11434"
|
||||||
|
Environment="EMBEDDING_MODEL=snowflake-arctic-embed2"
|
||||||
|
Environment="USER_ID=<USER_ID>"
|
||||||
|
ExecStart=/usr/bin/python3 <INSTALL_PATH>/tr-worker/realtime_qdrant_watcher.py --daemon
|
||||||
|
Restart=always
|
||||||
|
RestartSec=5
|
||||||
|
|
||||||
|
[Install]
|
||||||
|
WantedBy=multi-user.target
|
||||||
332
tr-worker/realtime_qdrant_watcher.py
Normal file
332
tr-worker/realtime_qdrant_watcher.py
Normal file
@@ -0,0 +1,332 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
Real-time Qdrant Watcher: Monitors OpenClaw session JSONL and stores to Qdrant instantly.
|
||||||
|
|
||||||
|
This daemon watches the active session file, embeds each conversation turn,
|
||||||
|
and stores directly to Qdrant memories_tr collection (real-time, no Redis).
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
# Run as daemon
|
||||||
|
python3 realtime_qdrant_watcher.py --daemon
|
||||||
|
|
||||||
|
# Run once (process current session then exit)
|
||||||
|
python3 realtime_qdrant_watcher.py --once
|
||||||
|
|
||||||
|
# Test mode (print to stdout, don't write to Qdrant)
|
||||||
|
python3 realtime_qdrant_watcher.py --dry-run
|
||||||
|
|
||||||
|
Systemd service:
|
||||||
|
# Copy to /etc/systemd/system/mem-qdrant-watcher.service
|
||||||
|
# systemctl enable --now mem-qdrant-watcher
|
||||||
|
"""
|
||||||
|
|
||||||
|
import os
|
||||||
|
import sys
|
||||||
|
import json
|
||||||
|
import time
|
||||||
|
import signal
|
||||||
|
import hashlib
|
||||||
|
import argparse
|
||||||
|
import requests
|
||||||
|
from datetime import datetime, timezone
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Dict, Any, Optional, List
|
||||||
|
|
||||||
|
# Config - Set via environment variables or use placeholders
|
||||||
|
QDRANT_URL = os.getenv("QDRANT_URL", "http://<QDRANT_IP>:6333")
|
||||||
|
QDRANT_COLLECTION = os.getenv("QDRANT_COLLECTION", "memories_tr")
|
||||||
|
OLLAMA_URL = os.getenv("OLLAMA_URL", "http://<OLLAMA_IP>:11434")
|
||||||
|
EMBEDDING_MODEL = os.getenv("EMBEDDING_MODEL", "snowflake-arctic-embed2")
|
||||||
|
USER_ID = os.getenv("USER_ID", "<USER_ID>")
|
||||||
|
|
||||||
|
# Paths
|
||||||
|
SESSIONS_DIR = Path(os.getenv("SESSIONS_DIR", "~/.openclaw/agents/main/sessions")).expanduser()
|
||||||
|
|
||||||
|
# State
|
||||||
|
running = True
|
||||||
|
last_position = 0
|
||||||
|
current_file = None
|
||||||
|
turn_counter = 0
|
||||||
|
|
||||||
|
|
||||||
|
def signal_handler(signum, frame):
|
||||||
|
"""Handle shutdown gracefully."""
|
||||||
|
global running
|
||||||
|
print(f"\nReceived signal {signum}, shutting down...", file=sys.stderr)
|
||||||
|
running = False
|
||||||
|
|
||||||
|
|
||||||
|
def get_embedding(text: str) -> List[float]:
|
||||||
|
"""Get embedding vector from Ollama."""
|
||||||
|
try:
|
||||||
|
response = requests.post(
|
||||||
|
f"{OLLAMA_URL}/api/embeddings",
|
||||||
|
json={"model": EMBEDDING_MODEL, "prompt": text},
|
||||||
|
timeout=30
|
||||||
|
)
|
||||||
|
response.raise_for_status()
|
||||||
|
return response.json()["embedding"]
|
||||||
|
except Exception as e:
|
||||||
|
print(f"Error getting embedding: {e}", file=sys.stderr)
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def clean_content(text: str) -> str:
|
||||||
|
"""Clean content - remove metadata, markdown, keep only plain text."""
|
||||||
|
import re
|
||||||
|
|
||||||
|
# Remove metadata JSON blocks
|
||||||
|
text = re.sub(r'Conversation info \(untrusted metadata\):\s*```json\s*\{[\s\S]*?\}\s*```', '', text)
|
||||||
|
|
||||||
|
# Remove thinking tags
|
||||||
|
text = re.sub(r'\[thinking:[^\]]*\]', '', text)
|
||||||
|
|
||||||
|
# Remove timestamp lines
|
||||||
|
text = re.sub(r'\[\w{3} \d{4}-\d{2}-\d{2} \d{2}:\d{2} [A-Z]{3}\]', '', text)
|
||||||
|
|
||||||
|
# Remove markdown tables
|
||||||
|
text = re.sub(r'\|[^\n]*\|', '', text) # Table rows
|
||||||
|
text = re.sub(r'\|[-:]+\|', '', text) # Table separators
|
||||||
|
|
||||||
|
# Remove markdown formatting
|
||||||
|
text = re.sub(r'\*\*([^*]+)\*\*', r'\1', text) # Bold **text**
|
||||||
|
text = re.sub(r'\*([^*]+)\*', r'\1', text) # Italic *text*
|
||||||
|
text = re.sub(r'`([^`]+)`', r'\1', text) # Inline code `text`
|
||||||
|
text = re.sub(r'```[\s\S]*?```', '', text) # Code blocks
|
||||||
|
|
||||||
|
# Remove horizontal rules
|
||||||
|
text = re.sub(r'---+', '', text)
|
||||||
|
text = re.sub(r'\*\*\*+', '', text)
|
||||||
|
|
||||||
|
# Remove excess whitespace
|
||||||
|
text = re.sub(r'\n{3,}', '\n', text)
|
||||||
|
text = re.sub(r'[ \t]+', ' ', text) # Multiple spaces -> single
|
||||||
|
|
||||||
|
return text.strip()
|
||||||
|
|
||||||
|
|
||||||
|
def store_to_qdrant(turn: Dict[str, Any], dry_run: bool = False) -> bool:
|
||||||
|
"""Store a single turn to Qdrant with embedding."""
|
||||||
|
if dry_run:
|
||||||
|
print(f"[DRY RUN] Would store turn {turn['turn']} ({turn['role']}): {turn['content'][:60]}...")
|
||||||
|
return True
|
||||||
|
|
||||||
|
# Get embedding
|
||||||
|
vector = get_embedding(turn['content'])
|
||||||
|
if vector is None:
|
||||||
|
print(f"Failed to get embedding for turn {turn['turn']}", file=sys.stderr)
|
||||||
|
return False
|
||||||
|
|
||||||
|
# Prepare payload
|
||||||
|
payload = {
|
||||||
|
"user_id": turn.get('user_id', USER_ID),
|
||||||
|
"role": turn['role'],
|
||||||
|
"content": turn['content'],
|
||||||
|
"turn": turn['turn'],
|
||||||
|
"timestamp": turn.get('timestamp', datetime.now(timezone.utc).isoformat()),
|
||||||
|
"date": datetime.now(timezone.utc).strftime('%Y-%m-%d'),
|
||||||
|
"source": "realtime_watcher",
|
||||||
|
"curated": False
|
||||||
|
}
|
||||||
|
|
||||||
|
# Generate deterministic ID
|
||||||
|
turn_id = turn.get('turn', 0)
|
||||||
|
hash_bytes = hashlib.sha256(f"{USER_ID}:turn:{turn_id}:{datetime.now().strftime('%H%M%S')}".encode()).digest()[:8]
|
||||||
|
point_id = int.from_bytes(hash_bytes, byteorder='big') % (2**63)
|
||||||
|
|
||||||
|
# Store to Qdrant
|
||||||
|
try:
|
||||||
|
response = requests.put(
|
||||||
|
f"{QDRANT_URL}/collections/{QDRANT_COLLECTION}/points",
|
||||||
|
json={
|
||||||
|
"points": [{
|
||||||
|
"id": abs(point_id),
|
||||||
|
"vector": vector,
|
||||||
|
"payload": payload
|
||||||
|
}]
|
||||||
|
},
|
||||||
|
timeout=30
|
||||||
|
)
|
||||||
|
response.raise_for_status()
|
||||||
|
return True
|
||||||
|
except Exception as e:
|
||||||
|
print(f"Error writing to Qdrant: {e}", file=sys.stderr)
|
||||||
|
return False
|
||||||
|
|
||||||
|
|
||||||
|
def get_current_session_file():
|
||||||
|
"""Find the most recently modified session JSONL file."""
|
||||||
|
if not SESSIONS_DIR.exists():
|
||||||
|
return None
|
||||||
|
|
||||||
|
files = list(SESSIONS_DIR.glob("*.jsonl"))
|
||||||
|
if not files:
|
||||||
|
return None
|
||||||
|
|
||||||
|
return max(files, key=lambda p: p.stat().st_mtime)
|
||||||
|
|
||||||
|
|
||||||
|
def parse_turn(line: str, session_name: str) -> Optional[Dict[str, Any]]:
|
||||||
|
"""Parse a single JSONL line into a turn dict."""
|
||||||
|
global turn_counter
|
||||||
|
|
||||||
|
try:
|
||||||
|
entry = json.loads(line.strip())
|
||||||
|
except json.JSONDecodeError:
|
||||||
|
return None
|
||||||
|
|
||||||
|
# OpenClaw format: {"type": "message", "message": {...}}
|
||||||
|
if entry.get('type') != 'message' or 'message' not in entry:
|
||||||
|
return None
|
||||||
|
|
||||||
|
msg = entry['message']
|
||||||
|
role = msg.get('role')
|
||||||
|
|
||||||
|
# Skip tool results, system, developer messages
|
||||||
|
if role in ('toolResult', 'system', 'developer'):
|
||||||
|
return None
|
||||||
|
|
||||||
|
if role not in ('user', 'assistant'):
|
||||||
|
return None
|
||||||
|
|
||||||
|
# Extract content
|
||||||
|
content = ""
|
||||||
|
if isinstance(msg.get('content'), list):
|
||||||
|
for item in msg['content']:
|
||||||
|
if isinstance(item, dict) and 'text' in item:
|
||||||
|
content += item['text']
|
||||||
|
elif isinstance(msg.get('content'), str):
|
||||||
|
content = msg['content']
|
||||||
|
|
||||||
|
if not content:
|
||||||
|
return None
|
||||||
|
|
||||||
|
# Clean content
|
||||||
|
content = clean_content(content)
|
||||||
|
if not content or len(content) < 5:
|
||||||
|
return None
|
||||||
|
|
||||||
|
turn_counter += 1
|
||||||
|
|
||||||
|
return {
|
||||||
|
'turn': turn_counter,
|
||||||
|
'role': role,
|
||||||
|
'content': content[:2000], # Limit size
|
||||||
|
'timestamp': entry.get('timestamp', datetime.now(timezone.utc).isoformat()),
|
||||||
|
'user_id': USER_ID
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def process_new_lines(f, session_name: str, dry_run: bool = False):
|
||||||
|
"""Process any new lines added to the file."""
|
||||||
|
global last_position
|
||||||
|
|
||||||
|
f.seek(last_position)
|
||||||
|
|
||||||
|
for line in f:
|
||||||
|
line = line.strip()
|
||||||
|
if not line:
|
||||||
|
continue
|
||||||
|
|
||||||
|
turn = parse_turn(line, session_name)
|
||||||
|
if turn:
|
||||||
|
if store_to_qdrant(turn, dry_run):
|
||||||
|
print(f"✅ Turn {turn['turn']} ({turn['role']}) → Qdrant")
|
||||||
|
|
||||||
|
last_position = f.tell()
|
||||||
|
|
||||||
|
|
||||||
|
def watch_session(session_file: Path, dry_run: bool = False):
|
||||||
|
"""Watch a specific session file for new lines."""
|
||||||
|
global last_position, turn_counter
|
||||||
|
|
||||||
|
session_name = session_file.name.replace('.jsonl', '')
|
||||||
|
print(f"Watching session: {session_file.name}")
|
||||||
|
|
||||||
|
try:
|
||||||
|
with open(session_file, 'r') as f:
|
||||||
|
for line in f:
|
||||||
|
turn_counter += 1
|
||||||
|
last_position = session_file.stat().st_size
|
||||||
|
print(f"Session has {turn_counter} existing turns, starting from position {last_position}")
|
||||||
|
except Exception as e:
|
||||||
|
print(f"Warning: Could not read existing turns: {e}", file=sys.stderr)
|
||||||
|
last_position = 0
|
||||||
|
|
||||||
|
with open(session_file, 'r') as f:
|
||||||
|
while running:
|
||||||
|
if not session_file.exists():
|
||||||
|
print("Session file removed, looking for new session...")
|
||||||
|
return None
|
||||||
|
|
||||||
|
process_new_lines(f, session_name, dry_run)
|
||||||
|
time.sleep(0.1)
|
||||||
|
|
||||||
|
return session_file
|
||||||
|
|
||||||
|
|
||||||
|
def watch_loop(dry_run: bool = False):
|
||||||
|
"""Main watch loop - handles session rotation."""
|
||||||
|
global current_file, turn_counter
|
||||||
|
|
||||||
|
while running:
|
||||||
|
session_file = get_current_session_file()
|
||||||
|
|
||||||
|
if session_file is None:
|
||||||
|
print("No active session found, waiting...")
|
||||||
|
time.sleep(1)
|
||||||
|
continue
|
||||||
|
|
||||||
|
if current_file != session_file:
|
||||||
|
print(f"\nNew session detected: {session_file.name}")
|
||||||
|
current_file = session_file
|
||||||
|
turn_counter = 0
|
||||||
|
last_position = 0
|
||||||
|
|
||||||
|
result = watch_session(session_file, dry_run)
|
||||||
|
|
||||||
|
if result is None:
|
||||||
|
current_file = None
|
||||||
|
time.sleep(0.5)
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
global USER_ID
|
||||||
|
|
||||||
|
parser = argparse.ArgumentParser(
|
||||||
|
description="Real-time OpenClaw session watcher → Qdrant"
|
||||||
|
)
|
||||||
|
parser.add_argument("--daemon", "-d", action="store_true", help="Run as daemon")
|
||||||
|
parser.add_argument("--once", "-o", action="store_true", help="Process once then exit")
|
||||||
|
parser.add_argument("--dry-run", "-n", action="store_true", help="Don't write to Qdrant")
|
||||||
|
parser.add_argument("--user-id", "-u", default=USER_ID, help=f"User ID (default: {USER_ID})")
|
||||||
|
|
||||||
|
args = parser.parse_args()
|
||||||
|
|
||||||
|
signal.signal(signal.SIGINT, signal_handler)
|
||||||
|
signal.signal(signal.SIGTERM, signal_handler)
|
||||||
|
|
||||||
|
if args.user_id:
|
||||||
|
USER_ID = args.user_id
|
||||||
|
|
||||||
|
print(f"🔍 Real-time Qdrant Watcher")
|
||||||
|
print(f"📍 Qdrant: {QDRANT_URL}/{QDRANT_COLLECTION}")
|
||||||
|
print(f"🧠 Ollama: {OLLAMA_URL}/{EMBEDDING_MODEL}")
|
||||||
|
print(f"👤 User: {USER_ID}")
|
||||||
|
print(f"📝 Sessions: {SESSIONS_DIR}")
|
||||||
|
print()
|
||||||
|
|
||||||
|
if args.once:
|
||||||
|
print("Running once...")
|
||||||
|
session_file = get_current_session_file()
|
||||||
|
if session_file:
|
||||||
|
watch_session(session_file, args.dry_run)
|
||||||
|
else:
|
||||||
|
print("No session found")
|
||||||
|
else:
|
||||||
|
print("Running as daemon (Ctrl+C to stop)...")
|
||||||
|
watch_loop(args.dry_run)
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
Reference in New Issue
Block a user