Compare commits
25 Commits
c780a24847
...
master
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
2069949a8d | ||
|
|
caebfb4b25 | ||
|
|
1bb4b1daaa | ||
|
|
87aab95ba8 | ||
|
|
8bf10326db | ||
|
|
4e1d432b02 | ||
|
|
402c03f647 | ||
|
|
eb9bddab69 | ||
|
|
f238812ce1 | ||
|
|
ead210b565 | ||
|
|
a194702a8a | ||
|
|
cdcfe2f51a | ||
|
|
b954064502 | ||
|
|
ba4a5fd63d | ||
|
|
29a1ade004 | ||
|
|
2f2d93ce7f | ||
|
|
62251f5566 | ||
|
|
501872d46e | ||
|
|
6a32cedb5a | ||
|
|
04953bc38b | ||
|
|
d943b9d87e | ||
|
|
808c021d15 | ||
|
|
5ea614b212 | ||
|
|
e1962887a5 | ||
|
|
d88ff6cea3 |
33
.gitignore
vendored
Normal file
33
.gitignore
vendored
Normal file
@@ -0,0 +1,33 @@
|
|||||||
|
# Python
|
||||||
|
__pycache__/
|
||||||
|
*.py[cod]
|
||||||
|
*$py.class
|
||||||
|
*.so
|
||||||
|
.Python
|
||||||
|
*.egg-info/
|
||||||
|
dist/
|
||||||
|
build/
|
||||||
|
|
||||||
|
# Environment
|
||||||
|
.env
|
||||||
|
.env.*
|
||||||
|
.venv/
|
||||||
|
|
||||||
|
# IDE
|
||||||
|
.vscode/
|
||||||
|
.idea/
|
||||||
|
*.swp
|
||||||
|
*.swo
|
||||||
|
*~
|
||||||
|
|
||||||
|
# OS
|
||||||
|
.DS_Store
|
||||||
|
Thumbs.db
|
||||||
|
|
||||||
|
# Session notes (local only)
|
||||||
|
session.md
|
||||||
|
*.session.md
|
||||||
|
|
||||||
|
# Logs
|
||||||
|
*.log
|
||||||
|
logs/
|
||||||
761
.local_projects/openclaw-true-recall-base/README.md
Normal file
761
.local_projects/openclaw-true-recall-base/README.md
Normal file
@@ -0,0 +1,761 @@
|
|||||||
|
# TrueRecall Base
|
||||||
|
|
||||||
|
**Purpose:** Real-time memory capture → Qdrant `memories_tr`
|
||||||
|
|
||||||
|
**Status:** ✅ Standalone capture system
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
TrueRecall Base is the **foundation**. It watches OpenClaw sessions in real-time and stores every turn to Qdrant's `memories_tr` collection.
|
||||||
|
|
||||||
|
This is **required** for both addons: **Gems** and **Blocks**.
|
||||||
|
|
||||||
|
**Base does NOT include:**
|
||||||
|
- ❌ Curation (gem extraction)
|
||||||
|
- ❌ Topic clustering (blocks)
|
||||||
|
- ❌ Injection (context recall)
|
||||||
|
|
||||||
|
**For those features, install an addon after base.**
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Requirements
|
||||||
|
|
||||||
|
**Vector Database**
|
||||||
|
|
||||||
|
TrueRecall Base requires a vector database to store conversation embeddings. This can be:
|
||||||
|
- **Local** - Self-hosted Qdrant (recommended for privacy)
|
||||||
|
- **Cloud** - Managed Qdrant Cloud or similar service
|
||||||
|
- **Any IP-accessible** Qdrant instance
|
||||||
|
|
||||||
|
In this version, we use a **local Qdrant database** (`http://<QDRANT_IP>:6333`). The database must be reachable from the machine running the watcher daemon.
|
||||||
|
|
||||||
|
**Additional Requirements:**
|
||||||
|
- **Ollama** - For generating text embeddings (local or remote)
|
||||||
|
- **OpenClaw** - The session files to monitor
|
||||||
|
- **Linux systemd** - For running the watcher as a service
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Gotchas & Known Limitations
|
||||||
|
|
||||||
|
> ⚠️ **Embedding Dimensions:** `snowflake-arctic-embed2` outputs **1024 dimensions**, not 768. Ensure your Qdrant collection is configured with `"size": 1024`.
|
||||||
|
|
||||||
|
> ⚠️ **Hardcoded Sessions Path:** `SESSIONS_DIR` is hardcoded to `/root/.openclaw/agents/main/sessions`. To use a different path, modify `realtime_qdrant_watcher.py` to read from an environment variable:
|
||||||
|
> ```python
|
||||||
|
> SESSIONS_DIR = Path(os.getenv("OPENCLAW_SESSIONS_DIR", "/root/.openclaw/agents/main/sessions"))
|
||||||
|
> ```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Three-Tier Architecture
|
||||||
|
|
||||||
|
```
|
||||||
|
true-recall-base (REQUIRED)
|
||||||
|
├── Core: Watcher daemon
|
||||||
|
└── Stores: memories_tr
|
||||||
|
│
|
||||||
|
├──▶ true-recall-gems (ADDON)
|
||||||
|
│ ├── Curator extracts gems → gems_tr
|
||||||
|
│ └── Plugin injects gems into prompts
|
||||||
|
│
|
||||||
|
└──▶ true-recall-blocks (ADDON)
|
||||||
|
├── Topic clustering → topic_blocks_tr
|
||||||
|
└── Contextual block retrieval
|
||||||
|
|
||||||
|
Note: Gems and Blocks are INDEPENDENT addons.
|
||||||
|
They both require Base, but don't work together.
|
||||||
|
Choose one: Gems OR Blocks (not both).
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Quick Start
|
||||||
|
|
||||||
|
### Option 1: Quick Install (Recommended)
|
||||||
|
|
||||||
|
```bash
|
||||||
|
cd /path/to/true-recall-base
|
||||||
|
./install.sh
|
||||||
|
```
|
||||||
|
|
||||||
|
#### What the Installer Does (Step-by-Step)
|
||||||
|
|
||||||
|
The `install.sh` script automates the entire setup process. Here's exactly what happens:
|
||||||
|
|
||||||
|
**Step 1: Interactive Configuration**
|
||||||
|
```
|
||||||
|
Configuration (press Enter for defaults):
|
||||||
|
|
||||||
|
Examples:
|
||||||
|
Qdrant: 10.0.0.40:6333 (remote) or localhost:6333 (local)
|
||||||
|
Ollama: 10.0.0.10:11434 (remote) or localhost:11434 (local)
|
||||||
|
|
||||||
|
Qdrant host:port [localhost:6333]: _
|
||||||
|
Ollama host:port [localhost:11434]: _
|
||||||
|
User ID [user]: _
|
||||||
|
```
|
||||||
|
- Prompts for Qdrant host:port (default: `localhost:6333`)
|
||||||
|
- Prompts for Ollama host:port (default: `localhost:11434`)
|
||||||
|
- Prompts for User ID (default: `user`)
|
||||||
|
- Press Enter to accept defaults, or type custom values
|
||||||
|
|
||||||
|
**Step 2: Configuration Confirmation**
|
||||||
|
```
|
||||||
|
Configuration:
|
||||||
|
Qdrant: http://localhost:6333
|
||||||
|
Ollama: http://localhost:11434
|
||||||
|
User ID: user
|
||||||
|
|
||||||
|
Proceed? [Y/n]: _
|
||||||
|
```
|
||||||
|
- Shows the complete configuration
|
||||||
|
- Asks for confirmation (type `n` to cancel, Enter or `Y` to proceed)
|
||||||
|
- Exits cleanly if cancelled, no changes made
|
||||||
|
|
||||||
|
**Step 3: Systemd Service Generation**
|
||||||
|
- Creates a temporary service file at `/tmp/mem-qdrant-watcher.service`
|
||||||
|
- Inserts your configuration values (IPs, ports, user ID)
|
||||||
|
- Uses absolute path for the script location (handles spaces in paths)
|
||||||
|
- Sets up automatic restart on failure
|
||||||
|
|
||||||
|
**Step 4: Service Installation**
|
||||||
|
```bash
|
||||||
|
sudo cp /tmp/mem-qdrant-watcher.service /etc/systemd/system/
|
||||||
|
sudo systemctl daemon-reload
|
||||||
|
```
|
||||||
|
- Copies the service file to systemd directory
|
||||||
|
- Reloads systemd to recognize the new service
|
||||||
|
|
||||||
|
**Step 5: Service Activation**
|
||||||
|
```bash
|
||||||
|
sudo systemctl enable --now mem-qdrant-watcher
|
||||||
|
```
|
||||||
|
- Enables the service to start on boot (`enable`)
|
||||||
|
- Starts the service immediately (`now`)
|
||||||
|
|
||||||
|
**Step 6: Verification**
|
||||||
|
```
|
||||||
|
==========================================
|
||||||
|
Installation Complete!
|
||||||
|
==========================================
|
||||||
|
|
||||||
|
Status:
|
||||||
|
● mem-qdrant-watcher.service - TrueRecall Base...
|
||||||
|
Active: active (running)
|
||||||
|
```
|
||||||
|
- Displays the service status
|
||||||
|
- Shows it's active and running
|
||||||
|
- Provides commands to verify and monitor
|
||||||
|
|
||||||
|
**Post-Installation Commands:**
|
||||||
|
```bash
|
||||||
|
# Check service status anytime
|
||||||
|
sudo systemctl status mem-qdrant-watcher
|
||||||
|
|
||||||
|
# View live logs
|
||||||
|
sudo journalctl -u mem-qdrant-watcher -f
|
||||||
|
|
||||||
|
# Verify Qdrant collection
|
||||||
|
curl -s http://localhost:6333/collections/memories_tr | jq '.result.points_count'
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Installer Requirements
|
||||||
|
- Must run as root or with sudo (for systemd operations)
|
||||||
|
- Must have execute permissions (`chmod +x install.sh`)
|
||||||
|
- Script must be run from the true-recall-base directory
|
||||||
|
|
||||||
|
### Option 2: Manual Install
|
||||||
|
|
||||||
|
```bash
|
||||||
|
cd /path/to/true-recall-base
|
||||||
|
|
||||||
|
# Copy service file
|
||||||
|
sudo cp watcher/mem-qdrant-watcher.service /etc/systemd/system/
|
||||||
|
|
||||||
|
# Edit the service file to set your IPs and user
|
||||||
|
sudo nano /etc/systemd/system/mem-qdrant-watcher.service
|
||||||
|
|
||||||
|
# Reload and start
|
||||||
|
sudo systemctl daemon-reload
|
||||||
|
sudo systemctl enable --now mem-qdrant-watcher
|
||||||
|
```
|
||||||
|
|
||||||
|
### Verify Installation
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Check service status
|
||||||
|
sudo systemctl status mem-qdrant-watcher
|
||||||
|
|
||||||
|
# Check collection
|
||||||
|
curl -s http://<QDRANT_IP>:6333/collections/memories_tr | jq '.result.points_count'
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Files
|
||||||
|
|
||||||
|
| File | Purpose |
|
||||||
|
|------|---------|
|
||||||
|
| `watcher/realtime_qdrant_watcher.py` | Capture daemon |
|
||||||
|
| `watcher/mem-qdrant-watcher.service` | Systemd service |
|
||||||
|
| `config.json` | Configuration template |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Configuration
|
||||||
|
|
||||||
|
Edit `config.json` or set environment variables:
|
||||||
|
|
||||||
|
| Variable | Default | Description |
|
||||||
|
|----------|---------|-------------|
|
||||||
|
| `QDRANT_URL` | `http://<QDRANT_IP>:6333` | Qdrant endpoint |
|
||||||
|
| `OLLAMA_URL` | `http://<OLLAMA_IP>:11434` | Ollama endpoint |
|
||||||
|
| `EMBEDDING_MODEL` | `snowflake-arctic-embed2` | Embedding model |
|
||||||
|
| `USER_ID` | `<USER_ID>` | User identifier |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## How It Works
|
||||||
|
|
||||||
|
### Architecture Overview
|
||||||
|
|
||||||
|
```
|
||||||
|
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
|
||||||
|
│ OpenClaw Chat │────▶│ Session JSONL │────▶│ Base Watcher │
|
||||||
|
│ (You talking) │ │ (/sessions/*.jsonl) │ │ (This daemon) │
|
||||||
|
└─────────────────┘ └──────────────────┘ └────────┬────────┘
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
┌────────────────────────────────────────────────────────────────────┐
|
||||||
|
│ PROCESSING PIPELINE │
|
||||||
|
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌───────────┐ │
|
||||||
|
│ │ Watch File │─▶│ Parse Turn │─▶│ Clean Text │─▶│ Embed │ │
|
||||||
|
│ │ (inotify) │ │ (JSON→dict) │ │ (strip md) │ │ (Ollama) │ │
|
||||||
|
│ └──────────────┘ └──────────────┘ └──────────────┘ └─────┬─────┘ │
|
||||||
|
│ │ │
|
||||||
|
│ ┌───────────────────────────────────────────────────────────┘ │
|
||||||
|
│ │ │
|
||||||
|
│ ▼ │
|
||||||
|
│ ┌──────────────┐ ┌──────────────┐ │
|
||||||
|
│ │ Store to │─▶│ Qdrant │ │
|
||||||
|
│ │ memories_tr │ │ (vector DB) │ │
|
||||||
|
│ └──────────────┘ └──────────────┘ │
|
||||||
|
└────────────────────────────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step-by-Step Process
|
||||||
|
|
||||||
|
#### Step 1: File Watching
|
||||||
|
|
||||||
|
The watcher monitors OpenClaw session files in real-time:
|
||||||
|
|
||||||
|
```python
|
||||||
|
# From realtime_qdrant_watcher.py
|
||||||
|
SESSIONS_DIR = Path("/root/.openclaw/agents/main/sessions")
|
||||||
|
```
|
||||||
|
|
||||||
|
> ⚠️ **Known Limitation:** `SESSIONS_DIR` is currently hardcoded. To use a different path, patch the watcher script to read from an environment variable (e.g., `os.getenv("OPENCLAW_SESSIONS_DIR", "/root/.openclaw/agents/main/sessions")`).
|
||||||
|
|
||||||
|
**What happens:**
|
||||||
|
- Uses `inotify` or polling to watch the sessions directory
|
||||||
|
- Automatically detects the most recently modified `.jsonl` file
|
||||||
|
- Handles session rotation (when OpenClaw starts a new session)
|
||||||
|
- Maintains position in file to avoid re-processing old lines
|
||||||
|
|
||||||
|
#### Step 2: Turn Parsing
|
||||||
|
|
||||||
|
Each conversation turn is extracted from the JSONL file:
|
||||||
|
|
||||||
|
```json
|
||||||
|
// Example session file entry
|
||||||
|
{
|
||||||
|
"type": "message",
|
||||||
|
"message": {
|
||||||
|
"role": "user",
|
||||||
|
"content": "Hello, can you help me?",
|
||||||
|
"timestamp": "2026-02-27T09:30:00Z"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**What happens:**
|
||||||
|
- Reads new lines appended to the session file
|
||||||
|
- Parses JSON to extract role (user/assistant/system)
|
||||||
|
- Extracts content text
|
||||||
|
- Captures timestamp
|
||||||
|
- Generates unique turn ID from content hash + timestamp
|
||||||
|
|
||||||
|
**Code flow:**
|
||||||
|
```python
|
||||||
|
def parse_turn(line: str) -> Optional[Dict]:
|
||||||
|
data = json.loads(line)
|
||||||
|
if data.get("type") != "message":
|
||||||
|
return None # Skip non-message entries
|
||||||
|
|
||||||
|
return {
|
||||||
|
"id": hashlib.md5(f"{content}{timestamp}".encode()).hexdigest()[:16],
|
||||||
|
"role": role,
|
||||||
|
"content": content,
|
||||||
|
"timestamp": timestamp,
|
||||||
|
"user_id": os.getenv("USER_ID", "default")
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Step 3: Content Cleaning
|
||||||
|
|
||||||
|
Before storage, content is normalized:
|
||||||
|
|
||||||
|
**Strips:**
|
||||||
|
- Markdown tables (`| column | column |`)
|
||||||
|
- Bold/italic markers (`**text**`, `*text*`)
|
||||||
|
- Inline code (`` `code` ``)
|
||||||
|
- Code blocks (```code```)
|
||||||
|
- Multiple consecutive spaces
|
||||||
|
- Leading/trailing whitespace
|
||||||
|
|
||||||
|
**Example:**
|
||||||
|
```
|
||||||
|
Input: "Check this **important** table: | col1 | col2 |"
|
||||||
|
Output: "Check this important table"
|
||||||
|
```
|
||||||
|
|
||||||
|
**Why:** Clean text improves embedding quality and searchability.
|
||||||
|
|
||||||
|
#### Step 4: Embedding Generation
|
||||||
|
|
||||||
|
The cleaned content is converted to a vector embedding:
|
||||||
|
|
||||||
|
```python
|
||||||
|
def get_embedding(text: str) -> List[float]:
|
||||||
|
response = requests.post(
|
||||||
|
f"{OLLAMA_URL}/api/embeddings",
|
||||||
|
json={"model": EMBEDDING_MODEL, "prompt": text}
|
||||||
|
)
|
||||||
|
return response.json()["embedding"]
|
||||||
|
```
|
||||||
|
|
||||||
|
**What happens:**
|
||||||
|
- Sends text to Ollama API (10.0.0.10:11434)
|
||||||
|
- Uses `snowflake-arctic-embed2` model
|
||||||
|
- Returns **1024-dimensional vector** (not 768)
|
||||||
|
- Falls back gracefully if Ollama is unavailable
|
||||||
|
|
||||||
|
#### Step 5: Qdrant Storage
|
||||||
|
|
||||||
|
The complete turn data is stored to Qdrant:
|
||||||
|
|
||||||
|
```python
|
||||||
|
payload = {
|
||||||
|
"user_id": user_id,
|
||||||
|
"role": turn["role"],
|
||||||
|
"content": cleaned_content[:2000], # Size limit
|
||||||
|
"timestamp": turn["timestamp"],
|
||||||
|
"session_id": session_id,
|
||||||
|
"source": "true-recall-base"
|
||||||
|
}
|
||||||
|
|
||||||
|
requests.put(
|
||||||
|
f"{QDRANT_URL}/collections/memories_tr/points",
|
||||||
|
json={"points": [{"id": turn_id, "vector": embedding, "payload": payload}]}
|
||||||
|
)
|
||||||
|
```
|
||||||
|
|
||||||
|
**Storage format:**
|
||||||
|
| Field | Type | Description |
|
||||||
|
|-------|------|-------------|
|
||||||
|
| `user_id` | string | User identifier |
|
||||||
|
| `role` | string | user/assistant/system |
|
||||||
|
| `content` | string | Cleaned text (max 2000 chars) |
|
||||||
|
| `timestamp` | string | ISO 8601 timestamp |
|
||||||
|
| `session_id` | string | Source session file |
|
||||||
|
| `source` | string | "true-recall-base" |
|
||||||
|
|
||||||
|
### Real-Time Performance
|
||||||
|
|
||||||
|
| Metric | Target | Actual |
|
||||||
|
|--------|--------|--------|
|
||||||
|
| Latency | < 500ms | ~100-200ms |
|
||||||
|
| Throughput | > 10 turns/sec | > 50 turns/sec |
|
||||||
|
| Embedding time | < 300ms | ~50-100ms |
|
||||||
|
| Qdrant write | < 100ms | ~10-50ms |
|
||||||
|
|
||||||
|
### Session Rotation Handling
|
||||||
|
|
||||||
|
When OpenClaw starts a new session:
|
||||||
|
|
||||||
|
1. New `.jsonl` file created in sessions directory
|
||||||
|
2. Watcher detects file change via `inotify`
|
||||||
|
3. Identifies most recently modified file
|
||||||
|
4. Switches to watching new file
|
||||||
|
5. Continues from position 0 of new file
|
||||||
|
6. Old file remains in `memories_tr` (already captured)
|
||||||
|
|
||||||
|
### Error Handling
|
||||||
|
|
||||||
|
**Qdrant unavailable:**
|
||||||
|
- Retries with exponential backoff
|
||||||
|
- Logs error, continues watching
|
||||||
|
- Next turn attempts storage again
|
||||||
|
|
||||||
|
**Ollama unavailable:**
|
||||||
|
- Cannot generate embeddings
|
||||||
|
- Logs error, skips turn
|
||||||
|
- Continues watching (no data loss in file)
|
||||||
|
|
||||||
|
**File access errors:**
|
||||||
|
- Handles permission issues gracefully
|
||||||
|
- Retries on temporary failures
|
||||||
|
|
||||||
|
### Collection Schema
|
||||||
|
|
||||||
|
**Qdrant collection: `memories_tr`**
|
||||||
|
|
||||||
|
```python
|
||||||
|
{
|
||||||
|
"name": "memories_tr",
|
||||||
|
"vectors": {
|
||||||
|
"size": 1024, # snowflake-arctic-embed2 dimension (1024, not 768)
|
||||||
|
"distance": "Cosine" # Similarity metric
|
||||||
|
},
|
||||||
|
"payload_schema": {
|
||||||
|
"user_id": "keyword", # Filterable
|
||||||
|
"role": "keyword", # Filterable
|
||||||
|
"timestamp": "datetime", # Range filterable
|
||||||
|
"content": "text" # Full-text searchable
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Security Notes
|
||||||
|
|
||||||
|
- **No credential storage** in code
|
||||||
|
- All sensitive values via environment variables
|
||||||
|
- `USER_ID` isolates memories per user
|
||||||
|
- Cleaned content removes PII markers (but review your data)
|
||||||
|
- HTTPS recommended for production Qdrant/Ollama
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Using Memories with OpenClaw
|
||||||
|
|
||||||
|
### The "q" Command
|
||||||
|
|
||||||
|
**"q"** refers to your Qdrant memory system (`memories_tr` collection).
|
||||||
|
|
||||||
|
When interacting with OpenClaw agents, you can search your stored memories using:
|
||||||
|
- `search q <topic>` - Semantic search for past conversations
|
||||||
|
- `q <topic>` - Shortcut for the same
|
||||||
|
|
||||||
|
### Context Injection Instructions
|
||||||
|
|
||||||
|
**For OpenClaw System Prompt:**
|
||||||
|
|
||||||
|
Add these lines to your agent's system context to enable memory-aware responses:
|
||||||
|
|
||||||
|
```
|
||||||
|
## Memory System (q)
|
||||||
|
|
||||||
|
**"q" = Qdrant collection `memories_tr`** — your conversation history database.
|
||||||
|
|
||||||
|
### Memory Retrieval Rules
|
||||||
|
|
||||||
|
**Before saying "I don't know" or "I can't do that":**
|
||||||
|
1. **ALWAYS search q first** using the topic/keywords from the user's request
|
||||||
|
2. Incorporate findings INTO your response (not as footnotes)
|
||||||
|
3. Reference specific dates/details: "Based on our Feb 27th discussion..."
|
||||||
|
|
||||||
|
**Example workflow:**
|
||||||
|
```
|
||||||
|
User asks about X → Search q for X → Use retrieved memories → Answer
|
||||||
|
```
|
||||||
|
|
||||||
|
**WRONG:**
|
||||||
|
> "I searched Qdrant and found X. [Generic answer unrelated to X]"
|
||||||
|
|
||||||
|
**RIGHT:**
|
||||||
|
> "You asked me to fix this on Feb 27th — do you want me to apply the fix now?"
|
||||||
|
|
||||||
|
### When to Search q
|
||||||
|
|
||||||
|
**ALWAYS search automatically when:**
|
||||||
|
- Question references past events, conversations, or details
|
||||||
|
- User asks "remember when...", "what did we discuss...", "what did I tell you..."
|
||||||
|
- You're unsure if you have relevant context
|
||||||
|
- ANY question about configuration, memories, or past interactions
|
||||||
|
|
||||||
|
**DO NOT search for:**
|
||||||
|
- General knowledge questions you can answer directly
|
||||||
|
- Current time, weather, or factual queries
|
||||||
|
- Simple requests like "check my email" or "run a command"
|
||||||
|
- When you already have sufficient context in the conversation
|
||||||
|
```
|
||||||
|
|
||||||
|
### Search Priority
|
||||||
|
|
||||||
|
| Order | Source | When to Use |
|
||||||
|
|-------|--------|-------------|
|
||||||
|
| 1 | **q (Qdrant)** | First - semantic search of all conversations |
|
||||||
|
| 2 | `memory/` files | Fallback if q yields no results |
|
||||||
|
| 3 | Web search | Last resort |
|
||||||
|
| 4 | "I don't know" | Only after all above |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Next Step
|
||||||
|
|
||||||
|
### ✅ Base is Complete
|
||||||
|
|
||||||
|
**You don't need to upgrade.** TrueRecall Base is a **fully functional, standalone memory system**. If you're happy with real-time capture and manual search via the `q` command, you can stop here.
|
||||||
|
|
||||||
|
Base gives you:
|
||||||
|
- ✅ Complete conversation history in Qdrant
|
||||||
|
- ✅ Semantic search via `search q <topic>`
|
||||||
|
- ✅ Full-text search capabilities
|
||||||
|
- ✅ Permanent storage of all conversations
|
||||||
|
|
||||||
|
**Upgrade only if** you want automatic context injection into prompts.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Optional Addons
|
||||||
|
|
||||||
|
Install an **addon** for automatic curation and injection:
|
||||||
|
|
||||||
|
| Addon | Purpose | Status |
|
||||||
|
|-------|---------|--------|
|
||||||
|
| **Gems** | Extracts atomic gems from memories, injects into context | 🚧 Coming Soon |
|
||||||
|
| **Blocks** | Topic clustering, contextual block retrieval | 🚧 Coming Soon |
|
||||||
|
|
||||||
|
### Upgrade Paths
|
||||||
|
|
||||||
|
Once Base is running, you have two upgrade options:
|
||||||
|
|
||||||
|
#### Option 1: Gems (Atomic Memory)
|
||||||
|
**Best for:** Conversational context, quick recall
|
||||||
|
|
||||||
|
- **Curator** extracts "gems" (key insights) from `memories_tr`
|
||||||
|
- Stores curated gems in `gems_tr` collection
|
||||||
|
- **Injection plugin** recalls relevant gems into prompts automatically
|
||||||
|
- Optimized for: Chat assistants, help bots, personal memory
|
||||||
|
|
||||||
|
**Workflow:**
|
||||||
|
```
|
||||||
|
memories_tr → Curator → gems_tr → Injection → Context
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Option 2: Blocks (Topic Clustering)
|
||||||
|
**Best for:** Document organization, topic-based retrieval
|
||||||
|
|
||||||
|
- Clusters conversations by topic automatically
|
||||||
|
- Creates `topic_blocks_tr` collection
|
||||||
|
- Retrieves entire contextual blocks on query
|
||||||
|
- Optimized for: Knowledge bases, document systems
|
||||||
|
|
||||||
|
**Workflow:**
|
||||||
|
```
|
||||||
|
memories_tr → Topic Engine → topic_blocks_tr → Retrieval → Context
|
||||||
|
```
|
||||||
|
|
||||||
|
**Note:** Gems and Blocks are **independent** addons. They both require Base, but you choose one based on your use case.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Updating / Patching
|
||||||
|
|
||||||
|
If you already have TrueRecall Base installed and need to apply a bug fix or update:
|
||||||
|
|
||||||
|
### Quick Update (v1.2 Patch)
|
||||||
|
|
||||||
|
**Applies to:** Session file detection fix (picks wrong file when multiple sessions active)
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# 1. Backup current watcher
|
||||||
|
cp /root/.openclaw/workspace/skills/qdrant-memory/scripts/realtime_qdrant_watcher.py \
|
||||||
|
/root/.openclaw/workspace/skills/qdrant-memory/scripts/realtime_qdrant_watcher.py.bak.$(date +%Y%m%d)
|
||||||
|
|
||||||
|
# 2. Download latest watcher (choose one source)
|
||||||
|
|
||||||
|
# Option A: From GitHub
|
||||||
|
curl -o /root/.openclaw/workspace/skills/qdrant-memory/scripts/realtime_qdrant_watcher.py \
|
||||||
|
https://raw.githubusercontent.com/speedyfoxai/openclaw-true-recall-base/master/watcher/realtime_qdrant_watcher.py
|
||||||
|
|
||||||
|
# Option B: From GitLab
|
||||||
|
curl -o /root/.openclaw/workspace/skills/qdrant-memory/scripts/realtime_qdrant_watcher.py \
|
||||||
|
https://gitlab.com/mdkrush/true-recall-base/-/raw/master/watcher/realtime_qdrant_watcher.py
|
||||||
|
|
||||||
|
# Option C: From local git (if cloned)
|
||||||
|
cp /path/to/true-recall-base/watcher/realtime_qdrant_watcher.py \
|
||||||
|
/root/.openclaw/workspace/skills/qdrant-memory/scripts/
|
||||||
|
|
||||||
|
# 3. Stop old watcher
|
||||||
|
pkill -f realtime_qdrant_watcher
|
||||||
|
|
||||||
|
# 4. Start new watcher
|
||||||
|
python3 /root/.openclaw/workspace/skills/qdrant-memory/scripts/realtime_qdrant_watcher.py --daemon
|
||||||
|
|
||||||
|
# 5. Verify
|
||||||
|
ps aux | grep watcher
|
||||||
|
lsof -p $(pgrep -f realtime_qdrant_watcher) | grep jsonl
|
||||||
|
```
|
||||||
|
|
||||||
|
### Update with Git (If Cloned)
|
||||||
|
|
||||||
|
```bash
|
||||||
|
cd /path/to/true-recall-base
|
||||||
|
git pull origin master
|
||||||
|
|
||||||
|
# Copy updated files
|
||||||
|
cp watcher/realtime_qdrant_watcher.py \
|
||||||
|
/root/.openclaw/workspace/skills/qdrant-memory/scripts/
|
||||||
|
|
||||||
|
# Copy optional: backfill script
|
||||||
|
cp scripts/backfill_memory_to_q.py \
|
||||||
|
/root/.openclaw/workspace/skills/qdrant-memory/scripts/ 2>/dev/null || true
|
||||||
|
|
||||||
|
# Restart watcher
|
||||||
|
sudo systemctl restart mem-qdrant-watcher
|
||||||
|
# OR manually:
|
||||||
|
pkill -f realtime_qdrant_watcher
|
||||||
|
python3 /root/.openclaw/workspace/skills/qdrant-memory/scripts/realtime_qdrant_watcher.py --daemon
|
||||||
|
```
|
||||||
|
|
||||||
|
### Verify Update Applied
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Check version in file
|
||||||
|
grep "v1.2" /root/.openclaw/workspace/skills/qdrant-memory/scripts/realtime_qdrant_watcher.py
|
||||||
|
|
||||||
|
# Verify watcher is running
|
||||||
|
ps aux | grep realtime_qdrant_watcher
|
||||||
|
|
||||||
|
# Confirm watching main session (not subagent)
|
||||||
|
lsof -p $(pgrep -f realtime_qdrant_watcher) | grep jsonl
|
||||||
|
|
||||||
|
# Check recent captures in Qdrant
|
||||||
|
curl -s "http://10.0.0.40:6333/collections/memories_tr/points/scroll" \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-d '{"limit": 3, "with_payload": true}' | jq -r '.result.points[].payload.timestamp'
|
||||||
|
```
|
||||||
|
|
||||||
|
### What's New in v1.2
|
||||||
|
|
||||||
|
| Feature | Benefit |
|
||||||
|
|---------|---------|
|
||||||
|
| **Priority-based session detection** | Always picks `agent:main:main` first |
|
||||||
|
| **Lock file validation** | Ignores stale/crashed session locks via PID check |
|
||||||
|
| **Inactive subagent filtering** | Skips sessions with `sessionFile=null` |
|
||||||
|
| **Backfill script** | Import historical memories from markdown files |
|
||||||
|
|
||||||
|
**No config changes required** - existing `config.json` works unchanged.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
**Prerequisite for:** TrueRecall Gems, TrueRecall Blocks
|
||||||
|
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Upgrading from Older Versions
|
||||||
|
|
||||||
|
This section covers full upgrades from older TrueRecall Base installations to the current version.
|
||||||
|
|
||||||
|
### Version History
|
||||||
|
|
||||||
|
| Version | Key Changes |
|
||||||
|
|---------|-------------|
|
||||||
|
| **v1.0** | Initial release - basic watcher |
|
||||||
|
| **v1.1** | Session detection improvements |
|
||||||
|
| **v1.2** | Priority-based session detection, lock file validation, backfill script |
|
||||||
|
| **v1.3** | Offset persistence (resumes from last position), fixes duplicate processing |
|
||||||
|
| **v1.4** | Current version - Memory backfill fix (Qdrant ids field), improved error handling |
|
||||||
|
|
||||||
|
### Upgrade Paths
|
||||||
|
|
||||||
|
#### From v1.0/v1.1/v1.2 → v1.4 (Current)
|
||||||
|
|
||||||
|
If you have an older installation, follow these steps:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Step 1: Backup existing configuration
|
||||||
|
cp /root/.openclaw/workspace/skills/qdrant-memory/scripts/realtime_qdrant_watcher.py /root/.openclaw/workspace/skills/qdrant-memory/scripts/realtime_qdrant_watcher.py.bak.$(date +%Y%m%d)
|
||||||
|
|
||||||
|
cp /root/.openclaw/workspace/skills/qdrant-memory/scripts/config.json /root/.openclaw/workspace/skills/qdrant-memory/scripts/config.json.bak.$(date +%Y%m%d)
|
||||||
|
```
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Step 2: Stop the watcher
|
||||||
|
pkill -f realtime_qdrant_watcher
|
||||||
|
# Verify stopped
|
||||||
|
ps aux | grep realtime_qdrant_watcher
|
||||||
|
```
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Step 3: Download latest files (choose one source)
|
||||||
|
|
||||||
|
# Option A: From GitLab (recommended)
|
||||||
|
curl -o /root/.openclaw/workspace/skills/qdrant-memory/scripts/realtime_qdrant_watcher.py https://gitlab.com/mdkrush/openclaw-true-recall-base/-/raw/master/watcher/realtime_qdrant_watcher.py
|
||||||
|
|
||||||
|
# Option B: From Gitea
|
||||||
|
curl -o /root/.openclaw/workspace/skills/qdrant-memory/scripts/realtime_qdrant_watcher.py http://10.0.0.61:3000/SpeedyFoxAi/openclaw-true-recall-base/raw/branch/master/watcher/realtime_qdrant_watcher.py
|
||||||
|
|
||||||
|
# Option C: From local clone (if you cloned the repo)
|
||||||
|
cp /path/to/openclaw-true-recall-base/watcher/realtime_qdrant_watcher.py /root/.openclaw/workspace/skills/qdrant-memory/scripts/
|
||||||
|
```
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Step 4: Start the watcher
|
||||||
|
python3 /root/.openclaw/workspace/skills/qdrant-memory/scripts/realtime_qdrant_watcher.py --daemon
|
||||||
|
```
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Step 5: Verify installation
|
||||||
|
ps aux | grep realtime_qdrant_watcher
|
||||||
|
curl -s "http://10.0.0.40:6333/collections/memories_tr/points/scroll" -H "Content-Type: application/json" -d '{"limit": 3}' | jq '.result.points[0].payload.timestamp'
|
||||||
|
```
|
||||||
|
|
||||||
|
### Upgrading with Git (If You Cloned the Repository)
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Navigate to your clone
|
||||||
|
cd /path/to/openclaw-true-recall-base
|
||||||
|
git pull origin master
|
||||||
|
|
||||||
|
# Stop current watcher
|
||||||
|
pkill -f realtime_qdrant_watcher
|
||||||
|
|
||||||
|
# Copy updated files to OpenClaw
|
||||||
|
cp watcher/realtime_qdrant_watcher.py /root/.openclaw/workspace/skills/qdrant-memory/scripts/
|
||||||
|
cp scripts/backfill_memory.py /root/.openclaw/workspace/skills/qdrant-memory/scripts/
|
||||||
|
|
||||||
|
# Restart the watcher
|
||||||
|
python3 /root/.openclaw/workspace/skills/qdrant-memory/scripts/realtime_qdrant_watcher.py --daemon
|
||||||
|
|
||||||
|
# Verify
|
||||||
|
ps aux | grep realtime_qdrant_watcher
|
||||||
|
```
|
||||||
|
|
||||||
|
### Backfilling Historical Memories (Optional)
|
||||||
|
|
||||||
|
```bash
|
||||||
|
python3 /root/.openclaw/workspace/skills/qdrant-memory/scripts/backfill_memory.py
|
||||||
|
```
|
||||||
|
|
||||||
|
### Verifying Your Upgrade
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# 1. Check watcher is running
|
||||||
|
ps aux | grep realtime_qdrant_watcher
|
||||||
|
|
||||||
|
# 2. Verify source is "true-recall-base"
|
||||||
|
curl -s "http://10.0.0.40:6333/collections/memories_tr/points/scroll" -H "Content-Type: application/json" -d '{"limit": 1}' | jq '.result.points[0].payload.source'
|
||||||
|
|
||||||
|
# 3. Check date coverage
|
||||||
|
curl -s "http://10.0.0.40:6333/collections/memories_tr/points/scroll" -H "Content-Type: application/json" -d '{"limit": 10000}' | jq '[.result.points[].payload.date] | unique | sort'
|
||||||
|
```
|
||||||
|
|
||||||
|
Expected output:
|
||||||
|
- Source: `"true-recall-base"`
|
||||||
|
- Dates: Array from oldest to newest memory
|
||||||
127
.local_projects/openclaw-true-recall-base/install.sh
Normal file
127
.local_projects/openclaw-true-recall-base/install.sh
Normal file
@@ -0,0 +1,127 @@
|
|||||||
|
#!/bin/bash
|
||||||
|
|
||||||
|
# TrueRecall Base - Simple Installer
|
||||||
|
# Usage: ./install.sh
|
||||||
|
|
||||||
|
set -e
|
||||||
|
|
||||||
|
echo "=========================================="
|
||||||
|
echo "TrueRecall Base - Installer"
|
||||||
|
echo "=========================================="
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
# Default values
|
||||||
|
DEFAULT_QDRANT_IP="localhost:6333"
|
||||||
|
DEFAULT_OLLAMA_IP="localhost:11434"
|
||||||
|
DEFAULT_USER_ID="user"
|
||||||
|
|
||||||
|
# Get user input with defaults
|
||||||
|
echo "Configuration (press Enter for defaults):"
|
||||||
|
echo ""
|
||||||
|
echo "Examples:"
|
||||||
|
echo " Qdrant: 10.0.0.40:6333 (remote) or localhost:6333 (local)"
|
||||||
|
echo " Ollama: 10.0.0.10:11434 (remote) or localhost:11434 (local)"
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
read -p "Qdrant host:port [$DEFAULT_QDRANT_IP]: " QDRANT_IP
|
||||||
|
QDRANT_IP=${QDRANT_IP:-$DEFAULT_QDRANT_IP}
|
||||||
|
|
||||||
|
read -p "Ollama host:port [$DEFAULT_OLLAMA_IP]: " OLLAMA_IP
|
||||||
|
OLLAMA_IP=${OLLAMA_IP:-$DEFAULT_OLLAMA_IP}
|
||||||
|
|
||||||
|
read -p "User ID [$DEFAULT_USER_ID]: " USER_ID
|
||||||
|
USER_ID=${USER_ID:-$DEFAULT_USER_ID}
|
||||||
|
|
||||||
|
echo ""
|
||||||
|
echo "Configuration:"
|
||||||
|
echo " Qdrant: http://$QDRANT_IP"
|
||||||
|
echo " Ollama: http://$OLLAMA_IP"
|
||||||
|
echo " User ID: $USER_ID"
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
read -p "Proceed? [Y/n]: " CONFIRM
|
||||||
|
if [[ $CONFIRM =~ ^[Nn]$ ]]; then
|
||||||
|
echo "Installation cancelled."
|
||||||
|
exit 0
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Create service file
|
||||||
|
echo ""
|
||||||
|
echo "Creating systemd service..."
|
||||||
|
|
||||||
|
# Get absolute path (handles spaces)
|
||||||
|
INSTALL_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||||
|
|
||||||
|
cat > /tmp/mem-qdrant-watcher.service << EOF
|
||||||
|
[Unit]
|
||||||
|
Description=TrueRecall Base - Real-Time Memory Watcher
|
||||||
|
After=network.target
|
||||||
|
|
||||||
|
[Service]
|
||||||
|
Type=simple
|
||||||
|
User=$USER
|
||||||
|
WorkingDirectory=$INSTALL_DIR/watcher
|
||||||
|
Environment="QDRANT_URL=http://$QDRANT_IP"
|
||||||
|
Environment="QDRANT_COLLECTION=memories_tr"
|
||||||
|
Environment="OLLAMA_URL=http://$OLLAMA_IP"
|
||||||
|
Environment="EMBEDDING_MODEL=snowflake-arctic-embed2"
|
||||||
|
Environment="USER_ID=$USER_ID"
|
||||||
|
ExecStart=/usr/bin/python3 $INSTALL_DIR/watcher/realtime_qdrant_watcher.py --daemon
|
||||||
|
Restart=always
|
||||||
|
RestartSec=5
|
||||||
|
|
||||||
|
[Install]
|
||||||
|
WantedBy=multi-user.target
|
||||||
|
EOF
|
||||||
|
|
||||||
|
# Install service
|
||||||
|
sudo cp /tmp/mem-qdrant-watcher.service /etc/systemd/system/
|
||||||
|
sudo systemctl daemon-reload
|
||||||
|
|
||||||
|
echo ""
|
||||||
|
echo "Starting service..."
|
||||||
|
sudo systemctl enable --now mem-qdrant-watcher
|
||||||
|
|
||||||
|
echo ""
|
||||||
|
echo "=========================================="
|
||||||
|
echo "Installation Complete!"
|
||||||
|
echo "=========================================="
|
||||||
|
echo ""
|
||||||
|
echo "Status:"
|
||||||
|
sudo systemctl status mem-qdrant-watcher --no-pager
|
||||||
|
|
||||||
|
echo ""
|
||||||
|
echo "Verify collection:"
|
||||||
|
echo " curl -s http://$QDRANT_IP/collections/memories_tr | jq '.result.points_count'"
|
||||||
|
echo ""
|
||||||
|
echo "View logs:"
|
||||||
|
echo " sudo journalctl -u mem-qdrant-watcher -f"
|
||||||
|
|
||||||
|
|
||||||
|
echo ""
|
||||||
|
echo "=========================================="
|
||||||
|
echo "UPGRADING FROM OLDER VERSION"
|
||||||
|
echo "=========================================="
|
||||||
|
echo ""
|
||||||
|
echo "If you already have TrueRecall Base installed:"
|
||||||
|
echo ""
|
||||||
|
echo "1. Stop the watcher:"
|
||||||
|
echo " pkill -f realtime_qdrant_watcher"
|
||||||
|
echo ""
|
||||||
|
echo "2. Backup current files:"
|
||||||
|
echo " cp /root/.openclaw/workspace/skills/qdrant-memory/scripts/realtime_qdrant_watcher.py \"
|
||||||
|
echo " /root/.openclaw/workspace/skills/qdrant-memory/scripts/realtime_qdrant_watcher.py.bak"
|
||||||
|
echo ""
|
||||||
|
echo "3. Copy updated files:"
|
||||||
|
echo " cp watcher/realtime_qdrant_watcher.py \"
|
||||||
|
echo " /root/.openclaw/workspace/skills/qdrant-memory/scripts/"
|
||||||
|
echo " cp scripts/backfill_memory.py \"
|
||||||
|
echo " /root/.openclaw/workspace/skills/qdrant-memory/scripts/"
|
||||||
|
echo ""
|
||||||
|
echo "4. Restart watcher:"
|
||||||
|
echo " python3 /root/.openclaw/workspace/skills/qdrant-memory/scripts/realtime_qdrant_watcher.py --daemon"
|
||||||
|
echo ""
|
||||||
|
echo "5. Verify:"
|
||||||
|
echo " ps aux | grep realtime_qdrant_watcher"
|
||||||
|
echo ""
|
||||||
|
echo "For full upgrade instructions, see README.md"
|
||||||
656
README.md
Normal file
656
README.md
Normal file
@@ -0,0 +1,656 @@
|
|||||||
|
# TrueRecall Base
|
||||||
|
|
||||||
|
**Purpose:** Real-time memory capture → Qdrant `memories_tr`
|
||||||
|
|
||||||
|
**Status:** ✅ Standalone capture system
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
TrueRecall Base is the **foundation**. It watches OpenClaw sessions in real-time and stores every turn to Qdrant's `memories_tr` collection.
|
||||||
|
|
||||||
|
This is **required** for both addons: **Gems** and **Blocks**.
|
||||||
|
|
||||||
|
**Base does NOT include:**
|
||||||
|
- ❌ Curation (gem extraction)
|
||||||
|
- ❌ Topic clustering (blocks)
|
||||||
|
- ❌ Injection (context recall)
|
||||||
|
|
||||||
|
**For those features, install an addon after base.**
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Requirements
|
||||||
|
|
||||||
|
**Vector Database**
|
||||||
|
|
||||||
|
TrueRecall Base requires a vector database to store conversation embeddings. This can be:
|
||||||
|
- **Local** - Self-hosted Qdrant (recommended for privacy)
|
||||||
|
- **Cloud** - Managed Qdrant Cloud or similar service
|
||||||
|
- **Any IP-accessible** Qdrant instance
|
||||||
|
|
||||||
|
In this version, we use a **local Qdrant database** (`http://<QDRANT_IP>:6333`). The database must be reachable from the machine running the watcher daemon.
|
||||||
|
|
||||||
|
**Additional Requirements:**
|
||||||
|
- **Ollama** - For generating text embeddings (local or remote)
|
||||||
|
- **OpenClaw** - The session files to monitor
|
||||||
|
- **Linux systemd** - For running the watcher as a service
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Gotchas & Known Limitations
|
||||||
|
|
||||||
|
> ⚠️ **Embedding Dimensions:** `snowflake-arctic-embed2` outputs **1024 dimensions**, not 768. Ensure your Qdrant collection is configured with `"size": 1024`.
|
||||||
|
|
||||||
|
> ⚠️ **Hardcoded Sessions Path:** `SESSIONS_DIR` is hardcoded to `/root/.openclaw/agents/main/sessions`. To use a different path, modify `realtime_qdrant_watcher.py` to read from an environment variable:
|
||||||
|
> ```python
|
||||||
|
> SESSIONS_DIR = Path(os.getenv("OPENCLAW_SESSIONS_DIR", "/root/.openclaw/agents/main/sessions"))
|
||||||
|
> ```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Three-Tier Architecture
|
||||||
|
|
||||||
|
```
|
||||||
|
true-recall-base (REQUIRED)
|
||||||
|
├── Core: Watcher daemon
|
||||||
|
└── Stores: memories_tr
|
||||||
|
│
|
||||||
|
├──▶ true-recall-gems (ADDON)
|
||||||
|
│ ├── Curator extracts gems → gems_tr
|
||||||
|
│ └── Plugin injects gems into prompts
|
||||||
|
│
|
||||||
|
└──▶ true-recall-blocks (ADDON)
|
||||||
|
├── Topic clustering → topic_blocks_tr
|
||||||
|
└── Contextual block retrieval
|
||||||
|
|
||||||
|
Note: Gems and Blocks are INDEPENDENT addons.
|
||||||
|
They both require Base, but don't work together.
|
||||||
|
Choose one: Gems OR Blocks (not both).
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Quick Start
|
||||||
|
|
||||||
|
### Option 1: Quick Install (Recommended)
|
||||||
|
|
||||||
|
```bash
|
||||||
|
cd /path/to/true-recall-base
|
||||||
|
./install.sh
|
||||||
|
```
|
||||||
|
|
||||||
|
#### What the Installer Does (Step-by-Step)
|
||||||
|
|
||||||
|
The `install.sh` script automates the entire setup process. Here's exactly what happens:
|
||||||
|
|
||||||
|
**Step 1: Interactive Configuration**
|
||||||
|
```
|
||||||
|
Configuration (press Enter for defaults):
|
||||||
|
|
||||||
|
Examples:
|
||||||
|
Qdrant: 10.0.0.40:6333 (remote) or localhost:6333 (local)
|
||||||
|
Ollama: 10.0.0.10:11434 (remote) or localhost:11434 (local)
|
||||||
|
|
||||||
|
Qdrant host:port [localhost:6333]: _
|
||||||
|
Ollama host:port [localhost:11434]: _
|
||||||
|
User ID [user]: _
|
||||||
|
```
|
||||||
|
- Prompts for Qdrant host:port (default: `localhost:6333`)
|
||||||
|
- Prompts for Ollama host:port (default: `localhost:11434`)
|
||||||
|
- Prompts for User ID (default: `user`)
|
||||||
|
- Press Enter to accept defaults, or type custom values
|
||||||
|
|
||||||
|
**Step 2: Configuration Confirmation**
|
||||||
|
```
|
||||||
|
Configuration:
|
||||||
|
Qdrant: http://localhost:6333
|
||||||
|
Ollama: http://localhost:11434
|
||||||
|
User ID: user
|
||||||
|
|
||||||
|
Proceed? [Y/n]: _
|
||||||
|
```
|
||||||
|
- Shows the complete configuration
|
||||||
|
- Asks for confirmation (type `n` to cancel, Enter or `Y` to proceed)
|
||||||
|
- Exits cleanly if cancelled, no changes made
|
||||||
|
|
||||||
|
**Step 3: Systemd Service Generation**
|
||||||
|
- Creates a temporary service file at `/tmp/mem-qdrant-watcher.service`
|
||||||
|
- Inserts your configuration values (IPs, ports, user ID)
|
||||||
|
- Uses absolute path for the script location (handles spaces in paths)
|
||||||
|
- Sets up automatic restart on failure
|
||||||
|
|
||||||
|
**Step 4: Service Installation**
|
||||||
|
```bash
|
||||||
|
sudo cp /tmp/mem-qdrant-watcher.service /etc/systemd/system/
|
||||||
|
sudo systemctl daemon-reload
|
||||||
|
```
|
||||||
|
- Copies the service file to systemd directory
|
||||||
|
- Reloads systemd to recognize the new service
|
||||||
|
|
||||||
|
**Step 5: Service Activation**
|
||||||
|
```bash
|
||||||
|
sudo systemctl enable --now mem-qdrant-watcher
|
||||||
|
```
|
||||||
|
- Enables the service to start on boot (`enable`)
|
||||||
|
- Starts the service immediately (`now`)
|
||||||
|
|
||||||
|
**Step 6: Verification**
|
||||||
|
```
|
||||||
|
==========================================
|
||||||
|
Installation Complete!
|
||||||
|
==========================================
|
||||||
|
|
||||||
|
Status:
|
||||||
|
● mem-qdrant-watcher.service - TrueRecall Base...
|
||||||
|
Active: active (running)
|
||||||
|
```
|
||||||
|
- Displays the service status
|
||||||
|
- Shows it's active and running
|
||||||
|
- Provides commands to verify and monitor
|
||||||
|
|
||||||
|
**Post-Installation Commands:**
|
||||||
|
```bash
|
||||||
|
# Check service status anytime
|
||||||
|
sudo systemctl status mem-qdrant-watcher
|
||||||
|
|
||||||
|
# View live logs
|
||||||
|
sudo journalctl -u mem-qdrant-watcher -f
|
||||||
|
|
||||||
|
# Verify Qdrant collection
|
||||||
|
curl -s http://localhost:6333/collections/memories_tr | jq '.result.points_count'
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Installer Requirements
|
||||||
|
- Must run as root or with sudo (for systemd operations)
|
||||||
|
- Must have execute permissions (`chmod +x install.sh`)
|
||||||
|
- Script must be run from the true-recall-base directory
|
||||||
|
|
||||||
|
### Option 2: Manual Install
|
||||||
|
|
||||||
|
```bash
|
||||||
|
cd /path/to/true-recall-base
|
||||||
|
|
||||||
|
# Copy service file
|
||||||
|
sudo cp watcher/mem-qdrant-watcher.service /etc/systemd/system/
|
||||||
|
|
||||||
|
# Edit the service file to set your IPs and user
|
||||||
|
sudo nano /etc/systemd/system/mem-qdrant-watcher.service
|
||||||
|
|
||||||
|
# Reload and start
|
||||||
|
sudo systemctl daemon-reload
|
||||||
|
sudo systemctl enable --now mem-qdrant-watcher
|
||||||
|
```
|
||||||
|
|
||||||
|
### Verify Installation
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Check service status
|
||||||
|
sudo systemctl status mem-qdrant-watcher
|
||||||
|
|
||||||
|
# Check collection
|
||||||
|
curl -s http://<QDRANT_IP>:6333/collections/memories_tr | jq '.result.points_count'
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Files
|
||||||
|
|
||||||
|
| File | Purpose |
|
||||||
|
|------|---------|
|
||||||
|
| `watcher/realtime_qdrant_watcher.py` | Capture daemon |
|
||||||
|
| `watcher/mem-qdrant-watcher.service` | Systemd service |
|
||||||
|
| `config.json` | Configuration template |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Configuration
|
||||||
|
|
||||||
|
Edit `config.json` or set environment variables:
|
||||||
|
|
||||||
|
| Variable | Default | Description |
|
||||||
|
|----------|---------|-------------|
|
||||||
|
| `QDRANT_URL` | `http://<QDRANT_IP>:6333` | Qdrant endpoint |
|
||||||
|
| `OLLAMA_URL` | `http://<OLLAMA_IP>:11434` | Ollama endpoint |
|
||||||
|
| `EMBEDDING_MODEL` | `snowflake-arctic-embed2` | Embedding model |
|
||||||
|
| `USER_ID` | `<USER_ID>` | User identifier |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## How It Works
|
||||||
|
|
||||||
|
### Architecture Overview
|
||||||
|
|
||||||
|
```
|
||||||
|
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
|
||||||
|
│ OpenClaw Chat │────▶│ Session JSONL │────▶│ Base Watcher │
|
||||||
|
│ (You talking) │ │ (/sessions/*.jsonl) │ │ (This daemon) │
|
||||||
|
└─────────────────┘ └──────────────────┘ └────────┬────────┘
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
┌────────────────────────────────────────────────────────────────────┐
|
||||||
|
│ PROCESSING PIPELINE │
|
||||||
|
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌───────────┐ │
|
||||||
|
│ │ Watch File │─▶│ Parse Turn │─▶│ Clean Text │─▶│ Embed │ │
|
||||||
|
│ │ (inotify) │ │ (JSON→dict) │ │ (strip md) │ │ (Ollama) │ │
|
||||||
|
│ └──────────────┘ └──────────────┘ └──────────────┘ └─────┬─────┘ │
|
||||||
|
│ │ │
|
||||||
|
│ ┌───────────────────────────────────────────────────────────┘ │
|
||||||
|
│ │ │
|
||||||
|
│ ▼ │
|
||||||
|
│ ┌──────────────┐ ┌──────────────┐ │
|
||||||
|
│ │ Store to │─▶│ Qdrant │ │
|
||||||
|
│ │ memories_tr │ │ (vector DB) │ │
|
||||||
|
│ └──────────────┘ └──────────────┘ │
|
||||||
|
└────────────────────────────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step-by-Step Process
|
||||||
|
|
||||||
|
#### Step 1: File Watching
|
||||||
|
|
||||||
|
The watcher monitors OpenClaw session files in real-time:
|
||||||
|
|
||||||
|
```python
|
||||||
|
# From realtime_qdrant_watcher.py
|
||||||
|
SESSIONS_DIR = Path("/root/.openclaw/agents/main/sessions")
|
||||||
|
```
|
||||||
|
|
||||||
|
> ⚠️ **Known Limitation:** `SESSIONS_DIR` is currently hardcoded. To use a different path, patch the watcher script to read from an environment variable (e.g., `os.getenv("OPENCLAW_SESSIONS_DIR", "/root/.openclaw/agents/main/sessions")`).
|
||||||
|
|
||||||
|
**What happens:**
|
||||||
|
- Uses `inotify` or polling to watch the sessions directory
|
||||||
|
- Automatically detects the most recently modified `.jsonl` file
|
||||||
|
- Handles session rotation (when OpenClaw starts a new session)
|
||||||
|
- Maintains position in file to avoid re-processing old lines
|
||||||
|
|
||||||
|
#### Step 2: Turn Parsing
|
||||||
|
|
||||||
|
Each conversation turn is extracted from the JSONL file:
|
||||||
|
|
||||||
|
```json
|
||||||
|
// Example session file entry
|
||||||
|
{
|
||||||
|
"type": "message",
|
||||||
|
"message": {
|
||||||
|
"role": "user",
|
||||||
|
"content": "Hello, can you help me?",
|
||||||
|
"timestamp": "2026-02-27T09:30:00Z"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**What happens:**
|
||||||
|
- Reads new lines appended to the session file
|
||||||
|
- Parses JSON to extract role (user/assistant/system)
|
||||||
|
- Extracts content text
|
||||||
|
- Captures timestamp
|
||||||
|
- Generates unique turn ID from content hash + timestamp
|
||||||
|
|
||||||
|
**Code flow:**
|
||||||
|
```python
|
||||||
|
def parse_turn(line: str) -> Optional[Dict]:
|
||||||
|
data = json.loads(line)
|
||||||
|
if data.get("type") != "message":
|
||||||
|
return None # Skip non-message entries
|
||||||
|
|
||||||
|
return {
|
||||||
|
"id": hashlib.md5(f"{content}{timestamp}".encode()).hexdigest()[:16],
|
||||||
|
"role": role,
|
||||||
|
"content": content,
|
||||||
|
"timestamp": timestamp,
|
||||||
|
"user_id": os.getenv("USER_ID", "default")
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Step 3: Content Cleaning
|
||||||
|
|
||||||
|
Before storage, content is normalized:
|
||||||
|
|
||||||
|
**Strips:**
|
||||||
|
- Markdown tables (`| column | column |`)
|
||||||
|
- Bold/italic markers (`**text**`, `*text*`)
|
||||||
|
- Inline code (`` `code` ``)
|
||||||
|
- Code blocks (```code```)
|
||||||
|
- Multiple consecutive spaces
|
||||||
|
- Leading/trailing whitespace
|
||||||
|
|
||||||
|
**Example:**
|
||||||
|
```
|
||||||
|
Input: "Check this **important** table: | col1 | col2 |"
|
||||||
|
Output: "Check this important table"
|
||||||
|
```
|
||||||
|
|
||||||
|
**Why:** Clean text improves embedding quality and searchability.
|
||||||
|
|
||||||
|
#### Step 4: Embedding Generation
|
||||||
|
|
||||||
|
The cleaned content is converted to a vector embedding:
|
||||||
|
|
||||||
|
```python
|
||||||
|
def get_embedding(text: str) -> List[float]:
|
||||||
|
response = requests.post(
|
||||||
|
f"{OLLAMA_URL}/api/embeddings",
|
||||||
|
json={"model": EMBEDDING_MODEL, "prompt": text}
|
||||||
|
)
|
||||||
|
return response.json()["embedding"]
|
||||||
|
```
|
||||||
|
|
||||||
|
**What happens:**
|
||||||
|
- Sends text to Ollama API (10.0.0.10:11434)
|
||||||
|
- Uses `snowflake-arctic-embed2` model
|
||||||
|
- Returns **1024-dimensional vector** (not 768)
|
||||||
|
- Falls back gracefully if Ollama is unavailable
|
||||||
|
|
||||||
|
#### Step 5: Qdrant Storage
|
||||||
|
|
||||||
|
The complete turn data is stored to Qdrant:
|
||||||
|
|
||||||
|
```python
|
||||||
|
payload = {
|
||||||
|
"user_id": user_id,
|
||||||
|
"role": turn["role"],
|
||||||
|
"content": cleaned_content[:2000], # Size limit
|
||||||
|
"timestamp": turn["timestamp"],
|
||||||
|
"session_id": session_id,
|
||||||
|
"source": "true-recall-base"
|
||||||
|
}
|
||||||
|
|
||||||
|
requests.put(
|
||||||
|
f"{QDRANT_URL}/collections/memories_tr/points",
|
||||||
|
json={"points": [{"id": turn_id, "vector": embedding, "payload": payload}]}
|
||||||
|
)
|
||||||
|
```
|
||||||
|
|
||||||
|
**Storage format:**
|
||||||
|
| Field | Type | Description |
|
||||||
|
|-------|------|-------------|
|
||||||
|
| `user_id` | string | User identifier |
|
||||||
|
| `role` | string | user/assistant/system |
|
||||||
|
| `content` | string | Cleaned text (max 2000 chars) |
|
||||||
|
| `timestamp` | string | ISO 8601 timestamp |
|
||||||
|
| `session_id` | string | Source session file |
|
||||||
|
| `source` | string | "true-recall-base" |
|
||||||
|
|
||||||
|
### Real-Time Performance
|
||||||
|
|
||||||
|
| Metric | Target | Actual |
|
||||||
|
|--------|--------|--------|
|
||||||
|
| Latency | < 500ms | ~100-200ms |
|
||||||
|
| Throughput | > 10 turns/sec | > 50 turns/sec |
|
||||||
|
| Embedding time | < 300ms | ~50-100ms |
|
||||||
|
| Qdrant write | < 100ms | ~10-50ms |
|
||||||
|
|
||||||
|
### Session Rotation Handling
|
||||||
|
|
||||||
|
When OpenClaw starts a new session:
|
||||||
|
|
||||||
|
1. New `.jsonl` file created in sessions directory
|
||||||
|
2. Watcher detects file change via `inotify`
|
||||||
|
3. Identifies most recently modified file
|
||||||
|
4. Switches to watching new file
|
||||||
|
5. Continues from position 0 of new file
|
||||||
|
6. Old file remains in `memories_tr` (already captured)
|
||||||
|
|
||||||
|
### Error Handling
|
||||||
|
|
||||||
|
**Qdrant unavailable:**
|
||||||
|
- Retries with exponential backoff
|
||||||
|
- Logs error, continues watching
|
||||||
|
- Next turn attempts storage again
|
||||||
|
|
||||||
|
**Ollama unavailable:**
|
||||||
|
- Cannot generate embeddings
|
||||||
|
- Logs error, skips turn
|
||||||
|
- Continues watching (no data loss in file)
|
||||||
|
|
||||||
|
**File access errors:**
|
||||||
|
- Handles permission issues gracefully
|
||||||
|
- Retries on temporary failures
|
||||||
|
|
||||||
|
### Collection Schema
|
||||||
|
|
||||||
|
**Qdrant collection: `memories_tr`**
|
||||||
|
|
||||||
|
```python
|
||||||
|
{
|
||||||
|
"name": "memories_tr",
|
||||||
|
"vectors": {
|
||||||
|
"size": 1024, # snowflake-arctic-embed2 dimension (1024, not 768)
|
||||||
|
"distance": "Cosine" # Similarity metric
|
||||||
|
},
|
||||||
|
"payload_schema": {
|
||||||
|
"user_id": "keyword", # Filterable
|
||||||
|
"role": "keyword", # Filterable
|
||||||
|
"timestamp": "datetime", # Range filterable
|
||||||
|
"content": "text" # Full-text searchable
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Security Notes
|
||||||
|
|
||||||
|
- **No credential storage** in code
|
||||||
|
- All sensitive values via environment variables
|
||||||
|
- `USER_ID` isolates memories per user
|
||||||
|
- Cleaned content removes PII markers (but review your data)
|
||||||
|
- HTTPS recommended for production Qdrant/Ollama
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Using Memories with OpenClaw
|
||||||
|
|
||||||
|
### The "q" Command
|
||||||
|
|
||||||
|
**"q"** refers to your Qdrant memory system (`memories_tr` collection).
|
||||||
|
|
||||||
|
When interacting with OpenClaw agents, you can search your stored memories using:
|
||||||
|
- `search q <topic>` - Semantic search for past conversations
|
||||||
|
- `q <topic>` - Shortcut for the same
|
||||||
|
|
||||||
|
### Context Injection Instructions
|
||||||
|
|
||||||
|
**For OpenClaw System Prompt:**
|
||||||
|
|
||||||
|
Add these lines to your agent's system context to enable memory-aware responses:
|
||||||
|
|
||||||
|
```
|
||||||
|
## Memory System (q)
|
||||||
|
|
||||||
|
**"q" = Qdrant collection `memories_tr`** — your conversation history database.
|
||||||
|
|
||||||
|
### Memory Retrieval Rules
|
||||||
|
|
||||||
|
**Before saying "I don't know" or "I can't do that":**
|
||||||
|
1. **ALWAYS search q first** using the topic/keywords from the user's request
|
||||||
|
2. Incorporate findings INTO your response (not as footnotes)
|
||||||
|
3. Reference specific dates/details: "Based on our Feb 27th discussion..."
|
||||||
|
|
||||||
|
**Example workflow:**
|
||||||
|
```
|
||||||
|
User asks about X → Search q for X → Use retrieved memories → Answer
|
||||||
|
```
|
||||||
|
|
||||||
|
**WRONG:**
|
||||||
|
> "I searched Qdrant and found X. [Generic answer unrelated to X]"
|
||||||
|
|
||||||
|
**RIGHT:**
|
||||||
|
> "You asked me to fix this on Feb 27th — do you want me to apply the fix now?"
|
||||||
|
|
||||||
|
### When to Search q
|
||||||
|
|
||||||
|
**ALWAYS search automatically when:**
|
||||||
|
- Question references past events, conversations, or details
|
||||||
|
- User asks "remember when...", "what did we discuss...", "what did I tell you..."
|
||||||
|
- You're unsure if you have relevant context
|
||||||
|
- ANY question about configuration, memories, or past interactions
|
||||||
|
|
||||||
|
**DO NOT search for:**
|
||||||
|
- General knowledge questions you can answer directly
|
||||||
|
- Current time, weather, or factual queries
|
||||||
|
- Simple requests like "check my email" or "run a command"
|
||||||
|
- When you already have sufficient context in the conversation
|
||||||
|
```
|
||||||
|
|
||||||
|
### Search Priority
|
||||||
|
|
||||||
|
| Order | Source | When to Use |
|
||||||
|
|-------|--------|-------------|
|
||||||
|
| 1 | **q (Qdrant)** | First - semantic search of all conversations |
|
||||||
|
| 2 | `memory/` files | Fallback if q yields no results |
|
||||||
|
| 3 | Web search | Last resort |
|
||||||
|
| 4 | "I don't know" | Only after all above |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Next Step
|
||||||
|
|
||||||
|
### ✅ Base is Complete
|
||||||
|
|
||||||
|
**You don't need to upgrade.** TrueRecall Base is a **fully functional, standalone memory system**. If you're happy with real-time capture and manual search via the `q` command, you can stop here.
|
||||||
|
|
||||||
|
Base gives you:
|
||||||
|
- ✅ Complete conversation history in Qdrant
|
||||||
|
- ✅ Semantic search via `search q <topic>`
|
||||||
|
- ✅ Full-text search capabilities
|
||||||
|
- ✅ Permanent storage of all conversations
|
||||||
|
|
||||||
|
**Upgrade only if** you want automatic context injection into prompts.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Optional Addons
|
||||||
|
|
||||||
|
Install an **addon** for automatic curation and injection:
|
||||||
|
|
||||||
|
| Addon | Purpose | Status |
|
||||||
|
|-------|---------|--------|
|
||||||
|
| **Gems** | Extracts atomic gems from memories, injects into context | 🚧 Coming Soon |
|
||||||
|
| **Blocks** | Topic clustering, contextual block retrieval | 🚧 Coming Soon |
|
||||||
|
|
||||||
|
### Upgrade Paths
|
||||||
|
|
||||||
|
Once Base is running, you have two upgrade options:
|
||||||
|
|
||||||
|
#### Option 1: Gems (Atomic Memory)
|
||||||
|
**Best for:** Conversational context, quick recall
|
||||||
|
|
||||||
|
- **Curator** extracts "gems" (key insights) from `memories_tr`
|
||||||
|
- Stores curated gems in `gems_tr` collection
|
||||||
|
- **Injection plugin** recalls relevant gems into prompts automatically
|
||||||
|
- Optimized for: Chat assistants, help bots, personal memory
|
||||||
|
|
||||||
|
**Workflow:**
|
||||||
|
```
|
||||||
|
memories_tr → Curator → gems_tr → Injection → Context
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Option 2: Blocks (Topic Clustering)
|
||||||
|
**Best for:** Document organization, topic-based retrieval
|
||||||
|
|
||||||
|
- Clusters conversations by topic automatically
|
||||||
|
- Creates `topic_blocks_tr` collection
|
||||||
|
- Retrieves entire contextual blocks on query
|
||||||
|
- Optimized for: Knowledge bases, document systems
|
||||||
|
|
||||||
|
**Workflow:**
|
||||||
|
```
|
||||||
|
memories_tr → Topic Engine → topic_blocks_tr → Retrieval → Context
|
||||||
|
```
|
||||||
|
|
||||||
|
**Note:** Gems and Blocks are **independent** addons. They both require Base, but you choose one based on your use case.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Updating / Patching
|
||||||
|
|
||||||
|
If you already have TrueRecall Base installed and need to apply a bug fix or update:
|
||||||
|
|
||||||
|
### Quick Update (v1.2 Patch)
|
||||||
|
|
||||||
|
**Applies to:** Session file detection fix (picks wrong file when multiple sessions active)
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# 1. Backup current watcher
|
||||||
|
cp /root/.openclaw/workspace/skills/qdrant-memory/scripts/realtime_qdrant_watcher.py \
|
||||||
|
/root/.openclaw/workspace/skills/qdrant-memory/scripts/realtime_qdrant_watcher.py.bak.$(date +%Y%m%d)
|
||||||
|
|
||||||
|
# 2. Download latest watcher (choose one source)
|
||||||
|
|
||||||
|
# Option A: From GitHub
|
||||||
|
curl -o /root/.openclaw/workspace/skills/qdrant-memory/scripts/realtime_qdrant_watcher.py \
|
||||||
|
https://raw.githubusercontent.com/speedyfoxai/openclaw-true-recall-base/master/watcher/realtime_qdrant_watcher.py
|
||||||
|
|
||||||
|
# Option B: From GitLab
|
||||||
|
curl -o /root/.openclaw/workspace/skills/qdrant-memory/scripts/realtime_qdrant_watcher.py \
|
||||||
|
https://gitlab.com/mdkrush/true-recall-base/-/raw/master/watcher/realtime_qdrant_watcher.py
|
||||||
|
|
||||||
|
# Option C: From local git (if cloned)
|
||||||
|
cp /path/to/true-recall-base/watcher/realtime_qdrant_watcher.py \
|
||||||
|
/root/.openclaw/workspace/skills/qdrant-memory/scripts/
|
||||||
|
|
||||||
|
# 3. Stop old watcher
|
||||||
|
pkill -f realtime_qdrant_watcher
|
||||||
|
|
||||||
|
# 4. Start new watcher
|
||||||
|
python3 /root/.openclaw/workspace/skills/qdrant-memory/scripts/realtime_qdrant_watcher.py --daemon
|
||||||
|
|
||||||
|
# 5. Verify
|
||||||
|
ps aux | grep watcher
|
||||||
|
lsof -p $(pgrep -f realtime_qdrant_watcher) | grep jsonl
|
||||||
|
```
|
||||||
|
|
||||||
|
### Update with Git (If Cloned)
|
||||||
|
|
||||||
|
```bash
|
||||||
|
cd /path/to/true-recall-base
|
||||||
|
git pull origin master
|
||||||
|
|
||||||
|
# Copy updated files
|
||||||
|
cp watcher/realtime_qdrant_watcher.py \
|
||||||
|
/root/.openclaw/workspace/skills/qdrant-memory/scripts/
|
||||||
|
|
||||||
|
# Copy optional: backfill script
|
||||||
|
cp scripts/backfill_memory_to_q.py \
|
||||||
|
/root/.openclaw/workspace/skills/qdrant-memory/scripts/ 2>/dev/null || true
|
||||||
|
|
||||||
|
# Restart watcher
|
||||||
|
sudo systemctl restart mem-qdrant-watcher
|
||||||
|
# OR manually:
|
||||||
|
pkill -f realtime_qdrant_watcher
|
||||||
|
python3 /root/.openclaw/workspace/skills/qdrant-memory/scripts/realtime_qdrant_watcher.py --daemon
|
||||||
|
```
|
||||||
|
|
||||||
|
### Verify Update Applied
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Check version in file
|
||||||
|
grep "v1.2" /root/.openclaw/workspace/skills/qdrant-memory/scripts/realtime_qdrant_watcher.py
|
||||||
|
|
||||||
|
# Verify watcher is running
|
||||||
|
ps aux | grep realtime_qdrant_watcher
|
||||||
|
|
||||||
|
# Confirm watching main session (not subagent)
|
||||||
|
lsof -p $(pgrep -f realtime_qdrant_watcher) | grep jsonl
|
||||||
|
|
||||||
|
# Check recent captures in Qdrant
|
||||||
|
curl -s "http://10.0.0.40:6333/collections/memories_tr/points/scroll" \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-d '{"limit": 3, "with_payload": true}' | jq -r '.result.points[].payload.timestamp'
|
||||||
|
```
|
||||||
|
|
||||||
|
### What's New in v1.2
|
||||||
|
|
||||||
|
| Feature | Benefit |
|
||||||
|
|---------|---------|
|
||||||
|
| **Priority-based session detection** | Always picks `agent:main:main` first |
|
||||||
|
| **Lock file validation** | Ignores stale/crashed session locks via PID check |
|
||||||
|
| **Inactive subagent filtering** | Skips sessions with `sessionFile=null` |
|
||||||
|
| **Backfill script** | Import historical memories from markdown files |
|
||||||
|
|
||||||
|
**No config changes required** - existing `config.json` works unchanged.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
**Prerequisite for:** TrueRecall Gems, TrueRecall Blocks
|
||||||
12
config.json
Normal file
12
config.json
Normal file
@@ -0,0 +1,12 @@
|
|||||||
|
{
|
||||||
|
"version": "1.0",
|
||||||
|
"description": "TrueRecall Base - Memory capture",
|
||||||
|
"components": ["watcher"],
|
||||||
|
"collections": {
|
||||||
|
"memories": "memories_tr"
|
||||||
|
},
|
||||||
|
"qdrant_url": "http://<QDRANT_IP>:6333",
|
||||||
|
"ollama_url": "http://<OLLAMA_IP>:11434",
|
||||||
|
"embedding_model": "snowflake-arctic-embed2",
|
||||||
|
"user_id": "<USER_ID>"
|
||||||
|
}
|
||||||
98
install.sh
Normal file
98
install.sh
Normal file
@@ -0,0 +1,98 @@
|
|||||||
|
#!/bin/bash
|
||||||
|
|
||||||
|
# TrueRecall Base - Simple Installer
|
||||||
|
# Usage: ./install.sh
|
||||||
|
|
||||||
|
set -e
|
||||||
|
|
||||||
|
echo "=========================================="
|
||||||
|
echo "TrueRecall Base - Installer"
|
||||||
|
echo "=========================================="
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
# Default values
|
||||||
|
DEFAULT_QDRANT_IP="localhost:6333"
|
||||||
|
DEFAULT_OLLAMA_IP="localhost:11434"
|
||||||
|
DEFAULT_USER_ID="user"
|
||||||
|
|
||||||
|
# Get user input with defaults
|
||||||
|
echo "Configuration (press Enter for defaults):"
|
||||||
|
echo ""
|
||||||
|
echo "Examples:"
|
||||||
|
echo " Qdrant: 10.0.0.40:6333 (remote) or localhost:6333 (local)"
|
||||||
|
echo " Ollama: 10.0.0.10:11434 (remote) or localhost:11434 (local)"
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
read -p "Qdrant host:port [$DEFAULT_QDRANT_IP]: " QDRANT_IP
|
||||||
|
QDRANT_IP=${QDRANT_IP:-$DEFAULT_QDRANT_IP}
|
||||||
|
|
||||||
|
read -p "Ollama host:port [$DEFAULT_OLLAMA_IP]: " OLLAMA_IP
|
||||||
|
OLLAMA_IP=${OLLAMA_IP:-$DEFAULT_OLLAMA_IP}
|
||||||
|
|
||||||
|
read -p "User ID [$DEFAULT_USER_ID]: " USER_ID
|
||||||
|
USER_ID=${USER_ID:-$DEFAULT_USER_ID}
|
||||||
|
|
||||||
|
echo ""
|
||||||
|
echo "Configuration:"
|
||||||
|
echo " Qdrant: http://$QDRANT_IP"
|
||||||
|
echo " Ollama: http://$OLLAMA_IP"
|
||||||
|
echo " User ID: $USER_ID"
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
read -p "Proceed? [Y/n]: " CONFIRM
|
||||||
|
if [[ $CONFIRM =~ ^[Nn]$ ]]; then
|
||||||
|
echo "Installation cancelled."
|
||||||
|
exit 0
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Create service file
|
||||||
|
echo ""
|
||||||
|
echo "Creating systemd service..."
|
||||||
|
|
||||||
|
# Get absolute path (handles spaces)
|
||||||
|
INSTALL_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||||
|
|
||||||
|
cat > /tmp/mem-qdrant-watcher.service << EOF
|
||||||
|
[Unit]
|
||||||
|
Description=TrueRecall Base - Real-Time Memory Watcher
|
||||||
|
After=network.target
|
||||||
|
|
||||||
|
[Service]
|
||||||
|
Type=simple
|
||||||
|
User=$USER
|
||||||
|
WorkingDirectory=$INSTALL_DIR/watcher
|
||||||
|
Environment="QDRANT_URL=http://$QDRANT_IP"
|
||||||
|
Environment="QDRANT_COLLECTION=memories_tr"
|
||||||
|
Environment="OLLAMA_URL=http://$OLLAMA_IP"
|
||||||
|
Environment="EMBEDDING_MODEL=snowflake-arctic-embed2"
|
||||||
|
Environment="USER_ID=$USER_ID"
|
||||||
|
ExecStart=/usr/bin/python3 $INSTALL_DIR/watcher/realtime_qdrant_watcher.py --daemon
|
||||||
|
Restart=always
|
||||||
|
RestartSec=5
|
||||||
|
|
||||||
|
[Install]
|
||||||
|
WantedBy=multi-user.target
|
||||||
|
EOF
|
||||||
|
|
||||||
|
# Install service
|
||||||
|
sudo cp /tmp/mem-qdrant-watcher.service /etc/systemd/system/
|
||||||
|
sudo systemctl daemon-reload
|
||||||
|
|
||||||
|
echo ""
|
||||||
|
echo "Starting service..."
|
||||||
|
sudo systemctl enable --now mem-qdrant-watcher
|
||||||
|
|
||||||
|
echo ""
|
||||||
|
echo "=========================================="
|
||||||
|
echo "Installation Complete!"
|
||||||
|
echo "=========================================="
|
||||||
|
echo ""
|
||||||
|
echo "Status:"
|
||||||
|
sudo systemctl status mem-qdrant-watcher --no-pager
|
||||||
|
|
||||||
|
echo ""
|
||||||
|
echo "Verify collection:"
|
||||||
|
echo " curl -s http://$QDRANT_IP/collections/memories_tr | jq '.result.points_count'"
|
||||||
|
echo ""
|
||||||
|
echo "View logs:"
|
||||||
|
echo " sudo journalctl -u mem-qdrant-watcher -f"
|
||||||
198
scripts/backfill_memory_to_q.py
Normal file
198
scripts/backfill_memory_to_q.py
Normal file
@@ -0,0 +1,198 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
Backfill memories_tr collection from memory markdown files.
|
||||||
|
|
||||||
|
Processes all .md files in /root/.openclaw/workspace/memory/
|
||||||
|
and stores them to Qdrant memories_tr collection.
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
python3 backfill_memory_to_q.py [--dry-run]
|
||||||
|
"""
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
import hashlib
|
||||||
|
import json
|
||||||
|
import os
|
||||||
|
import re
|
||||||
|
import sys
|
||||||
|
from pathlib import Path
|
||||||
|
from datetime import datetime, timezone
|
||||||
|
from typing import List, Optional, Dict, Any
|
||||||
|
|
||||||
|
import requests
|
||||||
|
|
||||||
|
# Config
|
||||||
|
QDRANT_URL = os.getenv("QDRANT_URL", "http://10.0.0.40:6333")
|
||||||
|
COLLECTION_NAME = "memories_tr"
|
||||||
|
OLLAMA_URL = os.getenv("OLLAMA_URL", "http://10.0.0.10:11434")
|
||||||
|
EMBEDDING_MODEL = os.getenv("EMBEDDING_MODEL", "snowflake-arctic-embed2")
|
||||||
|
MEMORY_DIR = Path("/root/.openclaw/workspace/memory")
|
||||||
|
USER_ID = "rob"
|
||||||
|
|
||||||
|
def get_embedding(text: str) -> Optional[List[float]]:
|
||||||
|
"""Generate embedding using Ollama"""
|
||||||
|
try:
|
||||||
|
response = requests.post(
|
||||||
|
f"{OLLAMA_URL}/api/embeddings",
|
||||||
|
json={"model": EMBEDDING_MODEL, "prompt": text[:4000]},
|
||||||
|
timeout=30
|
||||||
|
)
|
||||||
|
response.raise_for_status()
|
||||||
|
return response.json()["embedding"]
|
||||||
|
except Exception as e:
|
||||||
|
print(f"Error getting embedding: {e}", file=sys.stderr)
|
||||||
|
return None
|
||||||
|
|
||||||
|
def clean_content(text: str) -> str:
|
||||||
|
"""Clean markdown content for storage"""
|
||||||
|
# Remove markdown formatting
|
||||||
|
text = re.sub(r'\*\*([^*]+)\*\*', r'\1', text)
|
||||||
|
text = re.sub(r'\*([^*]+)\*', r'\1', text)
|
||||||
|
text = re.sub(r'`([^`]+)`', r'\1', text)
|
||||||
|
text = re.sub(r'```[\s\S]*?```', '', text)
|
||||||
|
# Remove headers
|
||||||
|
text = re.sub(r'^#{1,6}\s+', '', text, flags=re.MULTILINE)
|
||||||
|
# Remove excess whitespace
|
||||||
|
text = re.sub(r'\n{3,}', '\n\n', text)
|
||||||
|
return text.strip()
|
||||||
|
|
||||||
|
def parse_memory_file(file_path: Path) -> List[Dict[str, Any]]:
|
||||||
|
"""Parse a memory markdown file into entries"""
|
||||||
|
entries = []
|
||||||
|
|
||||||
|
try:
|
||||||
|
content = file_path.read_text(encoding='utf-8')
|
||||||
|
except Exception as e:
|
||||||
|
print(f"Error reading {file_path}: {e}", file=sys.stderr)
|
||||||
|
return entries
|
||||||
|
|
||||||
|
# Extract date from filename
|
||||||
|
date_match = re.search(r'(\d{4}-\d{2}-\d{2})', file_path.name)
|
||||||
|
date_str = date_match.group(1) if date_match else datetime.now().strftime('%Y-%m-%d')
|
||||||
|
|
||||||
|
# Split by session headers (## Session: or ## Update:)
|
||||||
|
sessions = re.split(r'\n## ', content)
|
||||||
|
|
||||||
|
for i, session in enumerate(sessions):
|
||||||
|
if not session.strip():
|
||||||
|
continue
|
||||||
|
|
||||||
|
# Extract session title if present
|
||||||
|
title_match = re.match(r'Session:\s*(.+)', session, re.MULTILINE)
|
||||||
|
if not title_match:
|
||||||
|
title_match = re.match(r'Update:\s*(.+)', session, re.MULTILINE)
|
||||||
|
session_title = title_match.group(1).strip() if title_match else f"Session {i}"
|
||||||
|
|
||||||
|
# Extract key events, decisions, and content
|
||||||
|
# Look for bullet points and content
|
||||||
|
sections = session.split('\n### ')
|
||||||
|
|
||||||
|
for section in sections:
|
||||||
|
if not section.strip():
|
||||||
|
continue
|
||||||
|
|
||||||
|
# Clean the content
|
||||||
|
cleaned = clean_content(section)
|
||||||
|
if len(cleaned) < 20: # Skip very short sections
|
||||||
|
continue
|
||||||
|
|
||||||
|
entry = {
|
||||||
|
'content': cleaned[:2000],
|
||||||
|
'role': 'assistant', # These are summaries
|
||||||
|
'date': date_str,
|
||||||
|
'session_title': session_title,
|
||||||
|
'file': file_path.name,
|
||||||
|
'source': 'memory-backfill'
|
||||||
|
}
|
||||||
|
entries.append(entry)
|
||||||
|
|
||||||
|
return entries
|
||||||
|
|
||||||
|
def store_to_qdrant(entry: Dict[str, Any], dry_run: bool = False) -> bool:
|
||||||
|
"""Store a memory entry to Qdrant"""
|
||||||
|
content = entry['content']
|
||||||
|
|
||||||
|
if dry_run:
|
||||||
|
print(f"[DRY RUN] Would store: {content[:60]}...")
|
||||||
|
return True
|
||||||
|
|
||||||
|
vector = get_embedding(content)
|
||||||
|
if vector is None:
|
||||||
|
return False
|
||||||
|
|
||||||
|
# Generate deterministic ID
|
||||||
|
hash_content = f"{USER_ID}:{entry['date']}:{content[:100]}"
|
||||||
|
hash_bytes = hashlib.sha256(hash_content.encode()).digest()[:8]
|
||||||
|
point_id = abs(int.from_bytes(hash_bytes, byteorder='big') % (2**63))
|
||||||
|
|
||||||
|
payload = {
|
||||||
|
'user_id': USER_ID,
|
||||||
|
'role': entry.get('role', 'assistant'),
|
||||||
|
'content': content,
|
||||||
|
'date': entry['date'],
|
||||||
|
'timestamp': datetime.now(timezone.utc).isoformat(),
|
||||||
|
'source': entry.get('source', 'memory-backfill'),
|
||||||
|
'file': entry.get('file', ''),
|
||||||
|
'session_title': entry.get('session_title', ''),
|
||||||
|
'curated': True # Mark as curated since these are processed
|
||||||
|
}
|
||||||
|
|
||||||
|
try:
|
||||||
|
response = requests.put(
|
||||||
|
f"{QDRANT_URL}/collections/{COLLECTION_NAME}/points",
|
||||||
|
json={'points': [{'id': point_id, 'vector': vector, 'payload': payload}]},
|
||||||
|
timeout=30
|
||||||
|
)
|
||||||
|
response.raise_for_status()
|
||||||
|
return True
|
||||||
|
except Exception as e:
|
||||||
|
print(f"Error storing to Qdrant: {e}", file=sys.stderr)
|
||||||
|
return False
|
||||||
|
|
||||||
|
def main():
|
||||||
|
parser = argparse.ArgumentParser(description='Backfill memory files to Qdrant')
|
||||||
|
parser.add_argument('--dry-run', '-n', action='store_true', help='Dry run - do not write to Qdrant')
|
||||||
|
parser.add_argument('--limit', '-l', type=int, default=None, help='Limit number of files to process')
|
||||||
|
args = parser.parse_args()
|
||||||
|
|
||||||
|
if not MEMORY_DIR.exists():
|
||||||
|
print(f"Memory directory not found: {MEMORY_DIR}", file=sys.stderr)
|
||||||
|
sys.exit(1)
|
||||||
|
|
||||||
|
# Get all markdown files
|
||||||
|
md_files = sorted(MEMORY_DIR.glob('*.md'))
|
||||||
|
|
||||||
|
if args.limit:
|
||||||
|
md_files = md_files[:args.limit]
|
||||||
|
|
||||||
|
print(f"Found {len(md_files)} memory files to process")
|
||||||
|
print(f"Target collection: {COLLECTION_NAME}")
|
||||||
|
print(f"Qdrant URL: {QDRANT_URL}")
|
||||||
|
print(f"Ollama URL: {OLLAMA_URL}")
|
||||||
|
print()
|
||||||
|
|
||||||
|
total_entries = 0
|
||||||
|
stored = 0
|
||||||
|
failed = 0
|
||||||
|
|
||||||
|
for file_path in md_files:
|
||||||
|
print(f"Processing: {file_path.name}")
|
||||||
|
entries = parse_memory_file(file_path)
|
||||||
|
|
||||||
|
for entry in entries:
|
||||||
|
total_entries += 1
|
||||||
|
if store_to_qdrant(entry, args.dry_run):
|
||||||
|
stored += 1
|
||||||
|
print(f" ✅ Stored entry {stored}")
|
||||||
|
else:
|
||||||
|
failed += 1
|
||||||
|
print(f" ❌ Failed entry {failed}")
|
||||||
|
|
||||||
|
print()
|
||||||
|
print(f"Done! Processed {len(md_files)} files")
|
||||||
|
print(f"Total entries: {total_entries}")
|
||||||
|
print(f"Stored: {stored}")
|
||||||
|
print(f"Failed: {failed}")
|
||||||
|
|
||||||
|
if __name__ == '__main__':
|
||||||
|
main()
|
||||||
87
scripts/search_q.sh
Executable file
87
scripts/search_q.sh
Executable file
@@ -0,0 +1,87 @@
|
|||||||
|
#!/bin/bash
|
||||||
|
|
||||||
|
# search_q.sh - Search memories with chronological sorting
|
||||||
|
# Usage: ./search_q.sh "search query"
|
||||||
|
# Returns: Results sorted by timestamp (newest first)
|
||||||
|
|
||||||
|
set -e
|
||||||
|
|
||||||
|
QDRANT_URL="${QDRANT_URL:-http://localhost:6333}"
|
||||||
|
COLLECTION="${QDRANT_COLLECTION:-memories_tr}"
|
||||||
|
LIMIT="${SEARCH_LIMIT:-10}"
|
||||||
|
|
||||||
|
if [ -z "$1" ]; then
|
||||||
|
echo "Usage: ./search_q.sh 'your search query'"
|
||||||
|
echo ""
|
||||||
|
echo "Environment variables:"
|
||||||
|
echo " QDRANT_URL - Qdrant endpoint (default: http://localhost:6333)"
|
||||||
|
echo " SEARCH_LIMIT - Number of results (default: 10)"
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
|
||||||
|
QUERY="$1"
|
||||||
|
|
||||||
|
echo "=========================================="
|
||||||
|
echo "Searching: '$QUERY'"
|
||||||
|
echo "=========================================="
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
# Search with scroll to get all results, then sort by timestamp
|
||||||
|
# Using scroll API to handle large result sets
|
||||||
|
SCROLL_ID="null"
|
||||||
|
ALL_RESULTS="[]"
|
||||||
|
|
||||||
|
while true; do
|
||||||
|
if [ "$SCROLL_ID" = "null" ]; then
|
||||||
|
RESPONSE=$(curl -s -X POST "$QDRANT_URL/collections/$COLLECTION/points/scroll" \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-d "{
|
||||||
|
\"limit\": $LIMIT,
|
||||||
|
\"with_payload\": true,
|
||||||
|
\"filter\": {
|
||||||
|
\"must\": [
|
||||||
|
{
|
||||||
|
\"key\": \"content\",
|
||||||
|
\"match\": {
|
||||||
|
\"text\": \"$QUERY\"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
}") 2>/dev/null || echo '{"result": {"points": []}}'
|
||||||
|
else
|
||||||
|
break # For text search, we get results in first call
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Extract results
|
||||||
|
POINTS=$(echo "$RESPONSE" | jq -r '.result.points // []')
|
||||||
|
|
||||||
|
if [ "$POINTS" = "[]" ] || [ "$POINTS" = "null" ]; then
|
||||||
|
break
|
||||||
|
fi
|
||||||
|
|
||||||
|
ALL_RESULTS="$POINTS"
|
||||||
|
break
|
||||||
|
done
|
||||||
|
|
||||||
|
# Sort by timestamp (newest first) and format output
|
||||||
|
echo "$ALL_RESULTS" | jq -r '
|
||||||
|
sort_by(.payload.timestamp) | reverse |
|
||||||
|
.[] |
|
||||||
|
"━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\n" +
|
||||||
|
"📅 " + (.payload.timestamp | split("T") | join(" ")) + "\n" +
|
||||||
|
"👤 " + .payload.role + " | User: " + .payload.user_id + "\n" +
|
||||||
|
"📝 " + (.payload.content | if length > 250 then .[0:250] + "..." else . end) + "\n"
|
||||||
|
' 2>/dev/null | tee /tmp/search_results.txt
|
||||||
|
|
||||||
|
# Count results
|
||||||
|
RESULT_COUNT=$(cat /tmp/search_results.txt | grep -c "━━━━━━━━" 2>/dev/null || echo "0")
|
||||||
|
|
||||||
|
echo ""
|
||||||
|
echo "=========================================="
|
||||||
|
if [ "$RESULT_COUNT" -gt 0 ]; then
|
||||||
|
echo "Found $RESULT_COUNT result(s). Most recent shown first."
|
||||||
|
else
|
||||||
|
echo "No results found for '$QUERY'"
|
||||||
|
fi
|
||||||
|
echo "=========================================="
|
||||||
19
watcher/mem-qdrant-watcher.service
Normal file
19
watcher/mem-qdrant-watcher.service
Normal file
@@ -0,0 +1,19 @@
|
|||||||
|
[Unit]
|
||||||
|
Description=TrueRecall Base - Real-Time Memory Watcher
|
||||||
|
After=network.target
|
||||||
|
|
||||||
|
[Service]
|
||||||
|
Type=simple
|
||||||
|
User=<USER>
|
||||||
|
WorkingDirectory=<INSTALL_PATH>/true-recall-base/watcher
|
||||||
|
Environment="QDRANT_URL=http://<QDRANT_IP>:6333"
|
||||||
|
Environment="QDRANT_COLLECTION=memories_tr"
|
||||||
|
Environment="OLLAMA_URL=http://<OLLAMA_IP>:11434"
|
||||||
|
Environment="EMBEDDING_MODEL=snowflake-arctic-embed2"
|
||||||
|
Environment="USER_ID=<USER_ID>"
|
||||||
|
ExecStart=/usr/bin/python3 <INSTALL_PATH>/true-recall-base/watcher/realtime_qdrant_watcher.py --daemon
|
||||||
|
Restart=always
|
||||||
|
RestartSec=5
|
||||||
|
|
||||||
|
[Install]
|
||||||
|
WantedBy=multi-user.target
|
||||||
445
watcher/realtime_qdrant_watcher.py
Normal file
445
watcher/realtime_qdrant_watcher.py
Normal file
@@ -0,0 +1,445 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
TrueRecall v1.2 - Real-time Qdrant Watcher
|
||||||
|
Monitors OpenClaw sessions and stores to memories_tr instantly.
|
||||||
|
|
||||||
|
This is the CAPTURE component. For curation and injection, install v2.
|
||||||
|
|
||||||
|
Changelog:
|
||||||
|
- v1.2: Fixed session rotation bug - added inactivity detection (30s threshold)
|
||||||
|
and improved file scoring to properly detect new sessions on /new or /reset
|
||||||
|
- v1.1: Added 1-second mtime polling for session rotation
|
||||||
|
- v1.0: Initial release
|
||||||
|
"""
|
||||||
|
|
||||||
|
import os
|
||||||
|
import sys
|
||||||
|
import json
|
||||||
|
import time
|
||||||
|
import signal
|
||||||
|
import hashlib
|
||||||
|
import argparse
|
||||||
|
import requests
|
||||||
|
from datetime import datetime, timezone
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Dict, Any, Optional, List
|
||||||
|
|
||||||
|
# Config
|
||||||
|
QDRANT_URL = os.getenv("QDRANT_URL", "http://10.0.0.40:6333")
|
||||||
|
QDRANT_COLLECTION = os.getenv("QDRANT_COLLECTION", "memories_tr")
|
||||||
|
OLLAMA_URL = os.getenv("OLLAMA_URL", "http://10.0.0.10:11434")
|
||||||
|
EMBEDDING_MODEL = os.getenv("EMBEDDING_MODEL", "snowflake-arctic-embed2")
|
||||||
|
USER_ID = os.getenv("USER_ID", "rob")
|
||||||
|
|
||||||
|
# Paths
|
||||||
|
SESSIONS_DIR = Path(os.getenv("OPENCLAW_SESSIONS_DIR", "/root/.openclaw/agents/main/sessions"))
|
||||||
|
|
||||||
|
# State
|
||||||
|
running = True
|
||||||
|
last_position = 0
|
||||||
|
current_file = None
|
||||||
|
turn_counter = 0
|
||||||
|
|
||||||
|
|
||||||
|
def signal_handler(signum, frame):
|
||||||
|
global running
|
||||||
|
print(f"\nReceived signal {signum}, shutting down...", file=sys.stderr)
|
||||||
|
running = False
|
||||||
|
|
||||||
|
|
||||||
|
def get_embedding(text: str) -> List[float]:
|
||||||
|
try:
|
||||||
|
response = requests.post(
|
||||||
|
f"{OLLAMA_URL}/api/embeddings",
|
||||||
|
json={"model": EMBEDDING_MODEL, "prompt": text},
|
||||||
|
timeout=30
|
||||||
|
)
|
||||||
|
response.raise_for_status()
|
||||||
|
return response.json()["embedding"]
|
||||||
|
except Exception as e:
|
||||||
|
print(f"Error getting embedding: {e}", file=sys.stderr)
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def clean_content(text: str) -> str:
|
||||||
|
import re
|
||||||
|
|
||||||
|
# Remove metadata JSON blocks
|
||||||
|
text = re.sub(r'Conversation info \(untrusted metadata\):\s*```json\s*\{[\s\S]*?\}\s*```', '', text)
|
||||||
|
|
||||||
|
# Remove thinking tags
|
||||||
|
text = re.sub(r'\[thinking:[^\]]*\]', '', text)
|
||||||
|
|
||||||
|
# Remove timestamp lines
|
||||||
|
text = re.sub(r'\[\w{3} \d{4}-\d{2}-\d{2} \d{2}:\d{2} [A-Z]{3}\]', '', text)
|
||||||
|
|
||||||
|
# Remove markdown tables
|
||||||
|
text = re.sub(r'\|[^\n]*\|', '', text)
|
||||||
|
text = re.sub(r'\|[-:]+\|', '', text)
|
||||||
|
|
||||||
|
# Remove markdown formatting
|
||||||
|
text = re.sub(r'\*\*([^*]+)\*\*', r'\1', text)
|
||||||
|
text = re.sub(r'\*([^*]+)\*', r'\1', text)
|
||||||
|
text = re.sub(r'`([^`]+)`', r'\1', text)
|
||||||
|
text = re.sub(r'```[\s\S]*?```', '', text)
|
||||||
|
|
||||||
|
# Remove horizontal rules
|
||||||
|
text = re.sub(r'---+', '', text)
|
||||||
|
text = re.sub(r'\*\*\*+', '', text)
|
||||||
|
|
||||||
|
# Remove excess whitespace
|
||||||
|
text = re.sub(r'\n{3,}', '\n', text)
|
||||||
|
text = re.sub(r'[ \t]+', ' ', text)
|
||||||
|
|
||||||
|
return text.strip()
|
||||||
|
|
||||||
|
|
||||||
|
def store_to_qdrant(turn: Dict[str, Any], dry_run: bool = False) -> bool:
|
||||||
|
if dry_run:
|
||||||
|
print(f"[DRY RUN] Would store turn {turn['turn']} ({turn['role']}): {turn['content'][:60]}...")
|
||||||
|
return True
|
||||||
|
|
||||||
|
vector = get_embedding(turn['content'])
|
||||||
|
if vector is None:
|
||||||
|
print(f"Failed to get embedding for turn {turn['turn']}", file=sys.stderr)
|
||||||
|
return False
|
||||||
|
|
||||||
|
payload = {
|
||||||
|
"user_id": turn.get('user_id', USER_ID),
|
||||||
|
"role": turn['role'],
|
||||||
|
"content": turn['content'],
|
||||||
|
"turn": turn['turn'],
|
||||||
|
"timestamp": turn.get('timestamp', datetime.now(timezone.utc).isoformat()),
|
||||||
|
"date": datetime.now(timezone.utc).strftime('%Y-%m-%d'),
|
||||||
|
"source": "true-recall-base",
|
||||||
|
"curated": False
|
||||||
|
}
|
||||||
|
|
||||||
|
# Generate deterministic ID
|
||||||
|
turn_id = turn.get('turn', 0)
|
||||||
|
hash_bytes = hashlib.sha256(f"{USER_ID}:turn:{turn_id}:{datetime.now().strftime('%H%M%S')}".encode()).digest()[:8]
|
||||||
|
point_id = int.from_bytes(hash_bytes, byteorder='big') % (2**63)
|
||||||
|
|
||||||
|
try:
|
||||||
|
response = requests.put(
|
||||||
|
f"{QDRANT_URL}/collections/{QDRANT_COLLECTION}/points",
|
||||||
|
json={
|
||||||
|
"points": [{
|
||||||
|
"id": abs(point_id),
|
||||||
|
"vector": vector,
|
||||||
|
"payload": payload
|
||||||
|
}]
|
||||||
|
},
|
||||||
|
timeout=30
|
||||||
|
)
|
||||||
|
response.raise_for_status()
|
||||||
|
return True
|
||||||
|
except Exception as e:
|
||||||
|
print(f"Error writing to Qdrant: {e}", file=sys.stderr)
|
||||||
|
return False
|
||||||
|
|
||||||
|
|
||||||
|
def is_lock_valid(lock_path: Path, max_age_seconds: int = 1800) -> bool:
|
||||||
|
"""Check if lock file is valid (not stale, PID exists)."""
|
||||||
|
try:
|
||||||
|
with open(lock_path, 'r') as f:
|
||||||
|
data = json.load(f)
|
||||||
|
|
||||||
|
# Check lock file age
|
||||||
|
created = datetime.fromisoformat(data['createdAt'].replace('Z', '+00:00'))
|
||||||
|
if (datetime.now(timezone.utc) - created).total_seconds() > max_age_seconds:
|
||||||
|
return False
|
||||||
|
|
||||||
|
# Check PID exists
|
||||||
|
pid = data.get('pid')
|
||||||
|
if pid and not os.path.exists(f"/proc/{pid}"):
|
||||||
|
return False
|
||||||
|
|
||||||
|
return True
|
||||||
|
except Exception:
|
||||||
|
return False
|
||||||
|
|
||||||
|
|
||||||
|
def get_current_session_file():
|
||||||
|
"""Find the most recently active session file.
|
||||||
|
|
||||||
|
Priority (per subagent analysis consensus):
|
||||||
|
1. Explicit agent:main:main lookup from sessions.json (highest priority)
|
||||||
|
2. Lock files with valid PID + recent timestamp
|
||||||
|
3. Parse sessions.json for other active sessions
|
||||||
|
4. File scoring by mtime + size (fallback)
|
||||||
|
"""
|
||||||
|
if not SESSIONS_DIR.exists():
|
||||||
|
return None
|
||||||
|
|
||||||
|
sessions_json = SESSIONS_DIR / "sessions.json"
|
||||||
|
|
||||||
|
# PRIORITY 1: Explicit main session lookup
|
||||||
|
if sessions_json.exists():
|
||||||
|
try:
|
||||||
|
with open(sessions_json, 'r') as f:
|
||||||
|
sessions_data = json.load(f)
|
||||||
|
|
||||||
|
# Look up agent:main:main explicitly
|
||||||
|
main_session = sessions_data.get("agent:main:main", {})
|
||||||
|
main_session_id = main_session.get('sessionId')
|
||||||
|
|
||||||
|
if main_session_id:
|
||||||
|
main_file = SESSIONS_DIR / f"{main_session_id}.jsonl"
|
||||||
|
if main_file.exists():
|
||||||
|
return main_file
|
||||||
|
except Exception as e:
|
||||||
|
print(f"Warning: Failed to parse sessions.json for main session: {e}", file=sys.stderr)
|
||||||
|
|
||||||
|
# PRIORITY 2: Lock files with PID validation
|
||||||
|
lock_files = list(SESSIONS_DIR.glob("*.jsonl.lock"))
|
||||||
|
valid_locks = [lf for lf in lock_files if is_lock_valid(lf)]
|
||||||
|
|
||||||
|
if valid_locks:
|
||||||
|
# Get the most recent valid lock file
|
||||||
|
newest_lock = max(valid_locks, key=lambda p: p.stat().st_mtime)
|
||||||
|
session_file = SESSIONS_DIR / newest_lock.name.replace('.jsonl.lock', '.jsonl')
|
||||||
|
if session_file.exists():
|
||||||
|
return session_file
|
||||||
|
|
||||||
|
# PRIORITY 3: Parse sessions.json for other sessions with sessionFile
|
||||||
|
if sessions_json.exists():
|
||||||
|
try:
|
||||||
|
with open(sessions_json, 'r') as f:
|
||||||
|
sessions_data = json.load(f)
|
||||||
|
|
||||||
|
active_session = None
|
||||||
|
active_mtime = 0
|
||||||
|
|
||||||
|
for session_key, session_info in sessions_data.items():
|
||||||
|
# Skip if no sessionFile (inactive subagents have null)
|
||||||
|
session_file_path = session_info.get('sessionFile')
|
||||||
|
if not session_file_path:
|
||||||
|
continue
|
||||||
|
|
||||||
|
session_file = Path(session_file_path)
|
||||||
|
if session_file.exists():
|
||||||
|
mtime = session_file.stat().st_mtime
|
||||||
|
if mtime > active_mtime:
|
||||||
|
active_mtime = mtime
|
||||||
|
active_session = session_file
|
||||||
|
|
||||||
|
if active_session:
|
||||||
|
return active_session
|
||||||
|
except Exception as e:
|
||||||
|
print(f"Warning: Failed to parse sessions.json: {e}", file=sys.stderr)
|
||||||
|
|
||||||
|
# PRIORITY 4: Score files by recency (mtime) + size
|
||||||
|
files = list(SESSIONS_DIR.glob("*.jsonl"))
|
||||||
|
if not files:
|
||||||
|
return None
|
||||||
|
|
||||||
|
def file_score(p: Path) -> float:
|
||||||
|
try:
|
||||||
|
stat = p.stat()
|
||||||
|
mtime = stat.st_mtime
|
||||||
|
size = stat.st_size
|
||||||
|
return mtime + (size / 1e9)
|
||||||
|
except Exception:
|
||||||
|
return 0
|
||||||
|
|
||||||
|
return max(files, key=file_score)
|
||||||
|
|
||||||
|
|
||||||
|
def parse_turn(line: str, session_name: str) -> Optional[Dict[str, Any]]:
|
||||||
|
global turn_counter
|
||||||
|
|
||||||
|
try:
|
||||||
|
entry = json.loads(line.strip())
|
||||||
|
except json.JSONDecodeError:
|
||||||
|
return None
|
||||||
|
|
||||||
|
if entry.get('type') != 'message' or 'message' not in entry:
|
||||||
|
return None
|
||||||
|
|
||||||
|
msg = entry['message']
|
||||||
|
role = msg.get('role')
|
||||||
|
|
||||||
|
if role in ('toolResult', 'system', 'developer'):
|
||||||
|
return None
|
||||||
|
|
||||||
|
if role not in ('user', 'assistant'):
|
||||||
|
return None
|
||||||
|
|
||||||
|
content = ""
|
||||||
|
if isinstance(msg.get('content'), list):
|
||||||
|
for item in msg['content']:
|
||||||
|
if isinstance(item, dict) and 'text' in item:
|
||||||
|
content += item['text']
|
||||||
|
elif isinstance(msg.get('content'), str):
|
||||||
|
content = msg['content']
|
||||||
|
|
||||||
|
if not content:
|
||||||
|
return None
|
||||||
|
|
||||||
|
content = clean_content(content)
|
||||||
|
if not content or len(content) < 5:
|
||||||
|
return None
|
||||||
|
|
||||||
|
turn_counter += 1
|
||||||
|
|
||||||
|
return {
|
||||||
|
'turn': turn_counter,
|
||||||
|
'role': role,
|
||||||
|
'content': content[:2000],
|
||||||
|
'timestamp': entry.get('timestamp', datetime.now(timezone.utc).isoformat()),
|
||||||
|
'user_id': USER_ID
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def process_new_lines(f, session_name: str, dry_run: bool = False):
|
||||||
|
global last_position
|
||||||
|
|
||||||
|
f.seek(last_position)
|
||||||
|
|
||||||
|
for line in f:
|
||||||
|
line = line.strip()
|
||||||
|
if not line:
|
||||||
|
continue
|
||||||
|
|
||||||
|
turn = parse_turn(line, session_name)
|
||||||
|
if turn:
|
||||||
|
if store_to_qdrant(turn, dry_run):
|
||||||
|
print(f"✅ Turn {turn['turn']} ({turn['role']}) → Qdrant")
|
||||||
|
|
||||||
|
last_position = f.tell()
|
||||||
|
|
||||||
|
|
||||||
|
def watch_session(session_file: Path, dry_run: bool = False):
|
||||||
|
global last_position, turn_counter
|
||||||
|
|
||||||
|
session_name = session_file.name.replace('.jsonl', '')
|
||||||
|
print(f"Watching session: {session_file.name}")
|
||||||
|
|
||||||
|
try:
|
||||||
|
with open(session_file, 'r') as f:
|
||||||
|
for line in f:
|
||||||
|
turn_counter += 1
|
||||||
|
last_position = session_file.stat().st_size
|
||||||
|
print(f"Session has {turn_counter} existing turns, starting from position {last_position}")
|
||||||
|
except Exception as e:
|
||||||
|
print(f"Warning: Could not read existing turns: {e}", file=sys.stderr)
|
||||||
|
last_position = 0
|
||||||
|
|
||||||
|
last_session_check = time.time()
|
||||||
|
last_data_time = time.time() # Track when we last saw new data
|
||||||
|
last_file_size = session_file.stat().st_size if session_file.exists() else 0
|
||||||
|
|
||||||
|
INACTIVITY_THRESHOLD = 30 # seconds - if no data for 30s, check for new session
|
||||||
|
|
||||||
|
with open(session_file, 'r') as f:
|
||||||
|
while running:
|
||||||
|
if not session_file.exists():
|
||||||
|
print("Session file removed, looking for new session...")
|
||||||
|
return None
|
||||||
|
|
||||||
|
current_time = time.time()
|
||||||
|
|
||||||
|
# Check for newer session every 1 second
|
||||||
|
if current_time - last_session_check > 1.0:
|
||||||
|
last_session_check = current_time
|
||||||
|
newest_session = get_current_session_file()
|
||||||
|
if newest_session and newest_session != session_file:
|
||||||
|
print(f"Newer session detected: {newest_session.name}")
|
||||||
|
return newest_session
|
||||||
|
|
||||||
|
# Check if current file is stale (no new data for threshold)
|
||||||
|
if current_time - last_data_time > INACTIVITY_THRESHOLD:
|
||||||
|
try:
|
||||||
|
current_size = session_file.stat().st_size
|
||||||
|
# If file hasn't grown, check if another session is active
|
||||||
|
if current_size == last_file_size:
|
||||||
|
newest_session = get_current_session_file()
|
||||||
|
if newest_session and newest_session != session_file:
|
||||||
|
print(f"Current session inactive, switching to: {newest_session.name}")
|
||||||
|
return newest_session
|
||||||
|
else:
|
||||||
|
# File grew, update tracking
|
||||||
|
last_file_size = current_size
|
||||||
|
last_data_time = current_time
|
||||||
|
except Exception:
|
||||||
|
pass
|
||||||
|
|
||||||
|
# Process new lines and update activity tracking
|
||||||
|
old_position = last_position
|
||||||
|
process_new_lines(f, session_name, dry_run)
|
||||||
|
|
||||||
|
# If we processed new data, update activity timestamp
|
||||||
|
if last_position > old_position:
|
||||||
|
last_data_time = current_time
|
||||||
|
try:
|
||||||
|
last_file_size = session_file.stat().st_size
|
||||||
|
except Exception:
|
||||||
|
pass
|
||||||
|
|
||||||
|
time.sleep(0.1)
|
||||||
|
|
||||||
|
return session_file
|
||||||
|
|
||||||
|
|
||||||
|
def watch_loop(dry_run: bool = False):
|
||||||
|
global current_file, turn_counter
|
||||||
|
|
||||||
|
while running:
|
||||||
|
session_file = get_current_session_file()
|
||||||
|
|
||||||
|
if session_file is None:
|
||||||
|
print("No active session found, waiting...")
|
||||||
|
time.sleep(1)
|
||||||
|
continue
|
||||||
|
|
||||||
|
if current_file != session_file:
|
||||||
|
print(f"\nNew session detected: {session_file.name}")
|
||||||
|
current_file = session_file
|
||||||
|
turn_counter = 0
|
||||||
|
last_position = 0
|
||||||
|
|
||||||
|
result = watch_session(session_file, dry_run)
|
||||||
|
|
||||||
|
if result is None:
|
||||||
|
current_file = None
|
||||||
|
time.sleep(0.5)
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
global USER_ID
|
||||||
|
|
||||||
|
parser = argparse.ArgumentParser(description="TrueRecall v1.1 - Real-time Memory Capture")
|
||||||
|
parser.add_argument("--daemon", "-d", action="store_true", help="Run as daemon")
|
||||||
|
parser.add_argument("--once", "-o", action="store_true", help="Process once then exit")
|
||||||
|
parser.add_argument("--dry-run", "-n", action="store_true", help="Don't write to Qdrant")
|
||||||
|
parser.add_argument("--user-id", "-u", default=USER_ID, help=f"User ID (default: {USER_ID})")
|
||||||
|
|
||||||
|
args = parser.parse_args()
|
||||||
|
|
||||||
|
signal.signal(signal.SIGINT, signal_handler)
|
||||||
|
signal.signal(signal.SIGTERM, signal_handler)
|
||||||
|
|
||||||
|
if args.user_id:
|
||||||
|
USER_ID = args.user_id
|
||||||
|
|
||||||
|
print(f"🔍 TrueRecall v1.1 - Real-time Memory Capture")
|
||||||
|
print(f"📍 Qdrant: {QDRANT_URL}/{QDRANT_COLLECTION}")
|
||||||
|
print(f"🧠 Ollama: {OLLAMA_URL}/{EMBEDDING_MODEL}")
|
||||||
|
print(f"👤 User: {USER_ID}")
|
||||||
|
print()
|
||||||
|
|
||||||
|
if args.once:
|
||||||
|
print("Running once...")
|
||||||
|
session_file = get_current_session_file()
|
||||||
|
if session_file:
|
||||||
|
watch_session(session_file, args.dry_run)
|
||||||
|
else:
|
||||||
|
print("No session found")
|
||||||
|
else:
|
||||||
|
print("Running as daemon (Ctrl+C to stop)...")
|
||||||
|
watch_loop(args.dry_run)
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
Reference in New Issue
Block a user