Files

SpeedyFoxAi e8854cd959 Initial commit: Jarvis Memory system

2026-02-23 12:13:04 -06:00

29 KiB

Raw Blame History

Memory System Architecture Diagrams

Created: February 18, 2026
Updated: February 18, 2026 (v2.0 - Added QMD, Task Queue, Session Harvesting, Email Integration)
Purpose: Complete backup of memory system architecture for Google Slides presentations

Part 1: Built-in Memory System (OpenClaw Default)
Part 2: Custom Memory System (What We Built)
Part 3: Comparison — Built-in vs Custom
Part 4: QMD (Query Markdown) — OpenClaw Experimental
Part 5: Task Queue System
Part 6: Session Harvesting
Part 7: Email Integration
Part 8: PROJECTNAME.md Workflow
Part 9: Complete Infrastructure Reference

Part 1: Built-in Memory System (OpenClaw Default)

Architecture Diagram

┌─────────────────────────────────────┐
│     OpenClaw Gateway Service        │
│  (Manages session state & routing)  │
└──────────────┬──────────────────────┘
               │
        ┌──────▼──────┐
        │   Session   │
        │   Context   │
        │ (In-Memory) │
        └──────┬──────┘
               │
        ┌──────▼──────────────────┐
        │  Message History Buffer │
        │     (Last N messages)   │
        │  Default: 8k-32k tokens │
        └──────┬──────────────────┘
               │
        ┌──────▼────────┐
        │   Model Input │
        │   (LLM Call)  │
        └───────────────┘

How Built-in Memory Works

Process Flow:

User sends message → Added to session context
Context accumulates in memory (not persistent)
Model receives last N messages as context
Session ends → Context is LOST

Key Characteristics:

✅ Works automatically (no setup)
✅ Fast (in-memory)
❌ Lost on /new or /reset
❌ Lost when session expires
❌ No cross-session memory
❌ Limited context window (~8k-32k tokens)

Built-in Limitations

Feature	Status
Session Persistence	❌ NO
Cross-Session Memory	❌ NO
User-Centric Storage	❌ NO
Long-Term Memory	❌ NO
Semantic Search	❌ NO
Conversation Threading	❌ NO
Automatic Backup	❌ NO

Part 2: Custom Memory System (What We Built)

Complete Architecture Diagram

┌─────────────────────────────────────────────────────────────────────┐
│                      MULTI-LAYER MEMORY SYSTEM                       │
├─────────────────────────────────────────────────────────────────────┤
│                                                                      │
│   LAYER 0: Real-Time Session Context (OpenClaw Gateway)             │
│   ┌─────────────────────────────────────────────────────────────┐ │
│   │  Session JSONL → Live context (temporary only)              │ │
│   └─────────────────────────────────────────────────────────────┘ │
│                              │                                       │
│   ┌──────────────────────────▼──────────────────────────────────┐  │
│   │  LAYER 1: Redis Buffer (Fast Short-Term)                     │  │
│   │  ├─ Key: mem:rob                                            │  │
│   │  ├─ Accumulates new turns since last check                 │  │
│   │  ├─ Heartbeat: Append-only (hb_append.py)                  │  │
│   │  ├─ Manual: Full dump (save_mem.py)                        │  │
│   │  └─ Flush: Daily 3:00 AM → Qdrant                          │  │
│   └──────────────────────────┬──────────────────────────────────┘  │
│                              │                                      │
│   ┌──────────────────────────▼──────────────────────────────────┐  │
│   │  LAYER 2: Daily File Logs (.md)                              │  │
│   │  ├─ Location: memory/YYYY-MM-DD.md                         │  │
│   │  ├─ Format: Human-readable Markdown                         │  │
│   │  ├─ Backup: 3:30 AM sliding_backup.sh                       │  │
│   │  └─ Retention: Permanent (git-tracked)                     │  │
│   └──────────────────────────┬──────────────────────────────────┘  │
│                              │                                      │
│   ┌──────────────────────────▼──────────────────────────────────┐  │
│   │  LAYER 3: Qdrant Vector DB (Semantic Long-Term)              │  │
│   │  ├─ Host: 10.0.0.40:6333                                    │  │
│   │  ├─ Embeddings: snowflake-arctic-embed2 (1024-dim)         │  │
│   │  ├─ Collections:                                            │  │
│   │  │   • kimi_memories (conversations)                        │  │
│   │  │   • kimi_kb (knowledge base)                            │  │
│   │  │   • private_court_docs (legal)                          │  │
│   │  ├─ Deduplication: Content hash per user                   │  │
│   │  └─ User-centric: user_id: "rob"                            │  │
│   └─────────────────────────────────────────────────────────────┘  │
│                                                                      │
│   ┌─────────────────────────────────────────────────────────────┐   │
│   │  CROSS-CUTTING: Task Queue (Redis)                          │   │
│   │  ├─ tasks:pending → tasks:active → tasks:completed         │   │
│   │  └─ Heartbeat worker for background jobs                   │   │
│   └─────────────────────────────────────────────────────────────┘   │
│                                                                      │
│   ┌─────────────────────────────────────────────────────────────┐   │
│   │  CROSS-CUTTING: Email Integration (Gmail)                   │   │
│   │  ├─ hb_check_email.py (Heartbeat)                          │   │
│   │  └─ Authorized senders: your_email@gmail.com               │   │
│   └─────────────────────────────────────────────────────────────┘   │
│                                                                      │
└─────────────────────────────────────────────────────────────────────┘

Detailed Component Breakdown

Component 1: Daily File Logs

Location: /root/.openclaw/workspace/memory/YYYY-MM-DD.md
Format: Markdown with timestamps
Content: Full conversation history
Access: Direct file read
Retention: Permanent (until deleted)
Auto-created: Yes, every session
Backup: sliding_backup.sh at 3:30 AM

Component 2: Redis Buffer (mem-redis skill)

Host: 10.0.0.36:6379
Key: mem:rob
Type: List (LPUSH append)
Purpose: Fast access, multi-session accumulation
Flush: Daily at 3:00 AM to Qdrant
No TTL: Data persists until successfully backed up
Fail-safe: If cron fails, data stays in Redis

Scripts:

Script	Purpose
`hb_append.py`	Heartbeat: Add NEW turns only
`save_mem.py`	Manual: Save ALL turns (with --reset option)
`cron_backup.py`	Daily: Process Redis → Qdrant → Clear Redis
`mem_retrieve.py`	Manual: Retrieve recent turns from Redis
`search_mem.py`	Search both Redis (exact) + Qdrant (semantic)

Component 3: Qdrant Vector Database

Host: http://10.0.0.40:6333
Embeddings Model: snowflake-arctic-embed2 at 10.0.0.10:11434
Vector Dimensions: 1024
User-Centric: All memories tagged with user_id: "rob"
Cross-Chat Search: Find info from ANY past conversation

Collections:

Collection	Purpose	Content
`kimi_memories`	Personal conversations	User + AI messages
`kimi_kb`	Knowledge base	Web data, docs, tutorials
`private_court_docs`	Legal documents	Court files, legal research

Component 4: Full Context Mode (Mem0-Style)

3 Embeddings Per Turn:

User message embedding
AI response embedding
Combined summary embedding

Threading Metadata:

user_id: "rob" (persistent identifier)
conversation_id: Groups related turns
session_id: Which chat instance
turn_number: Sequential ordering

Deduplication System

What It Is: A content-based duplicate detection system that prevents storing the exact same information multiple times for the same user.

How It Works:

Content Hash Generation: Each memory generates a SHA-256 hash of its content
Per-User Scope: Deduplication is per-user (same content from different users = allowed)
Pre-Storage Check: Before storing to Qdrant, check if hash exists for this user
Skip if Duplicate: If hash exists → skip storage, return "already exists"
Store if New: If hash doesn't exist → generate embeddings and store

Deduplication by Layer:

Layer	Deduplication	Behavior
Daily Files	❌ No	All turns appended (intentional — audit trail)
Redis Buffer	❌ No	All turns stored (temporary, flushed daily)
Qdrant (kimi_memories)	✅ Yes	Per-user content hash check
Qdrant (kimi_kb)	✅ Yes	Per-collection content hash check

Complete Script Reference

/root/.openclaw/workspace/
├── memory/
│   └── YYYY-MM-DD.md (daily logs)
│
├── skills/
│   ├── mem-redis/
│   │   └── scripts/
│   │       ├── hb_append.py (heartbeat: new turns only)
│   │       ├── save_mem.py (manual: all turns)
│   │       ├── cron_backup.py (daily flush to Qdrant)
│   │       ├── mem_retrieve.py (read from Redis)
│   │       └── search_mem.py (search Redis + Qdrant)
│   │
│   ├── qdrant-memory/
│   │   └── scripts/
│   │       ├── auto_store.py (immediate Qdrant storage)
│   │       ├── background_store.py (async storage)
│   │       ├── q_save.py (quick save trigger)
│   │       ├── daily_conversation_backup.py (file → Qdrant)
│   │       ├── get_conversation_context.py (retrieve threads)
│   │       ├── search_memories.py (semantic search)
│   │       ├── harvest_sessions.py (bulk import old sessions)
│   │       ├── harvest_newest.py (specific sessions)
│   │       ├── hb_check_email.py (email integration)
│   │       ├── sliding_backup.sh (file backup)
│   │       ├── kb_store.py / kb_search.py (knowledge base)
│   │       └── court_store.py / court_search.py (legal docs)
│   │
│   └── task-queue/
│       └── scripts/
│           ├── heartbeat_worker.py (process tasks)
│           ├── add_task.py (add background task)
│           └── list_tasks.py (view queue status)
│
└── MEMORY_DEF/
    ├── README.md
    ├── daily-backup.md
    └── agent-messaging.md

Technical Flow

Real-Time (Every Message)

User Input → AI Response
     ↓
Redis Buffer (fast append)
     ↓
File Log (persistent)
     ↓
[Optional: "save q"] → Qdrant (semantic)

Heartbeat (Every ~30-60 min)

hb_append.py → Check for new turns → Append to Redis
hb_check_email.py → Check Gmail → Process new emails
heartbeat_worker.py → Check task queue → Execute tasks

Daily Backup (3:00 AM & 3:30 AM)

3:00 AM: Redis Buffer → Flush → Qdrant (kimi_memories)
         └─> Clear Redis after successful write

3:30 AM: Daily Files → sliding_backup.sh → Archive
         └─> daily_conversation_backup.py → Qdrant

On Retrieval ("search q" or "q ")

Search Query
     ↓
search_mem.py
     ├──► Redis (exact text match, recent)
     └──► Qdrant (semantic similarity, long-term)
     ↓
Combined Results (Redis first, then Qdrant)
     ↓
Return context-enriched response

Part 3: Comparison — Built-in vs Custom

Feature Comparison Table

Feature	Built-in	Custom System
Session Persistence	❌ Lost on reset	✅ Survives forever
Cross-Session Memory	❌ None	✅ All sessions linked
User-Centric	❌ Session-based	✅ User-based (Mem0-style)
Semantic Search	❌ None	✅ Full semantic retrieval
Conversation Threading	❌ Linear only	✅ Thread-aware
Long-Term Storage	❌ Hours only	✅ Permanent (disk + vector)
Backup & Recovery	❌ None	✅ Multi-layer redundancy
Privacy	⚠️ Cloud dependent	✅ Fully local/self-hosted
Speed	✅ Fast (RAM)	✅ Fast (Redis) + Deep (Qdrant)
Cost	❌ OpenAI API tokens	✅ Free (local infrastructure)
Embeddings	❌ None	✅ 1024-dim (snowflake)
Cross-Reference	❌ None	✅ Links related memories
Task Queue	❌ None	✅ Background job processing
Email Integration	❌ None	✅ Gmail via Pub/Sub
Deduplication	❌ None	✅ Content hash-based

Why It's Better — Key Advantages

1. Mem0-Style Architecture

Memories follow the USER, not the session
Ask "what did I say about X?" → finds from ANY past conversation
Persistent identity across all chats

2. Hybrid Storage Strategy

Redis: Speed (real-time access)
Files: Durability (never lost, human-readable)
Qdrant: Intelligence (semantic search, similarity)

Exact match: File grep, exact text search
Semantic search: Vector similarity, conceptual matching
Thread reconstruction: Conversation_id grouping

4. Local-First Design

No cloud dependencies
No API costs (except initial setup)
Full privacy control
Works offline
Self-hosted infrastructure

5. Triple Redundancy

Layer	Purpose	Persistence
Redis	Speed	Temporary (daily flush)
Files	Durability	Permanent
Qdrant	Intelligence	Permanent

Part 4: QMD (Query Markdown) — OpenClaw Experimental

What is QMD?

QMD = Query Markdown — OpenClaw's experimental local-first memory backend that replaces the built-in SQLite indexer.

Key Difference:

Current system: SQLite + vector embeddings
QMD: BM25 + vectors + reranking in a standalone binary

QMD Architecture

┌─────────────────────────────────────────────┐
│  QMD Sidecar (Experimental)                 │
│  ├─ BM25 (exact token matching)            │
│  ├─ Vector similarity (semantic)           │
│  └─ Reranking (smart result ordering)      │
└──────────────────┬──────────────────────────┘
                   │
        ┌──────────▼──────────┐
        │  Markdown Source    │
        │  memory/*.md        │
        │  MEMORY.md          │
        └─────────────────────┘

QMD vs Current System

Feature	Current (Qdrant)	QMD (Experimental)
Storage	Qdrant server (10.0.0.40)	Local SQLite + files
Network	Requires network	Fully offline
Search	Vector only	Hybrid (BM25 + vector)
Exact tokens	Weak	Strong (BM25)
Embeddings	snowflake-arctic-embed2	Local GGUF models
Git-friendly	❌ Opaque vectors	✅ Markdown source
Explainable	Partial	Full (file.md#L12 citations)
Status	Production	Experimental

When QMD Might Be Better

✅ Use QMD if:

You want full offline operation (no 10.0.0.40 dependency)
You frequently search for exact tokens (IDs, function names, error codes)
You want human-editable memory files
You want git-tracked memory that survives system rebuilds

❌ Stick with Qdrant if:

Your current system is stable
You need multi-device access to same memory
You're happy with semantic-only search
You need production reliability

QMD Configuration (OpenClaw)

memory: {
  backend: "qmd",
  citations: "auto",
  qmd: {
    includeDefaultMemory: true,
    update: { interval: "5m", debounceMs: 15000 },
    limits: { maxResults: 6, timeoutMs: 4000 },
    paths: [
      { name: "docs", path: "~/notes", pattern: "**/*.md" }
    ]
  }
}

QMD Prerequisites

# Install QMD binary
bun install -g https://github.com/tobi/qmd

# Install SQLite with extensions (macOS)
brew install sqlite

# QMD auto-downloads GGUF models on first run (~0.6GB)

Part 5: Task Queue System

Architecture

┌─────────────────────────────────────────────┐
│  Redis Task Queue                          │
│  ├─ tasks:pending (FIFO)                 │
│  ├─ tasks:active (currently running)     │
│  ├─ tasks:completed (history)              │
│  └─ task:{id} (hash with details)         │
└──────────────────┬────────────────────────┘
                   │
        ┌──────────▼──────────┐
        │  Heartbeat Worker   │
        │  heartbeat_worker.py│
        └─────────────────────┘

Task Fields

id - Unique task ID
description - What to do
status - pending/active/completed/failed
created_at - Timestamp
created_by - Who created the task
result - Output from execution

Usage

# Add a task
python3 skills/task-queue/scripts/add_task.py "Check server disk space"

# List tasks
python3 skills/task-queue/scripts/list_tasks.py

# Heartbeat auto-executes pending tasks
python3 skills/task-queue/scripts/heartbeat_worker.py

Part 6: Session Harvesting

What is Session Harvesting?

Bulk import of historical OpenClaw session JSONL files into Qdrant memory.

When to Use

After setting up new memory system → backfill existing sessions
After discovering missed backups → recover data
Periodically → if cron jobs missed data

Scripts

Script	Purpose
`harvest_sessions.py`	Auto-harvest (limited by memory)
`harvest_newest.py`	Specific sessions (recommended)

Usage

# Harvest specific sessions (recommended)
python3 harvest_newest.py --user-id rob session-1.jsonl session-2.jsonl

# Find newest sessions to harvest
ls -t /root/.openclaw/agents/main/sessions/*.jsonl | head -20

# Auto-harvest with limit
python3 harvest_sessions.py --user-id rob --limit 10

How It Works

Parse → Reads JSONL session file
Pair → Matches user message with AI response
Embed → Generates 3 embeddings (user, AI, summary)
Deduplicate → Checks content_hash before storing
Store → Upserts to Qdrant with user_id, conversation_id

Part 7: Email Integration

Architecture

┌─────────────────────────────────────────────┐
│  Gmail Inbox                               │
│  (your_email@gmail.com)                    │
└──────────────────┬──────────────────────────┘
                   │
        ┌──────────▼──────────┐
        │  hb_check_email.py  │
        │  (Heartbeat)         │
        └─────────────────────┘

Authorized Senders

your_email@gmail.com (Configure in hb_check_email.py)
Add more as needed

Usage

# Check emails (runs automatically in heartbeat)
python3 skills/qdrant-memory/scripts/hb_check_email.py

How It Works

Polls Gmail for new messages
Filters by authorized senders
Reads subject and body
Searches Qdrant for context
Responds with helpful reply
Stores email + response to Qdrant

Part 8: PROJECTNAME.md Workflow

See original document for full details — this is a summary reference.

Purpose

Preserve context, decisions, and progress across sessions.

The Golden Rule — Append Only

NEVER Overwrite. ALWAYS Append.

File Structure Template

# PROJECTNAME.md

## Project Overview
- **Goal:** What we're achieving
- **Scope:** What's in/out
- **Success Criteria:** How we know it's done

## Current Status
- [x] Completed tasks
- [ ] In progress
- [ ] Upcoming

## Decisions Log
| Date | Decision | Rationale |
|------|----------|-----------|
| 2026-02-18 | Use X over Y | Because of Z |

## Technical Details
- Infrastructure specs
- Code snippets
- Configuration

## Blockers & Risks
- What's blocking progress
- Known issues

## Next Steps
- Immediate actions
- Questions to resolve

Real Examples

File	Project	Status
`MEM_DIAGRAM.md`	Memory system documentation	✅ Active
`AUDIT-PLAN.md`	OpenClaw infrastructure audit	✅ Completed
`YOUTUBE_UPDATE.md`	Video description optimization	🔄 Ongoing

Part 9: Complete Infrastructure Reference

Hardware/Network Topology

┌────────────────────────────────────────────────────────────────┐
│                     PROXMOX CLUSTER                            │
├────────────────────────────────────────────────────────────────┤
│                                                                 │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐        │
│  │  Ollama      │  │  Qdrant      │  │  Redis       │        │
│  │  10.0.0.10   │  │  10.0.0.40   │  │  10.0.0.36   │        │
│  │  GPU Node     │  │  LXC          │  │  LXC          │        │
│  │  Embeddings   │  │  Vector DB    │  │  Task Queue   │        │
│  │  11434        │  │  6333         │  │  6379         │        │
│  └──────────────┘  └──────────────┘  └──────────────┘        │
│                                                                 │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐        │
│  │  SearXNG     │  │  Kokoro TTS  │  │  OpenClaw    │        │
│  │  10.0.0.8    │  │  10.0.0.228  │  │  Workspace   │        │
│  │  Search       │  │  Voice        │  │  Kimi         │        │
│  │  8888         │  │  8880         │  │               │        │
│  └──────────────┘  └──────────────┘  └──────────────┘        │
│                                                                 │
└────────────────────────────────────────────────────────────────┘

Service Reference

Service	Purpose	Address	Model/Version
Qdrant	Vector database	10.0.0.40:6333	v1.x
Redis	Buffer + tasks	10.0.0.36:6379	v7.x
Ollama	Embeddings	10.0.0.10:11434	snowflake-arctic-embed2
SearXNG	Search	10.0.0.8:8888	Local
Kokoro TTS	Voice	10.0.0.228:8880	TTS

Daily Automation Schedule

Time	Task	Script
3:00 AM	Redis → Qdrant flush	`cron_backup.py`
3:30 AM	File-based sliding backup	`sliding_backup.sh`
Every 30-60 min	Heartbeat checks	`hb_append.py`, `hb_check_email.py`

Manual Triggers

Command	What It Does
`"save mem"`	Save ALL context to Redis + File
`"save q"`	Immediate Qdrant storage
`"q <topic>"`	Semantic search
`"search q <topic>"`	Full semantic search
`"remember this"`	Quick note to daily file
`"check messages"`	Check Redis for agent messages
`"send to Max"`	Send message to Max via Redis

Environment Variables

# Qdrant
QDRANT_URL=http://10.0.0.40:6333

# Redis
REDIS_HOST=10.0.0.36
REDIS_PORT=6379

# Ollama
OLLAMA_URL=http://10.0.0.10:11434

# User
DEFAULT_USER_ID=rob

Version History

Date	Version	Changes
2026-02-18	1.0	Initial documentation
2026-02-18	2.0	Added QMD, Task Queue, Session Harvesting, Email Integration, complete script reference

Quick Reference Card

Memory Commands

save mem      → Redis + File (all turns)
save q        → Qdrant (semantic, embeddings)
q <topic>     → Search Qdrant
remember this → Quick note to file

Architecture Layers

Layer 0: Session Context (temporary)
Layer 1: Redis Buffer (fast, 3:00 AM flush)
Layer 2: File Logs (permanent, human-readable)
Layer 3: Qdrant (semantic, searchable)

Key Files

memory/YYYY-MM-DD.md     → Daily conversation logs
MEMORY.md                → Curated long-term memory
MEMORY_DEF/*.md          → System documentation
skills/*/scripts/*.py    → Automation scripts

Infrastructure

10.0.0.40:6333 → Qdrant (vectors)
10.0.0.36:6379 → Redis (buffer + tasks)
10.0.0.10:11434 → Ollama (embeddings)

This document serves as the complete specification for the memory system. For questions or updates, see MEMORY.md or the SKILL.md files in each skill directory.

29 KiB Raw Blame History

Memory System Architecture Diagrams

Table of Contents

Part 1: Built-in Memory System (OpenClaw Default)

Architecture Diagram

How Built-in Memory Works

Built-in Limitations

Part 2: Custom Memory System (What We Built)

Complete Architecture Diagram

Detailed Component Breakdown

Component 1: Daily File Logs

Component 2: Redis Buffer (mem-redis skill)

Component 3: Qdrant Vector Database

Component 4: Full Context Mode (Mem0-Style)

Deduplication System

Complete Script Reference

Technical Flow

Real-Time (Every Message)

Heartbeat (Every ~30-60 min)

Daily Backup (3:00 AM & 3:30 AM)

On Retrieval ("search q" or "q ")

Part 3: Comparison — Built-in vs Custom

Feature Comparison Table

Why It's Better — Key Advantages

1. Mem0-Style Architecture

2. Hybrid Storage Strategy

3. Multi-Modal Retrieval

4. Local-First Design

5. Triple Redundancy

Part 4: QMD (Query Markdown) — OpenClaw Experimental

What is QMD?

QMD Architecture

QMD vs Current System

When QMD Might Be Better

QMD Configuration (OpenClaw)

QMD Prerequisites

Part 5: Task Queue System

Architecture

Task Fields

Usage

Part 6: Session Harvesting

What is Session Harvesting?

When to Use

Scripts

Usage

How It Works

Part 7: Email Integration

Architecture

Authorized Senders

Usage

How It Works

Part 8: PROJECTNAME.md Workflow

Purpose

The Golden Rule — Append Only

File Structure Template

Real Examples

Part 9: Complete Infrastructure Reference

Hardware/Network Topology

Service Reference

Daily Automation Schedule

Manual Triggers

Environment Variables

Version History

Quick Reference Card

Memory Commands

Architecture Layers

Key Files

Infrastructure

29 KiB

Raw Blame History