# NeuralStream **Neural streaming memory for OpenClaw with gem-based context injection.** ## Overview NeuralStream extracts high-value insights ("gems") from conversation batches using qwen3, stores them in Qdrant, and injects relevant gems into context on each new turn. This creates **infinite effective context** — the active window stays small, but semantically relevant gems from all past conversations are always retrievable. ## Core Concept | Traditional Memory | NeuralStream | |-------------------|--------------| | Context lost on `/new` | Gems persist in Qdrant | | Full history or generic summary | Semantic gem retrieval | | Static context window | Dynamic injection | | Survives compaction only | Survives session reset | | **Limited context** | **Infinite effective context** | ## How It Works ### Capture → Extract → Store → Retrieve 1. **Capture:** Every turn buffered to Redis (reuses mem-redis-watcher) 2. **Extract:** Batch of 5 turns → qwen3 (with 256k context) extracts structured gems 3. **Store:** Gems embedded + stored in Qdrant `neuralstream` 4. **Retrieve:** Each new turn → semantic search → inject top-10 gems ### Hybrid Triggers (Three-way) | Trigger | Condition | Purpose | |---------|-----------|---------| | Batch | Every 5 turns | Normal extraction | | Context | 50% usage (`ctx.getContextUsage()`) | Proactive pre-compaction | | Timer | 15 min idle | Safety net | **Context Awareness:** qwen3 receives up to 256k tokens of history for understanding, but only extracts gems from the last N turns (avoiding current context). All gems survive `/new`, `/reset`, and compaction via Qdrant persistence. ## Architecture NeuralStream is the **middle layer** — extraction intelligence on top of existing infrastructure: ``` ┌─────────────────────────────────────────────────────────┐ │ EXISTING: mem-redis-watcher │ │ Every turn → Redis buffer │ └──────────────────┬──────────────────────────────────────┘ │ ┌──────────▼──────────┐ │ NeuralStream │ │ - Batch reader │ │ - Gem extractor │ │ - Qdrant store │ └──────────┬──────────┘ │ ┌──────────▼──────────┐ │ EXISTING: │ │ qdrant-memory │ │ Semantic search │ │ Context injection │ └─────────────────────┘ ``` ## Technical Reference ### Native Context Monitoring ```typescript // In turn_end hook const usage = ctx.getContextUsage(); // usage.tokens, usage.contextWindow, usage.percent // Trigger extraction when usage.percent >= threshold ``` ### Primary Hook: turn_end ```typescript pi.on("turn_end", async (event, ctx) => { const { turnIndex, message, toolResults } = event; // Buffer turn to Redis // Check ctx.getContextUsage().percent // If batch >= 5 OR percent >= 50%: extract }); ``` ### Timer Fallback ```bash # Cron every 10 min # Check neuralstream:buffer age > 15 min # If yes: extract from partial batch ``` ### Context-Aware Extraction - Feed qwen3: Up to 256k tokens (full history for context) - Extract from: Last `batch_size` turns only - Benefit: Rich understanding without gemming current context ## Gem Format ```json { "gem_id": "uuid", "content": "Distilled insight/fact/decision", "summary": "One-line for quick scanning", "topics": ["docker", "redis", "architecture"], "importance": 0.9, "source": { "session_id": "uuid", "date": "2026-02-23", "turn_range": "15-20" }, "tags": ["decision", "fact", "preference", "todo", "code"], "created_at": "2026-02-23T15:26:00Z" } ``` ## Configuration (All Tunable) | Setting | Default | Description | |---------|---------|-------------| | batch_size | 5 | Turns per extraction | | context_threshold | 50% | Token % trigger (40-80% range) | | idle_timeout | 15 min | Timer trigger threshold | | gem_model | qwen3 | Extraction LLM (256k context) | | max_gems_injected | 10 | Per-turn limit | | embedding | snowflake-arctic-embed2 | Same as kimi_memories | | collection | neuralstream | Qdrant (1024 dims, Cosine) | ## Qdrant Schema **Collection:** `neuralstream` - Vector size: 1024 - Distance: Cosine - On-disk payload: true ## Project Structure ``` .projects/neuralstream/ ├── README.md # This file ├── session.md # Development log & state ├── prompt.md # (TBD) qwen3 extraction prompt └── src/ # (TBD) Implementation ├── extract.ts # Gem extraction logic ├── store.ts # Qdrant storage └── inject.ts # Context injection ``` ## Status - [x] Architecture defined (v2.2 context-aware) - [x] Native context monitoring validated (ctx.getContextUsage) - [x] Naming finalized (NeuralStream, alias: ns) - [x] Hook research completed - [x] Qdrant collection created (`neuralstream`) - [x] Gem format proposed - [x] Infrastructure decision (reuse Redis/Qdrant) - [ ] Extraction prompt design - [ ] Implementation - [ ] Testing ## Backups - Local: `/root/.openclaw/workspace/.projects/neuralstream/` - Remote: `deb2:/root/.projects/neuralstream/` (build/test only) - kimi_kb: Research entries stored ## Related Projects - **True Recall:** Gem extraction inspiration - **OpenClaw:** Host platform - **kimi_memories:** Shared Qdrant infrastructure - **mem-redis-watcher:** Existing capture layer --- **Created:** 2026-02-23 **Alias:** ns **Purpose:** Infinite context for LLMs