Files
true-recall/README.md.neuralstream.bak

184 lines
5.9 KiB
Plaintext

# NeuralStream
**Neural streaming memory for OpenClaw with gem-based context injection.**
## Overview
NeuralStream extracts high-value insights ("gems") from conversation batches using qwen3, stores them in Qdrant, and injects relevant gems into context on each new turn. This creates **infinite effective context** — the active window stays small, but semantically relevant gems from all past conversations are always retrievable.
## Core Concept
| Traditional Memory | NeuralStream |
|-------------------|--------------|
| Context lost on `/new` | Gems persist in Qdrant |
| Full history or generic summary | Semantic gem retrieval |
| Static context window | Dynamic injection |
| Survives compaction only | Survives session reset |
| **Limited context** | **Infinite effective context** |
## How It Works
### Capture → Extract → Store → Retrieve
1. **Capture:** Every turn buffered to Redis (reuses mem-redis-watcher)
2. **Extract:** Batch of 5 turns → qwen3 (with 256k context) extracts structured gems
3. **Store:** Gems embedded + stored in Qdrant `neuralstream`
4. **Retrieve:** Each new turn → semantic search → inject top-10 gems
### Hybrid Triggers (Three-way)
| Trigger | Condition | Purpose |
|---------|-----------|---------|
| Batch | Every 5 turns | Normal extraction |
| Context | 50% usage (`ctx.getContextUsage()`) | Proactive pre-compaction |
| Timer | 15 min idle | Safety net |
**Context Awareness:** qwen3 receives up to 256k tokens of history for understanding, but only extracts gems from the last N turns (avoiding current context).
All gems survive `/new`, `/reset`, and compaction via Qdrant persistence.
## Architecture
NeuralStream is the **middle layer** — extraction intelligence on top of existing infrastructure:
```
┌─────────────────────────────────────────────────────────┐
│ EXISTING: mem-redis-watcher │
│ Every turn → Redis buffer │
└──────────────────┬──────────────────────────────────────┘
┌──────────▼──────────┐
│ NeuralStream │
│ - Batch reader │
│ - Gem extractor │
│ - Qdrant store │
└──────────┬──────────┘
┌──────────▼──────────┐
│ EXISTING: │
│ qdrant-memory │
│ Semantic search │
│ Context injection │
└─────────────────────┘
```
## Technical Reference
### Native Context Monitoring
```typescript
// In turn_end hook
const usage = ctx.getContextUsage();
// usage.tokens, usage.contextWindow, usage.percent
// Trigger extraction when usage.percent >= threshold
```
### Primary Hook: turn_end
```typescript
pi.on("turn_end", async (event, ctx) => {
const { turnIndex, message, toolResults } = event;
// Buffer turn to Redis
// Check ctx.getContextUsage().percent
// If batch >= 5 OR percent >= 50%: extract
});
```
### Timer Fallback
```bash
# Cron every 10 min
# Check neuralstream:buffer age > 15 min
# If yes: extract from partial batch
```
### Context-Aware Extraction
- Feed qwen3: Up to 256k tokens (full history for context)
- Extract from: Last `batch_size` turns only
- Benefit: Rich understanding without gemming current context
## Gem Format
```json
{
"gem_id": "uuid",
"content": "Distilled insight/fact/decision",
"summary": "One-line for quick scanning",
"topics": ["docker", "redis", "architecture"],
"importance": 0.9,
"source": {
"session_id": "uuid",
"date": "2026-02-23",
"turn_range": "15-20"
},
"tags": ["decision", "fact", "preference", "todo", "code"],
"created_at": "2026-02-23T15:26:00Z"
}
```
## Configuration (All Tunable)
| Setting | Default | Description |
|---------|---------|-------------|
| batch_size | 5 | Turns per extraction |
| context_threshold | 50% | Token % trigger (40-80% range) |
| idle_timeout | 15 min | Timer trigger threshold |
| gem_model | qwen3 | Extraction LLM (256k context) |
| max_gems_injected | 10 | Per-turn limit |
| embedding | snowflake-arctic-embed2 | Same as kimi_memories |
| collection | neuralstream | Qdrant (1024 dims, Cosine) |
## Qdrant Schema
**Collection:** `neuralstream`
- Vector size: 1024
- Distance: Cosine
- On-disk payload: true
## Project Structure
```
.projects/neuralstream/
├── README.md # This file
├── session.md # Development log & state
├── prompt.md # (TBD) qwen3 extraction prompt
└── src/ # (TBD) Implementation
├── extract.ts # Gem extraction logic
├── store.ts # Qdrant storage
└── inject.ts # Context injection
```
## Status
- [x] Architecture defined (v2.2 context-aware)
- [x] Native context monitoring validated (ctx.getContextUsage)
- [x] Naming finalized (NeuralStream, alias: ns)
- [x] Hook research completed
- [x] Qdrant collection created (`neuralstream`)
- [x] Gem format proposed
- [x] Infrastructure decision (reuse Redis/Qdrant)
- [ ] Extraction prompt design
- [ ] Implementation
- [ ] Testing
## Backups
- Local: `/root/.openclaw/workspace/.projects/neuralstream/`
- Remote: `deb2:/root/.projects/neuralstream/` (build/test only)
- kimi_kb: Research entries stored
## Related Projects
- **True Recall:** Gem extraction inspiration
- **OpenClaw:** Host platform
- **kimi_memories:** Shared Qdrant infrastructure
- **mem-redis-watcher:** Existing capture layer
---
**Created:** 2026-02-23
**Alias:** ns
**Purpose:** Infinite context for LLMs