184 lines
5.9 KiB
Plaintext
184 lines
5.9 KiB
Plaintext
# NeuralStream
|
|
|
|
**Neural streaming memory for OpenClaw with gem-based context injection.**
|
|
|
|
## Overview
|
|
|
|
NeuralStream extracts high-value insights ("gems") from conversation batches using qwen3, stores them in Qdrant, and injects relevant gems into context on each new turn. This creates **infinite effective context** — the active window stays small, but semantically relevant gems from all past conversations are always retrievable.
|
|
|
|
## Core Concept
|
|
|
|
| Traditional Memory | NeuralStream |
|
|
|-------------------|--------------|
|
|
| Context lost on `/new` | Gems persist in Qdrant |
|
|
| Full history or generic summary | Semantic gem retrieval |
|
|
| Static context window | Dynamic injection |
|
|
| Survives compaction only | Survives session reset |
|
|
| **Limited context** | **Infinite effective context** |
|
|
|
|
## How It Works
|
|
|
|
### Capture → Extract → Store → Retrieve
|
|
|
|
1. **Capture:** Every turn buffered to Redis (reuses mem-redis-watcher)
|
|
2. **Extract:** Batch of 5 turns → qwen3 (with 256k context) extracts structured gems
|
|
3. **Store:** Gems embedded + stored in Qdrant `neuralstream`
|
|
4. **Retrieve:** Each new turn → semantic search → inject top-10 gems
|
|
|
|
### Hybrid Triggers (Three-way)
|
|
|
|
| Trigger | Condition | Purpose |
|
|
|---------|-----------|---------|
|
|
| Batch | Every 5 turns | Normal extraction |
|
|
| Context | 50% usage (`ctx.getContextUsage()`) | Proactive pre-compaction |
|
|
| Timer | 15 min idle | Safety net |
|
|
|
|
**Context Awareness:** qwen3 receives up to 256k tokens of history for understanding, but only extracts gems from the last N turns (avoiding current context).
|
|
|
|
All gems survive `/new`, `/reset`, and compaction via Qdrant persistence.
|
|
|
|
## Architecture
|
|
|
|
NeuralStream is the **middle layer** — extraction intelligence on top of existing infrastructure:
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────┐
|
|
│ EXISTING: mem-redis-watcher │
|
|
│ Every turn → Redis buffer │
|
|
└──────────────────┬──────────────────────────────────────┘
|
|
│
|
|
┌──────────▼──────────┐
|
|
│ NeuralStream │
|
|
│ - Batch reader │
|
|
│ - Gem extractor │
|
|
│ - Qdrant store │
|
|
└──────────┬──────────┘
|
|
│
|
|
┌──────────▼──────────┐
|
|
│ EXISTING: │
|
|
│ qdrant-memory │
|
|
│ Semantic search │
|
|
│ Context injection │
|
|
└─────────────────────┘
|
|
```
|
|
|
|
## Technical Reference
|
|
|
|
### Native Context Monitoring
|
|
|
|
```typescript
|
|
// In turn_end hook
|
|
const usage = ctx.getContextUsage();
|
|
// usage.tokens, usage.contextWindow, usage.percent
|
|
// Trigger extraction when usage.percent >= threshold
|
|
```
|
|
|
|
### Primary Hook: turn_end
|
|
|
|
```typescript
|
|
pi.on("turn_end", async (event, ctx) => {
|
|
const { turnIndex, message, toolResults } = event;
|
|
|
|
// Buffer turn to Redis
|
|
// Check ctx.getContextUsage().percent
|
|
// If batch >= 5 OR percent >= 50%: extract
|
|
});
|
|
```
|
|
|
|
### Timer Fallback
|
|
|
|
```bash
|
|
# Cron every 10 min
|
|
# Check neuralstream:buffer age > 15 min
|
|
# If yes: extract from partial batch
|
|
```
|
|
|
|
### Context-Aware Extraction
|
|
|
|
- Feed qwen3: Up to 256k tokens (full history for context)
|
|
- Extract from: Last `batch_size` turns only
|
|
- Benefit: Rich understanding without gemming current context
|
|
|
|
## Gem Format
|
|
|
|
```json
|
|
{
|
|
"gem_id": "uuid",
|
|
"content": "Distilled insight/fact/decision",
|
|
"summary": "One-line for quick scanning",
|
|
"topics": ["docker", "redis", "architecture"],
|
|
"importance": 0.9,
|
|
"source": {
|
|
"session_id": "uuid",
|
|
"date": "2026-02-23",
|
|
"turn_range": "15-20"
|
|
},
|
|
"tags": ["decision", "fact", "preference", "todo", "code"],
|
|
"created_at": "2026-02-23T15:26:00Z"
|
|
}
|
|
```
|
|
|
|
## Configuration (All Tunable)
|
|
|
|
| Setting | Default | Description |
|
|
|---------|---------|-------------|
|
|
| batch_size | 5 | Turns per extraction |
|
|
| context_threshold | 50% | Token % trigger (40-80% range) |
|
|
| idle_timeout | 15 min | Timer trigger threshold |
|
|
| gem_model | qwen3 | Extraction LLM (256k context) |
|
|
| max_gems_injected | 10 | Per-turn limit |
|
|
| embedding | snowflake-arctic-embed2 | Same as kimi_memories |
|
|
| collection | neuralstream | Qdrant (1024 dims, Cosine) |
|
|
|
|
## Qdrant Schema
|
|
|
|
**Collection:** `neuralstream`
|
|
- Vector size: 1024
|
|
- Distance: Cosine
|
|
- On-disk payload: true
|
|
|
|
## Project Structure
|
|
|
|
```
|
|
.projects/neuralstream/
|
|
├── README.md # This file
|
|
├── session.md # Development log & state
|
|
├── prompt.md # (TBD) qwen3 extraction prompt
|
|
└── src/ # (TBD) Implementation
|
|
├── extract.ts # Gem extraction logic
|
|
├── store.ts # Qdrant storage
|
|
└── inject.ts # Context injection
|
|
```
|
|
|
|
## Status
|
|
|
|
- [x] Architecture defined (v2.2 context-aware)
|
|
- [x] Native context monitoring validated (ctx.getContextUsage)
|
|
- [x] Naming finalized (NeuralStream, alias: ns)
|
|
- [x] Hook research completed
|
|
- [x] Qdrant collection created (`neuralstream`)
|
|
- [x] Gem format proposed
|
|
- [x] Infrastructure decision (reuse Redis/Qdrant)
|
|
- [ ] Extraction prompt design
|
|
- [ ] Implementation
|
|
- [ ] Testing
|
|
|
|
## Backups
|
|
|
|
- Local: `/root/.openclaw/workspace/.projects/neuralstream/`
|
|
- Remote: `deb2:/root/.projects/neuralstream/` (build/test only)
|
|
- kimi_kb: Research entries stored
|
|
|
|
## Related Projects
|
|
|
|
- **True Recall:** Gem extraction inspiration
|
|
- **OpenClaw:** Host platform
|
|
- **kimi_memories:** Shared Qdrant infrastructure
|
|
- **mem-redis-watcher:** Existing capture layer
|
|
|
|
---
|
|
|
|
**Created:** 2026-02-23
|
|
**Alias:** ns
|
|
**Purpose:** Infinite context for LLMs
|