# NeuralStream Session State **Date:** 2026-02-23 **Status:** Architecture v2.2 - Context-aware hybrid triggers **Alias:** ns --- ## Architecture v2.2 (Current) **Decision:** Three hybrid extraction triggers with full context awareness | Trigger | When | Purpose | |---------|------|---------| | `turn_end` (N=5) | Every 5 turns | Normal batch extraction | | Timer (15 min idle) | No new turn for 15 min | Catch partial batches | | Context (50% threshold) | `ctx.getContextUsage().percent >= threshold` | Proactive pre-compaction | **Context Awareness:** - qwen3 gets **up to 256k tokens** of full conversation history for understanding - Only extracts **last N turns** (oldest in batch) to avoid gemming current context - Uses `ctx.getContextUsage()` native API for token monitoring **Why Hybrid:** - Batch extraction = better quality gems (more context) - Timer safety = never lose important turns if user walks away - Context trigger = proactive extraction before system forces compaction - All gems survive `/new` and `/reset` via Qdrant **Infrastructure:** Reuse existing Redis/Qdrant — NeuralStream is the "middle layer" only --- ## Core Insight NeuralStream enables **infinite effective context** — active window stays small, but semantically relevant gems from all past conversations are queryable and injectable. --- ## Technical Decisions 2026-02-23 ### Triggers (Three-way Hybrid) | Trigger | Config | Default | |---------|--------|---------| | Batch size | `batch_size` | 5 turns | | Idle timeout | `idle_timeout` | 15 minutes | | Context threshold | `context_threshold` | 50% | ### Context Monitoring (Native API) - `ctx.getContextUsage()` → `{tokens, contextWindow, percent}` - Checked in `turn_end` hook - Triggers extraction when `percent >= context_threshold` ### Extraction Context Window - **Feed to qwen3:** Up to 256k tokens (full history for understanding) - **Extract from:** Last `batch_size` turns only - **Benefit:** Rich context awareness without gemming current conversation ### Storage - **Buffer:** Redis (`neuralstream:buffer` key) - **Gems:** Qdrant `neuralstream` collection (1024 dims, Cosine) - **Existing infra:** Reuse mem-redis-watcher + qdrant-memory ### Gem Format (Proposed) ```json { "gem_id": "uuid", "content": "Distilled insight/fact/decision", "summary": "One-line for quick scanning", "topics": ["docker", "redis", "architecture"], "importance": 0.9, "source": { "session_id": "uuid", "date": "2026-02-23", "turn_range": "15-20" }, "tags": ["decision", "fact", "preference", "todo", "code"], "created_at": "2026-02-23T15:26:00Z" } ``` ### Extraction Model - **qwen3** for gem extraction (256k context, cheap) - **Dedicated prompt** (to be designed) for extracting high-value items --- ## Architecture Layers | Layer | Status | Description | |-------|--------|-------------| | Capture | ✅ Existing | Every turn → Redis (mem-redis-watcher) | | **Extract** | ⏳ NeuralStream | Batch → qwen3 → gems → Qdrant | | Retrieve | ✅ Existing | Semantic search → inject context | NeuralStream = Smart extraction layer on top of existing infra. --- ## Open Questions - Gem extraction prompt design (deferred) - Importance scoring: auto vs manual? - Injection: `turn_start` hook or modify system prompt? - Semantic search threshold tuning --- ## Next Steps | Task | Status | |------|--------| | Architecture v2.2 finalized | ✅ | | Native context monitoring validated | ✅ | | Gem JSON schema | ✅ Proposed | | Implement turn_end hook | ⏳ | | Implement timer/cron check | ⏳ | | Implement context trigger | ⏳ | | Create extraction prompt | ⏳ | | Test gem extraction with qwen3 | ⏳ | | Implement injection mechanism | ⏳ | --- ## Decisions Log | Date | Decision | |------|----------| | 2026-02-23 | Switch to turn_end hook (v2) | | 2026-02-23 | Hybrid triggers with timer (v2.1) | | 2026-02-23 | Context-aware extraction (v2.2) | | 2026-02-23 | Native API: ctx.getContextUsage() | | 2026-02-23 | Full context feed to qwen3 (256k) | | 2026-02-23 | Reuse existing Redis/Qdrant infrastructure | | 2026-02-23 | Batch N=5 turns | | 2026-02-23 | Context threshold = 50% | | 2026-02-23 | Inactivity timer = 15 min | | 2026-02-23 | Dedicated qwen3 extraction prompt (deferred) | --- ## Backups - Local: `/root/.openclaw/workspace/.projects/neuralstream/` - Remote: `deb2:/root/.projects/neuralstream/` (build/test only) - kimi_kb: Research entries stored --- **Key Insight:** Session resets wipe context but NOT Qdrant. NeuralStream = "Context insurance policy" for infinite LLM memory.