151 lines
4.5 KiB
Plaintext
151 lines
4.5 KiB
Plaintext
# NeuralStream Session State
|
|
|
|
**Date:** 2026-02-23
|
|
**Status:** Architecture v2.2 - Context-aware hybrid triggers
|
|
**Alias:** ns
|
|
|
|
---
|
|
|
|
## Architecture v2.2 (Current)
|
|
|
|
**Decision:** Three hybrid extraction triggers with full context awareness
|
|
|
|
| Trigger | When | Purpose |
|
|
|---------|------|---------|
|
|
| `turn_end` (N=5) | Every 5 turns | Normal batch extraction |
|
|
| Timer (15 min idle) | No new turn for 15 min | Catch partial batches |
|
|
| Context (50% threshold) | `ctx.getContextUsage().percent >= threshold` | Proactive pre-compaction |
|
|
|
|
**Context Awareness:**
|
|
- qwen3 gets **up to 256k tokens** of full conversation history for understanding
|
|
- Only extracts **last N turns** (oldest in batch) to avoid gemming current context
|
|
- Uses `ctx.getContextUsage()` native API for token monitoring
|
|
|
|
**Why Hybrid:**
|
|
- Batch extraction = better quality gems (more context)
|
|
- Timer safety = never lose important turns if user walks away
|
|
- Context trigger = proactive extraction before system forces compaction
|
|
- All gems survive `/new` and `/reset` via Qdrant
|
|
|
|
**Infrastructure:** Reuse existing Redis/Qdrant — NeuralStream is the "middle layer" only
|
|
|
|
---
|
|
|
|
## Core Insight
|
|
|
|
NeuralStream enables **infinite effective context** — active window stays small, but semantically relevant gems from all past conversations are queryable and injectable.
|
|
|
|
---
|
|
|
|
## Technical Decisions 2026-02-23
|
|
|
|
### Triggers (Three-way Hybrid)
|
|
| Trigger | Config | Default |
|
|
|---------|--------|---------|
|
|
| Batch size | `batch_size` | 5 turns |
|
|
| Idle timeout | `idle_timeout` | 15 minutes |
|
|
| Context threshold | `context_threshold` | 50% |
|
|
|
|
### Context Monitoring (Native API)
|
|
- `ctx.getContextUsage()` → `{tokens, contextWindow, percent}`
|
|
- Checked in `turn_end` hook
|
|
- Triggers extraction when `percent >= context_threshold`
|
|
|
|
### Extraction Context Window
|
|
- **Feed to qwen3:** Up to 256k tokens (full history for understanding)
|
|
- **Extract from:** Last `batch_size` turns only
|
|
- **Benefit:** Rich context awareness without gemming current conversation
|
|
|
|
### Storage
|
|
- **Buffer:** Redis (`neuralstream:buffer` key)
|
|
- **Gems:** Qdrant `neuralstream` collection (1024 dims, Cosine)
|
|
- **Existing infra:** Reuse mem-redis-watcher + qdrant-memory
|
|
|
|
### Gem Format (Proposed)
|
|
```json
|
|
{
|
|
"gem_id": "uuid",
|
|
"content": "Distilled insight/fact/decision",
|
|
"summary": "One-line for quick scanning",
|
|
"topics": ["docker", "redis", "architecture"],
|
|
"importance": 0.9,
|
|
"source": {
|
|
"session_id": "uuid",
|
|
"date": "2026-02-23",
|
|
"turn_range": "15-20"
|
|
},
|
|
"tags": ["decision", "fact", "preference", "todo", "code"],
|
|
"created_at": "2026-02-23T15:26:00Z"
|
|
}
|
|
```
|
|
|
|
### Extraction Model
|
|
- **qwen3** for gem extraction (256k context, cheap)
|
|
- **Dedicated prompt** (to be designed) for extracting high-value items
|
|
|
|
---
|
|
|
|
## Architecture Layers
|
|
|
|
| Layer | Status | Description |
|
|
|-------|--------|-------------|
|
|
| Capture | ✅ Existing | Every turn → Redis (mem-redis-watcher) |
|
|
| **Extract** | ⏳ NeuralStream | Batch → qwen3 → gems → Qdrant |
|
|
| Retrieve | ✅ Existing | Semantic search → inject context |
|
|
|
|
NeuralStream = Smart extraction layer on top of existing infra.
|
|
|
|
---
|
|
|
|
## Open Questions
|
|
|
|
- Gem extraction prompt design (deferred)
|
|
- Importance scoring: auto vs manual?
|
|
- Injection: `turn_start` hook or modify system prompt?
|
|
- Semantic search threshold tuning
|
|
|
|
---
|
|
|
|
## Next Steps
|
|
|
|
| Task | Status |
|
|
|------|--------|
|
|
| Architecture v2.2 finalized | ✅ |
|
|
| Native context monitoring validated | ✅ |
|
|
| Gem JSON schema | ✅ Proposed |
|
|
| Implement turn_end hook | ⏳ |
|
|
| Implement timer/cron check | ⏳ |
|
|
| Implement context trigger | ⏳ |
|
|
| Create extraction prompt | ⏳ |
|
|
| Test gem extraction with qwen3 | ⏳ |
|
|
| Implement injection mechanism | ⏳ |
|
|
|
|
---
|
|
|
|
## Decisions Log
|
|
|
|
| Date | Decision |
|
|
|------|----------|
|
|
| 2026-02-23 | Switch to turn_end hook (v2) |
|
|
| 2026-02-23 | Hybrid triggers with timer (v2.1) |
|
|
| 2026-02-23 | Context-aware extraction (v2.2) |
|
|
| 2026-02-23 | Native API: ctx.getContextUsage() |
|
|
| 2026-02-23 | Full context feed to qwen3 (256k) |
|
|
| 2026-02-23 | Reuse existing Redis/Qdrant infrastructure |
|
|
| 2026-02-23 | Batch N=5 turns |
|
|
| 2026-02-23 | Context threshold = 50% |
|
|
| 2026-02-23 | Inactivity timer = 15 min |
|
|
| 2026-02-23 | Dedicated qwen3 extraction prompt (deferred) |
|
|
|
|
---
|
|
|
|
## Backups
|
|
|
|
- Local: `/root/.openclaw/workspace/.projects/neuralstream/`
|
|
- Remote: `deb2:/root/.projects/neuralstream/` (build/test only)
|
|
- kimi_kb: Research entries stored
|
|
|
|
---
|
|
|
|
**Key Insight:** Session resets wipe context but NOT Qdrant. NeuralStream = "Context insurance policy" for infinite LLM memory.
|