Files
true-recall/session.md.neuralstream.bak

151 lines
4.5 KiB
Plaintext

# NeuralStream Session State
**Date:** 2026-02-23
**Status:** Architecture v2.2 - Context-aware hybrid triggers
**Alias:** ns
---
## Architecture v2.2 (Current)
**Decision:** Three hybrid extraction triggers with full context awareness
| Trigger | When | Purpose |
|---------|------|---------|
| `turn_end` (N=5) | Every 5 turns | Normal batch extraction |
| Timer (15 min idle) | No new turn for 15 min | Catch partial batches |
| Context (50% threshold) | `ctx.getContextUsage().percent >= threshold` | Proactive pre-compaction |
**Context Awareness:**
- qwen3 gets **up to 256k tokens** of full conversation history for understanding
- Only extracts **last N turns** (oldest in batch) to avoid gemming current context
- Uses `ctx.getContextUsage()` native API for token monitoring
**Why Hybrid:**
- Batch extraction = better quality gems (more context)
- Timer safety = never lose important turns if user walks away
- Context trigger = proactive extraction before system forces compaction
- All gems survive `/new` and `/reset` via Qdrant
**Infrastructure:** Reuse existing Redis/Qdrant — NeuralStream is the "middle layer" only
---
## Core Insight
NeuralStream enables **infinite effective context** — active window stays small, but semantically relevant gems from all past conversations are queryable and injectable.
---
## Technical Decisions 2026-02-23
### Triggers (Three-way Hybrid)
| Trigger | Config | Default |
|---------|--------|---------|
| Batch size | `batch_size` | 5 turns |
| Idle timeout | `idle_timeout` | 15 minutes |
| Context threshold | `context_threshold` | 50% |
### Context Monitoring (Native API)
- `ctx.getContextUsage()` → `{tokens, contextWindow, percent}`
- Checked in `turn_end` hook
- Triggers extraction when `percent >= context_threshold`
### Extraction Context Window
- **Feed to qwen3:** Up to 256k tokens (full history for understanding)
- **Extract from:** Last `batch_size` turns only
- **Benefit:** Rich context awareness without gemming current conversation
### Storage
- **Buffer:** Redis (`neuralstream:buffer` key)
- **Gems:** Qdrant `neuralstream` collection (1024 dims, Cosine)
- **Existing infra:** Reuse mem-redis-watcher + qdrant-memory
### Gem Format (Proposed)
```json
{
"gem_id": "uuid",
"content": "Distilled insight/fact/decision",
"summary": "One-line for quick scanning",
"topics": ["docker", "redis", "architecture"],
"importance": 0.9,
"source": {
"session_id": "uuid",
"date": "2026-02-23",
"turn_range": "15-20"
},
"tags": ["decision", "fact", "preference", "todo", "code"],
"created_at": "2026-02-23T15:26:00Z"
}
```
### Extraction Model
- **qwen3** for gem extraction (256k context, cheap)
- **Dedicated prompt** (to be designed) for extracting high-value items
---
## Architecture Layers
| Layer | Status | Description |
|-------|--------|-------------|
| Capture | ✅ Existing | Every turn → Redis (mem-redis-watcher) |
| **Extract** | ⏳ NeuralStream | Batch → qwen3 → gems → Qdrant |
| Retrieve | ✅ Existing | Semantic search → inject context |
NeuralStream = Smart extraction layer on top of existing infra.
---
## Open Questions
- Gem extraction prompt design (deferred)
- Importance scoring: auto vs manual?
- Injection: `turn_start` hook or modify system prompt?
- Semantic search threshold tuning
---
## Next Steps
| Task | Status |
|------|--------|
| Architecture v2.2 finalized | ✅ |
| Native context monitoring validated | ✅ |
| Gem JSON schema | ✅ Proposed |
| Implement turn_end hook | ⏳ |
| Implement timer/cron check | ⏳ |
| Implement context trigger | ⏳ |
| Create extraction prompt | ⏳ |
| Test gem extraction with qwen3 | ⏳ |
| Implement injection mechanism | ⏳ |
---
## Decisions Log
| Date | Decision |
|------|----------|
| 2026-02-23 | Switch to turn_end hook (v2) |
| 2026-02-23 | Hybrid triggers with timer (v2.1) |
| 2026-02-23 | Context-aware extraction (v2.2) |
| 2026-02-23 | Native API: ctx.getContextUsage() |
| 2026-02-23 | Full context feed to qwen3 (256k) |
| 2026-02-23 | Reuse existing Redis/Qdrant infrastructure |
| 2026-02-23 | Batch N=5 turns |
| 2026-02-23 | Context threshold = 50% |
| 2026-02-23 | Inactivity timer = 15 min |
| 2026-02-23 | Dedicated qwen3 extraction prompt (deferred) |
---
## Backups
- Local: `/root/.openclaw/workspace/.projects/neuralstream/`
- Remote: `deb2:/root/.projects/neuralstream/` (build/test only)
- kimi_kb: Research entries stored
---
**Key Insight:** Session resets wipe context but NOT Qdrant. NeuralStream = "Context insurance policy" for infinite LLM memory.