Initial commit: TrueRecall v2.2 with 30b curator and timer-based curation
This commit is contained in:
150
session.md.neuralstream.bak
Normal file
150
session.md.neuralstream.bak
Normal file
@@ -0,0 +1,150 @@
|
||||
# NeuralStream Session State
|
||||
|
||||
**Date:** 2026-02-23
|
||||
**Status:** Architecture v2.2 - Context-aware hybrid triggers
|
||||
**Alias:** ns
|
||||
|
||||
---
|
||||
|
||||
## Architecture v2.2 (Current)
|
||||
|
||||
**Decision:** Three hybrid extraction triggers with full context awareness
|
||||
|
||||
| Trigger | When | Purpose |
|
||||
|---------|------|---------|
|
||||
| `turn_end` (N=5) | Every 5 turns | Normal batch extraction |
|
||||
| Timer (15 min idle) | No new turn for 15 min | Catch partial batches |
|
||||
| Context (50% threshold) | `ctx.getContextUsage().percent >= threshold` | Proactive pre-compaction |
|
||||
|
||||
**Context Awareness:**
|
||||
- qwen3 gets **up to 256k tokens** of full conversation history for understanding
|
||||
- Only extracts **last N turns** (oldest in batch) to avoid gemming current context
|
||||
- Uses `ctx.getContextUsage()` native API for token monitoring
|
||||
|
||||
**Why Hybrid:**
|
||||
- Batch extraction = better quality gems (more context)
|
||||
- Timer safety = never lose important turns if user walks away
|
||||
- Context trigger = proactive extraction before system forces compaction
|
||||
- All gems survive `/new` and `/reset` via Qdrant
|
||||
|
||||
**Infrastructure:** Reuse existing Redis/Qdrant — NeuralStream is the "middle layer" only
|
||||
|
||||
---
|
||||
|
||||
## Core Insight
|
||||
|
||||
NeuralStream enables **infinite effective context** — active window stays small, but semantically relevant gems from all past conversations are queryable and injectable.
|
||||
|
||||
---
|
||||
|
||||
## Technical Decisions 2026-02-23
|
||||
|
||||
### Triggers (Three-way Hybrid)
|
||||
| Trigger | Config | Default |
|
||||
|---------|--------|---------|
|
||||
| Batch size | `batch_size` | 5 turns |
|
||||
| Idle timeout | `idle_timeout` | 15 minutes |
|
||||
| Context threshold | `context_threshold` | 50% |
|
||||
|
||||
### Context Monitoring (Native API)
|
||||
- `ctx.getContextUsage()` → `{tokens, contextWindow, percent}`
|
||||
- Checked in `turn_end` hook
|
||||
- Triggers extraction when `percent >= context_threshold`
|
||||
|
||||
### Extraction Context Window
|
||||
- **Feed to qwen3:** Up to 256k tokens (full history for understanding)
|
||||
- **Extract from:** Last `batch_size` turns only
|
||||
- **Benefit:** Rich context awareness without gemming current conversation
|
||||
|
||||
### Storage
|
||||
- **Buffer:** Redis (`neuralstream:buffer` key)
|
||||
- **Gems:** Qdrant `neuralstream` collection (1024 dims, Cosine)
|
||||
- **Existing infra:** Reuse mem-redis-watcher + qdrant-memory
|
||||
|
||||
### Gem Format (Proposed)
|
||||
```json
|
||||
{
|
||||
"gem_id": "uuid",
|
||||
"content": "Distilled insight/fact/decision",
|
||||
"summary": "One-line for quick scanning",
|
||||
"topics": ["docker", "redis", "architecture"],
|
||||
"importance": 0.9,
|
||||
"source": {
|
||||
"session_id": "uuid",
|
||||
"date": "2026-02-23",
|
||||
"turn_range": "15-20"
|
||||
},
|
||||
"tags": ["decision", "fact", "preference", "todo", "code"],
|
||||
"created_at": "2026-02-23T15:26:00Z"
|
||||
}
|
||||
```
|
||||
|
||||
### Extraction Model
|
||||
- **qwen3** for gem extraction (256k context, cheap)
|
||||
- **Dedicated prompt** (to be designed) for extracting high-value items
|
||||
|
||||
---
|
||||
|
||||
## Architecture Layers
|
||||
|
||||
| Layer | Status | Description |
|
||||
|-------|--------|-------------|
|
||||
| Capture | ✅ Existing | Every turn → Redis (mem-redis-watcher) |
|
||||
| **Extract** | ⏳ NeuralStream | Batch → qwen3 → gems → Qdrant |
|
||||
| Retrieve | ✅ Existing | Semantic search → inject context |
|
||||
|
||||
NeuralStream = Smart extraction layer on top of existing infra.
|
||||
|
||||
---
|
||||
|
||||
## Open Questions
|
||||
|
||||
- Gem extraction prompt design (deferred)
|
||||
- Importance scoring: auto vs manual?
|
||||
- Injection: `turn_start` hook or modify system prompt?
|
||||
- Semantic search threshold tuning
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
| Task | Status |
|
||||
|------|--------|
|
||||
| Architecture v2.2 finalized | ✅ |
|
||||
| Native context monitoring validated | ✅ |
|
||||
| Gem JSON schema | ✅ Proposed |
|
||||
| Implement turn_end hook | ⏳ |
|
||||
| Implement timer/cron check | ⏳ |
|
||||
| Implement context trigger | ⏳ |
|
||||
| Create extraction prompt | ⏳ |
|
||||
| Test gem extraction with qwen3 | ⏳ |
|
||||
| Implement injection mechanism | ⏳ |
|
||||
|
||||
---
|
||||
|
||||
## Decisions Log
|
||||
|
||||
| Date | Decision |
|
||||
|------|----------|
|
||||
| 2026-02-23 | Switch to turn_end hook (v2) |
|
||||
| 2026-02-23 | Hybrid triggers with timer (v2.1) |
|
||||
| 2026-02-23 | Context-aware extraction (v2.2) |
|
||||
| 2026-02-23 | Native API: ctx.getContextUsage() |
|
||||
| 2026-02-23 | Full context feed to qwen3 (256k) |
|
||||
| 2026-02-23 | Reuse existing Redis/Qdrant infrastructure |
|
||||
| 2026-02-23 | Batch N=5 turns |
|
||||
| 2026-02-23 | Context threshold = 50% |
|
||||
| 2026-02-23 | Inactivity timer = 15 min |
|
||||
| 2026-02-23 | Dedicated qwen3 extraction prompt (deferred) |
|
||||
|
||||
---
|
||||
|
||||
## Backups
|
||||
|
||||
- Local: `/root/.openclaw/workspace/.projects/neuralstream/`
|
||||
- Remote: `deb2:/root/.projects/neuralstream/` (build/test only)
|
||||
- kimi_kb: Research entries stored
|
||||
|
||||
---
|
||||
|
||||
**Key Insight:** Session resets wipe context but NOT Qdrant. NeuralStream = "Context insurance policy" for infinite LLM memory.
|
||||
Reference in New Issue
Block a user