Collection: knowledge_base

Metadata Schema:
{
  "subject": "Machine Learning",         // Primary topic/theme
  "subjects": ["AI", "NLP"],            // Related subjects for cross-linking
  "category": "reference",               // reference | code | notes | documentation
  "path": "AI/ML/Transformers",          // Hierarchical location (like filesystem)
  "level": 2,                            // Depth: 0=root, 1=section, 2=chunk
  "parent_id": "abc-123",                // Parent document ID (for chunks/children)
  
  "content_type": "web_page",            // web_page | pdf | code | markdown | note
  "language": "python",                  // For code/docs (optional)
  "project": "llm-research",             // Optional project tag
  
  "checksum": "sha256:abc...",           // For duplicate detection
  "source_url": "https://...",           // Optional reference (not primary org)
  
  "title": "Understanding Transformers", // Display name
  "concepts": ["attention", "bert"],     // Auto-extracted key concepts
  "date_added": "2026-02-05",
  "date_updated": "2026-02-05"
}

Key Design Decisions:
- Subject-first: Organize by topic, not by where it came from
- Path-based hierarchy: Navigate "AI/ML/Transformers" or "Projects/HomeLab/Docker"
- Separate from memories: knowledge_base and openclaw_memories don't mix
- Duplicate handling: Checksum comparison → overwrite if changed, skip if same
- No retention limits

Use Cases:
- Web scrape → path: "Research/Web/<topic>", subject: extracted topic
- Project docs → path: "Projects/<project-name>/<doc>", project tag
- Code reference → path: "Code/<language>/<topic>", language field
- Personal notes → path: "Notes/<category>/<note>"