Collection: knowledge_base Metadata Schema: { "subject": "Machine Learning", // Primary topic/theme "subjects": ["AI", "NLP"], // Related subjects for cross-linking "category": "reference", // reference | code | notes | documentation "path": "AI/ML/Transformers", // Hierarchical location (like filesystem) "level": 2, // Depth: 0=root, 1=section, 2=chunk "parent_id": "abc-123", // Parent document ID (for chunks/children) "content_type": "web_page", // web_page | pdf | code | markdown | note "language": "python", // For code/docs (optional) "project": "llm-research", // Optional project tag "checksum": "sha256:abc...", // For duplicate detection "source_url": "https://...", // Optional reference (not primary org) "title": "Understanding Transformers", // Display name "concepts": ["attention", "bert"], // Auto-extracted key concepts "date_added": "2026-02-05", "date_updated": "2026-02-05" } Key Design Decisions: - Subject-first: Organize by topic, not by where it came from - Path-based hierarchy: Navigate "AI/ML/Transformers" or "Projects/HomeLab/Docker" - Separate from memories: knowledge_base and openclaw_memories don't mix - Duplicate handling: Checksum comparison → overwrite if changed, skip if same - No retention limits Use Cases: - Web scrape → path: "Research/Web/", subject: extracted topic - Project docs → path: "Projects//", project tag - Code reference → path: "Code//", language field - Personal notes → path: "Notes//"