Expose a navigation layer to the agent and harden RAG, after analyzing the OKF note and testing on the real KB. - memory_search.search(): dedupe best-chunk-per-file (a relevant note can no longer be buried by another file's chunks) + keyword fallback tagged degraded:True when Ollama is unreachable (no more hard crash). - update_notes_index.py: emit per-folder index.md + root router; prune empty folders; fix latent subcategory->project bug. - Exclude generated index.md from RAG rglob (reindex/incremental) + indexer scans + heartbeat freshness check (prevents self-pollution / reindex thrash). - CLAUDE.md: reframe memory as hybrid (navigation first, RAG for fuzzy recall). - Delete stale orphan kb/youtube/index.json; correct the OKF source note. - Tests: dedup, keyword fallback, index.md exclusion. Plan + review in docs/. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
114 lines
7.9 KiB
Markdown
114 lines
7.9 KiB
Markdown
# Plan: Navigation layer pentru memoria agentului (OKF-inspired)
|
||
|
||
Sursă: analiza notei `memory/kb/youtube/2026-06-27_google-open-knowledge-format.md`
|
||
+ test empiric pe KB-ul real (151 note youtube, 581 note total).
|
||
|
||
## Context / problemă
|
||
|
||
Agentul Echo caută în memorie DOAR prin RAG (`src/memory_search.py`: Ollama
|
||
`all-minilm` 384-dim + cosine scan în SQLite). CLAUDE.md îl declară "single
|
||
source of truth". Test empiric: RAG ratează nota relevantă când query-ul e
|
||
parafrazat conceptual (ex. "cum organizez un KB pt agenți să folosească mai
|
||
puțini tokens" → nota OKF nu apare în top-8). `memory/kb/index.json` există
|
||
(581 note, regenerat azi) dar e consumat DOAR de dashboard-ul web (căi
|
||
`notes-data/`), nu de agent, și are 84k tokens. Există un orfan stale
|
||
`kb/youtube/index.json` (8/151 note, 5 luni vechime).
|
||
|
||
## Obiectiv
|
||
|
||
Dă agentului un strat de navigare ieftin și robust care completează RAG-ul
|
||
(nu îl înlocuiește), prinde parafrazele pe care embeddings le ratează, și
|
||
merge ca fallback când Ollama remote pică.
|
||
|
||
## Recomandări (scope propus)
|
||
|
||
### R1 — Șterge orfanul `kb/youtube/index.json`
|
||
Stale din 30 ian (8/151 note). Capcana "index învechit > lipsă index".
|
||
Efort: trivial.
|
||
|
||
### R2 — Generează `index.md` slim per-folder, auto
|
||
Extinde `tools/update_notes_index.py` să emită, pe lângă `index.json`, un
|
||
`index.md` per subfolder kb/ (title + descriere 1 rând + tags). Pilot dovedit:
|
||
youtube/ index.md = 11k tokens vs 259k (citit tot, 24×) vs 84k (index global,
|
||
7.7×). Capcană: scriptul scanează `*.md` recursiv → trebuie să excludă
|
||
explicit `index.md` ca să nu-l trateze ca notă (poluează index.json).
|
||
Regenerat din heartbeat.py la fiecare notă nouă.
|
||
|
||
### R3 — Expune navigarea agentului (hibrid cu RAG)
|
||
La `memory_search`, încarcă întâi index.md slim al folderului-țintă pe lângă
|
||
top-k din RAG, și combină. Prinde și parafraza, și keyword-ul. Instrucțiune în
|
||
CLAUDE.md cum să folosească indexul.
|
||
|
||
### R4 — Tratează Ollama remote ca SPOF
|
||
RAG depinde de host remote (`10.0.20.161:11434`). Dacă pică, `search()` aruncă
|
||
ConnectionError → memoria agentului dispare. index.md per-folder = fallback
|
||
fără Ollama. Adaugă degradare grațioasă în memory_search.search().
|
||
|
||
### R5 — NU face conversie big-bang la YAML front matter
|
||
Doar 6/586 note au YAML; update_notes_index.py extrage deja metadata din
|
||
convenția `**Tags:**`/`**Data:**`. Standardizează doar de-acum în template-ul
|
||
de notă nouă.
|
||
|
||
### R6 — Corectează nota OKF
|
||
Marchează "Google a lansat OKF" ca neverificat (o sursă YouTube; se confundă
|
||
cu Open Knowledge Foundation). Actualizează "Relevanță": nu lipsesc indexuri,
|
||
lipsește un index navigabil EXPUS agentului.
|
||
|
||
## NU în scope
|
||
- Vizualizare HTML graph a KB-ului (deprioritizat, efort mare/valoare mică).
|
||
- Înlocuirea RAG cu navigare pură (hibrid, nu substituție).
|
||
- Migrare ANN/vector-ext pentru viteza RAG (separat).
|
||
|
||
---
|
||
<!-- /autoplan review report -->
|
||
# GSTACK REVIEW REPORT (/autoplan)
|
||
|
||
Voices: Claude subagent only — **codex missing** on this host (all phases `[subagent-only]`).
|
||
Phases run: CEO, Eng, DX. Design **skipped** (no UI scope — HTML viz is out of scope).
|
||
|
||
## Cross-phase themes (flagged independently in 2-3 phases = high confidence)
|
||
|
||
| Theme | Phases | Severity |
|
||
|---|---|---|
|
||
| **T1 — R3 routing is undefined.** "Load the target folder's index.md" requires already knowing the folder — that IS the navigation problem. The 11k figure holds only for youtube alone; loading all 13 folders ≈ 43-84k, erasing the win. | CEO, Eng, DX | CRITICAL |
|
||
| **T2 — Wrong consumer.** The autonomous agent (Claude CLI in heartbeat.py) has filesystem access and never calls `search()`. Wiring R3 into `memory_search.search()` only changes the human `/search` command, not the agent. | Eng, DX | HIGH |
|
||
| **T3 — Staleness trap recreated.** R1 deletes a stale index (proof these rot). R2 creates 13+ new generated artifacts triggered only on *new note*, not edits → silent drift. | CEO, Eng, DX | HIGH |
|
||
| **T4 — Self-pollution into RAG.** `memory_search.reindex()/incremental_index()` do `rglob("*.md")` with no exclusion → index.md gets embedded and returned as fake "notes" in top-k. (Plan only flagged the index.json pollution, missed the RAG DB one.) | Eng | HIGH |
|
||
| **T5 — Token win vs strawman baseline.** Comparison is against "read all 259k" (nobody does that). Real baseline = RAG top-k (~1-3k tokens). Against that, index.md is *more* tokens, justified only by recall. | CEO | HIGH |
|
||
| **T6 — Cheaper alternatives unexamined.** `init_config` already supports `ollama.model`/`embedding_dim` → swapping all-minilm(384) for nomic/bge + reindex is a one-line change. Plus likely chunk-dedup recall bug, plus SQLite FTS5 hybrid (no new infra). All target "RAG misses paraphrases" directly. | CEO | CRITICAL |
|
||
| **T7 — R4 is the one sound, decoupled item.** `search()` raises ConnectionError on Ollama outage with no fallback (real SPOF). Ship independently. BUT it's a breaking contract change (existing tests assert it raises). | CEO, Eng, DX | keep |
|
||
|
||
## CEO consensus (subagent-only)
|
||
- Right problem? **DISAGREE w/ plan** — likely weak embedding model + chunk-dedup bug, not missing navigation.
|
||
- Premises stated? **No** — one query is not enough evidence; token win is vanity baseline.
|
||
- 6-month regret: 3 parallel stale metadata copies (SQLite, index.json, index.md).
|
||
- Alternatives explored? **No** — BM25/FTS5 hybrid, reranker, better embedder never compared.
|
||
- Prior art: OKF unverified/possibly nonexistent; bespoke format = zero portability gain.
|
||
|
||
## Eng consensus (subagent-only)
|
||
- Architecture: R3 unbuildable as written (no folder signal into `search()`). R2-in-update_notes_index acceptable reuse but keep separated from `notes-data/` rewriting.
|
||
- Edge cases: T4 self-pollution; heartbeat mtime thrash; `projects/` (236 notes, nested) breaks flat per-folder assumption.
|
||
- Tests: R4 breaks `search()` contract — existing tests assert raise; need rewrite + new coverage for R2/R3/T4.
|
||
|
||
## DX consensus (subagent-only)
|
||
- Discoverability: CLAUDE.md:138 calls RAG "single source of truth" — a soft new instruction loses to it; agent keeps defaulting to RAG.
|
||
- Human workflow: edit-without-new-file → silent index.md drift.
|
||
- Degradation signal (R4): must return `mode="degraded_navigation_only"` + tell user, never silent.
|
||
- Latent bug to fix first: `update_notes_index.py:244` references `n['subcategory']`, a key never set (extractor sets `project`).
|
||
|
||
## Decision Audit Trail
|
||
|
||
| # | Phase | Decision | Class | Principle | Rationale |
|
||
|---|---|---|---|---|---|
|
||
| 1 | Eng | Add `index.md` exclusion to BOTH update_notes_index scan AND memory_search rglob (reindex/incremental) | Mechanical | P1 completeness | T4 is silent corruption; non-negotiable IF R2 ships |
|
||
| 2 | Eng | R4 split from R2/R3, shipped standalone | Mechanical | P6 action | Highest value/lowest risk, no dependency |
|
||
| 3 | DX | R4 returns structured degraded mode + user signal, not silent | Mechanical | P1 | Silent shallow results worse than error |
|
||
| 4 | CEO/DX | R3 (hybrid into search()) deferred until routing + consumer resolved (T1/T2) | Taste | P5 explicit | Unbuildable as written |
|
||
| 5 | CEO | Add "fix RAG first" track (model test + chunk-dedup + FTS5) before bespoke index | USER CHALLENGE | P3/P4 | Cheaper, reuses infra, targets same symptom — but user's call |
|
||
| 6 | all | R1 (delete orphan) + R6 (fix note) ship anytime | Mechanical | P6 | Trivial, independent |
|
||
|
||
## REVISED scope (post-review)
|
||
- **Ship now (safe, independent):** R1 delete orphan, R6 fix note, R4 graceful degradation (with explicit signal + test rewrite), fix latent bug update_notes_index.py:244, chunk-dedup in search().
|
||
- **Test before building (cheap, reversible):** swap embedding model (nomic-embed-text/bge-m3) + reindex; re-run the failing paraphrase query; prototype SQLite FTS5 hybrid.
|
||
- **Build only if the above doesn't fix recall:** R2 index.md (with T3/T4 lifecycle + exclusion fixes, per-category granularity for projects/), R3 hybrid (after routing + consumer T1/T2 designed).
|