Files
echo-core/docs/okf-navigation-plan.md
Marius Mutu 5c9748ffb4 feat(memory): hybrid retrieval — navigation index.md + RAG hardening
Expose a navigation layer to the agent and harden RAG, after analyzing the
OKF note and testing on the real KB.

- memory_search.search(): dedupe best-chunk-per-file (a relevant note can no
  longer be buried by another file's chunks) + keyword fallback tagged
  degraded:True when Ollama is unreachable (no more hard crash).
- update_notes_index.py: emit per-folder index.md + root router; prune empty
  folders; fix latent subcategory->project bug.
- Exclude generated index.md from RAG rglob (reindex/incremental) + indexer
  scans + heartbeat freshness check (prevents self-pollution / reindex thrash).
- CLAUDE.md: reframe memory as hybrid (navigation first, RAG for fuzzy recall).
- Delete stale orphan kb/youtube/index.json; correct the OKF source note.
- Tests: dedup, keyword fallback, index.md exclusion. Plan + review in docs/.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-27 17:52:27 +00:00

114 lines
7.9 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Plan: Navigation layer pentru memoria agentului (OKF-inspired)
Sursă: analiza notei `memory/kb/youtube/2026-06-27_google-open-knowledge-format.md`
+ test empiric pe KB-ul real (151 note youtube, 581 note total).
## Context / problemă
Agentul Echo caută în memorie DOAR prin RAG (`src/memory_search.py`: Ollama
`all-minilm` 384-dim + cosine scan în SQLite). CLAUDE.md îl declară "single
source of truth". Test empiric: RAG ratează nota relevantă când query-ul e
parafrazat conceptual (ex. "cum organizez un KB pt agenți să folosească mai
puțini tokens" → nota OKF nu apare în top-8). `memory/kb/index.json` există
(581 note, regenerat azi) dar e consumat DOAR de dashboard-ul web (căi
`notes-data/`), nu de agent, și are 84k tokens. Există un orfan stale
`kb/youtube/index.json` (8/151 note, 5 luni vechime).
## Obiectiv
Dă agentului un strat de navigare ieftin și robust care completează RAG-ul
(nu îl înlocuiește), prinde parafrazele pe care embeddings le ratează, și
merge ca fallback când Ollama remote pică.
## Recomandări (scope propus)
### R1 — Șterge orfanul `kb/youtube/index.json`
Stale din 30 ian (8/151 note). Capcana "index învechit > lipsă index".
Efort: trivial.
### R2 — Generează `index.md` slim per-folder, auto
Extinde `tools/update_notes_index.py` să emită, pe lângă `index.json`, un
`index.md` per subfolder kb/ (title + descriere 1 rând + tags). Pilot dovedit:
youtube/ index.md = 11k tokens vs 259k (citit tot, 24×) vs 84k (index global,
7.7×). Capcană: scriptul scanează `*.md` recursiv → trebuie să excludă
explicit `index.md` ca să nu-l trateze ca notă (poluează index.json).
Regenerat din heartbeat.py la fiecare notă nouă.
### R3 — Expune navigarea agentului (hibrid cu RAG)
La `memory_search`, încarcă întâi index.md slim al folderului-țintă pe lângă
top-k din RAG, și combină. Prinde și parafraza, și keyword-ul. Instrucțiune în
CLAUDE.md cum să folosească indexul.
### R4 — Tratează Ollama remote ca SPOF
RAG depinde de host remote (`10.0.20.161:11434`). Dacă pică, `search()` aruncă
ConnectionError → memoria agentului dispare. index.md per-folder = fallback
fără Ollama. Adaugă degradare grațioasă în memory_search.search().
### R5 — NU face conversie big-bang la YAML front matter
Doar 6/586 note au YAML; update_notes_index.py extrage deja metadata din
convenția `**Tags:**`/`**Data:**`. Standardizează doar de-acum în template-ul
de notă nouă.
### R6 — Corectează nota OKF
Marchează "Google a lansat OKF" ca neverificat (o sursă YouTube; se confundă
cu Open Knowledge Foundation). Actualizează "Relevanță": nu lipsesc indexuri,
lipsește un index navigabil EXPUS agentului.
## NU în scope
- Vizualizare HTML graph a KB-ului (deprioritizat, efort mare/valoare mică).
- Înlocuirea RAG cu navigare pură (hibrid, nu substituție).
- Migrare ANN/vector-ext pentru viteza RAG (separat).
---
<!-- /autoplan review report -->
# GSTACK REVIEW REPORT (/autoplan)
Voices: Claude subagent only — **codex missing** on this host (all phases `[subagent-only]`).
Phases run: CEO, Eng, DX. Design **skipped** (no UI scope — HTML viz is out of scope).
## Cross-phase themes (flagged independently in 2-3 phases = high confidence)
| Theme | Phases | Severity |
|---|---|---|
| **T1 — R3 routing is undefined.** "Load the target folder's index.md" requires already knowing the folder — that IS the navigation problem. The 11k figure holds only for youtube alone; loading all 13 folders ≈ 43-84k, erasing the win. | CEO, Eng, DX | CRITICAL |
| **T2 — Wrong consumer.** The autonomous agent (Claude CLI in heartbeat.py) has filesystem access and never calls `search()`. Wiring R3 into `memory_search.search()` only changes the human `/search` command, not the agent. | Eng, DX | HIGH |
| **T3 — Staleness trap recreated.** R1 deletes a stale index (proof these rot). R2 creates 13+ new generated artifacts triggered only on *new note*, not edits → silent drift. | CEO, Eng, DX | HIGH |
| **T4 — Self-pollution into RAG.** `memory_search.reindex()/incremental_index()` do `rglob("*.md")` with no exclusion → index.md gets embedded and returned as fake "notes" in top-k. (Plan only flagged the index.json pollution, missed the RAG DB one.) | Eng | HIGH |
| **T5 — Token win vs strawman baseline.** Comparison is against "read all 259k" (nobody does that). Real baseline = RAG top-k (~1-3k tokens). Against that, index.md is *more* tokens, justified only by recall. | CEO | HIGH |
| **T6 — Cheaper alternatives unexamined.** `init_config` already supports `ollama.model`/`embedding_dim` → swapping all-minilm(384) for nomic/bge + reindex is a one-line change. Plus likely chunk-dedup recall bug, plus SQLite FTS5 hybrid (no new infra). All target "RAG misses paraphrases" directly. | CEO | CRITICAL |
| **T7 — R4 is the one sound, decoupled item.** `search()` raises ConnectionError on Ollama outage with no fallback (real SPOF). Ship independently. BUT it's a breaking contract change (existing tests assert it raises). | CEO, Eng, DX | keep |
## CEO consensus (subagent-only)
- Right problem? **DISAGREE w/ plan** — likely weak embedding model + chunk-dedup bug, not missing navigation.
- Premises stated? **No** — one query is not enough evidence; token win is vanity baseline.
- 6-month regret: 3 parallel stale metadata copies (SQLite, index.json, index.md).
- Alternatives explored? **No** — BM25/FTS5 hybrid, reranker, better embedder never compared.
- Prior art: OKF unverified/possibly nonexistent; bespoke format = zero portability gain.
## Eng consensus (subagent-only)
- Architecture: R3 unbuildable as written (no folder signal into `search()`). R2-in-update_notes_index acceptable reuse but keep separated from `notes-data/` rewriting.
- Edge cases: T4 self-pollution; heartbeat mtime thrash; `projects/` (236 notes, nested) breaks flat per-folder assumption.
- Tests: R4 breaks `search()` contract — existing tests assert raise; need rewrite + new coverage for R2/R3/T4.
## DX consensus (subagent-only)
- Discoverability: CLAUDE.md:138 calls RAG "single source of truth" — a soft new instruction loses to it; agent keeps defaulting to RAG.
- Human workflow: edit-without-new-file → silent index.md drift.
- Degradation signal (R4): must return `mode="degraded_navigation_only"` + tell user, never silent.
- Latent bug to fix first: `update_notes_index.py:244` references `n['subcategory']`, a key never set (extractor sets `project`).
## Decision Audit Trail
| # | Phase | Decision | Class | Principle | Rationale |
|---|---|---|---|---|---|
| 1 | Eng | Add `index.md` exclusion to BOTH update_notes_index scan AND memory_search rglob (reindex/incremental) | Mechanical | P1 completeness | T4 is silent corruption; non-negotiable IF R2 ships |
| 2 | Eng | R4 split from R2/R3, shipped standalone | Mechanical | P6 action | Highest value/lowest risk, no dependency |
| 3 | DX | R4 returns structured degraded mode + user signal, not silent | Mechanical | P1 | Silent shallow results worse than error |
| 4 | CEO/DX | R3 (hybrid into search()) deferred until routing + consumer resolved (T1/T2) | Taste | P5 explicit | Unbuildable as written |
| 5 | CEO | Add "fix RAG first" track (model test + chunk-dedup + FTS5) before bespoke index | USER CHALLENGE | P3/P4 | Cheaper, reuses infra, targets same symptom — but user's call |
| 6 | all | R1 (delete orphan) + R6 (fix note) ship anytime | Mechanical | P6 | Trivial, independent |
## REVISED scope (post-review)
- **Ship now (safe, independent):** R1 delete orphan, R6 fix note, R4 graceful degradation (with explicit signal + test rewrite), fix latent bug update_notes_index.py:244, chunk-dedup in search().
- **Test before building (cheap, reversible):** swap embedding model (nomic-embed-text/bge-m3) + reindex; re-run the failing paraphrase query; prototype SQLite FTS5 hybrid.
- **Build only if the above doesn't fix recall:** R2 index.md (with T3/T4 lifecycle + exclusion fixes, per-category granularity for projects/), R3 hybrid (after routing + consumer T1/T2 designed).