feat(5.18): corpus k-NN exemple etichetate + seed real Haiku (17181 op)

Seed app/data/operatii-etichetate.json regenerat cu subagenti Haiku pe TOATE
cele 17181 operatii distincte (ordine frecventa, 100%), inlocuind seed-ul Groq
(3758). Validare Haiku vs Groq pe 157 op etichetate: la dezacorduri Haiku corect
~22/30, Groq ~0. Haiku prinde gunoiul ratat de Groq (ITP, chirie anvelope, nume
piese fara actiune): NUL 2200 (12.8%) vs ~7.6% Groq; adaptare electronica OE-7
(nu OE-5), placute frana uzura OE-1 (nu OE-F avarie).

US-001..006: prefiltru NUL determinist, etichetator offline, generator seed,
seeder mapping_suggestions (in init_db, gated seed_operatii_enabled), embeddings
indexeaza corpus etichetat, enrich NUL+kNN. Distributie seed: OE-1 80.1%, NUL
12.8%, OE-2 3.5%, restul rar (OE-4/3/7/8/R/I/5, AITLV, R-ODO).

config: seed_operatii_enabled=True + embeddings_enabled=True implicit (SILVER
populat + sugestii semantice; ambele suggestion-only, dezactivabile prin env).

Suita: 1387 passed, 1 deselected (live).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
Claude Agent
2026-06-29 06:59:15 +00:00
parent c05fa00007
commit 756f77730f
17 changed files with 139308 additions and 44 deletions

View File

@@ -117,11 +117,21 @@ class Settings(BaseSettings):
enforce_plans: bool = True
# --- Embeddings (sugestie mapare, Stratul 2 PRD 5.14) ---
# DEZACTIVAT implicit: prima folosire lazy-load-eaza modelul fastembed/ONNX
# (~230MB pe disc) sincron in thread-ul de cerere -> hang la prima cerere /mapari.
# Activeaza explicit in productie (start.sh/Docker/.env) cand vrei sugestii semantice.
# OFF pastreaza suita de teste rapida si /mapari instant (cade pe GOLD/SILVER+fuzzy).
embeddings_enabled: bool = False
# ACTIVAT implicit: editorul de mapari ofera sugestii semantice (model fastembed/ONNX).
# Cost: prima folosire lazy-load-eaza modelul (~230MB pe disc) sincron in thread-ul de
# cerere -> prima cerere /mapari poate dura 30-120s pana modelul intra in memorie; cererile
# urmatoare sunt instant. SUGGESTION-ONLY: nu intra in resolve_prestatii (nu auto-trimite).
# Pune-l pe False (start.sh/Docker/.env: AUTOPASS_EMBEDDINGS_ENABLED=false) cand vrei
# /mapari instant la prima cerere sau suita de teste rapida (cade pe GOLD/SILVER+fuzzy).
embeddings_enabled: bool = True
# --- Seed corpus operatii etichetate (SILVER, PRD 5.18 US-004) ---
# ACTIVAT implicit: la init_db, populeaza mapping_suggestions din artefactul comis
# `app/data/operatii-etichetate.json` (INSERT OR IGNORE). Asa SILVER nu mai e gol in
# productie -> sugestii exact-match + corpus k-NN reale. SUGGESTION-ONLY.
# Pune-l pe False (AUTOPASS_SEED_OPERATII_ENABLED=false) cand vrei SILVER gol —
# conftest il dezactiveaza global, testele care-l vor il pornesc punctual.
seed_operatii_enabled: bool = True
@property
def rar_base_url(self) -> str: