feat(5.18): corpus k-NN exemple etichetate + seed real Haiku (17181 op)

Seed app/data/operatii-etichetate.json regenerat cu subagenti Haiku pe TOATE
cele 17181 operatii distincte (ordine frecventa, 100%), inlocuind seed-ul Groq
(3758). Validare Haiku vs Groq pe 157 op etichetate: la dezacorduri Haiku corect
~22/30, Groq ~0. Haiku prinde gunoiul ratat de Groq (ITP, chirie anvelope, nume
piese fara actiune): NUL 2200 (12.8%) vs ~7.6% Groq; adaptare electronica OE-7
(nu OE-5), placute frana uzura OE-1 (nu OE-F avarie).

US-001..006: prefiltru NUL determinist, etichetator offline, generator seed,
seeder mapping_suggestions (in init_db, gated seed_operatii_enabled), embeddings
indexeaza corpus etichetat, enrich NUL+kNN. Distributie seed: OE-1 80.1%, NUL
12.8%, OE-2 3.5%, restul rar (OE-4/3/7/8/R/I/5, AITLV, R-ODO).

config: seed_operatii_enabled=True + embeddings_enabled=True implicit (SILVER
populat + sugestii semantice; ambele suggestion-only, dezactivabile prin env).

Suita: 1387 passed, 1 deselected (live).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
Claude Agent
2026-06-29 06:59:15 +00:00
parent c05fa00007
commit 756f77730f
17 changed files with 139308 additions and 44 deletions

View File

@@ -135,10 +135,12 @@ class EmbeddingEngine:
denumire: str,
top_k: int = 3,
) -> list[dict]:
"""Returneaza top_k vecini cosine [{cod, similaritate}].
"""Returneaza top_k vecini cosine [{cod, is_nul, similaritate}].
Returneaza [] daca backend-ul lipseste, corpus-ul e gol sau apare
orice exceptie (degradare gratioasa -- nu blocheaza ingestia).
`is_nul` (PRD 5.18 US-005): cand corpusul include exemple NUL (non-operatii),
un vecin NUL = semnal de SUPRESIE, nu cod. Default False pe corpusuri vechi
fara `is_nul` in itemi. Returneaza [] daca backend-ul lipseste, corpus-ul e gol
sau apare orice exceptie (degradare gratioasa -- nu blocheaza ingestia).
"""
if not self.is_available() or not self._corpus_items:
return []
@@ -149,6 +151,7 @@ class EmbeddingEngine:
scored = [
{
"cod": item["cod"],
"is_nul": bool(item.get("is_nul", False)),
"similaritate": _cosine_similarity(query_vec, vec),
}
for item, vec in zip(self._corpus_items, self._corpus_vecs)