feat(5.18): corpus k-NN exemple etichetate + seed real Haiku (17181 op)

Seed app/data/operatii-etichetate.json regenerat cu subagenti Haiku pe TOATE
cele 17181 operatii distincte (ordine frecventa, 100%), inlocuind seed-ul Groq
(3758). Validare Haiku vs Groq pe 157 op etichetate: la dezacorduri Haiku corect
~22/30, Groq ~0. Haiku prinde gunoiul ratat de Groq (ITP, chirie anvelope, nume
piese fara actiune): NUL 2200 (12.8%) vs ~7.6% Groq; adaptare electronica OE-7
(nu OE-5), placute frana uzura OE-1 (nu OE-F avarie).

US-001..006: prefiltru NUL determinist, etichetator offline, generator seed,
seeder mapping_suggestions (in init_db, gated seed_operatii_enabled), embeddings
indexeaza corpus etichetat, enrich NUL+kNN. Distributie seed: OE-1 80.1%, NUL
12.8%, OE-2 3.5%, restul rar (OE-4/3/7/8/R/I/5, AITLV, R-ODO).

config: seed_operatii_enabled=True + embeddings_enabled=True implicit (SILVER
populat + sugestii semantice; ambele suggestion-only, dezactivabile prin env).

Suita: 1387 passed, 1 deselected (live).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
Claude Agent
2026-06-29 06:59:15 +00:00
parent c05fa00007
commit 756f77730f
17 changed files with 139308 additions and 44 deletions

View File

@@ -272,14 +272,18 @@ def test_embeddings_functional_cand_flag_activ(conn, monkeypatch):
get_settings.cache_clear()
monkeypatch.setattr(emb_mod, "_engine", EmbeddingEngine(backend=_FakeEmbedBackend()))
# Nomenclatorul (din fixtura conn) are OE-1..OE-4; adaug coduri cu denumiri keyword.
# Corpusul sursa = mapping_suggestions (SILVER) -- PRD 5.18 US-005.
# (Inainte era nomenclator_rar; migrat la mapping_suggestions ca k-NN sa
# opereze pe exemple reale etichetate, nu pe categorii generice RAR.)
conn.execute(
"INSERT OR REPLACE INTO nomenclator_rar (cod_prestatie, nume_prestatie) VALUES (?, ?)",
("UL-1", "Schimb ulei"),
"INSERT OR REPLACE INTO mapping_suggestions "
"(denumire_normalizata, cod_prestatie, is_nul, source, confidence) VALUES (?, ?, ?, ?, ?)",
("Schimb ulei", "UL-1", 0, "llm", 0.95),
)
conn.execute(
"INSERT OR REPLACE INTO nomenclator_rar (cod_prestatie, nume_prestatie) VALUES (?, ?)",
("FR-1", "Placute frana"),
"INSERT OR REPLACE INTO mapping_suggestions "
"(denumire_normalizata, cod_prestatie, is_nul, source, confidence) VALUES (?, ?, ?, ?, ?)",
("Placute frana", "FR-1", 0, "llm", 0.95),
)
conn.commit()