feat(voice): improve Romanian STT — hallucination gate + finetuned model

Gemma 4 cloud audio was infeasible (31b-cloud has no audio; E4B broken
upstream, no deploy host), so improve faster-whisper instead.

- Pin temperature=0.0 to disable the fallback ladder that re-decoded unclear
  audio up to 6x (source of the 16-24s latency outliers); reject hallucinated
  segments via avg_logprob/compression_ratio in the new pure _filter_segments.
- Adopt mikr/whisper-small-ro-cv11 (CT2 int8) via configurable voice.stt_model:
  spike showed WER 24%->10%, numbers fixed at source, +0.33s p50 (in budget).
- Add tools/voice_stt_mine.py (log mining) + tools/voice_stt_spike.py (model
  eval with diacritic scoring) + tests for the gate and miner.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This commit is contained in:
2026-06-27 18:16:16 +00:00
parent ec23d188ec
commit ce273d14db
9 changed files with 664 additions and 16 deletions

1
.gitignore vendored
View File

@@ -29,3 +29,4 @@ memory.bak/
approved-tasks.json
dashboard/status.json
tools/anaf-monitor/monitor.log
models/