feat(voice): improve Romanian STT — hallucination gate + finetuned model · ce273d14db - echo-core

feat(voice): improve Romanian STT — hallucination gate + finetuned model

Gemma 4 cloud audio was infeasible (31b-cloud has no audio; E4B broken
upstream, no deploy host), so improve faster-whisper instead.

- Pin temperature=0.0 to disable the fallback ladder that re-decoded unclear
  audio up to 6x (source of the 16-24s latency outliers); reject hallucinated
  segments via avg_logprob/compression_ratio in the new pure _filter_segments.
- Adopt mikr/whisper-small-ro-cv11 (CT2 int8) via configurable voice.stt_model:
  spike showed WER 24%->10%, numbers fixed at source, +0.33s p50 (in budget).
- Add tools/voice_stt_mine.py (log mining) + tools/voice_stt_spike.py (model
  eval with diacritic scoring) + tests for the gate and miner.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

This commit is contained in:

Marius Mutu

2026-06-27 18:16:16 +00:00

parent ec23d188ec

commit ce273d14db

9 changed files with 664 additions and 16 deletions

1

.gitignore vendored

View File

@@ -29,3 +29,4 @@ memory.bak/
 approved-tasks.json
 dashboard/status.json
 tools/anaf-monitor/monitor.log
 models/

feat(voice): improve Romanian STT — hallucination gate + finetuned model

1 .gitignore vendored Unescape Escape View File

1

.gitignore vendored

View File