Gemma 4 cloud audio was infeasible (31b-cloud has no audio; E4B broken
upstream, no deploy host), so improve faster-whisper instead.
- Pin temperature=0.0 to disable the fallback ladder that re-decoded unclear
audio up to 6x (source of the 16-24s latency outliers); reject hallucinated
segments via avg_logprob/compression_ratio in the new pure _filter_segments.
- Adopt mikr/whisper-small-ro-cv11 (CT2 int8) via configurable voice.stt_model:
spike showed WER 24%->10%, numbers fixed at source, +0.33s p50 (in budget).
- Add tools/voice_stt_mine.py (log mining) + tools/voice_stt_spike.py (model
eval with diacritic scoring) + tests for the gate and miner.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>