feat(voice): improve Romanian STT — hallucination gate + finetuned model
Gemma 4 cloud audio was infeasible (31b-cloud has no audio; E4B broken upstream, no deploy host), so improve faster-whisper instead. - Pin temperature=0.0 to disable the fallback ladder that re-decoded unclear audio up to 6x (source of the 16-24s latency outliers); reject hallucinated segments via avg_logprob/compression_ratio in the new pure _filter_segments. - Adopt mikr/whisper-small-ro-cv11 (CT2 int8) via configurable voice.stt_model: spike showed WER 24%->10%, numbers fixed at source, +0.33s p50 (in budget). - Add tools/voice_stt_mine.py (log mining) + tools/voice_stt_spike.py (model eval with diacritic scoring) + tests for the gate and miner. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This commit is contained in:
@@ -110,7 +110,8 @@
|
||||
],
|
||||
"user_name": "Marius",
|
||||
"default_voice": "F1",
|
||||
"auto_leave_minutes": 5
|
||||
"auto_leave_minutes": 5,
|
||||
"stt_model": "/home/moltbot/echo-core/models/whisper-small-ro-cv11-int8"
|
||||
},
|
||||
"paths": {
|
||||
"personality": "personality/",
|
||||
|
||||
Reference in New Issue
Block a user