feat: anti-hallucination params + retranscribe script for fixing broken transcripts

- transcribe.py: add --max-context 0, --entropy-thold 2.4, --max-len 60,
  --suppress-nst, --no-fallback to whisper.cpp to prevent hallucination loops
- transcribe.py: remove interactive quality gate (runs unattended now)
- run.bat: remove pause prompts for unattended operation
- retranscribe_tail.py: new script that detects hallucination bursts in SRT
  files, extracts and re-transcribes only the affected audio segments, then
  splices the result back together. Drops segments that re-hallucinate
  (silence/music). Backs up originals to transcripts/backup/.
- fix_hallucinations.bat: Windows wrapper for retranscribe_tail.py

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-03-24 21:17:14 +02:00
parent 56e676618f
commit 763999f3a9
4 changed files with 497 additions and 13 deletions

View File

@@ -274,8 +274,7 @@ if "%~1"=="" (
if errorlevel 1 (
echo.
echo WARNING: Some downloads failed. Check download_errors.log
echo Press any key to continue to transcription anyway, or Ctrl+C to abort.
pause >nul
echo Continuing to transcription automatically...
)
:: ============================================================
@@ -312,4 +311,3 @@ echo.
echo Next step: generate summaries from WSL2 with Claude Code
echo python summarize.py
echo ============================================================
pause