feat(practitioner): structură per-modul + PDF-uri sursă + split 2-PC

- audio/Modul {N}/filename.mp3 — fiecare modul în subdirector separat
  pentru copiere pe telefon și transfer între PC-uri.
- PDF-urile se păstrează ca sursă în summaries/pdf/ (fără extract txt).
- transcribe_status="pdf_source_only" pentru lecțiile PDF → summarize.py
  le filtrează automat.
- Fix coliziune manifest transcript_path (stem-based, nu preserve prior).
- .bat per modul (M2-M8) + dispatchers run_pc1_all (M2-M5) + run_pc2_all
  (M6-M8) pentru partajare work pe 2 PC-uri.
- prepare_pc2_bundle.py: zip cu scripts + manifest + .env + PDFs pentru
  PC2 (self-installs whisper.cpp/model/ffmpeg la primul run).
- M1 whisper complete (49/49 audio+vimeo transcrise).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-04-23 08:48:58 +03:00
parent 2e4bb88624
commit 6ee53133b7
132 changed files with 28904 additions and 74 deletions

View File

@@ -211,11 +211,18 @@ def main():
for lec in mod["lectures"]:
total += 1
# Text and PDF lectures bypass whisper — transcript written by download.py.
if lec.get("type") in ("text", "pdf"):
# Text lectures have transcript written by download.py.
if lec.get("type") == "text":
lec["transcribe_status"] = "complete"
skipped += 1
log.info(f" Skipping {lec.get('type')}: {lec['title']}")
log.info(f" Skipping text: {lec['title']}")
continue
# PDF lectures are source-only (no transcript, no whisper). Preserve
# download.py's 'pdf_source_only' state so summarize.py filters them out.
if lec.get("type") == "pdf":
skipped += 1
log.info(f" Skipping pdf (source-only): {lec['title']}")
continue
if lec.get("download_status") != "complete":