Files
nlp-master/.gitignore
Marius Mutu d22038d002 refactor: parametrize pipeline cu --course flag + suport Vimeo/text
Un singur set de scripturi acum rulează pe orice curs configurat în
courses.py. Master rămâne la rădăcina repo (backward-compat M1-M6);
cursuri noi (ex. practitioner la shop.cursnlp.ro) primesc un root
dedicat (nlp-practitioner/) cu propriile artefacte.

- courses.py: config dict (master, practitioner) + course_paths() +
  validate_manifest_course() (manifest fără course_key = master).
- download.py: --course + --modules; trei tipuri de lecții (audio HTTP,
  Vimeo iframe via yt-dlp audio-only, text-only cu captură HTML);
  merge cu manifest existent în loc de replace; strip [Audio] pentru
  backward-compat paths.
- transcribe.py: --course + --modules; skip type==text; path-uri prin
  course_paths(); validare course_key.
- summarize.py: --course + --compile; template prompt folosește
  course['name']; scrie SUPORT_CURS.md cu LF explicit (WSL2 baseline).
- md_to_pdf.py: --course resolv-ă summaries_dir / pdf_dir per curs.
- run.bat: detectează master|practitioner ca primul argument,
  propagă --course la sub-scripturi; backward-compat run.bat [modules].
- requirements.txt: + yt-dlp.
- .gitignore: nlp-practitioner/audio/, audio_wav/, scratch_recon.py, tmp_recon/.
- tests/test_regression.sh: 5 gate-uri read-only (import, schema,
  disk-coherence, SUPORT_CURS byte-identic, cross-course isolation).

Regression curs master: PASS (manifest + SUPORT_CURS.md hash
identic cu baseline /tmp/suport_before.md).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-22 14:33:19 +03:00

49 lines
635 B
Plaintext

# Audio files
audio/
*.mp3
*.wav
# Whisper models
models/
*.bin
# Manifest (machine-specific state)
manifest.json
# Credentials
.env
# SRT subtitles (generated, can be re-derived from transcripts)
*.srt
# Binaries (downloaded by setup_whisper.py)
whisper-bin/
ffmpeg-bin/
# Temp files
.whisper_bin_path
.ffmpeg_bin_path
# WAV cache (converted from MP3)
audio_wav/
# Python
__pycache__/
*.pyc
.venv/
.venv_pdf/
# Claude Code local state
.claude/
# Logs
*.log
# Second course (practitioner) — artifacts only, scripts partajate
nlp-practitioner/audio/
nlp-practitioner/audio_wav/
# Recon scratch
scratch_recon.py
tmp_recon/