Commit Graph

2 Commits

Author SHA1 Message Date
Claude Agent
62f86250cc refactor(docs): consolidate and cleanup documentation
- Delete 9 deprecated/obsolete docs (~6,300 lines removed)
- Move test PDFs to tests/fixtures/ocr-samples/
- Create docs/DEPLOYMENT.md as principal guide
- Create tests/ocr-validation/README.md
- Update all refs for ultrathin monolith architecture
- Update OCR tests to use relative paths

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-22 09:14:51 +00:00
495790411f feat(ocr): Add docTR OCR engine with metrics infrastructure
Add docTR as primary OCR engine with 2-tier sequential processing,
OCR metrics tracking, and simplified engine selection.

Features:
- docTR OCR engine with light+medium preprocessing tiers
- doctr_plus mode with early exit optimization (~65% fast path)
- OCR metrics dashboard with per-engine statistics
- User OCR preference persistence
- Parallel worker pool for OCR processing
- Cross-validation for extraction quality

Engine options: tesseract, doctr, doctr_plus (recommended), paddleocr

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-02 05:37:16 +02:00