feat(ocr): Add docTR OCR engine with metrics infrastructure
Add docTR as primary OCR engine with 2-tier sequential processing, OCR metrics tracking, and simplified engine selection. Features: - docTR OCR engine with light+medium preprocessing tiers - doctr_plus mode with early exit optimization (~65% fast path) - OCR metrics dashboard with per-engine statistics - User OCR preference persistence - Parallel worker pool for OCR processing - Cross-validation for extraction quality Engine options: tesseract, doctr, doctr_plus (recommended), paddleocr 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
@@ -38,12 +38,20 @@ from backend.modules.reports.routers import create_reports_router
|
||||
from backend.modules.data_entry.routers import create_data_entry_router
|
||||
from backend.modules.telegram.routers import create_telegram_router
|
||||
|
||||
# Configure logging
|
||||
# Configure logging (level from env: DEBUG, INFO, WARNING, ERROR)
|
||||
log_level = os.getenv('LOG_LEVEL', 'INFO').upper()
|
||||
logging.basicConfig(
|
||||
level=logging.INFO,
|
||||
level=getattr(logging, log_level, logging.INFO),
|
||||
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
|
||||
datefmt='%H:%M:%S'
|
||||
)
|
||||
# Reduce noise from third-party libraries
|
||||
logging.getLogger('httpcore').setLevel(logging.WARNING)
|
||||
logging.getLogger('httpx').setLevel(logging.WARNING)
|
||||
logging.getLogger('multipart').setLevel(logging.WARNING)
|
||||
logging.getLogger('doctr').setLevel(logging.WARNING)
|
||||
logging.getLogger('tensorflow').setLevel(logging.WARNING)
|
||||
logging.getLogger('PIL').setLevel(logging.WARNING)
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# Global variables for background tasks
|
||||
|
||||
Reference in New Issue
Block a user