feat: Add OCR integration for automatic receipt data extraction

Implement Tesseract-based OCR to automatically extract vendor name, date, total amount, and VAT from uploaded receipt images/PDFs, reducing manual data entry and improving accuracy. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-12 11:48:29 +02:00
parent 5960154094
commit 41ae97180e
16 changed files with 2773 additions and 32 deletions
--- a/data-entry-app/backend/requirements.txt
+++ b/data-entry-app/backend/requirements.txt
@@ -30,3 +30,11 @@ httpx>=0.26.0
 # Testing
 pytest>=8.0.0
 pytest-asyncio>=0.23.3
+
+# OCR Dependencies
+paddleocr>=2.7.0
+paddlepaddle>=2.5.0
+opencv-python>=4.8.0
+pytesseract>=0.3.10
+pdf2image>=1.16.0
+numpy>=1.24.0