# Plan: Receipt Scanning Workflow Improvements > **Context Handover Document** - Created for session continuity > **Date**: 2025-12-15 > **Status**: Ready for implementation ## Overview Improve the data-entry-app receipt scanning to: 1. Save supplier name, CUI, and OCR text in drafts 2. Make supplier validation assistive (not blocking) 3. Unify create/edit forms with OCR rescan capability 4. Fix image resize bug (>4000px) 5. **NEW: Extract payment methods (CARD/NUMERAR) from OCR** ## Requirements Summary - **Drafts**: Save `cui` + `partner_name` + `ocr_raw_text` + `payment_methods` from OCR - **Supplier match**: Auto-fill but editable (for assistance, not validation) - **No match**: Show warning only, allow saving draft - **Edit mode**: Allow OCR rescan on existing drafts - **Approval**: Requires valid `cui` only (NOT partner_id) - ROA has stored procedure for supplier lookup - **Image resize**: Cap at 4000px BEFORE upscaling - **Payment methods**: Extract CARD/NUMERAR amounts (after TOTAL LEI, before TOTAL TVA) --- ## Part 1: Backend Model & Database ### 1.1 Add Fields to Receipt Model **File**: `data-entry-app/backend/app/db/models/receipt.py` Add after line 66 (after `partner_name`): ```python cui: Optional[str] = Field(default=None, max_length=20) # Fiscal code from OCR ocr_raw_text: Optional[str] = Field(default=None) # Raw OCR text for debugging payment_methods: Optional[str] = Field(default=None, max_length=500) # JSON: [{"method":"CARD","amount":"50.00"}] ``` ### 1.2 Create Alembic Migration **File**: `data-entry-app/backend/migrations/versions/XXXX_add_ocr_fields.py` ```python def upgrade(): with op.batch_alter_table('receipts') as batch_op: batch_op.add_column(sa.Column('cui', sa.String(20), nullable=True)) batch_op.add_column(sa.Column('ocr_raw_text', sa.Text(), nullable=True)) batch_op.add_column(sa.Column('payment_methods', sa.String(500), nullable=True)) ``` ### 1.3 Update Pydantic Schemas **File**: `data-entry-app/backend/app/schemas/receipt.py` **Add PaymentMethodSchema** (after TvaEntrySchema ~line 75): ```python class PaymentMethodSchema(BaseModel): """Payment method entry (CARD/NUMERAR).""" method: str = Field(description="Payment method: CARD or NUMERAR") amount: Decimal = Field(description="Amount paid with this method") ``` **ReceiptBase** (after line 97): ```python cui: Optional[str] = Field(default=None, max_length=20) ocr_raw_text: Optional[str] = Field(default=None) payment_methods: Optional[List[PaymentMethodSchema]] = Field(default=None, description="Payment methods from OCR") ``` **ReceiptUpdate** (after line 125): ```python cui: Optional[str] = Field(default=None, max_length=20) ocr_raw_text: Optional[str] = Field(default=None) payment_methods: Optional[List[PaymentMethodSchema]] = Field(default=None) ``` **ReceiptResponse**: Add validator to parse `payment_methods` from JSON (similar to `parse_tva_breakdown`) --- ## Part 2: Fix Image Resize Bug **File**: `data-entry-app/backend/app/services/image_preprocessor.py` ### 2.1 Update `preprocess_light()` (after line 55) Add downscale BEFORE upscale: ```python # 2a. Scale DOWN if any side exceeds 4000px (PaddleOCR limit) height, width = gray.shape max_side = max(height, width) if max_side > 4000: scale = 4000 / max_side gray = cv2.resize(gray, None, fx=scale, fy=scale, interpolation=cv2.INTER_AREA) height, width = gray.shape # 2b. Scale UP if too small if width < 1500: scale = 1500 / width gray = cv2.resize(gray, None, fx=scale, fy=scale, interpolation=cv2.INTER_CUBIC) ``` ### 2.2 Update `preprocess_heavy()` (after line 82) Same downscale logic before the existing upscale at lines 85-88. --- ## Part 3: Backend OCR Endpoint - Return Raw Text **File**: `data-entry-app/backend/app/routers/ocr.py` Ensure the OCR extraction endpoint returns `raw_text` in the response (verify this is already included in the OCR service output). --- ## Part 4: Frontend Form Unification ### 4.1 Unify OCR Zone for Create & Edit **File**: `data-entry-app/frontend/src/views/receipts/ReceiptCreateView.vue` **Change line 19** from: ```vue
``` to: ```vue
``` **Update header text** (around line 23): ```vue

{{ isEditMode ? 'Re-scanare OCR (opțional)' : 'Poză Bon (obligatoriu)' }}

``` ### 4.2 Add CUI Field to Form State **File**: `data-entry-app/frontend/src/views/receipts/ReceiptCreateView.vue` Add to form ref initialization: ```javascript cui: '', ocr_raw_text: '', ``` ### 4.3 Add CUI Display Field Add after Furnizor dropdown (around line 210): ```vue
CUI negăsit în nomenclator
``` ### 4.4 Change Supplier Dialog to Warning Banner **Current behavior** (lines 555-563): When CUI not found, opens blocking dialog. **New behavior**: Show non-blocking warning message. Replace the `else` block in `applyOCRData()`: ```javascript } else { // Not found - show warning but allow continuing supplierWarning.value = { show: true, cui: data.cui, name: data.partner_name || '' } // Still set form values from OCR form.value.cui = data.cui form.value.partner_name = data.partner_name || '' toast.add({ severity: 'warn', summary: 'Furnizor negăsit', detail: `CUI ${data.cui} nu a fost găsit în nomenclator`, life: 5000 }) } ``` Add ref for warning state: ```javascript const supplierWarning = ref({ show: false, cui: '', name: '' }) ``` ### 4.5 Update `applyOCRData()` to Save Raw Text Add to the function: ```javascript if (data.cui) form.value.cui = data.cui if (data.raw_text) form.value.ocr_raw_text = data.raw_text ``` ### 4.6 Update `loadReceipt()` for Edit Mode Add to existing field mapping: ```javascript cui: receipt.value.cui || '', ocr_raw_text: receipt.value.ocr_raw_text || '', ``` --- ## Part 5: Backend Approval Validation **File**: `data-entry-app/backend/app/services/receipt_service.py` In `approve_receipt()` method, add validation: ```python if not receipt.cui: return False, "Trebuie completat codul fiscal (CUI) pentru aprobare", None ``` **Note**: At approval, only `cui` (fiscal code) is required, NOT `partner_id`. The ROA ERP has a stored procedure that searches/creates suppliers based on `cui`. The `partner_id` is only populated later during Oracle import phase. --- ## Part 6: OCR Payment Methods Extraction ### 6.1 Update ExtractionResult Dataclass **File**: `data-entry-app/backend/app/services/ocr_extractor.py` Add to `ExtractionResult` (after line 24, after `items_count`): ```python payment_methods: List[dict] = field(default_factory=list) # [{"method":"CARD","amount":Decimal}] ``` ### 6.2 Add Payment Method Patterns **File**: `data-entry-app/backend/app/services/ocr_extractor.py` Add new patterns (after TVA_PATTERNS ~line 184): ```python # Payment method patterns - appears after TOTAL LEI, before TOTAL TVA # Format: "CARD: 50.00" or "NUMERAR 100.00" or "PLATA CARD: 50.00" PAYMENT_METHOD_PATTERNS = [ # CARD with amount (r'(?:PLATA\s+)?CARD\s*:?\s*([\d\s.,]+)', 'CARD', 0.95), # NUMERAR (cash) with amount (r'NUMERAR\s*:?\s*([\d\s.,]+)', 'NUMERAR', 0.95), # CASH alternative spelling (r'CASH\s*:?\s*([\d\s.,]+)', 'NUMERAR', 0.90), ] ``` ### 6.3 Add Extraction Method **File**: `data-entry-app/backend/app/services/ocr_extractor.py` Add new method `_extract_payment_methods()` (after `_extract_address` ~line 996): ```python def _extract_payment_methods(self, text: str) -> List[dict]: """ Extract payment methods (CARD/NUMERAR) from receipt. These appear after TOTAL LEI and before TOTAL TVA section. Returns list of: {'method': 'CARD'/'NUMERAR', 'amount': Decimal} """ payment_methods = [] seen_methods = set() # Normalize spaces in numbers normalized_text = re.sub(r'(\d+)[.,]\s+(\d{2})', r'\1.\2', text) # Find the region between TOTAL LEI and TOTAL TVA total_lei_match = re.search(r'TOTAL\s+LEI\s*([\d\s.,]+)', normalized_text, re.IGNORECASE) total_tva_match = re.search(r'TOTAL\s+T[VU][AR]', normalized_text, re.IGNORECASE) # Define search region (after TOTAL LEI, before TOTAL TVA if exists) if total_lei_match: start_pos = total_lei_match.end() end_pos = total_tva_match.start() if total_tva_match else len(normalized_text) search_region = normalized_text[start_pos:end_pos] else: search_region = normalized_text # Fallback to full text for pattern, method, confidence in self.PAYMENT_METHOD_PATTERNS: for match in re.finditer(pattern, search_region, re.IGNORECASE): try: amount_str = match.group(1).replace(' ', '') amount_str = self._normalize_number(re.sub(r'[^\d.,]', '', amount_str)) amount = Decimal(amount_str) if amount > 0 and method not in seen_methods: payment_methods.append({ 'method': method, 'amount': amount }) seen_methods.add(method) except (InvalidOperation, ValueError): continue return payment_methods ``` ### 6.4 Call Extraction in `extract()` Method **File**: `data-entry-app/backend/app/services/ocr_extractor.py` Add to `extract()` method (after line 255, after `result.address = ...`): ```python result.payment_methods = self._extract_payment_methods(text_upper) ``` ### 6.5 Frontend - Add Payment Methods Display **File**: `data-entry-app/frontend/src/views/receipts/ReceiptCreateView.vue` Add to form ref: ```javascript payment_methods: [], ``` Add to `applyOCRData()`: ```javascript if (data.payment_methods) form.value.payment_methods = data.payment_methods ``` Add UI display (after TVA breakdown section): ```vue
``` --- ## Implementation Order | Step | Task | Files | |------|------|-------| | 1 | Add `cui`, `ocr_raw_text`, `payment_methods` to model | `models/receipt.py` | | 2 | Create migration | `migrations/versions/...` | | 3 | Update schemas | `schemas/receipt.py` | | 4 | Fix image resize | `services/image_preprocessor.py` | | 5 | Add payment methods extraction to OCR | `services/ocr_extractor.py` | | 6 | Unify frontend form + add new fields | `views/receipts/ReceiptCreateView.vue` | | 7 | Add approval validation | `services/receipt_service.py` | | 8 | Test full workflow | Manual testing | --- ## Files to Modify ### Backend - `data-entry-app/backend/app/db/models/receipt.py` - Add cui, ocr_raw_text, payment_methods fields - `data-entry-app/backend/app/schemas/receipt.py` - Add PaymentMethodSchema, update schemas - `data-entry-app/backend/app/services/image_preprocessor.py` - Fix resize bug (cap at 4000px) - `data-entry-app/backend/app/services/ocr_extractor.py` - Add payment methods extraction - `data-entry-app/backend/app/services/receipt_service.py` - Add approval validation - `data-entry-app/backend/migrations/versions/` - New migration ### Frontend - `data-entry-app/frontend/src/views/receipts/ReceiptCreateView.vue` - Unify form, add CUI + payment methods fields, change dialog to warning --- ## Expected Behavior After Implementation 1. **OCR Scan**: Extracts supplier name, CUI, raw text, payment methods → all saved to draft 2. **Payment Methods**: CARD/NUMERAR amounts extracted (after TOTAL LEI, before TOTAL TVA) 3. **CUI Match**: Auto-fills supplier name from ROA, user can edit 4. **CUI No Match**: Shows warning toast, allows saving draft with OCR data 5. **Edit Mode**: Can re-scan OCR to update extracted data 6. **Approval**: Requires valid `cui` (fiscal code) - NOT partner_id 7. **Oracle Import** (later): Uses `cui` to find/create supplier via ROA stored procedure 8. **Large Images**: Automatically resized to max 4000px before OCR --- ## Romanian Receipt Structure Reference ``` NUME FIRMA S.R.L. CIF: RO12345678 STR. EXEMPLU NR. 1 [Product lines...] TOTAL LEI 150.00 ← Total amount CARD 50.00 ← Payment method 1 (NEW) NUMERAR 100.00 ← Payment method 2 (NEW) TOTAL TVA A-19% 23.95 ← TVA breakdown ```