Files
roa2web-service-auto/IMPLEMENTATION_PLAN_RECEIPT_OCR.md
Marius Mutu 1a6e9b17d2 feat: Add shared components, refactor stores, improve data-entry workflow
Shared Components:
- Add CompanySelector.vue and PeriodSelector.vue components
- Add AppHeader.vue and SlideMenu.vue layout components
- Add shared stores factories (companies.js, accountingPeriod.js)
- Add shared routes factories (companies.py, calendar.py)
- Add shared models (company.py, calendar.py)
- Add shared layout styles (header.css, navigation.css)

Data Entry App:
- Update CLAUDE.md with prod/test server documentation
- Improve nomenclature sync service with better error handling
- Update receipts router and CRUD operations
- Add company/period stores using shared factories
- Update App.vue layout with shared components
- Fix OCRUploadZone file handling

Reports App:
- Refactor stores to use shared factories
- Update App.vue to use shared layout components

Infrastructure:
- Replace start-data-entry.sh with separate dev/test scripts
- Add .claude/rules for authentication, backend patterns, etc.
- Add implementation plan for OCR receipt improvements
- Clean up old documentation files

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-15 15:00:45 +02:00

387 lines
12 KiB
Markdown

# Plan: Receipt Scanning Workflow Improvements
> **Context Handover Document** - Created for session continuity
> **Date**: 2025-12-15
> **Status**: Ready for implementation
## Overview
Improve the data-entry-app receipt scanning to:
1. Save supplier name, CUI, and OCR text in drafts
2. Make supplier validation assistive (not blocking)
3. Unify create/edit forms with OCR rescan capability
4. Fix image resize bug (>4000px)
5. **NEW: Extract payment methods (CARD/NUMERAR) from OCR**
## Requirements Summary
- **Drafts**: Save `cui` + `partner_name` + `ocr_raw_text` + `payment_methods` from OCR
- **Supplier match**: Auto-fill but editable (for assistance, not validation)
- **No match**: Show warning only, allow saving draft
- **Edit mode**: Allow OCR rescan on existing drafts
- **Approval**: Requires valid `cui` only (NOT partner_id) - ROA has stored procedure for supplier lookup
- **Image resize**: Cap at 4000px BEFORE upscaling
- **Payment methods**: Extract CARD/NUMERAR amounts (after TOTAL LEI, before TOTAL TVA)
---
## Part 1: Backend Model & Database
### 1.1 Add Fields to Receipt Model
**File**: `data-entry-app/backend/app/db/models/receipt.py`
Add after line 66 (after `partner_name`):
```python
cui: Optional[str] = Field(default=None, max_length=20) # Fiscal code from OCR
ocr_raw_text: Optional[str] = Field(default=None) # Raw OCR text for debugging
payment_methods: Optional[str] = Field(default=None, max_length=500) # JSON: [{"method":"CARD","amount":"50.00"}]
```
### 1.2 Create Alembic Migration
**File**: `data-entry-app/backend/migrations/versions/XXXX_add_ocr_fields.py`
```python
def upgrade():
with op.batch_alter_table('receipts') as batch_op:
batch_op.add_column(sa.Column('cui', sa.String(20), nullable=True))
batch_op.add_column(sa.Column('ocr_raw_text', sa.Text(), nullable=True))
batch_op.add_column(sa.Column('payment_methods', sa.String(500), nullable=True))
```
### 1.3 Update Pydantic Schemas
**File**: `data-entry-app/backend/app/schemas/receipt.py`
**Add PaymentMethodSchema** (after TvaEntrySchema ~line 75):
```python
class PaymentMethodSchema(BaseModel):
"""Payment method entry (CARD/NUMERAR)."""
method: str = Field(description="Payment method: CARD or NUMERAR")
amount: Decimal = Field(description="Amount paid with this method")
```
**ReceiptBase** (after line 97):
```python
cui: Optional[str] = Field(default=None, max_length=20)
ocr_raw_text: Optional[str] = Field(default=None)
payment_methods: Optional[List[PaymentMethodSchema]] = Field(default=None, description="Payment methods from OCR")
```
**ReceiptUpdate** (after line 125):
```python
cui: Optional[str] = Field(default=None, max_length=20)
ocr_raw_text: Optional[str] = Field(default=None)
payment_methods: Optional[List[PaymentMethodSchema]] = Field(default=None)
```
**ReceiptResponse**: Add validator to parse `payment_methods` from JSON (similar to `parse_tva_breakdown`)
---
## Part 2: Fix Image Resize Bug
**File**: `data-entry-app/backend/app/services/image_preprocessor.py`
### 2.1 Update `preprocess_light()` (after line 55)
Add downscale BEFORE upscale:
```python
# 2a. Scale DOWN if any side exceeds 4000px (PaddleOCR limit)
height, width = gray.shape
max_side = max(height, width)
if max_side > 4000:
scale = 4000 / max_side
gray = cv2.resize(gray, None, fx=scale, fy=scale, interpolation=cv2.INTER_AREA)
height, width = gray.shape
# 2b. Scale UP if too small
if width < 1500:
scale = 1500 / width
gray = cv2.resize(gray, None, fx=scale, fy=scale, interpolation=cv2.INTER_CUBIC)
```
### 2.2 Update `preprocess_heavy()` (after line 82)
Same downscale logic before the existing upscale at lines 85-88.
---
## Part 3: Backend OCR Endpoint - Return Raw Text
**File**: `data-entry-app/backend/app/routers/ocr.py`
Ensure the OCR extraction endpoint returns `raw_text` in the response (verify this is already included in the OCR service output).
---
## Part 4: Frontend Form Unification
### 4.1 Unify OCR Zone for Create & Edit
**File**: `data-entry-app/frontend/src/views/receipts/ReceiptCreateView.vue`
**Change line 19** from:
```vue
<div class="upload-section" v-if="!isEditMode">
```
to:
```vue
<div class="upload-section">
```
**Update header text** (around line 23):
```vue
<h3>
<i class="pi pi-camera"></i>
{{ isEditMode ? 'Re-scanare OCR (opțional)' : 'Poză Bon (obligatoriu)' }}
</h3>
```
### 4.2 Add CUI Field to Form State
**File**: `data-entry-app/frontend/src/views/receipts/ReceiptCreateView.vue`
Add to form ref initialization:
```javascript
cui: '',
ocr_raw_text: '',
```
### 4.3 Add CUI Display Field
Add after Furnizor dropdown (around line 210):
```vue
<div class="form-field">
<label>CUI (Cod Fiscal)</label>
<InputText v-model="form.cui" placeholder="Ex: RO12345678" />
<small v-if="form.cui && !form.partner_id" class="p-text-warning">
<i class="pi pi-exclamation-triangle"></i>
CUI negăsit în nomenclator
</small>
</div>
```
### 4.4 Change Supplier Dialog to Warning Banner
**Current behavior** (lines 555-563): When CUI not found, opens blocking dialog.
**New behavior**: Show non-blocking warning message.
Replace the `else` block in `applyOCRData()`:
```javascript
} else {
// Not found - show warning but allow continuing
supplierWarning.value = {
show: true,
cui: data.cui,
name: data.partner_name || ''
}
// Still set form values from OCR
form.value.cui = data.cui
form.value.partner_name = data.partner_name || ''
toast.add({
severity: 'warn',
summary: 'Furnizor negăsit',
detail: `CUI ${data.cui} nu a fost găsit în nomenclator`,
life: 5000
})
}
```
Add ref for warning state:
```javascript
const supplierWarning = ref({ show: false, cui: '', name: '' })
```
### 4.5 Update `applyOCRData()` to Save Raw Text
Add to the function:
```javascript
if (data.cui) form.value.cui = data.cui
if (data.raw_text) form.value.ocr_raw_text = data.raw_text
```
### 4.6 Update `loadReceipt()` for Edit Mode
Add to existing field mapping:
```javascript
cui: receipt.value.cui || '',
ocr_raw_text: receipt.value.ocr_raw_text || '',
```
---
## Part 5: Backend Approval Validation
**File**: `data-entry-app/backend/app/services/receipt_service.py`
In `approve_receipt()` method, add validation:
```python
if not receipt.cui:
return False, "Trebuie completat codul fiscal (CUI) pentru aprobare", None
```
**Note**: At approval, only `cui` (fiscal code) is required, NOT `partner_id`.
The ROA ERP has a stored procedure that searches/creates suppliers based on `cui`.
The `partner_id` is only populated later during Oracle import phase.
---
## Part 6: OCR Payment Methods Extraction
### 6.1 Update ExtractionResult Dataclass
**File**: `data-entry-app/backend/app/services/ocr_extractor.py`
Add to `ExtractionResult` (after line 24, after `items_count`):
```python
payment_methods: List[dict] = field(default_factory=list) # [{"method":"CARD","amount":Decimal}]
```
### 6.2 Add Payment Method Patterns
**File**: `data-entry-app/backend/app/services/ocr_extractor.py`
Add new patterns (after TVA_PATTERNS ~line 184):
```python
# Payment method patterns - appears after TOTAL LEI, before TOTAL TVA
# Format: "CARD: 50.00" or "NUMERAR 100.00" or "PLATA CARD: 50.00"
PAYMENT_METHOD_PATTERNS = [
# CARD with amount
(r'(?:PLATA\s+)?CARD\s*:?\s*([\d\s.,]+)', 'CARD', 0.95),
# NUMERAR (cash) with amount
(r'NUMERAR\s*:?\s*([\d\s.,]+)', 'NUMERAR', 0.95),
# CASH alternative spelling
(r'CASH\s*:?\s*([\d\s.,]+)', 'NUMERAR', 0.90),
]
```
### 6.3 Add Extraction Method
**File**: `data-entry-app/backend/app/services/ocr_extractor.py`
Add new method `_extract_payment_methods()` (after `_extract_address` ~line 996):
```python
def _extract_payment_methods(self, text: str) -> List[dict]:
"""
Extract payment methods (CARD/NUMERAR) from receipt.
These appear after TOTAL LEI and before TOTAL TVA section.
Returns list of: {'method': 'CARD'/'NUMERAR', 'amount': Decimal}
"""
payment_methods = []
seen_methods = set()
# Normalize spaces in numbers
normalized_text = re.sub(r'(\d+)[.,]\s+(\d{2})', r'\1.\2', text)
# Find the region between TOTAL LEI and TOTAL TVA
total_lei_match = re.search(r'TOTAL\s+LEI\s*([\d\s.,]+)', normalized_text, re.IGNORECASE)
total_tva_match = re.search(r'TOTAL\s+T[VU][AR]', normalized_text, re.IGNORECASE)
# Define search region (after TOTAL LEI, before TOTAL TVA if exists)
if total_lei_match:
start_pos = total_lei_match.end()
end_pos = total_tva_match.start() if total_tva_match else len(normalized_text)
search_region = normalized_text[start_pos:end_pos]
else:
search_region = normalized_text # Fallback to full text
for pattern, method, confidence in self.PAYMENT_METHOD_PATTERNS:
for match in re.finditer(pattern, search_region, re.IGNORECASE):
try:
amount_str = match.group(1).replace(' ', '')
amount_str = self._normalize_number(re.sub(r'[^\d.,]', '', amount_str))
amount = Decimal(amount_str)
if amount > 0 and method not in seen_methods:
payment_methods.append({
'method': method,
'amount': amount
})
seen_methods.add(method)
except (InvalidOperation, ValueError):
continue
return payment_methods
```
### 6.4 Call Extraction in `extract()` Method
**File**: `data-entry-app/backend/app/services/ocr_extractor.py`
Add to `extract()` method (after line 255, after `result.address = ...`):
```python
result.payment_methods = self._extract_payment_methods(text_upper)
```
### 6.5 Frontend - Add Payment Methods Display
**File**: `data-entry-app/frontend/src/views/receipts/ReceiptCreateView.vue`
Add to form ref:
```javascript
payment_methods: [],
```
Add to `applyOCRData()`:
```javascript
if (data.payment_methods) form.value.payment_methods = data.payment_methods
```
Add UI display (after TVA breakdown section):
```vue
<!-- Payment Methods (from OCR) -->
<div class="form-field" v-if="form.payment_methods && form.payment_methods.length > 0">
<label>Modalități Plată</label>
<div class="payment-methods-display">
<Tag v-for="pm in form.payment_methods" :key="pm.method"
:severity="pm.method === 'CARD' ? 'info' : 'success'"
:value="`${pm.method}: ${formatCurrency(pm.amount)}`" />
</div>
</div>
```
---
## Implementation Order
| Step | Task | Files |
|------|------|-------|
| 1 | Add `cui`, `ocr_raw_text`, `payment_methods` to model | `models/receipt.py` |
| 2 | Create migration | `migrations/versions/...` |
| 3 | Update schemas | `schemas/receipt.py` |
| 4 | Fix image resize | `services/image_preprocessor.py` |
| 5 | Add payment methods extraction to OCR | `services/ocr_extractor.py` |
| 6 | Unify frontend form + add new fields | `views/receipts/ReceiptCreateView.vue` |
| 7 | Add approval validation | `services/receipt_service.py` |
| 8 | Test full workflow | Manual testing |
---
## Files to Modify
### Backend
- `data-entry-app/backend/app/db/models/receipt.py` - Add cui, ocr_raw_text, payment_methods fields
- `data-entry-app/backend/app/schemas/receipt.py` - Add PaymentMethodSchema, update schemas
- `data-entry-app/backend/app/services/image_preprocessor.py` - Fix resize bug (cap at 4000px)
- `data-entry-app/backend/app/services/ocr_extractor.py` - Add payment methods extraction
- `data-entry-app/backend/app/services/receipt_service.py` - Add approval validation
- `data-entry-app/backend/migrations/versions/` - New migration
### Frontend
- `data-entry-app/frontend/src/views/receipts/ReceiptCreateView.vue` - Unify form, add CUI + payment methods fields, change dialog to warning
---
## Expected Behavior After Implementation
1. **OCR Scan**: Extracts supplier name, CUI, raw text, payment methods → all saved to draft
2. **Payment Methods**: CARD/NUMERAR amounts extracted (after TOTAL LEI, before TOTAL TVA)
3. **CUI Match**: Auto-fills supplier name from ROA, user can edit
4. **CUI No Match**: Shows warning toast, allows saving draft with OCR data
5. **Edit Mode**: Can re-scan OCR to update extracted data
6. **Approval**: Requires valid `cui` (fiscal code) - NOT partner_id
7. **Oracle Import** (later): Uses `cui` to find/create supplier via ROA stored procedure
8. **Large Images**: Automatically resized to max 4000px before OCR
---
## Romanian Receipt Structure Reference
```
NUME FIRMA S.R.L.
CIF: RO12345678
STR. EXEMPLU NR. 1
[Product lines...]
TOTAL LEI 150.00 ← Total amount
CARD 50.00 ← Payment method 1 (NEW)
NUMERAR 100.00 ← Payment method 2 (NEW)
TOTAL TVA A-19% 23.95 ← TVA breakdown
```