- Delete data-entry-app/ (1.6GB), reports-app/ (447MB), .auto-build-data/
- Saved ~1.4GB disk space (64% reduction: 2.2GB → 845MB)
Updated references across 38 files:
- .claude/rules/ paths: backend/modules/, src/modules/
- .claude/commands/validate.md: all validation paths
- docs/ (13 files): data-entry, telegram, README, CLAUDE.md
- scripts/ (3 files): backup-secrets, restore-secrets, test-docker
- security/ (2 files): git_cleanup, SECURITY_PROCEDURES
- deployment/ & shared/: updated all stale comments
All paths now reflect ultrathin monolith architecture:
- Backend: backend/modules/{reports,data_entry,telegram}/
- Frontend: src/modules/{reports,data-entry}/
- Shared: shared/{auth,database,routes}/
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
13 KiB
13 KiB
Architecture: Data Entry App
Overview
Aplicatie separata pentru introducere date in ERP, cu workflow de aprobare si staging area inainte de sincronizare in Oracle.
Decizii Arhitecturale
1. SQLModel + Alembic
Alegere: SQLModel (Pydantic + SQLAlchemy) cu Alembic pentru migrari
Motivatie:
- Creat de autorul FastAPI - integrare perfecta
- Un model = Pydantic + SQLAlchemy - nu duplici definitii
- Async support nativ
- Alembic - standard industrial pentru migrari
- Validare automata - Pydantic valideaza input, SQLAlchemy gestioneaza DB
Alternative considerate:
- SQLAlchemy pur: Mai verbose, necesita scheme Pydantic separate
- Tortoise ORM: Async nativ dar comunitate mai mica
- Peewee: Simplu dar fara async
2. Separare de Reports-App
Alegere: Aplicatie separata in backend/modules/data_entry/
Motivatie:
- Responsabilitati diferite: reports = read-only, data-entry = write
- Lifecycle diferit: data-entry poate avea releases mai frecvente
- Risc izolat: bug in data-entry nu afecteaza raportarile
- Scalare independenta
Shared Components:
shared/database/oracle_pool.py- conexiune Oracle pentru nomenclatoare si autentificareshared/auth/- autentificare JWT comuna (middleware, routes factory, auth service)shared/frontend/components/LoginView.vue- UI login partajatshared/frontend/stores/auth.js- Pinia auth store factoryshared/frontend/styles/login.css- stiluri login
3. Workflow cu Staging Area
Alegere: SQLite local ca staging, apoi sync in Oracle
Motivatie:
- Permite lucru offline (utilizator poate completa bonuri)
- Review de contabil inainte de date in Oracle
- Rollback simplu (stergem din SQLite)
- Audit trail complet
Flow:
User Input → SQLite (staging) → Contabil Review → Oracle (final)
4. Storage Fisiere
Alegere: Filesystem local cu referinte in DB
Motivatie:
- Simplu de implementat si backup
- Performanta buna pentru imagini
- Poate migra la S3/Azure Blob daca e nevoie
Structura:
data/uploads/
{year}/
{month}/
{uuid}.{ext}
Diagrama Componente
┌─────────────────────────────────────────────────────────────────┐
│ data-entry-app │
├─────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Frontend │────▶│ Backend │────▶│ SQLite │ │
│ │ Vue.js │ │ FastAPI │ │ (staging) │ │
│ │ :3010 │ │ :8003 │ │ │ │
│ └──────────────┘ └──────┬───────┘ └──────────────┘ │
│ │ │ │
│ │ OCR Upload │ Nomenclatoare │
│ ▼ ▼ │
│ ┌──────────────┐ ┌──────────────┐ │
│ │ OCR Service │ │ Oracle │ │
│ │ PaddleOCR │ │ (read-only) │ │
│ │ +Tesseract │ └──────────────┘ │
│ └──────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
Model de Date
Receipt (Bon Fiscal)
receipts
├── id (PK)
├── receipt_type: enum(bon_fiscal, chitanta)
├── direction: enum(cheltuiala, incasare)
├── receipt_number, receipt_series
├── receipt_date, amount, description
├── company_id, partner_id, partner_name
├── cash_register_id, cash_register_name
├── expense_type_code
├── status: enum(draft, pending_review, approved, rejected, synced)
├── created_by, created_at, updated_at
├── submitted_at, reviewed_by, reviewed_at
├── rejection_reason
└── oracle_synced_at, oracle_act_id, oracle_error
ReceiptAttachment (Atasamente)
receipt_attachments
├── id (PK)
├── receipt_id (FK)
├── filename, stored_filename
├── file_path, file_size, mime_type
└── uploaded_at
AccountingEntry (Note Contabile)
accounting_entries
├── id (PK)
├── receipt_id (FK)
├── entry_type: enum(debit, credit)
├── account_code, account_name
├── amount
├── partner_id, cost_center_id
├── is_auto_generated
└── modified_by, modified_at
Workflow States
┌─────────┐
│ DRAFT │◀────────────────────┐
└────┬────┘ │
│ submit() │ (edit after reject)
▼ │
┌──────────────┐ │
│PENDING_REVIEW│──────────────────┤
└──────┬───────┘ │
│ │
┌─────┴─────┐ │
▼ ▼ │
┌────────┐ ┌────────┐ │
│APPROVED│ │REJECTED│──────────────┘
└────┬───┘ └────────┘ resubmit()
│
│ (Faza 2)
▼
┌──────┐
│SYNCED│
└──────┘
Generare Note Contabile
Algoritm
def generate_entries(receipt):
expense_type = EXPENSE_TYPES[receipt.expense_type_code]
entries = []
if expense_type.has_vat:
net_amount = receipt.amount / Decimal('1.19')
vat_amount = receipt.amount - net_amount
# Cheltuiala (debit)
entries.append(Entry(DEBIT, expense_type.account, net_amount))
# TVA (debit)
entries.append(Entry(DEBIT, "4426", vat_amount))
else:
entries.append(Entry(DEBIT, expense_type.account, receipt.amount))
# Credit casa/banca
entries.append(Entry(CREDIT, receipt.cash_register_account, receipt.amount))
return entries
Exemplu: Bon Benzina 200 RON
Debit 6022 Cheltuieli combustibil 168.07
Debit 4426 TVA deductibila 31.93
Credit 5311 Casa in lei 200.00
Integrare Oracle (Faza 2)
Proceduri Stocate
-- 1. Initializare
pack_contafin.init_scriere_act_rul_local()
-- 2. Insert linii
INSERT INTO ACT_TEMP (
ID_ACT, DATAIREG, DATAACT, SCD, ASCD, SCC, ASCC,
SUMA, ID_CTR, ID_PARTD, EXPLICATIA, ...
)
-- 3. Finalizare
pack_contafin.finalizeaza_scriere_act_rul()
→ SCRIE_IN_ACT()
→ SCRIE_IN_RUL()
→ Actualizare situatii (BV, BP, TVA)
Securitate
Autentificare
- JWT tokens din shared auth
- Middleware valideaza token si injecteaza user
Autorizare
- Permisiuni verificate in services
- Utilizator poate edita doar bonurile proprii in DRAFT
- Doar contabil poate aproba/respinge
Upload Fisiere
- Validare MIME type (whitelist)
- Sanitizare nume fisier
- Limita dimensiune (10MB)
- Stocare cu UUID (previne path traversal)
Configuratie
Environment Variables
# SQLite Database
SQLITE_DATABASE_PATH=data/receipts.db
# File Storage
UPLOAD_PATH=data/uploads
MAX_UPLOAD_SIZE=10485760 # 10MB
# Oracle (pentru nomenclatoare)
ORACLE_USER=CONTAFIN_ORACLE
ORACLE_PASSWORD=***
ORACLE_HOST=localhost
ORACLE_PORT=1526
ORACLE_SID=ROA
# JWT (shared)
JWT_SECRET_KEY=***
JWT_ALGORITHM=HS256
OCR Processing Pipeline
5. OCR Architecture
Alegere: PaddleOCR (primar) + Tesseract (fallback), procesare 100% locala
Motivatie:
- Zero costuri externe (fara API-uri cloud)
- Procesare on-premise (date sensibile raman locale)
- PaddleOCR: acuratete ridicata, CPU-friendly
- Tesseract: suport excelent pentru diacritice romanesti
Stack OCR:
┌─────────────────────────────────────────────────────┐
│ OCR Pipeline │
├─────────────────────────────────────────────────────┤
│ │
│ Image Upload → ImagePreprocessor → OCREngine │
│ │ │ │ │
│ │ ▼ ▼ │
│ │ ┌─────────┐ ┌──────────────┐ │
│ │ │ OpenCV │ │ PaddleOCR │ │
│ │ │ Pipeline│ │ (primary) │ │
│ │ └─────────┘ └──────┬───────┘ │
│ │ │ │ │
│ │ │ fallback│ │
│ │ │ ▼ │
│ │ │ ┌──────────────┐ │
│ │ │ │ Tesseract │ │
│ │ │ │ (ron+eng) │ │
│ │ │ └──────────────┘ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌──────────────────────────────────────────────┐ │
│ │ ReceiptExtractor (Regex) │ │
│ │ - Amount patterns (TOTAL, DE PLATA) │ │
│ │ - Date patterns (DD.MM.YYYY) │ │
│ │ - CUI patterns (C.U.I., C.I.F.) │ │
│ │ - Vendor extraction (first lines) │ │
│ └──────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ExtractionResult + Confidence │
│ │
└─────────────────────────────────────────────────────┘
Image Preprocessing Pipeline
def preprocess(image):
1. Convert to grayscale
2. Resize if width < 1000px (upscale for better OCR)
3. Deskew using Hough lines (straighten rotated text)
4. Denoise (Non-local means denoising)
5. Adaptive thresholding (binarization)
6. Morphological close (connect broken characters)
return processed_image
Extraction Patterns (Romanian Receipts)
| Pattern Type | Regex Examples | Confidence |
|---|---|---|
| Amount | TOTAL\s*:?\s*([\d.,]+) |
0.95 |
| Date | (\d{2}[./]\d{2}[./]\d{4}) |
0.90 |
| CUI | C\.?U\.?I\.?\s*:?\s*(\d{6,10}) |
0.95 |
| Receipt Number | NR\.?\s*BON\s*:?\s*(\d+) |
0.95 |
| Vendor | First 5 non-keyword lines | 0.70 |
OCR API Endpoints
GET /api/ocr/status # Check OCR availability
POST /api/ocr/extract # Extract from uploaded image
POST /api/ocr/extract-attachment/{id} # Re-process existing attachment
System Dependencies
# Ubuntu/Debian
apt-get install -y \
tesseract-ocr tesseract-ocr-ron tesseract-ocr-eng \
poppler-utils libgl1-mesa-glx libglib2.0-0
Testing Strategy
Unit Tests
- CRUD operations
- Workflow transitions
- Entry generation logic
- OCR extraction patterns
Integration Tests
- API endpoints
- File upload/download
- Oracle nomenclature fetch
- OCR endpoint with sample receipts
E2E Tests
- Complete workflow: create → submit → approve
- File upload cu preview
- OCR extraction → form auto-fill