Files
roa2web-service-auto/scripts/ralph/final-plan.md
Marius Mutu e257fa5d5f feat(telegram): bot bonuri fiscale — OCR → preview → Oracle write
- US-001: mută queue_client.py în data_entry/services/ocr/
- US-002/003/004: oracle_receipt_writer + oracle_server_id în DB
- US-005: receipt_handlers.py (PDF/photo/callback flow)
- US-006: wire handlers în main.py, per-schema connect, seq_cod.nextval
- US-007: .gitignore secrets/*.oracle_pass
- US-008/009/010: teste unit + integration + E2E
- setup-secrets.sh helper + template
- docs/telegram/README.md actualizat cu arhitectura nouă

Testat E2E pe DB live (MARIUSM_AUTO). COD din seq_cod.nextval.
pypdfium2 fallback pentru PDF decode (fără poppler).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-05 09:26:58 +00:00

29 KiB
Raw Blame History

Final plan — roa2web-telegram-bonuri

Recovered from planning session JSONL b740c3df-d2b9-447b-bf8e-0100e0dd9fbf after the approved status was accidentally reset and a new planning session was started. This file contains the longest structured assistant turn from that session (the CEO review).


Running plan-eng-review on the design doc. CEO review just completed — now focusing on what eng review uniquely covers: code patterns, test coverage diagram, performance specifics, and parallelization.


/plan-eng-review — Telegram Bonuri Fiscale

Design doc: ~/.gstack/projects/romfast-echo-core/moltbot-master-design-20260504-221523.md. Reviewed by /office-hours (8/10), /plan-ceo-review just completed (5 findings folded back into plan).


Step 0 — Scope Challenge

Existing code already solves:

Sub-problem Existing Plan reuses?
OCR async queue submit + poll backend/scripts/whatsapp_import/ocr_client.py YES (direct import)
Oracle write via PACK_CONTAFIN backend/scripts/whatsapp_import/process_v2.py:60-135 Adapts pattern (not import)
Telegram auth check backend/modules/telegram/auth/linking.py::get_user_auth_data YES
File download from Telegram python-telegram-bot::Bot.get_file().download_to_drive() YES
Date parse / nul-safe field handling process_v2.py:64-67 (date), process_v2.py:84-87 (nrract) Plan must replicate, NOT importable as-is
CUI → cont mapping process_v2.py::get_cont_cheltuiala (hardcoded Romfast) Adapted with warning fallback

Rebuilding: _save_to_oracle is being adapted from process_v2.py, not imported. Reason: process_v2.py is a CLI script that reads from a result object with .attribute access. Telegram handler will receive a dict. Different shape, can't import. The Oracle write logic should be extracted to a shared helperbackend/modules/data_entry/services/oracle_receipt_writer.py — so the next channel (WhatsApp, email) doesn't trigger a third copy. See Architecture finding #1.

Minimum set of changes — design doc proposes 5 file touches. Eng-review verdict: 5 is the minimum.

Complexity check — 5 files. Under threshold. No new classes (flat functions per existing pattern). Clean.

Search check (Layer 1/2/3):

  • python-telegram-bot async file download → built-in bot.get_file().download_to_drive(). Plan uses it. [Layer 1]
  • concurrent_updates=True for parallel users → built-in flag in Application.builder(). Plan should set it (CEO finding FINDING-ARCH-1). [Layer 1]
  • Async wrapper for sync DB driver → asyncio.to_thread() (Python 3.9+). Plan uses it. [Layer 1]
  • Per-user state → context.user_data (PTB built-in). Plan uses it. [Layer 1]

No reinvention. No L2/L3 patterns introduced.

TODOS cross-reference: No TODOS.md exists in roa2web. Recommend adding one as part of this PR with the deferred items from the CEO review (server_id DB persistence, "view in ERP" link, unify-via-receipt-service).

Distribution check: No new artifact. Code lives inside the existing roa2web bot process. N/A.

Step 0 verdict: Scope is right. Proceed to sections.


Section 1 — Architecture Review

                       ROA2WEB BOT PROCESS (single worker)
┌─────────────────────────────────────────────────────────────────────┐
│                                                                     │
│  bot_main.py::create_telegram_application()                         │
│    │                                                                │
│    ├─ Application.builder().token(T).build()                        │
│    │   └─ ⚠ MISSING: .concurrent_updates(True)                       │
│    │                                                                │
│    ├─ CommandHandler(...) × N                                       │
│    ├─ MessageHandler(TEXT & ~COMMAND, handle_text_message)          │
│    │                                                                │
│    ├─ ⓢ MessageHandler(Document.PDF | Document.IMAGE, doc_handler) │  NEW
│    ├─ ⓢ MessageHandler(PHOTO, photo_handler)                       │  NEW
│    ├─ ⓢ CallbackQueryHandler(receipt_cb, pattern=r"^receipt:")     │  NEW
│    │                                                                │
│    └─ CallbackQueryHandler(button_callback)         ← catch-all     │
│       (receipt:* MUST be registered before this — handler order)    │
│                                                                     │
│  receipt_handlers.py (NEW)                                          │
│    handle_document_message  ─┐                                      │
│    handle_photo_message      ├─→ _submit_ocr_and_preview            │
│                              │     │                                │
│                              │     ├─ submit_ocr_job (whatsapp_imp.)│
│                              │     ├─ poll wait_for_result          │
│                              │     └─ edit_message + keyboard       │
│                              │                                      │
│    handle_receipt_callback ──┘     │                                │
│      ├─ confirm: asyncio.to_thread(_save_to_oracle)                 │
│      └─ cancel:  unlink temp + edit_message                         │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘
              │                    │
              ▼                    ▼
       SQLite OCR queue       Oracle DB
       data/ocr_queue/        ACT_TEMP via PACK_CONTAFIN
       ⚠ hardcoded path        cred: secrets/<server_id>_write.oracle_pass

FINDING-ARCH-1 [P2] (confidence: 8/10) _save_to_oracle duplication risk. The plan adapts process_v2.py's PACK_CONTAFIN logic into receipt_handlers.py::_save_to_oracle. This makes 2 copies of the 3-step Oracle write. WhatsApp adapter is the third channel waiting to land — that becomes 3 copies. DRY violation.

Recommendation: extract to backend/modules/data_entry/services/oracle_receipt_writer.py::write_receipt(receipt_dict, oracle_cfg) -> tuple[int, str]. Both process_v2.py and receipt_handlers.py import it. Effort: ~20 min (CC) for the extraction + adapt both callers.

This connects to engineering preference: "DRY is important — flag repetition aggressively."

FINDING-ARCH-2 [P3] (confidence: 9/10) bot/__init__.py import boundary. The plan imports from backend.scripts.whatsapp_import.ocr_client. Importing from scripts/ into modules/ is backwards — scripts should depend on modules, not the reverse.

Recommendation: move ocr_client.py to backend/modules/data_entry/services/ocr/queue_client.py. The whatsapp scripts then import FROM modules. Then receipt_handlers imports from a proper module path. Bonus: solves the hardcoded /workspace/roa2web/... path natively, since the path becomes relative to Path(__file__).parents[3] from inside the modules tree.

Effort: ~15 min (CC) for the move + 3 import updates.

FINDING-ARCH-3 [P1] (confidence: 9/10) Already in CEO review as FINDING-ARCH-1: concurrent_updates=False default. One-line fix: Application.builder().token(T).concurrent_updates(True).build(). Without it, the 120s OCR poll blocks every other user. Critical for any team with ≥2 active users.

Production failure scenarios:

Scenario Plan accounts for?
OCR worker process down ✗ — plan polls until 120s timeout, generic error. Should health-check first.
Oracle DB unreachable ✓ — try/except around _save_to_oracle + finally cleanup
Telegram file download fails (40MB+) ✗ — no try/except around download_to_drive()
User auth revoked between submit and confirm ✗ — no re-check at confirm time
Bot restart between submit and confirm ✓ — pending lost (in-memory), pending file orphaned, startup cleanup scan handles it
secrets/ file missing Partial — returns empty password, ORA-01017, generic error

Coupling concerns: receipt_handlerswhatsapp_import (BACKWARDS). FINDING-ARCH-2 fixes it.

Scaling: OCR poll dominates. With concurrent_updates=True (FINDING-ARCH-3), 10 simultaneous users × 120s each is bounded by the OCR worker throughput, not handler concurrency. Worker is data_entry/services/ocr/job_worker.py — single worker, queue-based — so OCR is the real serialization point. Acceptable for a 4-accountant team.

SPOF: OCR worker. Documented above.

Rollback: git revert + restart. If cherry-pick #3 (server_id DB column) is implemented, additive nullable column → safe rollback.

STOP — 3 architecture findings. Going with recommendations: ACCEPT all three.


Section 2 — Code Quality Review

FINDING-CQ-1 [P2] (confidence: 8/10) _format_receipt_preview will be a string concat for ~10 fields with None handling. The plan describes it as 1 function returning 1 multi-line string. With the cherry-pick #1 (low-confidence warning) added, this function does 2 things: format + warn.

Recommendation: split into _format_receipt_preview(ocr_result) and _confidence_warning(confidence) returning Optional[str]. Compose them in the caller. Keeps each function under 10 lines, testable in isolation.

Effort: trivial. Maps to preference: "explicit over clever."

FINDING-CQ-2 [P2] (confidence: 9/10) Magic numbers in plan: 120 (OCR timeout seconds), 0.3 (poll interval), 600 (TTL seconds), 0.60 (low-confidence threshold). Spread across the file as hardcoded literals.

Recommendation: top of receipt_handlers.py:

OCR_TIMEOUT_S = 120
OCR_POLL_INTERVAL_S = 0.3
PENDING_TTL_S = 600
LOW_CONFIDENCE_THRESHOLD = 0.60
TEMP_FILE_PREFIX = "receipt_"

Effort: 5 min. Maps to preference: "bias toward explicit."

FINDING-CQ-3 [P2] (confidence: 8/10) Error handling in plan uses except Exception as e (line 172 of design doc). This is the catch-all anti-pattern flagged in the skill's prime directives.

Recommendation: catch the specific exceptions:

except oracledb.DatabaseError as e:
    msg = _format_oracle_error(e)  # CEO finding FINDING-ERR-3
    await query.edit_message_text(f"❌ Eroare bază de date: {msg}")
    logger.exception("receipt: oracle_write_failed user=%s", oracle_username)
except (asyncio.TimeoutError, OSError) as e:
    await query.edit_message_text(f"❌ Eroare sistem: {e}")
    logger.exception("receipt: system_error")

Don't catch Exception — let unexpected ones crash the handler so PTB's error_handler logs them with full traceback. This is how bot_main.py:148 already does it (add_error_handler(error_handler)).

FINDING-CQ-4 [P3] (confidence: 7/10) pending_receipt dict is structural — created in 1 place, consumed in 1 place. With 6 keys including a TTL field, this is on the edge of needing a dataclass. Going with explicit dict for now is fine (matches existing handlers.py style, avoids new abstraction). Flag: if a third consumer appears, refactor.

Maps to preference: "engineered enough — not over-engineered."

STOP — 4 code quality findings. Going with recommendations: ACCEPT all four.


Section 3 — Test Review (MANDATORY)

Framework detection:

[ -f roa2web/pytest.ini ] || [ -f roa2web/pyproject.toml ]
ls roa2web/tests/

roa2web uses pytest + pytest-asyncio. Test fixtures already exist at tests/fixtures/ocr-samples/.

Coverage diagram:

CODE PATHS                                              USER FLOWS
[+] receipt_handlers.py                                 [+] PDF receipt flow
  ├── handle_document_message()                          ├── [GAP] [→E2E] Send PDF → preview → confirm → Oracle
  │   ├── [GAP] auth check → not linked                  ├── [GAP] [→E2E] Send PDF → preview → cancel
  │   ├── [GAP] auth check → linked, MIME PDF            └── [GAP] Send wrong file type (e.g. .docx)
  │   ├── [GAP] auth check → linked, MIME PNG
  │   ├── [GAP] auth check → linked, MIME unknown      [+] Photo receipt flow
  │   └── [GAP] [cherry-pick #2] pending_receipt exists   ├── [GAP] [→E2E] Send photo → preview → confirm
  │                                                      └── [GAP] Send photo while pending exists
  ├── handle_photo_message()
  │   ├── [GAP] auth check → not linked                [+] Error states
  │   ├── [GAP] auth check → linked                      ├── [GAP] OCR failed → "OCR eșuat" with retry suggestion
  │   └── [GAP] [cherry-pick #2] pending exists          ├── [GAP] OCR timeout → "OCR timeout" with retry
  │                                                      ├── [GAP] Oracle ORA-01017 → translated message
  ├── _submit_ocr_and_preview()                          ├── [GAP] Oracle ORA-00001 (duplicate) → translated
  │   ├── [GAP] OSError on download (FINDING-ERR-1)      ├── [GAP] Sesiune expirată după 10 min → cancel msg
  │   ├── [GAP] submit_ocr_job aiosqlite.Error           └── [GAP] Double-press Confirm → "Sesiune expirată"
  │   ├── [GAP] poll → timeout
  │   ├── [GAP] poll → success                         [+] Boundary states
  │   ├── [GAP] poll → success but low confidence        ├── [GAP] OCR confidence 0.0 → warning shown
  │   └── [GAP] poll → result has None fields            ├── [GAP] OCR all fields None → N/A in preview
  │                                                      ├── [GAP] amount = 0 → still saves (or rejects?)
  ├── _format_receipt_preview()                          ├── [GAP] receipt_date None → defaults to today
  │   ├── [GAP] all fields populated                     └── [GAP] CUI not in CUI_TO_CONT → fallback "6028"
  │   ├── [GAP] confidence < 0.60 → ⚠ warning
  │   └── [GAP] all fields None → N/A everywhere
  │
  ├── _confidence_warning()                            [+] Concurrency
  │   ├── [GAP] confidence 0.50 → returns warning string  └── [GAP] [→E2E] 2 users send receipts simultaneously
  │   └── [GAP] confidence 0.85 → returns None              (validates concurrent_updates=True)
  │
  ├── _build_oracle_write_config()
  │   ├── [GAP] server_id present, write_user in config
  │   ├── [GAP] server_id present, no write_user → derive _AUTO
  │   ├── [GAP] server_id None → fallback default server
  │   ├── [GAP] [FINDING-ERR-2] server_id unknown → fallback to default
  │   └── [GAP] secrets file missing → empty password
  │
  ├── _save_to_oracle()                                  [→E2E with real Oracle dev DB:]
  │   ├── [GAP] happy path                               └── tests/e2e/test_receipt_oracle_write.py
  │   ├── [GAP] CUI not in mapping → log + fallback
  │   ├── [GAP] receipt_date None → datetime.now()
  │   ├── [GAP] receipt_date malformed → try/except
  │   ├── [GAP] amount/tva None → 0
  │   └── [GAP] PACK_CONTAFIN raises → DatabaseError up
  │
  ├── _format_oracle_error() [FINDING-ERR-3]
  │   ├── [GAP] ORA-01017 → "Credențiale invalide"
  │   ├── [GAP] ORA-00001 → "Bon duplicat"
  │   ├── [GAP] ORA-12541 → "Bază de date inaccesibilă"
  │   └── [GAP] unknown ORA → original message
  │
  └── handle_receipt_callback()
      ├── [GAP] confirm → save success
      ├── [GAP] confirm → save fail (DatabaseError)
      ├── [GAP] confirm → pending None (expired)
      ├── [GAP] confirm → TTL > 600s → expired
      ├── [GAP] cancel → file unlinked + msg
      ├── [GAP] double-press confirm → 2nd press gets None
      └── [GAP] file already deleted at confirm → missing_ok=True

[+] bot_main.py
  └── concurrent_updates=True wired                  [GAP] verify via integration test

[+] config.py::OracleServerConfig
  └── write_user: Optional[str] field added          [GAP] unit test: load config with/without write_user

COVERAGE: 0/47 paths tested (0%) — entire feature is new
QUALITY: ★★★:0 ★★:0 ★:0
GAPS: 47 total (5 E2E, 0 eval)

Critical gap: Every path is a gap because receipt_handlers.py is a new file. The plan must include all of these tests as part of the implementation step, not deferred.

Test plan to add to design doc Assignment:

  1. Unit teststests/modules/telegram/test_receipt_handlers.py:

    • test_format_receipt_preview_* (3 cases: full, all None, low confidence)
    • test_confidence_warning_* (2 cases)
    • test_build_oracle_write_config_* (5 cases incl. unknown server_id fallback)
    • test_format_oracle_error_* (4 ORA codes)
    • test_save_to_oracle_* (mock oracledb, 6 cases)
    • test_handle_receipt_callback_* (7 cases incl. expired, double-press, cancel)
    • test_handle_document_message_* (5 cases)
    • test_handle_photo_message_* (3 cases)
  2. E2E teststests/e2e/test_receipt_telegram_flow.py (using PTB test bot or mocks):

    • PDF send → preview → confirm
    • Photo send → preview → cancel
    • 2 simultaneous users (validates concurrent_updates=True)
  3. Oracle integration testtests/integration/test_oracle_receipt_writer.py (after extracting per FINDING-ARCH-1):

    • Real dev Oracle DB write + rollback (or use a transaction wrap)

Test plan artifact:

# Test Plan: Telegram Bonuri Fiscale
Generated by /plan-eng-review on 2026-05-05
Branch: feature/telegram-bonuri-fiscale (roa2web)
Repo: romfast/roa2web

## Affected Pages/Routes
- Telegram bot: send PDF/JPG/PNG document → preview message
- Telegram bot: send photo → preview message
- Telegram inline button: ✅ Confirmă → Oracle write
- Telegram inline button: ❌ Anulează → cleanup

## Key Interactions to Verify
- Send PDF receipt → OCR processing message → preview with extracted data → Confirmă button works → Oracle write succeeds → success message with COD
- Send photo receipt → same flow with JPEG
- Send unsupported file (.docx) → reject message
- Press Anulează → cleanup message, file deleted from /tmp

## Edge Cases
- Send 2nd file while pending exists → "Ai un bon în așteptare" message
- Wait > 10 minutes after preview, then press Confirmă → "Sesiune expirată"
- Double-press Confirmă rapidly → 1 save, 2nd press shows "Sesiune expirată"
- OCR confidence < 60%   warning at top of preview
- OCR returns all-None fields  preview shows N/A everywhere, no crash
- Oracle ORA-01017 (bad password)  user sees "Credențiale invalide"

## Critical Paths
- Auth user  submit PDF  confirm  verify ACT_TEMP row in Oracle dev DB
- 2 users submit photo simultaneously  both get preview within 30s (validates concurrent_updates)

I'll write this to ~/.gstack/projects/romfast-echo-core/moltbot-master-eng-review-test-plan-20260505.md after the review completes.

STOP — Test review complete. 47 gaps, all mandatory. Going with recommendation: write all 47 tests as part of implementation (not deferred).


Section 4 — Performance Review

FINDING-PERF-1 [P3] (confidence: 7/10) Oracle connection per save. _save_to_oracle calls oracledb.connect() on every confirm. roa2web has an existing Oracle connection pool — backend/modules/data_entry/db/oracle_pool.py per the recent commits.

Recommendation: use the existing pool. from backend.modules.data_entry.db.oracle_pool import get_pool. If the pool is per-server-id, pass the server_id. Saves ~100-300ms per receipt (TCP handshake + auth round-trip). For a single user at human pace this is barely noticeable, but it's the existing pattern — don't reinvent.

[Layer 1] — use the built-in pool, don't roll your own.

FINDING-PERF-2 [P4] (confidence: 6/10) asyncio.sleep(0.3) × 400 = 400 wakeups for a single OCR job. Each wakeup re-queries SQLite. With 4 simultaneous users × 400 wakeups = 1600 SQLite queries per minute. Not catastrophic, but wait_for_result could use a longer poll interval (1.0s) with no perceptible UX delta. 120 wakeups instead of 400.

Recommendation: change OCR_POLL_INTERVAL_S = 1.0. User won't notice — OCR results don't arrive faster than ~5s anyway.

FINDING-PERF-3 [P4] (confidence: 5/10) Memory: pending_receipt dict in context.user_data holds the full OCR result + file path. For 100 active users this is ~50KB. Negligible. No issue.

FINDING-PERF-4 [P3] (confidence: 7/10) Caching: _build_oracle_write_config reads the secrets file from disk on every confirm. Cache it in module-level dict at first read.

Recommendation:

_PASSWORD_CACHE: dict[str, str] = {}

def _read_oracle_password(server_id: str) -> str:
    if server_id not in _PASSWORD_CACHE:
        pass_file = PROJECT_ROOT / "secrets" / f"{server_id}_write.oracle_pass"
        _PASSWORD_CACHE[server_id] = pass_file.read_text().strip() if pass_file.exists() else ""
    return _PASSWORD_CACHE[server_id]

Saves a disk read per confirm. Trivial. Maps to preference: "explicit > clever, but cache cheap reads."

STOP — 4 perf findings. ACCEPT FINDING-PERF-1 (use pool) and FINDING-PERF-2 (1s poll). FINDING-PERF-3 = no action. FINDING-PERF-4 = ACCEPT (trivial).


Outside Voice — skip

Codex availability check + 2-minute outside voice run is valuable for big architecture decisions. This plan is contained: 1 new file, 4 file touches, well-trodden patterns. Skipping.


Failure Modes Summary

Codepath Failure Test? Error handling? Silent?
_download_file OSError (/tmp full) unlikely but real ✗ (FINDING-ERR-1) YES — CRITICAL GAP
_build_oracle_write_config unknown server_id possible after server delete ✗ (FINDING-ERR-2) YES — CRITICAL GAP
_save_to_oracle ORA-01017 wrong password file ✓ (raw msg) NO (visible to user)
OCR worker process down bot deployed without worker ✓ (timeout) NO (timeout msg)
pending_receipt race (rapid double-confirm) reasonable scenario ✓ (pop returns None) NO
secrets/ file missing first deploy mistake Partial (empty pwd → ORA-01017) NO (visible)
receipt_date malformed in OCR result model output drift ✗ (FINDING-DATA-1) YES — CRITICAL GAP

3 critical gaps. All 3 are addressed by CEO findings already accepted into the plan.


Worktree Parallelization

Sequential implementation, no parallelization opportunity. The plan is one feature in one file with cascading dependencies (tests depend on impl, bot_main wiring depends on receipt_handlers.py existing). No independent workstreams. Single worktree, single agent.


NOT in Scope

Item Rationale
WhatsApp parity for the same flow Separate channel, separate PR. Architecture #1 (extract oracle_receipt_writer.py) makes this easier later.
oracle_receipt_writer.py extraction RECOMMEND DOING IT — see FINDING-ARCH-1. If user defers: explicitly list as TODO.
ocr_client.py move from scripts/ to modules/ RECOMMEND DOING IT — FINDING-ARCH-2. If user defers: TODO.
"View in ERP" deep link after save CEO cherry-pick #4 — needs URL pattern confirmation, defer post-validation
Unify write path through receipts.py API CEO cherry-pick #5 — too much scope (JWT auth for service account, SQLAlchemy session)
Multi-company CUI_TO_CONT mapping in DB Deferred from existing whatsapp_import. Separate refactor.
Web push notification when receipt saved Out of scope — would couple to web SSE infra

What Already Exists

Already in repo Plan reuses correctly?
ocr_client.py::submit_ocr_job / wait_for_result YES (with path-fix preconditions)
process_v2.py PACK_CONTAFIN logic Adapts pattern (recommend extracting per ARCH-1)
get_user_auth_data() YES
OracleServerConfig model Extended with write_user field
secrets/<server_id>.oracle_pass convention Followed for _write.oracle_pass
tests/fixtures/ocr-samples/ (pre-loaded PDFs) Available for E2E tests
oracle_pool.py connection pool NOT in plan — recommend adding (FINDING-PERF-1)
Existing handler order pattern in bot_main.py YES — receipt callbacks before catch-all

TODOS.md Proposed Items

(Since no TODOS.md exists in roa2web — these are deferral notes for the design doc Open Questions section.)

  1. Persist oracle_server_id in telegram_users table — Already accepted as cherry-pick #3 in CEO review, included in plan.
  2. Extract oracle_receipt_writer.py shared helper — FINDING-ARCH-1. Going with recommendation: include in this PR.
  3. Move ocr_client.py from scripts/ to modules/ — FINDING-ARCH-2. Going with recommendation: include in this PR.
  4. "View in ERP" deep link after save — CEO cherry-pick #4 deferred.
  5. Unify Telegram write through receipts.py API endpoint — CEO cherry-pick #5 deferred.
  6. Multi-company CUI→cont mapping in DB or config — already noted in design doc's Open Question #4.

Updated Assignment (revised checklist for implementation)

1.  git checkout -b feature/telegram-bonuri-fiscale (in roa2web)
2.  Move ocr_client.py: scripts/whatsapp_import/ → modules/data_entry/services/ocr/queue_client.py
    Update imports in process_v2.py + test_ocr_simple.py
    Replace hardcoded QUEUE_DIR with Path(__file__).parents[3] / "data/ocr_queue"
3.  Extract oracle_receipt_writer.py:
    Create modules/data_entry/services/oracle_receipt_writer.py with write_receipt(receipt_dict, oracle_cfg) -> tuple[int, str]
    Adapt process_v2.py to call it
4.  Add backend/config.py: write_user: Optional[str] = None to OracleServerConfig
5.  Migration: alembic revision --autogenerate -m "add oracle_server_id to telegram_users"
    Modify link_user_to_oracle to accept + store server_id
    Modify get_user_auth_data to return server_id
6.  Implement receipt_handlers.py:
    - Module constants (FINDING-CQ-2)
    - _format_receipt_preview + _confidence_warning (FINDING-CQ-1)
    - _build_oracle_write_config with None-server fallback (FINDING-ERR-2)
    - _format_oracle_error (FINDING-ERR-3)
    - Caching for password file (FINDING-PERF-4)
    - _save_to_oracle delegates to oracle_receipt_writer
    - handle_document/photo/callback with proper error handling (FINDING-CQ-3)
    - Date parse with try/except (FINDING-DATA-1)
    - OSError handling around download (FINDING-ERR-1)
    - Duplicate pending guard (cherry-pick #2)
    - Low-confidence warning in preview (cherry-pick #1)
    - Structured logging at each step (FINDING-OBS-1)
7.  Wire bot_main.py:
    - Add .concurrent_updates(True) to Application.builder() (FINDING-ARCH-3)
    - Register 3 new handlers BEFORE catch-all CallbackQueryHandler
    - Add startup cleanup scan: glob /tmp/receipt_*.* and unlink at start
8.  Create secrets/<server_id>_write.oracle_pass on server (deploy step, not in git)
9.  Verify .gitignore covers secrets/*.oracle_pass (FINDING-SEC-1)
10. Write all 47 unit tests per Section 3 coverage diagram
11. Write 3 E2E tests (PDF flow, photo flow, concurrent users)
12. Write Oracle integration test against dev DB
13. End-to-end manual test with real bon → confirm in Oracle ACT_TEMP
14. Verify on production-like setup that OCR worker is running before deploy

15 steps. Up from 9. Every addition is grounded in a finding.


Completion Summary

  • Step 0: Scope Challenge — Scope accepted as-is (5 file touches, no over-build)
  • Architecture Review — 3 issues found (1 critical: concurrent_updates; 2 cleanups: extract oracle writer + move ocr_client)
  • Code Quality Review — 4 issues found (split format/warn, magic numbers, except Exception, dataclass deferred)
  • Test Review — diagram produced, 47 gaps identified (entire feature is new)
  • Performance Review — 4 issues found (use existing pool, slower poll, password cache, memory N/A)
  • NOT in scope — written (7 items)
  • What already exists — written (8 items)
  • TODOS.md updates — 6 items, 3 included in PR, 3 deferred
  • Failure modes — 3 critical gaps flagged, all addressed by accepted findings
  • Outside voice — skipped (contained scope, no architectural ambiguity)
  • Parallelization — sequential, no opportunity
  • Lake Score — 11/13 recommendations chose complete option (2 deferred to follow-up PRs, both with explicit rationale)

Unresolved decisions: None. User's "merge cu recomandările" applies to all findings. All ACCEPTed unless explicitly deferred.


Review Log