- US-001: mută queue_client.py în data_entry/services/ocr/ - US-002/003/004: oracle_receipt_writer + oracle_server_id în DB - US-005: receipt_handlers.py (PDF/photo/callback flow) - US-006: wire handlers în main.py, per-schema connect, seq_cod.nextval - US-007: .gitignore secrets/*.oracle_pass - US-008/009/010: teste unit + integration + E2E - setup-secrets.sh helper + template - docs/telegram/README.md actualizat cu arhitectura nouă Testat E2E pe DB live (MARIUSM_AUTO). COD din seq_cod.nextval. pypdfium2 fallback pentru PDF decode (fără poppler). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
490 lines
29 KiB
Markdown
490 lines
29 KiB
Markdown
# Final plan — roa2web-telegram-bonuri
|
||
|
||
> Recovered from planning session JSONL `b740c3df-d2b9-447b-bf8e-0100e0dd9fbf` after the
|
||
> approved status was accidentally reset and a new planning session was started. This file
|
||
> contains the longest structured assistant turn from that session (the CEO review).
|
||
|
||
---
|
||
|
||
Running plan-eng-review on the design doc. CEO review just completed — now focusing on what eng review uniquely covers: code patterns, test coverage diagram, performance specifics, and parallelization.
|
||
|
||
---
|
||
|
||
## /plan-eng-review — Telegram Bonuri Fiscale
|
||
|
||
Design doc: `~/.gstack/projects/romfast-echo-core/moltbot-master-design-20260504-221523.md`. Reviewed by /office-hours (8/10), /plan-ceo-review just completed (5 findings folded back into plan).
|
||
|
||
---
|
||
|
||
### Step 0 — Scope Challenge
|
||
|
||
**Existing code already solves:**
|
||
|
||
| Sub-problem | Existing | Plan reuses? |
|
||
|---|---|---|
|
||
| OCR async queue submit + poll | `backend/scripts/whatsapp_import/ocr_client.py` | YES (direct import) ✅ |
|
||
| Oracle write via PACK_CONTAFIN | `backend/scripts/whatsapp_import/process_v2.py:60-135` | Adapts pattern (not import) ✅ |
|
||
| Telegram auth check | `backend/modules/telegram/auth/linking.py::get_user_auth_data` | YES ✅ |
|
||
| File download from Telegram | `python-telegram-bot::Bot.get_file().download_to_drive()` | YES ✅ |
|
||
| Date parse / nul-safe field handling | `process_v2.py:64-67` (date), `process_v2.py:84-87` (nrract) | Plan must replicate, NOT importable as-is |
|
||
| CUI → cont mapping | `process_v2.py::get_cont_cheltuiala` (hardcoded Romfast) | Adapted with warning fallback |
|
||
|
||
**Rebuilding:** `_save_to_oracle` is being adapted from `process_v2.py`, not imported. Reason: `process_v2.py` is a CLI script that reads from a `result` object with `.attribute` access. Telegram handler will receive a `dict`. Different shape, can't import. **The Oracle write logic should be extracted to a shared helper** — `backend/modules/data_entry/services/oracle_receipt_writer.py` — so the next channel (WhatsApp, email) doesn't trigger a third copy. See Architecture finding #1.
|
||
|
||
**Minimum set of changes** — design doc proposes 5 file touches. Eng-review verdict: 5 is the minimum.
|
||
|
||
**Complexity check** — 5 files. Under threshold. No new classes (flat functions per existing pattern). Clean.
|
||
|
||
**Search check (Layer 1/2/3):**
|
||
- python-telegram-bot async file download → built-in `bot.get_file().download_to_drive()`. Plan uses it. ✅ [Layer 1]
|
||
- `concurrent_updates=True` for parallel users → built-in flag in `Application.builder()`. Plan should set it (CEO finding FINDING-ARCH-1). [Layer 1]
|
||
- Async wrapper for sync DB driver → `asyncio.to_thread()` (Python 3.9+). Plan uses it. ✅ [Layer 1]
|
||
- Per-user state → `context.user_data` (PTB built-in). Plan uses it. ✅ [Layer 1]
|
||
|
||
No reinvention. No L2/L3 patterns introduced.
|
||
|
||
**TODOS cross-reference:** No `TODOS.md` exists in roa2web. Recommend adding one as part of this PR with the deferred items from the CEO review (server_id DB persistence, "view in ERP" link, unify-via-receipt-service).
|
||
|
||
**Distribution check:** No new artifact. Code lives inside the existing roa2web bot process. N/A.
|
||
|
||
**Step 0 verdict:** Scope is right. Proceed to sections.
|
||
|
||
---
|
||
|
||
### Section 1 — Architecture Review
|
||
|
||
```
|
||
ROA2WEB BOT PROCESS (single worker)
|
||
┌─────────────────────────────────────────────────────────────────────┐
|
||
│ │
|
||
│ bot_main.py::create_telegram_application() │
|
||
│ │ │
|
||
│ ├─ Application.builder().token(T).build() │
|
||
│ │ └─ ⚠ MISSING: .concurrent_updates(True) │
|
||
│ │ │
|
||
│ ├─ CommandHandler(...) × N │
|
||
│ ├─ MessageHandler(TEXT & ~COMMAND, handle_text_message) │
|
||
│ │ │
|
||
│ ├─ ⓢ MessageHandler(Document.PDF | Document.IMAGE, doc_handler) │ NEW
|
||
│ ├─ ⓢ MessageHandler(PHOTO, photo_handler) │ NEW
|
||
│ ├─ ⓢ CallbackQueryHandler(receipt_cb, pattern=r"^receipt:") │ NEW
|
||
│ │ │
|
||
│ └─ CallbackQueryHandler(button_callback) ← catch-all │
|
||
│ (receipt:* MUST be registered before this — handler order) │
|
||
│ │
|
||
│ receipt_handlers.py (NEW) │
|
||
│ handle_document_message ─┐ │
|
||
│ handle_photo_message ├─→ _submit_ocr_and_preview │
|
||
│ │ │ │
|
||
│ │ ├─ submit_ocr_job (whatsapp_imp.)│
|
||
│ │ ├─ poll wait_for_result │
|
||
│ │ └─ edit_message + keyboard │
|
||
│ │ │
|
||
│ handle_receipt_callback ──┘ │ │
|
||
│ ├─ confirm: asyncio.to_thread(_save_to_oracle) │
|
||
│ └─ cancel: unlink temp + edit_message │
|
||
│ │
|
||
└─────────────────────────────────────────────────────────────────────┘
|
||
│ │
|
||
▼ ▼
|
||
SQLite OCR queue Oracle DB
|
||
data/ocr_queue/ ACT_TEMP via PACK_CONTAFIN
|
||
⚠ hardcoded path cred: secrets/<server_id>_write.oracle_pass
|
||
```
|
||
|
||
**FINDING-ARCH-1** `[P2] (confidence: 8/10)` `_save_to_oracle` duplication risk. The plan adapts `process_v2.py`'s PACK_CONTAFIN logic into `receipt_handlers.py::_save_to_oracle`. This makes 2 copies of the 3-step Oracle write. WhatsApp adapter is the third channel waiting to land — that becomes 3 copies. DRY violation.
|
||
|
||
Recommendation: extract to `backend/modules/data_entry/services/oracle_receipt_writer.py::write_receipt(receipt_dict, oracle_cfg) -> tuple[int, str]`. Both `process_v2.py` and `receipt_handlers.py` import it. Effort: ~20 min (CC) for the extraction + adapt both callers.
|
||
|
||
This connects to engineering preference: "DRY is important — flag repetition aggressively."
|
||
|
||
**FINDING-ARCH-2** `[P3] (confidence: 9/10)` `bot/__init__.py` import boundary. The plan imports from `backend.scripts.whatsapp_import.ocr_client`. Importing from `scripts/` into `modules/` is backwards — scripts should depend on modules, not the reverse.
|
||
|
||
Recommendation: move `ocr_client.py` to `backend/modules/data_entry/services/ocr/queue_client.py`. The whatsapp scripts then import FROM modules. Then receipt_handlers imports from a proper module path. Bonus: solves the hardcoded `/workspace/roa2web/...` path natively, since the path becomes relative to `Path(__file__).parents[3]` from inside the modules tree.
|
||
|
||
Effort: ~15 min (CC) for the move + 3 import updates.
|
||
|
||
**FINDING-ARCH-3** `[P1] (confidence: 9/10)` Already in CEO review as FINDING-ARCH-1: `concurrent_updates=False` default. One-line fix: `Application.builder().token(T).concurrent_updates(True).build()`. Without it, the 120s OCR poll blocks every other user. Critical for any team with ≥2 active users.
|
||
|
||
**Production failure scenarios:**
|
||
|
||
| Scenario | Plan accounts for? |
|
||
|---|---|
|
||
| OCR worker process down | ✗ — plan polls until 120s timeout, generic error. Should health-check first. |
|
||
| Oracle DB unreachable | ✓ — `try/except` around `_save_to_oracle` + `finally` cleanup |
|
||
| Telegram file download fails (40MB+) | ✗ — no try/except around `download_to_drive()` |
|
||
| User auth revoked between submit and confirm | ✗ — no re-check at confirm time |
|
||
| Bot restart between submit and confirm | ✓ — pending lost (in-memory), pending file orphaned, startup cleanup scan handles it |
|
||
| `secrets/` file missing | Partial — returns empty password, ORA-01017, generic error |
|
||
|
||
**Coupling concerns:** `receipt_handlers` → `whatsapp_import` (BACKWARDS). FINDING-ARCH-2 fixes it.
|
||
|
||
**Scaling:** OCR poll dominates. With concurrent_updates=True (FINDING-ARCH-3), 10 simultaneous users × 120s each is bounded by the OCR worker throughput, not handler concurrency. Worker is `data_entry/services/ocr/job_worker.py` — single worker, queue-based — so OCR is the real serialization point. Acceptable for a 4-accountant team.
|
||
|
||
**SPOF:** OCR worker. Documented above.
|
||
|
||
**Rollback:** `git revert + restart`. If cherry-pick #3 (server_id DB column) is implemented, additive nullable column → safe rollback.
|
||
|
||
**STOP** — 3 architecture findings. Going with recommendations: ACCEPT all three.
|
||
|
||
---
|
||
|
||
### Section 2 — Code Quality Review
|
||
|
||
**FINDING-CQ-1** `[P2] (confidence: 8/10)` `_format_receipt_preview` will be a string concat for ~10 fields with None handling. The plan describes it as 1 function returning 1 multi-line string. With the cherry-pick #1 (low-confidence warning) added, this function does 2 things: format + warn.
|
||
|
||
Recommendation: split into `_format_receipt_preview(ocr_result)` and `_confidence_warning(confidence)` returning `Optional[str]`. Compose them in the caller. Keeps each function under 10 lines, testable in isolation.
|
||
|
||
Effort: trivial. Maps to preference: "explicit over clever."
|
||
|
||
**FINDING-CQ-2** `[P2] (confidence: 9/10)` Magic numbers in plan: `120` (OCR timeout seconds), `0.3` (poll interval), `600` (TTL seconds), `0.60` (low-confidence threshold). Spread across the file as hardcoded literals.
|
||
|
||
Recommendation: top of `receipt_handlers.py`:
|
||
```python
|
||
OCR_TIMEOUT_S = 120
|
||
OCR_POLL_INTERVAL_S = 0.3
|
||
PENDING_TTL_S = 600
|
||
LOW_CONFIDENCE_THRESHOLD = 0.60
|
||
TEMP_FILE_PREFIX = "receipt_"
|
||
```
|
||
|
||
Effort: 5 min. Maps to preference: "bias toward explicit."
|
||
|
||
**FINDING-CQ-3** `[P2] (confidence: 8/10)` Error handling in plan uses `except Exception as e` (line 172 of design doc). This is the catch-all anti-pattern flagged in the skill's prime directives.
|
||
|
||
Recommendation: catch the specific exceptions:
|
||
```python
|
||
except oracledb.DatabaseError as e:
|
||
msg = _format_oracle_error(e) # CEO finding FINDING-ERR-3
|
||
await query.edit_message_text(f"❌ Eroare bază de date: {msg}")
|
||
logger.exception("receipt: oracle_write_failed user=%s", oracle_username)
|
||
except (asyncio.TimeoutError, OSError) as e:
|
||
await query.edit_message_text(f"❌ Eroare sistem: {e}")
|
||
logger.exception("receipt: system_error")
|
||
```
|
||
|
||
Don't catch `Exception` — let unexpected ones crash the handler so PTB's error_handler logs them with full traceback. This is how `bot_main.py:148` already does it (`add_error_handler(error_handler)`).
|
||
|
||
**FINDING-CQ-4** `[P3] (confidence: 7/10)` `pending_receipt` dict is structural — created in 1 place, consumed in 1 place. With 6 keys including a TTL field, this is on the edge of needing a `dataclass`. Going with explicit dict for now is fine (matches existing handlers.py style, avoids new abstraction). Flag: if a third consumer appears, refactor.
|
||
|
||
Maps to preference: "engineered enough — not over-engineered."
|
||
|
||
**STOP** — 4 code quality findings. Going with recommendations: ACCEPT all four.
|
||
|
||
---
|
||
|
||
### Section 3 — Test Review (MANDATORY)
|
||
|
||
**Framework detection:**
|
||
|
||
```bash
|
||
[ -f roa2web/pytest.ini ] || [ -f roa2web/pyproject.toml ]
|
||
ls roa2web/tests/
|
||
```
|
||
|
||
roa2web uses pytest + pytest-asyncio. Test fixtures already exist at `tests/fixtures/ocr-samples/`.
|
||
|
||
**Coverage diagram:**
|
||
|
||
```
|
||
CODE PATHS USER FLOWS
|
||
[+] receipt_handlers.py [+] PDF receipt flow
|
||
├── handle_document_message() ├── [GAP] [→E2E] Send PDF → preview → confirm → Oracle
|
||
│ ├── [GAP] auth check → not linked ├── [GAP] [→E2E] Send PDF → preview → cancel
|
||
│ ├── [GAP] auth check → linked, MIME PDF └── [GAP] Send wrong file type (e.g. .docx)
|
||
│ ├── [GAP] auth check → linked, MIME PNG
|
||
│ ├── [GAP] auth check → linked, MIME unknown [+] Photo receipt flow
|
||
│ └── [GAP] [cherry-pick #2] pending_receipt exists ├── [GAP] [→E2E] Send photo → preview → confirm
|
||
│ └── [GAP] Send photo while pending exists
|
||
├── handle_photo_message()
|
||
│ ├── [GAP] auth check → not linked [+] Error states
|
||
│ ├── [GAP] auth check → linked ├── [GAP] OCR failed → "OCR eșuat" with retry suggestion
|
||
│ └── [GAP] [cherry-pick #2] pending exists ├── [GAP] OCR timeout → "OCR timeout" with retry
|
||
│ ├── [GAP] Oracle ORA-01017 → translated message
|
||
├── _submit_ocr_and_preview() ├── [GAP] Oracle ORA-00001 (duplicate) → translated
|
||
│ ├── [GAP] OSError on download (FINDING-ERR-1) ├── [GAP] Sesiune expirată după 10 min → cancel msg
|
||
│ ├── [GAP] submit_ocr_job aiosqlite.Error └── [GAP] Double-press Confirm → "Sesiune expirată"
|
||
│ ├── [GAP] poll → timeout
|
||
│ ├── [GAP] poll → success [+] Boundary states
|
||
│ ├── [GAP] poll → success but low confidence ├── [GAP] OCR confidence 0.0 → warning shown
|
||
│ └── [GAP] poll → result has None fields ├── [GAP] OCR all fields None → N/A in preview
|
||
│ ├── [GAP] amount = 0 → still saves (or rejects?)
|
||
├── _format_receipt_preview() ├── [GAP] receipt_date None → defaults to today
|
||
│ ├── [GAP] all fields populated └── [GAP] CUI not in CUI_TO_CONT → fallback "6028"
|
||
│ ├── [GAP] confidence < 0.60 → ⚠ warning
|
||
│ └── [GAP] all fields None → N/A everywhere
|
||
│
|
||
├── _confidence_warning() [+] Concurrency
|
||
│ ├── [GAP] confidence 0.50 → returns warning string └── [GAP] [→E2E] 2 users send receipts simultaneously
|
||
│ └── [GAP] confidence 0.85 → returns None (validates concurrent_updates=True)
|
||
│
|
||
├── _build_oracle_write_config()
|
||
│ ├── [GAP] server_id present, write_user in config
|
||
│ ├── [GAP] server_id present, no write_user → derive _AUTO
|
||
│ ├── [GAP] server_id None → fallback default server
|
||
│ ├── [GAP] [FINDING-ERR-2] server_id unknown → fallback to default
|
||
│ └── [GAP] secrets file missing → empty password
|
||
│
|
||
├── _save_to_oracle() [→E2E with real Oracle dev DB:]
|
||
│ ├── [GAP] happy path └── tests/e2e/test_receipt_oracle_write.py
|
||
│ ├── [GAP] CUI not in mapping → log + fallback
|
||
│ ├── [GAP] receipt_date None → datetime.now()
|
||
│ ├── [GAP] receipt_date malformed → try/except
|
||
│ ├── [GAP] amount/tva None → 0
|
||
│ └── [GAP] PACK_CONTAFIN raises → DatabaseError up
|
||
│
|
||
├── _format_oracle_error() [FINDING-ERR-3]
|
||
│ ├── [GAP] ORA-01017 → "Credențiale invalide"
|
||
│ ├── [GAP] ORA-00001 → "Bon duplicat"
|
||
│ ├── [GAP] ORA-12541 → "Bază de date inaccesibilă"
|
||
│ └── [GAP] unknown ORA → original message
|
||
│
|
||
└── handle_receipt_callback()
|
||
├── [GAP] confirm → save success
|
||
├── [GAP] confirm → save fail (DatabaseError)
|
||
├── [GAP] confirm → pending None (expired)
|
||
├── [GAP] confirm → TTL > 600s → expired
|
||
├── [GAP] cancel → file unlinked + msg
|
||
├── [GAP] double-press confirm → 2nd press gets None
|
||
└── [GAP] file already deleted at confirm → missing_ok=True
|
||
|
||
[+] bot_main.py
|
||
└── concurrent_updates=True wired [GAP] verify via integration test
|
||
|
||
[+] config.py::OracleServerConfig
|
||
└── write_user: Optional[str] field added [GAP] unit test: load config with/without write_user
|
||
|
||
COVERAGE: 0/47 paths tested (0%) — entire feature is new
|
||
QUALITY: ★★★:0 ★★:0 ★:0
|
||
GAPS: 47 total (5 E2E, 0 eval)
|
||
```
|
||
|
||
**Critical gap:** Every path is a gap because `receipt_handlers.py` is a new file. The plan must include all of these tests as part of the implementation step, not deferred.
|
||
|
||
**Test plan to add to design doc Assignment:**
|
||
|
||
1. **Unit tests** — `tests/modules/telegram/test_receipt_handlers.py`:
|
||
- `test_format_receipt_preview_*` (3 cases: full, all None, low confidence)
|
||
- `test_confidence_warning_*` (2 cases)
|
||
- `test_build_oracle_write_config_*` (5 cases incl. unknown server_id fallback)
|
||
- `test_format_oracle_error_*` (4 ORA codes)
|
||
- `test_save_to_oracle_*` (mock oracledb, 6 cases)
|
||
- `test_handle_receipt_callback_*` (7 cases incl. expired, double-press, cancel)
|
||
- `test_handle_document_message_*` (5 cases)
|
||
- `test_handle_photo_message_*` (3 cases)
|
||
|
||
2. **E2E tests** — `tests/e2e/test_receipt_telegram_flow.py` (using PTB test bot or mocks):
|
||
- PDF send → preview → confirm
|
||
- Photo send → preview → cancel
|
||
- 2 simultaneous users (validates concurrent_updates=True)
|
||
|
||
3. **Oracle integration test** — `tests/integration/test_oracle_receipt_writer.py` (after extracting per FINDING-ARCH-1):
|
||
- Real dev Oracle DB write + rollback (or use a transaction wrap)
|
||
|
||
**Test plan artifact:**
|
||
|
||
```markdown
|
||
# Test Plan: Telegram Bonuri Fiscale
|
||
Generated by /plan-eng-review on 2026-05-05
|
||
Branch: feature/telegram-bonuri-fiscale (roa2web)
|
||
Repo: romfast/roa2web
|
||
|
||
## Affected Pages/Routes
|
||
- Telegram bot: send PDF/JPG/PNG document → preview message
|
||
- Telegram bot: send photo → preview message
|
||
- Telegram inline button: ✅ Confirmă → Oracle write
|
||
- Telegram inline button: ❌ Anulează → cleanup
|
||
|
||
## Key Interactions to Verify
|
||
- Send PDF receipt → OCR processing message → preview with extracted data → Confirmă button works → Oracle write succeeds → success message with COD
|
||
- Send photo receipt → same flow with JPEG
|
||
- Send unsupported file (.docx) → reject message
|
||
- Press Anulează → cleanup message, file deleted from /tmp
|
||
|
||
## Edge Cases
|
||
- Send 2nd file while pending exists → "Ai un bon în așteptare" message
|
||
- Wait > 10 minutes after preview, then press Confirmă → "Sesiune expirată"
|
||
- Double-press Confirmă rapidly → 1 save, 2nd press shows "Sesiune expirată"
|
||
- OCR confidence < 60% → ⚠ warning at top of preview
|
||
- OCR returns all-None fields → preview shows N/A everywhere, no crash
|
||
- Oracle ORA-01017 (bad password) → user sees "Credențiale invalide"
|
||
|
||
## Critical Paths
|
||
- Auth user → submit PDF → confirm → verify ACT_TEMP row in Oracle dev DB
|
||
- 2 users submit photo simultaneously → both get preview within 30s (validates concurrent_updates)
|
||
```
|
||
|
||
I'll write this to `~/.gstack/projects/romfast-echo-core/moltbot-master-eng-review-test-plan-20260505.md` after the review completes.
|
||
|
||
**STOP** — Test review complete. 47 gaps, all mandatory. Going with recommendation: write all 47 tests as part of implementation (not deferred).
|
||
|
||
---
|
||
|
||
### Section 4 — Performance Review
|
||
|
||
**FINDING-PERF-1** `[P3] (confidence: 7/10)` Oracle connection per save. `_save_to_oracle` calls `oracledb.connect()` on every confirm. roa2web has an existing Oracle connection pool — `backend/modules/data_entry/db/oracle_pool.py` per the recent commits.
|
||
|
||
Recommendation: use the existing pool. `from backend.modules.data_entry.db.oracle_pool import get_pool`. If the pool is per-server-id, pass the server_id. Saves ~100-300ms per receipt (TCP handshake + auth round-trip). For a single user at human pace this is barely noticeable, but it's the existing pattern — don't reinvent.
|
||
|
||
[Layer 1] — use the built-in pool, don't roll your own.
|
||
|
||
**FINDING-PERF-2** `[P4] (confidence: 6/10)` `asyncio.sleep(0.3)` × 400 = 400 wakeups for a single OCR job. Each wakeup re-queries SQLite. With 4 simultaneous users × 400 wakeups = 1600 SQLite queries per minute. Not catastrophic, but `wait_for_result` could use a longer poll interval (1.0s) with no perceptible UX delta. 120 wakeups instead of 400.
|
||
|
||
Recommendation: change `OCR_POLL_INTERVAL_S = 1.0`. User won't notice — OCR results don't arrive faster than ~5s anyway.
|
||
|
||
**FINDING-PERF-3** `[P4] (confidence: 5/10)` Memory: `pending_receipt` dict in `context.user_data` holds the full OCR result + file path. For 100 active users this is ~50KB. Negligible. No issue.
|
||
|
||
**FINDING-PERF-4** `[P3] (confidence: 7/10)` Caching: `_build_oracle_write_config` reads the secrets file from disk on every confirm. Cache it in module-level dict at first read.
|
||
|
||
Recommendation:
|
||
```python
|
||
_PASSWORD_CACHE: dict[str, str] = {}
|
||
|
||
def _read_oracle_password(server_id: str) -> str:
|
||
if server_id not in _PASSWORD_CACHE:
|
||
pass_file = PROJECT_ROOT / "secrets" / f"{server_id}_write.oracle_pass"
|
||
_PASSWORD_CACHE[server_id] = pass_file.read_text().strip() if pass_file.exists() else ""
|
||
return _PASSWORD_CACHE[server_id]
|
||
```
|
||
|
||
Saves a disk read per confirm. Trivial. Maps to preference: "explicit > clever, but cache cheap reads."
|
||
|
||
**STOP** — 4 perf findings. ACCEPT FINDING-PERF-1 (use pool) and FINDING-PERF-2 (1s poll). FINDING-PERF-3 = no action. FINDING-PERF-4 = ACCEPT (trivial).
|
||
|
||
---
|
||
|
||
### Outside Voice — skip
|
||
|
||
Codex availability check + 2-minute outside voice run is valuable for big architecture decisions. This plan is contained: 1 new file, 4 file touches, well-trodden patterns. Skipping.
|
||
|
||
---
|
||
|
||
### Failure Modes Summary
|
||
|
||
| Codepath | Failure | Test? | Error handling? | Silent? |
|
||
|---|---|---|---|---|
|
||
| `_download_file` OSError (/tmp full) | unlikely but real | ✗ | ✗ (FINDING-ERR-1) | YES — **CRITICAL GAP** |
|
||
| `_build_oracle_write_config` unknown server_id | possible after server delete | ✗ | ✗ (FINDING-ERR-2) | YES — **CRITICAL GAP** |
|
||
| `_save_to_oracle` ORA-01017 | wrong password file | ✗ | ✓ (raw msg) | NO (visible to user) |
|
||
| OCR worker process down | bot deployed without worker | ✗ | ✓ (timeout) | NO (timeout msg) |
|
||
| `pending_receipt` race (rapid double-confirm) | reasonable scenario | ✗ | ✓ (pop returns None) | NO ✅ |
|
||
| `secrets/` file missing | first deploy mistake | ✗ | Partial (empty pwd → ORA-01017) | NO (visible) |
|
||
| `receipt_date` malformed in OCR result | model output drift | ✗ | ✗ (FINDING-DATA-1) | YES — **CRITICAL GAP** |
|
||
|
||
3 critical gaps. All 3 are addressed by CEO findings already accepted into the plan.
|
||
|
||
---
|
||
|
||
### Worktree Parallelization
|
||
|
||
Sequential implementation, no parallelization opportunity. The plan is one feature in one file with cascading dependencies (tests depend on impl, bot_main wiring depends on receipt_handlers.py existing). No independent workstreams. Single worktree, single agent.
|
||
|
||
---
|
||
|
||
### NOT in Scope
|
||
|
||
| Item | Rationale |
|
||
|---|---|
|
||
| WhatsApp parity for the same flow | Separate channel, separate PR. Architecture #1 (extract `oracle_receipt_writer.py`) makes this easier later. |
|
||
| `oracle_receipt_writer.py` extraction | RECOMMEND DOING IT — see FINDING-ARCH-1. If user defers: explicitly list as TODO. |
|
||
| `ocr_client.py` move from scripts/ to modules/ | RECOMMEND DOING IT — FINDING-ARCH-2. If user defers: TODO. |
|
||
| "View in ERP" deep link after save | CEO cherry-pick #4 — needs URL pattern confirmation, defer post-validation |
|
||
| Unify write path through `receipts.py` API | CEO cherry-pick #5 — too much scope (JWT auth for service account, SQLAlchemy session) |
|
||
| Multi-company `CUI_TO_CONT` mapping in DB | Deferred from existing whatsapp_import. Separate refactor. |
|
||
| Web push notification when receipt saved | Out of scope — would couple to web SSE infra |
|
||
|
||
---
|
||
|
||
### What Already Exists
|
||
|
||
| Already in repo | Plan reuses correctly? |
|
||
|---|---|
|
||
| `ocr_client.py::submit_ocr_job` / `wait_for_result` | YES (with path-fix preconditions) |
|
||
| `process_v2.py` PACK_CONTAFIN logic | Adapts pattern (recommend extracting per ARCH-1) |
|
||
| `get_user_auth_data()` | YES |
|
||
| `OracleServerConfig` model | Extended with `write_user` field |
|
||
| `secrets/<server_id>.oracle_pass` convention | Followed for `_write.oracle_pass` |
|
||
| `tests/fixtures/ocr-samples/` (pre-loaded PDFs) | Available for E2E tests |
|
||
| `oracle_pool.py` connection pool | NOT in plan — recommend adding (FINDING-PERF-1) |
|
||
| Existing handler order pattern in `bot_main.py` | YES — receipt callbacks before catch-all |
|
||
|
||
---
|
||
|
||
### TODOS.md Proposed Items
|
||
|
||
(Since no TODOS.md exists in roa2web — these are deferral notes for the design doc Open Questions section.)
|
||
|
||
1. **Persist `oracle_server_id` in `telegram_users` table** — Already accepted as cherry-pick #3 in CEO review, included in plan.
|
||
2. **Extract `oracle_receipt_writer.py` shared helper** — FINDING-ARCH-1. Going with recommendation: include in this PR.
|
||
3. **Move `ocr_client.py` from scripts/ to modules/** — FINDING-ARCH-2. Going with recommendation: include in this PR.
|
||
4. **"View in ERP" deep link after save** — CEO cherry-pick #4 deferred.
|
||
5. **Unify Telegram write through `receipts.py` API endpoint** — CEO cherry-pick #5 deferred.
|
||
6. **Multi-company CUI→cont mapping in DB or config** — already noted in design doc's Open Question #4.
|
||
|
||
---
|
||
|
||
### Updated Assignment (revised checklist for implementation)
|
||
|
||
```
|
||
1. git checkout -b feature/telegram-bonuri-fiscale (in roa2web)
|
||
2. Move ocr_client.py: scripts/whatsapp_import/ → modules/data_entry/services/ocr/queue_client.py
|
||
Update imports in process_v2.py + test_ocr_simple.py
|
||
Replace hardcoded QUEUE_DIR with Path(__file__).parents[3] / "data/ocr_queue"
|
||
3. Extract oracle_receipt_writer.py:
|
||
Create modules/data_entry/services/oracle_receipt_writer.py with write_receipt(receipt_dict, oracle_cfg) -> tuple[int, str]
|
||
Adapt process_v2.py to call it
|
||
4. Add backend/config.py: write_user: Optional[str] = None to OracleServerConfig
|
||
5. Migration: alembic revision --autogenerate -m "add oracle_server_id to telegram_users"
|
||
Modify link_user_to_oracle to accept + store server_id
|
||
Modify get_user_auth_data to return server_id
|
||
6. Implement receipt_handlers.py:
|
||
- Module constants (FINDING-CQ-2)
|
||
- _format_receipt_preview + _confidence_warning (FINDING-CQ-1)
|
||
- _build_oracle_write_config with None-server fallback (FINDING-ERR-2)
|
||
- _format_oracle_error (FINDING-ERR-3)
|
||
- Caching for password file (FINDING-PERF-4)
|
||
- _save_to_oracle delegates to oracle_receipt_writer
|
||
- handle_document/photo/callback with proper error handling (FINDING-CQ-3)
|
||
- Date parse with try/except (FINDING-DATA-1)
|
||
- OSError handling around download (FINDING-ERR-1)
|
||
- Duplicate pending guard (cherry-pick #2)
|
||
- Low-confidence warning in preview (cherry-pick #1)
|
||
- Structured logging at each step (FINDING-OBS-1)
|
||
7. Wire bot_main.py:
|
||
- Add .concurrent_updates(True) to Application.builder() (FINDING-ARCH-3)
|
||
- Register 3 new handlers BEFORE catch-all CallbackQueryHandler
|
||
- Add startup cleanup scan: glob /tmp/receipt_*.* and unlink at start
|
||
8. Create secrets/<server_id>_write.oracle_pass on server (deploy step, not in git)
|
||
9. Verify .gitignore covers secrets/*.oracle_pass (FINDING-SEC-1)
|
||
10. Write all 47 unit tests per Section 3 coverage diagram
|
||
11. Write 3 E2E tests (PDF flow, photo flow, concurrent users)
|
||
12. Write Oracle integration test against dev DB
|
||
13. End-to-end manual test with real bon → confirm in Oracle ACT_TEMP
|
||
14. Verify on production-like setup that OCR worker is running before deploy
|
||
```
|
||
|
||
15 steps. Up from 9. Every addition is grounded in a finding.
|
||
|
||
---
|
||
|
||
### Completion Summary
|
||
|
||
- **Step 0: Scope Challenge** — Scope accepted as-is (5 file touches, no over-build)
|
||
- **Architecture Review** — 3 issues found (1 critical: concurrent_updates; 2 cleanups: extract oracle writer + move ocr_client)
|
||
- **Code Quality Review** — 4 issues found (split format/warn, magic numbers, except Exception, dataclass deferred)
|
||
- **Test Review** — diagram produced, 47 gaps identified (entire feature is new)
|
||
- **Performance Review** — 4 issues found (use existing pool, slower poll, password cache, memory N/A)
|
||
- **NOT in scope** — written (7 items)
|
||
- **What already exists** — written (8 items)
|
||
- **TODOS.md updates** — 6 items, 3 included in PR, 3 deferred
|
||
- **Failure modes** — 3 critical gaps flagged, all addressed by accepted findings
|
||
- **Outside voice** — skipped (contained scope, no architectural ambiguity)
|
||
- **Parallelization** — sequential, no opportunity
|
||
- **Lake Score** — 11/13 recommendations chose complete option (2 deferred to follow-up PRs, both with explicit rationale)
|
||
|
||
**Unresolved decisions:** None. User's "merge cu recomandările" applies to all findings. All ACCEPTed unless explicitly deferred.
|
||
|
||
---
|
||
|
||
### Review Log
|