feat(5.6): observabilitate + jurnal aplicatie + lifecycle trimiteri blocate

Implementeaza PRD 5.6 complet (14 stories, TDD). Doua axe:

Lifecycle trimiteri blocate (Val A):
- submissions_admin.py: sterge/repune scoped (404 cross-account inaintea lui 409 stare)
- reactivare dedup peste `error` cu CAS (WHERE id=? AND status='error'), creds noi in
  submissions + accounts.rar_creds_enc; worker invalideaza sesiunea RAR la creds proaspete
  (JWT 30h vechi nu mai trimite cu parola gresita); camp aditiv `reactivated:true`
- retentie randuri blocate 30z; purge_expired exclude queued/sending; purge_after curatat
  la reactivare/requeue
- API DELETE /v1/prezentari/{id} + /repune (200+JSON); UI butoane + bulk + banner actionabil

Observabilitate:
- app/observ.py log_event: dublu canal app_events (DB) + RotatingFileHandler per-proces,
  redactare creds/PII la scriere (redact_pii/vin_partial)
- request_id middleware + X-Request-ID pe toate raspunsurile
- handler global excepții -> 500 envelope 6-chei + request_id (traceback doar in jurnal)
- audit cerere API (api_prezentari/api_auth_esuat) + audit worker (rar_login/tranzitii)
- tab "Jurnal" filtrabil scoped (non-admin doar contul sau); retentie jurnal 90z
- rar_error expus in GET /v1/prezentari/{id} (recovery observabil)

pytest -q: 741 passed, 0 failed. Docs: PRD raport VERIFY, contract endpointuri noi, ROADMAP.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
Claude Agent
2026-06-23 18:45:39 +00:00
parent f48346de5c
commit c842e3352a
40 changed files with 2851 additions and 64 deletions

File diff suppressed because one or more lines are too long

View File

@@ -409,6 +409,27 @@ Endpointuri noi:
- `POST /v1/mapari` `{account_id?, cod_op_service, cod_prestatie, auto_send}` — upsert mapare + re-rezolvare. Respinge `cod_prestatie` inexistent in nomenclator (422).
- Web: `GET /_fragments/mapari` (editor HTMX), `POST /mapari` (form, salveaza + re-randeaza).
### Lifecycle trimiteri blocate (PRD 5.6)
`POST /v1/prezentari` — camp **aditiv** in fiecare `SubmissionResult`: `reactivated: bool`.
La resubmit cu aceeasi cheie de continut peste un rand `error` (ex. parola RAR corectata),
randul se RE-ACTIVEAZA (re-clasificat + creds actualizate) si raspunsul poarta
`reactivated: true` + starea noua. `deduped` pastreaza semantica actuala (clientii vechi
care testeaza `deduped` nu se sparg). Pentru `sent`/`queued`/`sending`/`needs_*` ->
`deduped: true` (neschimbat).
- `DELETE /v1/prezentari/{id}` — sterge o trimitere blocata a contului cheii API.
**200 + body JSON** `{ok, submission_id, status_anterior}` (NU 204 — clienti VFP string-parse).
Scope evaluat INAINTEA starii: cross-account / inexistent -> **404** (acelasi mesaj, B3);
own-account `sent`/`sending` -> **409** (conflict de stare).
- `POST /v1/prezentari/{id}/repune` — re-pune in coada (`error -> queued`, re-ruleaza classify).
**200 + body JSON** `{ok, submission_id, status_anterior, status_nou}`. Acelasi oracol scope/stare.
- `GET /v1/prezentari/{id}` expune ACUM si `rar_error` (T9) — recovery observabil prin API
(de ce a esuat); contine doar coduri/mesaje de validare RAR, niciodata creds.
Web (dashboard, scoped pe sesiune + CSRF): `POST /trimitere/{id}/sterge`,
`POST /trimitere/{id}/repune`, `POST /trimiteri/sterge-bulk` (selectie multipla, doar blocate).
Fuzzy: `rapidfuzz.token_sort_ratio` pe denumire normalizata (fara diacritice, upper).
Nomenclatorul se ia **live** din RAR (worker upsert la fiecare login); seed fallback
de 18 coduri la boot (`app/nomenclator_seed.py`) ca editorul sa mearga offline.

View File

@@ -1,6 +1,7 @@
<!-- /autoplan restore point: /home/claude/.gstack/projects/romfast-rar-autopass/main-autoplan-restore-20260623-165442.md -->
# PRD 5.6 — Observabilitate, jurnal aplicatie & lifecycle trimiteri blocate
**Stare**: aprobat (decizii §5 rezolvate 2026-06-23)
**Stare**: aprobat + review /autoplan complet (4 decizii de gust rezolvate 2026-06-23; vezi Anexa /autoplan)
> Proces complet: `docs/ROADMAP.md` §5. Contract RAR (sursa de adevar): `docs/api-rar-contract.md`.
> Catalog erori (sursa de adevar coduri): `app/errors.py` (PRD 5.4). Redactare creds: `app/security.py`.
@@ -262,8 +263,11 @@ formatul, redactarea si dublul canal (DB + fisier) sa fie consistente si imposib
- **Test intai (RED)**: `tests/test_api_lifecycle.py` — `test_delete_scoped_pe_cheie`,
`test_delete_sent_403`, `test_repune_error_queued`, `test_repune_inexistent_404`
- **Acceptance criteria**:
- [ ] `DELETE /v1/prezentari/{id}` → 200/204 pe randuri ne-sent ale contului cheii;
403 pe `sent`/`sending`; 404 cross-account/inexistent (acelasi mesaj, ca B3).
- [ ] `DELETE /v1/prezentari/{id}` → **200 + body JSON** `{ok, submission_id, status_anterior}`
(NU 204; clienti VFP string-parse) pe randuri ne-sent ale contului cheii.
- [ ] **Scope evaluat INAINTEA starii** (decizie /autoplan #20): cross-account / inexistent
→ **404** (acelasi mesaj, B3 — nu confirmam existenta); own-account `sent`/`sending`
→ **409** (conflict de stare). Test `test_delete_cross_account_sent_404`.
- [ ] `POST /v1/prezentari/{id}/repune` → randul devine `queued` (peste helper US-009).
- [ ] Scoped strict pe contul cheii API (nu se poate atinge alt cont).
- **Verificare E2E**: cu cheia contului 2, `POST .../15/repune` → 200; worker il re-trimite (creds corecte).
@@ -302,8 +306,14 @@ parola) sa fie acceptata **pentru ca** azi un rand `error` cu aceeasi cheie o bl
- **Acceptance criteria**:
- [ ] La enqueue, daca randul existent cu aceeasi `idempotency_key` e `error`:
se RE-ACTIVEAZA acelasi rand (re-ruleaza `classify`, **actualizeaza `rar_creds_enc`**
cu creds-urile noi din cerere, reset `retry_count`/`next_attempt_at`), si raspunsul
NU mai e `deduped: true` ci starea noua (ex. `queued`).
cu creds-urile noi din cerere, reset `retry_count`/`next_attempt_at`, **`purge_after=NULL`**),
si raspunsul poarta **camp aditiv `reactivated: true`** + starea noua (ex. `queued`);
`deduped` ramane cu semantica actuala (decizie /autoplan #19, NU se repurpose-aza).
- [ ] **Reactivarea e un UPDATE compare-and-swap** (`WHERE id=? AND status='error'`); daca
`rowcount==0` (alt POST/requeue a schimbat starea intre timp) -> raspuns dedup pe starea curenta.
Worker-ul **invalideaza sesiunea RAR cache-uita** a contului cand randul claim-uit poarta
`rar_creds_enc != NULL` (altfel JWT vechi 30h trimite cu parola gresita — vezi T1 anexa).
- [ ] Creds noi se propaga si in **`accounts.rar_creds_enc`** (canal web durabil, decizie #17).
- [ ] Pentru `sent`/`queued`/`sending`: comportament neschimbat → `deduped: true`
(nu cream dubluri, nu deranjam in-flight/trimise).
- [ ] `needs_data`/`needs_mapping`: raman `deduped` la resubmit (decizie §5) — corectia
@@ -413,7 +423,157 @@ Val B: [US-010 API lifecycle] [US-011 UI lifecycle] [US-014 banner actionabil]
---
## Anexa /autoplan — Raport de review (2026-06-23)
> Generat de `/autoplan` (CEO -> Design -> Eng -> DX), commit `f48346d`, branch `main`.
> Voci: Claude subagent per faza + Codex. **Codex INDISPONIBIL** (usage limit la runtime)
> -> toate fazele ruleaza `[subagent-only]`. Premisa "app_events table + tab Jurnal"
> confirmata de utilizator la poarta de premise (vs alternativa stdout-first).
> Restore point: vezi comentariul HTML din capul fisierului.
### Consensus tables (Codex = N/A, subagent-only)
```
CEO: 1 premise flagged (substrate, CONFIRMAT keep) · 3 right-problem/scope · 4 alt-uri necomparate
DESIGN: 3 high (poll vs select, deep-link inexistent, banner->panel) · stari lipsa
ENG: 2 CRITICAL (US-012 race+JWT stale, purge_after) · 0 concurrency tests · WAL contention
DX: 5 high (500 envelope 6 chei, 403/404 oracle, deduped breaking, docs, rar_error allowlist)
```
### Diagrame
US-012 reactivare `error` — masina de stari + cursa (fix necesar T1):
```
POST /v1/prezentari (acelasi payload, parola corectata)
|
v
SELECT status WHERE idempotency_key=? ---- error ----> UPDATE ... SET status='queued',
| rar_creds_enc=<nou>, retry=0,
| sent/queued/sending/needs_* next_attempt_at=NULL,
v purge_after=NULL
deduped:true (neschimbat) |
v
CURSA (fara CAS): worker.claim_one (BEGIN IMMEDIATE) queued->sending
CURSA (JWT): AccountSessions[account_id] are token vechi (30h) din creds GRESITE
-> trimite cu parola veche, ignora corectia <-- BUG CENTRAL
FIX: UPDATE ... WHERE id=? AND status='error' (CAS; rowcount 0 -> deduped) +
la claim, daca randul poarta rar_creds_enc != NULL -> sessions.invalidate(account_id)
```
Retentie / purjare (fix T2):
```
mark(sent) -> purge_after = now + 90z (existent)
mark(blocate) -> purge_after = now + 30z (US-013 nou; error/needs_data/needs_mapping)
reactivare/ -> purge_after = NULL (US-009/012; ALTFEL purjat inainte de claim)
re-pune coada
purge_expired WHERE purge_after<now AND status IN ('sent','error','needs_data','needs_mapping')
EXCLUDE explicit 'queued'/'sending'
```
### Failure Modes Registry (noi, din review)
```
CODEPATH | FAILURE MODE | RESCUED? | TEST? | USER SEES | LOGGED?
---------------------------------|-------------------------------|----------|-------|------------------|--------
create_prezentari reactivare | cursa cu claim_one / 2x POST | FIX T1 | FIX T3| queued det. | US-004
worker JWT cache dupa creds noi | trimite cu parola veche | FIX T1 | FIX T3| ramane error | US-005 <- CRITICAL
reactivare fara purge_after=NULL | purjat inainte de claim | FIX T2 | FIX T3| dispare tacit | US-005 <- CRITICAL
log_event own-conn pe hot path | WAL write-lock pana la 15s | FIX T4 | da | latenta POST | -
RotatingFileHandler 2 procese | rotatie rename race | FIX T5 | n/a | log corupt | -
500 envelope 4 chei | parser client crapa pe 5xx | FIX T7 | da | KeyError client | US-001
403 sent vs 404 cross-acct | oracol de existenta | FIX TD2 | da | leak | US-004
bulk select vs poll 15s | selectie stearsa mid-actiune | FIX T12 | da | frustrare | -
deep-link status inexistent | banner duce la lista nefiltr. | FIX T13 | da | dead-end | -
```
### Decision Audit Trail (auto-decis cu cele 6 principii)
| # | Faza | Decizie | Clasificare | Principiu | Rationament |
|---|------|---------|-------------|-----------|-------------|
| 1 | CEO | Premisa app_events table + tab | GATE (user) | - | Confirmat de utilizator: web-visibility e scop de produs (operator fara SSH) |
| 2 | Eng | US-012 = CAS guarded + invalidare sesiune worker la creds noi (T1) | Mechanical | P1 completeness | Bug central; fara el US-012 nu-si atinge scopul |
| 3 | Eng | reactivare/requeue purge_after=NULL; purge exclude queued/sending (T2) | Mechanical | P1 | Altfel randul reactivat e purjat tacit |
| 4 | Eng | teste concurenta + purge-before-claim (T3) | Mechanical | P1 well-tested | Lista de teste US-012 era single-thread |
| 5 | Eng | log_event(conn opt) reuse hot-path (T4) | Mechanical | P3 pragmatic | Evita contentie WAL |
| 6 | Eng | log-uri per-proces api.log/worker.log (T5) | Mechanical | P5 explicit | RotatingFileHandler nu e multiproces-safe |
| 7 | Eng | vin_partial() + context curat (T6) | Mechanical | P1 | scrub() nu acopera VIN (US-007) |
| 8 | DX | EROARE_INTERNA in CATALOG; 500 = 6 chei + request_id (T7) | Mechanical | P1 | Contract 6 chei (PRD 5.4) |
| 9 | DX | X-Request-ID pe TOATE raspunsurile (T8) | Mechanical | P1 | Corelare si pe 422/401/404 |
| 10 | DX | rar_error in _PREZENTARE_FIELDS (T9) | Mechanical | P6 action | Recovery API observabil fara dashboard |
| 11 | DX | update api-rar-contract.md + reconcile de-scope (T10) | Mechanical | P1 | Sursa de adevar trebuie sa includa endpointurile noi |
| 12 | DX | DELETE -> 200+JSON, nu 204 (T11) | Mechanical | P5 | Consistent cu restul v1; clienti VFP string-parse |
| 13 | Design | poll vs bulk-select rezolvat (T12) | Mechanical | P1 | Selectie stearsa la 15s = defect |
| 14 | Design | plumbing deep-link status (T13) | Mechanical | P1 | Destinatia US-014 nu exista azi |
| 15 | Design | banner -> panou detaliu (T14) | Mechanical | P3 | Duce direct la butonul de actiune |
| 16 | Design | stari empty/loading/partial + collision checkbox (T15) | Mechanical | P1 | Acoperire stari = scope, nu afterthought |
| 17 | CEO | **REZOLVAT: DA** — resubmit/requeue cu creds noi reimprospateaza si `accounts.rar_creds_enc` (T16) | Taste | P1 | Utilizator: ambele canale converg pe parola corectata |
| 18 | CEO | **REZOLVAT: pastram bundled, lifecycle (Val A) PRIMUL** | Taste | P6 | Utilizator: §6 izoleaza deja valurile; overhead minim pe PRD aprobat |
| 19 | DX | **REZOLVAT: camp aditiv `reactivated:true`** (NU repurpose deduped) | Taste | P5 | Utilizator: backward-compat pentru clienti care testeaza `deduped` |
| 20 | DX | **REZOLVAT: cross-account 404 INAINTE de verificare status**; own-account sent/sending -> 409 | Taste(sec) | P1 | Utilizator: inchide oracolul de existenta (B3) |
### Decizii /autoplan rezolvate la poarta finala (2026-06-23, obligatorii pentru executie)
- **Bundling [#18]**: PRD 5.6 ramane unitar; ordinea de executie pune **Val A (lifecycle: US-009/012/013/011/014) inaintea** observabilitatii. Un singur VERIFY.
- **US-012 raspuns [#19]**: la reactivarea unui rand `error` se intoarce camp **aditiv `reactivated: true`** pe `SubmissionResult` (NU se repurpose-aza `deduped`). `deduped` ramane cu semantica actuala; clientii vechi nu se sparg. Update `app/models.py` + contract.
- **US-010 coduri [#20]**: scope-ul (cross-account) se evalueaza **inaintea** starii. Cross-account / inexistent -> **404** (acelasi mesaj, B3). Own-account `sent`/`sending` -> **409** (conflict de stare, nu 403). Test nou `test_delete_cross_account_sent_404`.
- **US-009/012 creds [#17]**: cand resubmit/requeue aduce creds noi, se reimprospateaza si `accounts.rar_creds_enc` (canalul web durabil), nu doar `submissions.rar_creds_enc`. Combinat cu invalidarea sesiunii worker (T1).
### Implementation Tasks (auto-generate, vezi JSONL ~/.gstack/projects/romfast-rar-autopass/)
P1 (blocheaza ship): T1 (US-012 CAS+sesiune), T2 (purge_after), T3 (teste concurenta), T4 (log_event conn),
T5 (log per-proces), T7 (500 6-chei), T8 (X-Request-ID global), T9 (rar_error allowlist), T10 (docs contract),
T12 (poll vs select), T13 (deep-link).
P2 (acelasi branch): T6 (vin_partial), T11 (DELETE 200+body), T14 (banner->panou), T15 (stari UI), T16 (creds web).
### Completion Summaries
```
CEO | premise 1 (confirmat keep) · right-problem OK (lifecycle=10x) · 1 challenge bundling · F6 creds web
DESIGN | 3 high (poll/select, deep-link, banner) · stari lipsa · checkbox collision · AA de verificat
ENG | 2 CRITICAL (race+JWT, purge) · 0 concurrency tests · WAL contention · IDOR ordine 404
DX | 5 high (500 envelope, oracle, deduped, docs, rar_error) · recovery matrix per-stare de documentat
Lake | toate auto-deciziile au ales optiunea completa (16/16 mechanical = ADD/fix complet)
```
## Raport VERIFY
> Completat de subagentul verificator (context curat) in faza VERIFY — vezi ROADMAP §5.6.
> PASS/FAIL per criteriu, cu dovezi (output pytest citat, E2E pe RAR test). Lipseste pana la VERIFY.
> Executie completa 2026-06-23 (TDD, RED->GREEN per story). Toate cele 14 stories livrate.
### Rezultat teste
`python3 -m pytest -q` -> **741 passed, 0 failed** (~64s). Baseline inainte de 5.6: 561 teste
(restul de 114 "esecuri" de la pornire erau artefact de mediu — `.env`-ul de testare live are
`AUTOPASS_REQUIRE_API_KEY=true`; rulat cu override-urile standard de test, baseline-ul e verde).
Teste noi adaugate (toate verzi):
- US-001 `tests/test_error_handler.py` (5) — 500 structurat 6-chei + request_id, fara traceback/creds.
- US-002 `tests/test_request_id.py` (4) — X-Request-ID pe toate raspunsurile, contextvar.
- US-003 `tests/test_observ.py` (4) — dublu canal DB+fisier, redactare, nivel din env, best-effort.
- US-004 `tests/test_audit_api.py` (3) — `api_prezentari` (count+distributie), `api_auth_esuat` (IP+prefix).
- US-005 `tests/test_worker_observ.py` (3) — `rar_login` ok/esuat fara parola, tranzitii sent/error.
- US-007 `tests/test_jurnal_redactare.py` (4) — parola/token/VIN niciodata integral; fuzz chei sensibile.
- US-006 `tests/test_web_jurnal.py` (5) — scope non-admin/admin, filtru tip/nivel/cont, deep-link tab.
- US-008 `tests/test_jurnal_retentie.py` (5) — purge_after pe app_events, purjare, RotatingFileHandler.
- US-009 `tests/test_submissions_admin.py` (6) — sterge/repune scoped, 404 cross-account, classify la repune.
- US-010 `tests/test_api_lifecycle.py` (7) — DELETE/repune 200+JSON, scope-before-state (404 vs 409).
- US-011 `tests/test_web_lifecycle.py` (7) — butoane doar pe blocate, CSRF, bulk scoped.
- US-012 `tests/test_dedup_error.py` (5) — reactivare peste `error` + `reactivated:true`, creds noi; sent/queued/needs_* raman deduped.
- US-013 `tests/test_purge_blocate.py` (5) — purge_after pe blocate (30z), purjare exclude queued/sending.
- US-014 `tests/test_web_status_fragment.py` (+3) — categorie linkeaza la lista filtrata, identificator partial, scope.
### Fix-uri tehnice cheie (din /autoplan)
- **T1 (CRITICAL)**: reactivarea e UPDATE compare-and-swap (`WHERE id=? AND status='error'`);
worker-ul invalideaza sesiunea RAR cache-uita cand randul claim-uit poarta `rar_creds_enc != NULL`
(JWT vechi 30h din parola gresita nu mai trimite). Creds noi se propaga si in `accounts.rar_creds_enc`.
- **T2**: reactivare/requeue seteaza `purge_after=NULL`; `purge_expired` exclude explicit `queued`/`sending`.
- **T7**: 500 = envelope 6-chei (catalog) + `request_id`. **T8**: X-Request-ID pe TOATE raspunsurile (middleware).
- **T9**: `rar_error` in allowlist-ul `GET /v1/prezentari/{id}` (recovery observabil; test vechi actualizat).
### Note
- Teste modificate intentionat (comportament schimbat de PRD): `test_t16_purjare` (error primeste acum
purge_after — US-013), `test_get_scope_prezentari` (`rar_error` expus acum — T9).
- E2E live pe RAR test: NEPROBAT in aceasta sesiune (necesita creds RAR test + `--send`). Backend-ul de
trimitere e neatins ca logica; modificarile worker sunt aditive (evenimente + invalidare sesiune la creds noi).
Recomandat la deploy: o trimitere `--send` pentru a confirma `rar_login` ok + `submission_sent` in jurnal.