chore: curatare agresiva comentarii — scoatere referinte US/PRD din cod si template-uri
Eliminat zgomotul de trasabilitate (US-xxx, PRD x.x, Rn, OV-x, Tn, decizii/naratiune istorica) din 41 fisiere app/ + template-uri. Pastrate comentariile care documenteaza invarianti si logica ne-evidenta (idempotenta/hash, reconciliere anti-duplicat, RAR 500 esec definitiv, creds per cont, WAF User-Agent, 422 fara echo de parola, scope NULL->1), curatate doar de tokeni. Verificare: pentru cele 27 module .py curatate, structura de cod (tokeni non-comentariu/ non-string) e IDENTICA fata de HEAD -> doar comentarii/docstring-uri schimbate. Singura schimbare de cod e in tests/test_web_responsive.py (scos 3 assert pe markeri US-006/007/008, inlocuite de asertiunile structurale alaturate). 0 tokeni US/PRD reziduali in app/. Regresie: 896 passed, 1 deselected. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -1,6 +1,6 @@
|
||||
"""Parser fisiere xlsx/csv pentru import prezentari (Treapta 2, U1).
|
||||
"""Parser fisiere xlsx/csv pentru import prezentari (Treapta 2).
|
||||
|
||||
Arhitectura 2-treceri (Issue 2, consens cross-model):
|
||||
Arhitectura 2-treceri:
|
||||
Trecerea 1 — read_only=True: dim-check (FileTooLarge) + detectie multi-sheet.
|
||||
Trecerea 2 — normal-mode: header + merged cells + body.
|
||||
Aceasta separare e necesara deoarece openpyxl read_only=True nu vede celule imbinate.
|
||||
@@ -29,7 +29,7 @@ from typing import Any, NamedTuple
|
||||
MAX_ROWS = 5_000
|
||||
MAX_BYTES = 5 * 1024 * 1024 # 5 MB
|
||||
|
||||
# Prag rata None pe o coloana obligatorie -> mesaj formule necalculate (Issue 3)
|
||||
# Prag rata None pe o coloana obligatorie -> mesaj formule necalculate
|
||||
FORMULA_NONE_RATE = 0.6
|
||||
|
||||
# Coloane cheie pentru detectia footer-ului (trim structural)
|
||||
@@ -82,7 +82,7 @@ class ParsedFile(NamedTuple):
|
||||
columns: list[str] # Numele coloanelor detectate (din header)
|
||||
rows: list[dict[str, Any]] # Fiecare rand: {coloana: valoare_bruta}
|
||||
coercion_flags: dict[int, list[str]] # {row_index: [motive needs_review]}
|
||||
formula_columns: list[str] # Coloane cu rata None ridicata (Issue 3)
|
||||
formula_columns: list[str] # Coloane cu rata None ridicata
|
||||
date_col_format: dict[str, str] # {coloana: "DD.MM.YYYY" | "YYYY-MM-DD" | "native" | "ambiguous"}
|
||||
|
||||
|
||||
@@ -230,13 +230,13 @@ def _xlsx_parse_sheet(ws, sheet_name: str) -> ParsedFile:
|
||||
# Trim footer: elimina randuri trailing unde coloanele cheie sunt goale
|
||||
raw_rows = _trim_footer(raw_rows, col_names)
|
||||
|
||||
# Detectie coloane cu formule (rata None, Issue 3)
|
||||
# Detectie coloane cu formule (rata None ridicata)
|
||||
formula_columns = _detect_formula_columns(col_values, len(raw_rows))
|
||||
|
||||
# Detectie format data la nivel de coloana (T10/OV-8)
|
||||
# Detectie format data la nivel de coloana
|
||||
date_col_format = _detect_date_formats(col_values, col_names)
|
||||
|
||||
# Coercion + flags needs_review (T3)
|
||||
# Coercion + flags needs_review
|
||||
coercion_flags: dict[int, list[str]] = {}
|
||||
processed_rows: list[dict[str, Any]] = []
|
||||
for i, row_dict in enumerate(raw_rows):
|
||||
@@ -289,7 +289,7 @@ def _trim_footer(rows: list[dict[str, Any]], col_names: list[str]) -> list[dict[
|
||||
|
||||
|
||||
# --------------------------------------------------------------------------- #
|
||||
# Detectie coloane formule (Issue 3) #
|
||||
# Detectie coloane formule #
|
||||
# --------------------------------------------------------------------------- #
|
||||
|
||||
def _detect_formula_columns(col_values: dict[str, list[Any]], n_rows: int) -> list[str]:
|
||||
@@ -306,7 +306,7 @@ def _detect_formula_columns(col_values: dict[str, list[Any]], n_rows: int) -> li
|
||||
|
||||
|
||||
# --------------------------------------------------------------------------- #
|
||||
# Dezambiguizare data la nivel de coloana (T10 / OV-8) #
|
||||
# Dezambiguizare data la nivel de coloana #
|
||||
# --------------------------------------------------------------------------- #
|
||||
|
||||
def _detect_date_formats(col_values: dict[str, list[Any]], col_names: list[str]) -> dict[str, str]:
|
||||
@@ -344,7 +344,7 @@ def _detect_date_formats(col_values: dict[str, list[Any]], col_names: list[str])
|
||||
result[col_name] = "mixed"
|
||||
continue
|
||||
|
||||
# Toate string — detectie format la nivel de coloana (OV-8)
|
||||
# Toate string — detectie format la nivel de coloana
|
||||
fmt = _infer_date_format_from_column(str_vals)
|
||||
result[col_name] = fmt
|
||||
|
||||
@@ -354,7 +354,7 @@ def _detect_date_formats(col_values: dict[str, list[Any]], col_names: list[str])
|
||||
def _infer_date_format_from_column(str_vals: list[str]) -> str:
|
||||
"""Detecteaza formatul datei dintr-o lista de valori string.
|
||||
|
||||
Logica OV-8: daca ORICARE rand are token pozitia-1 > 12 -> coloana e DD-first.
|
||||
Daca ORICARE rand are token pozitia-1 > 12 -> coloana e DD-first.
|
||||
Daca toti zi <= 12 -> ambiguu.
|
||||
"""
|
||||
dd_first_evidence = False
|
||||
@@ -421,7 +421,7 @@ def _split_date(s: str) -> list[str] | None:
|
||||
|
||||
|
||||
# --------------------------------------------------------------------------- #
|
||||
# Coercion per rand (T3) #
|
||||
# Coercion per rand #
|
||||
# --------------------------------------------------------------------------- #
|
||||
|
||||
def _coerce_row(row_dict: dict[str, Any], col_names: list[str]) -> tuple[dict[str, Any], list[str]]:
|
||||
@@ -682,7 +682,7 @@ def parse_csv(data: bytes) -> ParsedFile:
|
||||
def parse_xlsx(data: bytes, *, sheet_name: str | None = None) -> ParsedFile:
|
||||
"""Parseaza un fisier XLSX.
|
||||
|
||||
Arhitectura 2-treceri (Issue 2):
|
||||
Arhitectura 2-treceri:
|
||||
1. read_only=True: dim-check + detectie multi-sheet
|
||||
2. normal-mode: header + merged cells + body
|
||||
|
||||
|
||||
Reference in New Issue
Block a user