diff --git a/tests/ocr-validation/BENCHMARK_RESULTS.md b/tests/ocr-validation/BENCHMARK_RESULTS.md deleted file mode 100644 index 3caf300..0000000 --- a/tests/ocr-validation/BENCHMARK_RESULTS.md +++ /dev/null @@ -1,106 +0,0 @@ -# OCR Test Results - docTR+ Engine - -**Date:** 2026-01-02 | **Receipts:** 26 | **Test:** Sequential - -## Summary Comparison - -| Workers | Avg | Total | Mem Used | Mem Avail | -|---------|-----|-------|----------|-----------| -| 1 | 6.8s | 176s | 3.2GB | 4.1GB | -| 2 | 7.2s | 187s | 3.1GB | 4.1GB | -| 3 | 6.8s | 176s | 3.9GB | 3.3GB | - -**Success Rate:** 80.8% (21/26) - same for all configs - -**Note:** For sequential tests, 1 worker ≈ 3 workers speed! -Multiple workers only help with parallel requests. - -## Detailed Results (1 Worker) - -| # | Receipt | Time | Tier | Result | Notes | -|---|---------|------|------|--------|-------| -| 01 | abonament kineterra | 6.8s | T1 | ✓ | 97% | -| 02 | benzina 14 august | 6.0s | T1 | ✓ | 83% | -| 03 | benzina 27 octombrie | 5.9s | T1 | ✓ | 83% | -| 04 | igiena 11 octombrie | 7.7s | T1 | ✓ | 97% | -| 05 | igiena 14 dec five-holding | 11.5s | T1+T2 | ✗ | TOTAL ±1 | -| 06 | rechizite 12 dec pictus | 5.9s | T1 | ✓ | 97% | -| 07 | benzina 10 mai 2025 | 5.1s | T1 | ✓ | 83% | -| 08 | brick consumabil 604 50% | 4.8s | T1 | ✓ | 97% | -| 09 | benzina 13 septembrie | 4.9s | T1 | ✓ | 83% | -| 10 | brick consumabile 604 | 5.3s | T1 | ✓ | 97% | -| 11 | benzina 20 dec | 5.8s | T1 | ✓ | 79% | -| 12 | bon fiscal Dedeman | 5.7s | T1 | ✓ | 90% | -| 13 | factura Dedeman | 6.8s | T1 | ✓ | 97% | -| 14 | benzina 13 iulie | 5.7s | T1 | ✓ | 95% | -| 15 | best print stampila | 4.5s | T1 | ✓ | 94% | -| 16 | electrobering telecomanda | 4.8s | T1 | ✓ | 97% | -| 17 | brick igiena 8 oct | 11.9s | T1+T2 | ✗ | TOTAL/CUI | -| 18 | gama ink refill toner | 5.9s | T1 | ✓ | 94% | -| 19 | kineterra fizioterapie | 4.6s | T1 | ✓ | 97% | -| 20 | brick igiena 1 sept | 12.5s | T1+T2 | ✗ | ALL None | -| 21 | kineterra abonament | 5.6s | T1 | ✓ | 97% | -| 22 | brick igiena electrice | 15.9s | T1+T2 | ✗ | DATE None | -| 23 | electrobering igiena | 4.4s | T1 | ✓ | 97% | -| 24 | Lidl papetarie 604 | 5.8s | T1 | ✓ | 87% | -| 25 | brick igiena 604 | 6.8s | T1 | ✗ | DATE ±1 | -| 26 | unlimited duplicat | 4.8s | T1 | ✓ | 86% | - -## Time Comparison by Receipt - -| # | Receipt | 1W | 2W | 3W | -|---|---------|----|----|-----| -| 01 | abonament kineterra | 6.8s | 6.7s | 5.8s | -| 02 | benzina 14 august | 6.0s | 5.5s | 5.8s | -| 03 | benzina 27 octombrie | 5.9s | 5.9s | 5.7s | -| 04 | igiena 11 octombrie | 7.7s | 8.9s | 7.4s | -| 05 | igiena 14 dec (FAIL) | 11.5s | 12.3s | 11.9s | -| 06 | rechizite pictus | 5.9s | 5.9s | 5.7s | -| 07 | benzina 10 mai | 5.1s | 6.0s | 5.8s | -| 08 | brick 50% | 4.8s | 5.9s | 5.5s | -| 09 | benzina 13 sept | 4.9s | 5.9s | 5.3s | -| 10 | brick consumabile | 5.3s | 5.7s | 5.7s | -| 11 | benzina 20 dec | 5.8s | 5.4s | 5.8s | -| 12 | bon Dedeman | 5.7s | 5.9s | 5.8s | -| 13 | factura Dedeman | 6.8s | 6.9s | 6.8s | -| 14 | benzina 13 iulie | 5.7s | 6.1s | 5.4s | -| 15 | best print | 4.5s | 5.8s | 4.8s | -| 16 | electrobering | 4.8s | 4.2s | 4.7s | -| 17 | brick 8 oct (FAIL) | 11.9s | 13.1s | 12.0s | -| 18 | gama ink | 5.9s | 5.9s | 4.7s | -| 19 | kineterra fizioterapie | 4.6s | 5.9s | 4.8s | -| 20 | brick 1 sept (FAIL) | 12.5s | 13.2s | 13.1s | -| 21 | kineterra abonament | 5.6s | 4.9s | 4.8s | -| 22 | brick electrice (FAIL) | 15.9s | 17.0s | 15.5s | -| 23 | electrobering igiena | 4.4s | 5.4s | 5.0s | -| 24 | Lidl papetarie | 5.8s | 6.9s | 5.8s | -| 25 | brick 604 (FAIL) | 6.8s | 6.5s | 6.9s | -| 26 | unlimited duplicat | 4.8s | 5.8s | 5.0s | -|---|---------|----|----|-----| -| **AVG** | | **6.8s** | **7.2s** | **6.8s** | -| **TOTAL** | | **176s** | **187s** | **176s** | - -## Tier Analysis - -- **T1 only (early exit):** 21 receipts (~5-6s) -- **T1+T2 (full):** 5 receipts (~12-16s) - -## Failures (5) - -| Receipt | Issue | Fixable | -|---------|-------|---------| -| igiena 14 dec | TOTAL ±1 | No | -| brick 8 oct | TOTAL/CUI | Maybe | -| brick 1 sept | ALL None | No (bad doc) | -| brick electrice | DATE None | Maybe | -| brick 604 | DATE ±1 | No | - -## Recommendation - -``` -OCR_WORKERS=1 # Best for sequential, saves RAM -OCR_WORKERS=2 # For parallel requests (production) -OCR_MAX_TASKS_PER_CHILD=0 # No restart -``` - -**For 8GB RAM:** Use 1-2 workers max