refactor(tests): move OCR benchmark results to tests directory

- Move docs/OCR_TEST_RESULTS.md → tests/ocr-validation/BENCHMARK_RESULTS.md
- Keep docs/OCR_MEMORY_SOLUTIONS_RESEARCH.md as technical reference

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
Claude Agent
2026-01-22 09:15:25 +00:00
parent 62f86250cc
commit 3baac5f03e

View File

@@ -0,0 +1,106 @@
# OCR Test Results - docTR+ Engine
**Date:** 2026-01-02 | **Receipts:** 26 | **Test:** Sequential
## Summary Comparison
| Workers | Avg | Total | Mem Used | Mem Avail |
|---------|-----|-------|----------|-----------|
| 1 | 6.8s | 176s | 3.2GB | 4.1GB |
| 2 | 7.2s | 187s | 3.1GB | 4.1GB |
| 3 | 6.8s | 176s | 3.9GB | 3.3GB |
**Success Rate:** 80.8% (21/26) - same for all configs
**Note:** For sequential tests, 1 worker ≈ 3 workers speed!
Multiple workers only help with parallel requests.
## Detailed Results (1 Worker)
| # | Receipt | Time | Tier | Result | Notes |
|---|---------|------|------|--------|-------|
| 01 | abonament kineterra | 6.8s | T1 | ✓ | 97% |
| 02 | benzina 14 august | 6.0s | T1 | ✓ | 83% |
| 03 | benzina 27 octombrie | 5.9s | T1 | ✓ | 83% |
| 04 | igiena 11 octombrie | 7.7s | T1 | ✓ | 97% |
| 05 | igiena 14 dec five-holding | 11.5s | T1+T2 | ✗ | TOTAL ±1 |
| 06 | rechizite 12 dec pictus | 5.9s | T1 | ✓ | 97% |
| 07 | benzina 10 mai 2025 | 5.1s | T1 | ✓ | 83% |
| 08 | brick consumabil 604 50% | 4.8s | T1 | ✓ | 97% |
| 09 | benzina 13 septembrie | 4.9s | T1 | ✓ | 83% |
| 10 | brick consumabile 604 | 5.3s | T1 | ✓ | 97% |
| 11 | benzina 20 dec | 5.8s | T1 | ✓ | 79% |
| 12 | bon fiscal Dedeman | 5.7s | T1 | ✓ | 90% |
| 13 | factura Dedeman | 6.8s | T1 | ✓ | 97% |
| 14 | benzina 13 iulie | 5.7s | T1 | ✓ | 95% |
| 15 | best print stampila | 4.5s | T1 | ✓ | 94% |
| 16 | electrobering telecomanda | 4.8s | T1 | ✓ | 97% |
| 17 | brick igiena 8 oct | 11.9s | T1+T2 | ✗ | TOTAL/CUI |
| 18 | gama ink refill toner | 5.9s | T1 | ✓ | 94% |
| 19 | kineterra fizioterapie | 4.6s | T1 | ✓ | 97% |
| 20 | brick igiena 1 sept | 12.5s | T1+T2 | ✗ | ALL None |
| 21 | kineterra abonament | 5.6s | T1 | ✓ | 97% |
| 22 | brick igiena electrice | 15.9s | T1+T2 | ✗ | DATE None |
| 23 | electrobering igiena | 4.4s | T1 | ✓ | 97% |
| 24 | Lidl papetarie 604 | 5.8s | T1 | ✓ | 87% |
| 25 | brick igiena 604 | 6.8s | T1 | ✗ | DATE ±1 |
| 26 | unlimited duplicat | 4.8s | T1 | ✓ | 86% |
## Time Comparison by Receipt
| # | Receipt | 1W | 2W | 3W |
|---|---------|----|----|-----|
| 01 | abonament kineterra | 6.8s | 6.7s | 5.8s |
| 02 | benzina 14 august | 6.0s | 5.5s | 5.8s |
| 03 | benzina 27 octombrie | 5.9s | 5.9s | 5.7s |
| 04 | igiena 11 octombrie | 7.7s | 8.9s | 7.4s |
| 05 | igiena 14 dec (FAIL) | 11.5s | 12.3s | 11.9s |
| 06 | rechizite pictus | 5.9s | 5.9s | 5.7s |
| 07 | benzina 10 mai | 5.1s | 6.0s | 5.8s |
| 08 | brick 50% | 4.8s | 5.9s | 5.5s |
| 09 | benzina 13 sept | 4.9s | 5.9s | 5.3s |
| 10 | brick consumabile | 5.3s | 5.7s | 5.7s |
| 11 | benzina 20 dec | 5.8s | 5.4s | 5.8s |
| 12 | bon Dedeman | 5.7s | 5.9s | 5.8s |
| 13 | factura Dedeman | 6.8s | 6.9s | 6.8s |
| 14 | benzina 13 iulie | 5.7s | 6.1s | 5.4s |
| 15 | best print | 4.5s | 5.8s | 4.8s |
| 16 | electrobering | 4.8s | 4.2s | 4.7s |
| 17 | brick 8 oct (FAIL) | 11.9s | 13.1s | 12.0s |
| 18 | gama ink | 5.9s | 5.9s | 4.7s |
| 19 | kineterra fizioterapie | 4.6s | 5.9s | 4.8s |
| 20 | brick 1 sept (FAIL) | 12.5s | 13.2s | 13.1s |
| 21 | kineterra abonament | 5.6s | 4.9s | 4.8s |
| 22 | brick electrice (FAIL) | 15.9s | 17.0s | 15.5s |
| 23 | electrobering igiena | 4.4s | 5.4s | 5.0s |
| 24 | Lidl papetarie | 5.8s | 6.9s | 5.8s |
| 25 | brick 604 (FAIL) | 6.8s | 6.5s | 6.9s |
| 26 | unlimited duplicat | 4.8s | 5.8s | 5.0s |
|---|---------|----|----|-----|
| **AVG** | | **6.8s** | **7.2s** | **6.8s** |
| **TOTAL** | | **176s** | **187s** | **176s** |
## Tier Analysis
- **T1 only (early exit):** 21 receipts (~5-6s)
- **T1+T2 (full):** 5 receipts (~12-16s)
## Failures (5)
| Receipt | Issue | Fixable |
|---------|-------|---------|
| igiena 14 dec | TOTAL ±1 | No |
| brick 8 oct | TOTAL/CUI | Maybe |
| brick 1 sept | ALL None | No (bad doc) |
| brick electrice | DATE None | Maybe |
| brick 604 | DATE ±1 | No |
## Recommendation
```
OCR_WORKERS=1 # Best for sequential, saves RAM
OCR_WORKERS=2 # For parallel requests (production)
OCR_MAX_TASKS_PER_CHILD=0 # No restart
```
**For 8GB RAM:** Use 1-2 workers max