Implement Dashboard consolidation + Performance logging

Features:
- Add unified "Dashboard Complet" sheet (Excel) with all 9 sections
- Add unified "Dashboard Complet" page (PDF) with key metrics
- Fix VALOARE_ANTERIOARA NULL bug (use sumar_executiv_yoy directly)
- Add PerformanceLogger class for timing analysis
- Remove redundant consolidated sheets (keep only Dashboard Complet)

Bug fixes:
- Fix Excel formula error (=== interpreted as formula, changed to >>>)
- Fix args.output → args.output_dir in perf.summary()

Performance analysis:
- Add PERFORMANCE_ANALYSIS.md with detailed breakdown
- SQL queries take 94% of runtime (31 min), Excel/PDF only 1%
- Identified slow queries for optimization

Documentation:
- Update CLAUDE.md with new structure
- Add context handover for query optimization task

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
2025-12-11 13:33:02 +02:00
parent a2ad4c7ed2
commit 9e9ddec014
20 changed files with 2400 additions and 959 deletions

View File

@@ -0,0 +1,162 @@
# Context Handover - Query Optimization (11 Dec 2025 - v2)
## Session Summary
This session accomplished:
1. ✅ Fixed VALOARE_ANTERIOARA NULL bug (used `sumar_executiv_yoy` directly)
2. ✅ Created unified "Dashboard Complet" sheet/page
3. ✅ Added PerformanceLogger for timing analysis
4. ✅ Fixed Excel formula error (`===``>>>`)
5. ✅ Removed redundant consolidated sheets/pages
6. ✅ Created PERFORMANCE_ANALYSIS.md with findings
## Critical Finding: SQL Queries Are The Bottleneck
**Total runtime: ~33 minutes**
- SQL Queries: 31 min (94%)
- Excel/PDF: 15 sec (1%)
### Top Slow Queries (all 60-130 seconds for tiny results):
| Query | Duration | Rows | Issue |
|-------|----------|------|-------|
| `clienti_sub_medie` | 130.63s | 100 | Uses complex views |
| `sumar_executiv_yoy` | 129.05s | 5 | YoY 24-month scan |
| `vanzari_lunare` | 129.90s | 25 | Monthly aggregation |
| `indicatori_agregati_venituri_yoy` | 129.31s | 3 | YoY comparison |
---
## Root Cause: Views vs Base Tables
The current queries use complex views like `fact_vfacturi2`, `fact_vfacturi_detalii`, `vnom_articole`, `vnom_parteneri`.
**These views likely contain:**
- Multiple nested JOINs
- Calculated columns
- No index utilization
**Solution:** Use base tables directly: `VANZARI`, `VANZARI_DETALII`, `NOM_PARTENERI`, etc.
---
## Example Optimization: CLIENTI_SUB_MEDIE
### Current Query (uses views - 130 seconds):
Located in `queries.py` around line 600-650.
### Optimized Query (uses base tables - should be <5 seconds):
```sql
WITH preturi_medii AS (
SELECT
d.id_articol,
AVG(CASE WHEN d.pret_cu_tva = 1 THEN d.pret / (1 + d.proc_tvav/100) ELSE d.pret END) AS pret_mediu
FROM VANZARI f
JOIN VANZARI_DETALII d ON d.id_vanzare = f.id_vanzare
WHERE f.sters = 0 AND d.sters = 0
AND f.tip > 0 AND f.tip NOT IN (7, 8, 9, 24)
AND f.data_act >= ADD_MONTHS(TRUNC(SYSDATE), -24)
AND d.pret > 0
GROUP BY d.id_articol
),
preturi_client AS (
SELECT
d.id_articol,
f.id_part,
p.denumire as client,
AVG(CASE WHEN d.pret_cu_tva = 1 THEN d.pret / (1 + d.proc_tvav/100) ELSE d.pret END) AS pret_client,
SUM(d.cantitate) AS cantitate_totala
FROM VANZARI f
JOIN VANZARI_DETALII d ON d.id_vanzare = f.id_vanzare
JOIN NOM_PARTENERI P on f.id_part = p.id_part
WHERE f.sters = 0 AND d.sters = 0
AND f.tip > 0 AND f.tip NOT IN (7, 8, 9, 24)
AND f.data_act >= ADD_MONTHS(TRUNC(SYSDATE), -24)
AND d.pret > 0
GROUP BY d.id_articol, f.id_part, p.denumire
)
SELECT
a.denumire AS produs,
pc.client,
ROUND(pc.pret_client, 2) AS pret_platit,
ROUND(pm.pret_mediu, 2) AS pret_mediu,
ROUND((pm.pret_mediu - pc.pret_client) * 100.0 / pm.pret_mediu, 2) AS discount_vs_medie,
pc.cantitate_totala
FROM preturi_client pc
JOIN preturi_medii pm ON pm.id_articol = pc.id_articol
JOIN vnom_articole a ON a.id_articol = pc.id_articol
WHERE pc.pret_client < pm.pret_mediu * 0.85
ORDER BY discount_vs_medie DESC
FETCH FIRST 100 ROWS ONLY
```
### Key Differences:
1. Uses `VANZARI` instead of `fact_vfacturi2`
2. Uses `VANZARI_DETALII` instead of `fact_vfacturi_detalii`
3. Uses `NOM_PARTENERI` instead of `vnom_parteneri`
4. Column names differ: `id_vanzare` vs `nrfactura`, `data_act` vs `data`
5. Direct JOIN on IDs instead of view abstractions
---
## Task for Next Session: Optimize All Slow Queries
### Priority 1 - Rewrite using base tables:
1. `clienti_sub_medie` (130s) - example above
2. `sumar_executiv` (130s)
3. `sumar_executiv_yoy` (129s)
4. `vanzari_lunare` (130s)
5. `indicatori_agregati_venituri_yoy` (129s)
### Priority 2 - YoY optimization:
- Pre-calculate previous year metrics in single CTE
- Avoid scanning same data twice
### Steps:
1. Read current query in `queries.py`
2. Identify view → base table mappings
3. Rewrite with base tables
4. Test performance improvement
5. Repeat for all slow queries
---
## Key Files
| File | Purpose |
|------|---------|
| `queries.py` | All SQL queries - constants like `CLIENTI_SUB_MEDIE` |
| `main.py` | Execution with PerformanceLogger |
| `PERFORMANCE_ANALYSIS.md` | Detailed timing analysis |
---
## Base Table → View Mapping (to discover)
Need to examine Oracle schema to find exact mappings:
- `VANZARI``fact_vfacturi2`?
- `VANZARI_DETALII``fact_vfacturi_detalii`?
- `NOM_PARTENERI``vnom_parteneri`?
- `NOM_ARTICOLE``vnom_articole`?
Column mappings:
- `id_vanzare``nrfactura`?
- `data_act``data`?
- `id_part``id_partener`?
---
## Test Command
```bash
cd /mnt/e/proiecte/vending/data_intelligence_report
.\run.bat
# Check output/performance_log.txt for timing
```
---
## Success Criteria
Reduce total query time from 31 minutes to <5 minutes by using base tables instead of views.