Files
vending_data_intelligence_r…/CONTEXT_HANDOVER_20251211_v2.md
Marius Mutu 9e9ddec014 Implement Dashboard consolidation + Performance logging
Features:
- Add unified "Dashboard Complet" sheet (Excel) with all 9 sections
- Add unified "Dashboard Complet" page (PDF) with key metrics
- Fix VALOARE_ANTERIOARA NULL bug (use sumar_executiv_yoy directly)
- Add PerformanceLogger class for timing analysis
- Remove redundant consolidated sheets (keep only Dashboard Complet)

Bug fixes:
- Fix Excel formula error (=== interpreted as formula, changed to >>>)
- Fix args.output → args.output_dir in perf.summary()

Performance analysis:
- Add PERFORMANCE_ANALYSIS.md with detailed breakdown
- SQL queries take 94% of runtime (31 min), Excel/PDF only 1%
- Identified slow queries for optimization

Documentation:
- Update CLAUDE.md with new structure
- Add context handover for query optimization task

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-11 13:33:02 +02:00

4.7 KiB

Context Handover - Query Optimization (11 Dec 2025 - v2)

Session Summary

This session accomplished:

  1. Fixed VALOARE_ANTERIOARA NULL bug (used sumar_executiv_yoy directly)
  2. Created unified "Dashboard Complet" sheet/page
  3. Added PerformanceLogger for timing analysis
  4. Fixed Excel formula error (===>>>)
  5. Removed redundant consolidated sheets/pages
  6. Created PERFORMANCE_ANALYSIS.md with findings

Critical Finding: SQL Queries Are The Bottleneck

Total runtime: ~33 minutes

  • SQL Queries: 31 min (94%)
  • Excel/PDF: 15 sec (1%)

Top Slow Queries (all 60-130 seconds for tiny results):

Query Duration Rows Issue
clienti_sub_medie 130.63s 100 Uses complex views
sumar_executiv_yoy 129.05s 5 YoY 24-month scan
vanzari_lunare 129.90s 25 Monthly aggregation
indicatori_agregati_venituri_yoy 129.31s 3 YoY comparison

Root Cause: Views vs Base Tables

The current queries use complex views like fact_vfacturi2, fact_vfacturi_detalii, vnom_articole, vnom_parteneri.

These views likely contain:

  • Multiple nested JOINs
  • Calculated columns
  • No index utilization

Solution: Use base tables directly: VANZARI, VANZARI_DETALII, NOM_PARTENERI, etc.


Example Optimization: CLIENTI_SUB_MEDIE

Current Query (uses views - 130 seconds):

Located in queries.py around line 600-650.

Optimized Query (uses base tables - should be <5 seconds):

WITH preturi_medii AS (
    SELECT
        d.id_articol,
        AVG(CASE WHEN d.pret_cu_tva = 1 THEN d.pret / (1 + d.proc_tvav/100) ELSE d.pret END) AS pret_mediu
    FROM VANZARI f
    JOIN VANZARI_DETALII d ON d.id_vanzare = f.id_vanzare
    WHERE f.sters = 0 AND d.sters = 0
      AND f.tip > 0 AND f.tip NOT IN (7, 8, 9, 24)
      AND f.data_act >= ADD_MONTHS(TRUNC(SYSDATE), -24)
      AND d.pret > 0
    GROUP BY d.id_articol
),
preturi_client AS (
    SELECT
        d.id_articol,
        f.id_part,
        p.denumire as client,
        AVG(CASE WHEN d.pret_cu_tva = 1 THEN d.pret / (1 + d.proc_tvav/100) ELSE d.pret END) AS pret_client,
        SUM(d.cantitate) AS cantitate_totala
    FROM VANZARI f
    JOIN VANZARI_DETALII d ON d.id_vanzare = f.id_vanzare
    JOIN NOM_PARTENERI P on f.id_part = p.id_part
    WHERE f.sters = 0 AND d.sters = 0
      AND f.tip > 0 AND f.tip NOT IN (7, 8, 9, 24)
      AND f.data_act >= ADD_MONTHS(TRUNC(SYSDATE), -24)
      AND d.pret > 0
    GROUP BY d.id_articol, f.id_part, p.denumire
)
SELECT
    a.denumire AS produs,
    pc.client,
    ROUND(pc.pret_client, 2) AS pret_platit,
    ROUND(pm.pret_mediu, 2) AS pret_mediu,
    ROUND((pm.pret_mediu - pc.pret_client) * 100.0 / pm.pret_mediu, 2) AS discount_vs_medie,
    pc.cantitate_totala
FROM preturi_client pc
JOIN preturi_medii pm ON pm.id_articol = pc.id_articol
JOIN vnom_articole a ON a.id_articol = pc.id_articol
WHERE pc.pret_client < pm.pret_mediu * 0.85
ORDER BY discount_vs_medie DESC
FETCH FIRST 100 ROWS ONLY

Key Differences:

  1. Uses VANZARI instead of fact_vfacturi2
  2. Uses VANZARI_DETALII instead of fact_vfacturi_detalii
  3. Uses NOM_PARTENERI instead of vnom_parteneri
  4. Column names differ: id_vanzare vs nrfactura, data_act vs data
  5. Direct JOIN on IDs instead of view abstractions

Task for Next Session: Optimize All Slow Queries

Priority 1 - Rewrite using base tables:

  1. clienti_sub_medie (130s) - example above
  2. sumar_executiv (130s)
  3. sumar_executiv_yoy (129s)
  4. vanzari_lunare (130s)
  5. indicatori_agregati_venituri_yoy (129s)

Priority 2 - YoY optimization:

  • Pre-calculate previous year metrics in single CTE
  • Avoid scanning same data twice

Steps:

  1. Read current query in queries.py
  2. Identify view → base table mappings
  3. Rewrite with base tables
  4. Test performance improvement
  5. Repeat for all slow queries

Key Files

File Purpose
queries.py All SQL queries - constants like CLIENTI_SUB_MEDIE
main.py Execution with PerformanceLogger
PERFORMANCE_ANALYSIS.md Detailed timing analysis

Base Table → View Mapping (to discover)

Need to examine Oracle schema to find exact mappings:

  • VANZARIfact_vfacturi2?
  • VANZARI_DETALIIfact_vfacturi_detalii?
  • NOM_PARTENERIvnom_parteneri?
  • NOM_ARTICOLEvnom_articole?

Column mappings:

  • id_vanzarenrfactura?
  • data_actdata?
  • id_partid_partener?

Test Command

cd /mnt/e/proiecte/vending/data_intelligence_report
.\run.bat
# Check output/performance_log.txt for timing

Success Criteria

Reduce total query time from 31 minutes to <5 minutes by using base tables instead of views.