From c56f832e8190a44ac1fa283d3ba33f55c3478a24 Mon Sep 17 00:00:00 2001 From: Marius Mutu Date: Sat, 25 Oct 2025 22:59:12 +0300 Subject: [PATCH] Add comprehensive multi-tenant architecture upgrade plan MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Creates detailed 60-page implementation roadmap for transforming ROA2WEB from single-tenant to multi-tenant SaaS architecture. Plan includes 6 phases with backward compatibility, hybrid connection support (SSH tunnel + direct), and complete deployment strategies for dev/Docker/Windows environments. Key features: - Tenant isolation with separate Oracle connection pools per tenant - Dynamic SSH tunnel management with auto-restart - Encrypted credentials in PostgreSQL/SQLite tenant config DB - JWT-based tenant identification and access validation - Redis cache namespacing per tenant - Comprehensive testing and migration strategies Timeline: 14-20 days implementation Target: <10% performance overhead, zero downtime migration 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude --- shared/docs/MULTI_TENANT_UPGRADE_PLAN.md | 2501 ++++++++++++++++++++++ shared/docs/REDIS_IMPLEMENTATION_PLAN.md | 910 ++++++++ 2 files changed, 3411 insertions(+) create mode 100644 shared/docs/MULTI_TENANT_UPGRADE_PLAN.md create mode 100644 shared/docs/REDIS_IMPLEMENTATION_PLAN.md diff --git a/shared/docs/MULTI_TENANT_UPGRADE_PLAN.md b/shared/docs/MULTI_TENANT_UPGRADE_PLAN.md new file mode 100644 index 0000000..f98ea79 --- /dev/null +++ b/shared/docs/MULTI_TENANT_UPGRADE_PLAN.md @@ -0,0 +1,2501 @@ +# Plan Upgrade Multi-Tenant Architecture - ROA2WEB + +**Version:** 1.0 +**Created:** 2025-10-25 +**Status:** Planning Phase + +--- + +## 📋 Sumar Executiv + +ROA2WEB va fi transformat de la o aplicație **single-tenant** (un singur client, o singură bază de date Oracle) la o arhitectură **multi-tenant SaaS** care suportă: + +- **Multiple clienți simultaneous** cu izolare completă între tenants (pool-uri, cache, audit logs) +- **Conexiuni hibride**: SSH tunnel pentru clienți remote SAU direct TCP pentru clienți în LAN +- **Deployment flexibil**: Development (WSL), Docker (Proxmox LXC), Windows IIS +- **Backward compatibility**: Tenant "default" funcționează exact ca single-tenant actual (zero breaking changes) +- **Gradual migration**: Fiecare fază testabilă independent, rollout incremental +- **Security-first**: Passwords encrypted în tenant DB, SSH keys read-only, JWT signing per tenant +- **Performance**: < 10% overhead vs single-tenant, izolare pool-uri per tenant + +--- + +## 🏗️ Arhitectură Target + +### Single-Tenant (Actual) + +``` +┌─────────────────────────────────────────────────────┐ +│ FastAPI Backend │ +│ │ +│ ┌─────────────────────────────────────────────┐ │ +│ │ OraclePool (Singleton) │ │ +│ │ - Hardcoded credentials din .env │ │ +│ │ - Min: 2, Max: 10 connections │ │ +│ │ - Shared pentru toți userii │ │ +│ └─────────────────────────────────────────────┘ │ +│ ▼ │ +└──────────────────────┼──────────────────────────────┘ + │ + ┌─────────────┴───────────┐ + │ │ + SSH Tunnel Direct Connection + (Development) (Windows Production) + │ │ + ▼ ▼ +┌─────────────────┐ ┌──────────────────┐ +│ Oracle Server │ │ Oracle Server │ +│ (Remote) │ │ (Local LAN) │ +└─────────────────┘ └──────────────────┘ + +JWT Token Structure (Actual): +{ + "username": "john.doe", + "user_id": 123, + "companies": ["COMP1", "COMP2"], + "permissions": ["read", "reports"], + "exp": 1234567890, + "iat": 1234567800, + "type": "access" +} +``` + +### Multi-Tenant (Target) + +``` +┌────────────────────────────────────────────────────────────────────┐ +│ FastAPI Backend │ +│ │ +│ ┌──────────────────────────────────────────────────────────────┐ │ +│ │ MultiTenantPoolManager (New) │ │ +│ │ │ │ +│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │ +│ │ │ Client A │ │ Client B │ │ Client C │ │ │ +│ │ │ Pool (2-10) │ │ Pool (2-10) │ │ Pool (2-10) │ │ │ +│ │ │ SSH Tunnel │ │ Direct Conn │ │ SSH Tunnel │ │ │ +│ │ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │ │ +│ │ │ │ │ │ │ +│ └─────────┼─────────────────┼─────────────────┼──────────────┘ │ +│ │ │ │ │ +└────────────┼─────────────────┼─────────────────┼────────────────┘ + │ │ │ + ┌────────┴─────┐ ┌────────┴─────┐ ┌────────┴─────┐ + │ SSH Process │ │ Direct │ │ SSH Process │ + │ localhost: │ │ 192.168.1.50 │ │ localhost: │ + │ 15261 │ │ :1521 │ │ 15262 │ + └────────┬─────┘ └────────┬─────┘ └────────┬─────┘ + │ │ │ + ▼ ▼ ▼ + ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ + │ Oracle │ │ Oracle │ │ Oracle │ + │ Client A │ │ Client B │ │ Client C │ + │ (Remote) │ │ (LAN) │ │ (Remote) │ + └──────────────┘ └──────────────┘ └──────────────┘ + + ┌──────────────────────┐ + │ Tenant Config DB │ + │ (PostgreSQL/SQLite) │ + │ │ + │ - tenants │ + │ - tenant_users │ + │ - audit_logs │ + └──────────────────────┘ + +JWT Token Structure (Target): +{ + "username": "john.doe", + "user_id": 123, + "tenant_id": "client-a-uuid", ← NEW + "companies": ["COMP1", "COMP2"], + "permissions": ["read", "reports"], + "exp": 1234567890, + "iat": 1234567800, + "type": "access" +} + +Redis Cache Keys: +cache:{tenant_id}:dashboard:{company_id} ← Already prepared! +cache:{tenant_id}:invoices:{filters_hash} +``` + +### Key Architectural Decisions + +1. **Lazy Pool Initialization**: Pool-uri create doar când tenant-ul e accesat prima dată (economie memorie) +2. **SSH Tunnel per Tenant**: Subprocess separat pentru fiecare tenant remote (izolare, resilience) +3. **Tenant Config DB Separate**: Nu stocăm tenant config în Oracle (evităm dependențe circulare) +4. **JWT Tenant ID Signed**: Tenant ID e în token signed, nu poate fi modificat de client +5. **Pool Cleanup**: Pool-uri inactive > 1h se închid automat (economie resurse) +6. **Backward Compatible**: Tenant "default" mapează la .env actual (zero migration pain) + +--- + +## 🗂️ Structura Fișierelor + +### Fișiere Noi + +``` +shared/ +├── database/ +│ ├── multi_tenant_pool.py ✅ NEW - MultiTenantPoolManager class +│ ├── tenant_config.py ✅ NEW - Tenant configuration loader +│ ├── ssh_tunnel_manager.py ✅ NEW - SSH tunnel per tenant management +│ └── tenant_models.py ✅ NEW - Pydantic models for tenants +│ +├── middleware/ +│ └── tenant_middleware.py ✅ NEW - Tenant identification middleware +│ +├── schemas/ +│ └── tenant_config_schema.sql ✅ NEW - PostgreSQL/SQLite schema +│ +└── utils/ + ├── encryption.py ✅ NEW - Fernet encryption for passwords + └── tenant_utils.py ✅ NEW - Tenant helper functions + +deployment/ +├── docker/ +│ └── tenant-config-db.dockerfile ✅ NEW - PostgreSQL tenant config container +│ +└── windows/ + └── tenant-config-setup.ps1 ✅ NEW - SQL Server Express setup for tenants +``` + +### Fișiere Modificate + +``` +shared/ +├── database/ +│ └── oracle_pool.py ⚠️ MODIFY - Add DEPRECATED warning +│ +├── auth/ +│ ├── jwt_handler.py ⚠️ MODIFY - Add tenant_id to JWT payload +│ └── middleware.py ⚠️ MODIFY - Extract tenant_id, validate access +│ +└── cache/ + └── redis_client.py ⚠️ MODIFY - Use real tenant_id (not "default") + +reports-app/backend/ +├── app/ +│ ├── main.py ⚠️ MODIFY - Initialize MultiTenantPoolManager +│ └── routers/ +│ ├── companies.py ⚠️ MODIFY - Use tenant_id from request.state +│ ├── dashboard.py ⚠️ MODIFY - Use tenant_id from request.state +│ ├── invoices.py ⚠️ MODIFY - Use tenant_id from request.state +│ └── treasury.py ⚠️ MODIFY - Use tenant_id from request.state +│ +└── .env.example ⚠️ MODIFY - Add tenant config DB variables + +docker-compose.yml ⚠️ MODIFY - Add tenant-config-db service + +deployment/windows/ +└── scripts/ + └── Install-ROA2WEB.ps1 ⚠️ MODIFY - Add tenant DB setup +``` + +### Database Schema (Tenant Config DB) + +**PostgreSQL/SQLite Compatible Schema** + +```sql +-- shared/schemas/tenant_config_schema.sql + +-- Tenants configuration table +CREATE TABLE IF NOT EXISTS tenants ( + id VARCHAR(36) PRIMARY KEY, -- UUID + name VARCHAR(255) NOT NULL, -- Display name (ex: "Client A - Retail SRL") + connection_type VARCHAR(20) NOT NULL, -- 'ssh_tunnel' | 'direct' + + -- Oracle connection details + oracle_host VARCHAR(255) NOT NULL, -- Oracle server IP/hostname + oracle_port INTEGER NOT NULL DEFAULT 1521, + oracle_sid VARCHAR(50) NOT NULL DEFAULT 'ROA', + oracle_user VARCHAR(100) NOT NULL, + oracle_password_encrypted TEXT NOT NULL, -- Fernet encrypted password + + -- SSH tunnel configuration (NULL if connection_type='direct') + ssh_host VARCHAR(255), -- SSH server IP + ssh_port INTEGER DEFAULT 22, + ssh_user VARCHAR(100), + ssh_key_path VARCHAR(500), -- Path to SSH private key + ssh_tunnel_local_port INTEGER, -- Local port for tunnel (ex: 15261) + + -- Pool configuration + min_connections INTEGER NOT NULL DEFAULT 2, + max_connections INTEGER NOT NULL DEFAULT 10, + + -- Status + is_active BOOLEAN NOT NULL DEFAULT TRUE, + created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP, + updated_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP, + + -- Constraints + CONSTRAINT chk_connection_type CHECK (connection_type IN ('ssh_tunnel', 'direct')), + CONSTRAINT chk_ssh_config CHECK ( + (connection_type = 'direct') OR + (connection_type = 'ssh_tunnel' AND ssh_host IS NOT NULL AND ssh_key_path IS NOT NULL) + ) +); + +-- Tenant users mapping (which users have access to which tenants) +CREATE TABLE IF NOT EXISTS tenant_users ( + id SERIAL PRIMARY KEY, -- Auto-increment ID + tenant_id VARCHAR(36) NOT NULL REFERENCES tenants(id) ON DELETE CASCADE, + user_id INTEGER NOT NULL, -- Oracle user ID from CONTAFIN_ORACLE.UTILIZATORI + username VARCHAR(100) NOT NULL, -- Oracle username + is_admin BOOLEAN NOT NULL DEFAULT FALSE, -- Tenant admin (can manage tenant config) + granted_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP, + granted_by INTEGER, -- User ID who granted access + + UNIQUE(tenant_id, user_id) +); + +-- Audit logs per tenant +CREATE TABLE IF NOT EXISTS audit_logs ( + id SERIAL PRIMARY KEY, + tenant_id VARCHAR(36) NOT NULL REFERENCES tenants(id) ON DELETE CASCADE, + user_id INTEGER NOT NULL, + username VARCHAR(100) NOT NULL, + action VARCHAR(100) NOT NULL, -- 'login', 'query', 'export', etc. + resource VARCHAR(255), -- Resource accessed (ex: 'dashboard', 'invoices') + status VARCHAR(20) NOT NULL, -- 'success' | 'error' + error_message TEXT, + ip_address VARCHAR(50), + user_agent TEXT, + created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP, + + -- Index for fast queries + INDEX idx_tenant_user (tenant_id, user_id), + INDEX idx_created_at (created_at) +); + +-- Insert default tenant (backward compatibility) +-- This maps to existing .env credentials +INSERT INTO tenants ( + id, name, connection_type, + oracle_host, oracle_port, oracle_sid, oracle_user, oracle_password_encrypted, + min_connections, max_connections, is_active +) VALUES ( + 'default', + 'Default Tenant (Single-Tenant Legacy)', + 'ssh_tunnel', -- Will be read from environment + 'localhost', -- Will be overridden by environment if needed + 1526, + 'ROA', + 'CONTAFIN_ORACLE', + 'PLACEHOLDER_ENCRYPTED_PASSWORD', -- Will be replaced by migration script + 2, + 10, + TRUE +) ON CONFLICT (id) DO NOTHING; + +-- Indexes for performance +CREATE INDEX IF NOT EXISTS idx_tenants_active ON tenants(is_active); +CREATE INDEX IF NOT EXISTS idx_tenant_users_user ON tenant_users(user_id); +CREATE INDEX IF NOT EXISTS idx_audit_tenant ON audit_logs(tenant_id); +``` + +--- + +## 🚀 Faze de Upgrade + +### FAZA 1: Tenant Configuration Database (2-3 zile) + +**Obiectiv:** Creează tenant configuration database și loader pentru citirea tenant configs la startup. + +#### Tasks + +1. **Creează PostgreSQL/SQLite schema pentru tenant config** + - **Fișier:** `shared/schemas/tenant_config_schema.sql` + - **Acțiune:** Define tables `tenants`, `tenant_users`, `audit_logs` + - **Deployment:** + - Dev: SQLite (`data/tenant_config.db`) + - Docker: PostgreSQL container (`roa-tenant-config-db`) + - Windows: SQL Server Express SAU PostgreSQL Windows service + +2. **Implementează TenantConfigLoader** + - **Fișier:** `shared/database/tenant_config.py` + - **Clasa:** `TenantConfigLoader(db_url: str)` + - **Metode:** + - `async def load_tenants() -> Dict[str, TenantConfig]` - Load all active tenants + - `async def get_tenant(tenant_id: str) -> Optional[TenantConfig]` - Get specific tenant + - `async def reload_tenant(tenant_id: str)` - Reload tenant config (for updates) + - **Pattern:** Async context manager pentru DB connections + +3. **Implementează Pydantic models pentru tenant config** + - **Fișier:** `shared/database/tenant_models.py` + - **Models:** + ```python + class TenantConfig(BaseModel): + id: str # UUID + name: str + connection_type: Literal['ssh_tunnel', 'direct'] + oracle_host: str + oracle_port: int + oracle_sid: str + oracle_user: str + oracle_password: str # Decrypted + ssh_host: Optional[str] = None + ssh_port: Optional[int] = 22 + ssh_user: Optional[str] = None + ssh_key_path: Optional[str] = None + ssh_tunnel_local_port: Optional[int] = None + min_connections: int = 2 + max_connections: int = 10 + is_active: bool = True + ``` + +4. **Implementează password encryption/decryption** + - **Fișier:** `shared/utils/encryption.py` + - **Funcții:** + - `encrypt_password(password: str, key: str) -> str` - Fernet encryption + - `decrypt_password(encrypted: str, key: str) -> str` - Fernet decryption + - **Environment:** `DB_ENCRYPTION_KEY` (generate with `Fernet.generate_key()`) + +5. **Creează migration script pentru tenant default** + - **Fișier:** `shared/scripts/create_default_tenant.py` + - **Acțiune:** + - Citește credențiale din `.env` actual + - Encrypt password cu `DB_ENCRYPTION_KEY` + - Insert tenant "default" în tenant DB + - Testează decryption și Oracle connection + +6. **Update Docker Compose cu tenant config DB** + - **Fișier:** `docker-compose.yml` + - **Service nou:** + ```yaml + roa-tenant-config-db: + image: postgres:15-alpine + container_name: roa-tenant-config-db + environment: + POSTGRES_DB: tenant_config + POSTGRES_USER: tenant_admin + POSTGRES_PASSWORD: ${TENANT_DB_PASSWORD} + volumes: + - tenant-config-data:/var/lib/postgresql/data + - ./shared/schemas/tenant_config_schema.sql:/docker-entrypoint-initdb.d/schema.sql:ro + networks: + - roa-network + ``` + +7. **Update .env.example cu tenant DB variables** + ```bash + # Tenant Configuration Database + TENANT_DB_URL=postgresql://tenant_admin:password@localhost:5432/tenant_config + # For SQLite (development): sqlite:///data/tenant_config.db + DB_ENCRYPTION_KEY=GENERATE_WITH_Fernet.generate_key() + ``` + +#### Output Verificabil + +- ✅ Tenant DB se creează cu succes (PostgreSQL/SQLite) +- ✅ Schema tables create (`tenants`, `tenant_users`, `audit_logs`) +- ✅ Default tenant se încarcă cu credențiale din `.env` actual +- ✅ Password encryption/decryption funcționează +- ✅ Test: `pytest shared/tests/test_tenant_config.py -v` +- ✅ Docker: `docker-compose up roa-tenant-config-db` pornește cu succes + +--- + +### FAZA 2: MultiTenantPoolManager (3-4 zile) + +**Obiectiv:** Implementează pool manager care creează pool-uri Oracle separate per tenant cu lazy initialization. + +#### Tasks + +1. **Implementează MultiTenantPoolManager class** + - **Fișier:** `shared/database/multi_tenant_pool.py` + - **Pattern:** Singleton (similar cu `OraclePool` actual) + - **Structură:** + ```python + class MultiTenantPoolManager: + _instance: Optional['MultiTenantPoolManager'] = None + _pools: Dict[str, oracledb.ConnectionPool] = {} # tenant_id -> pool + _tenant_configs: Dict[str, TenantConfig] = {} + _pool_locks: Dict[str, asyncio.Lock] = {} # Thread-safe pool creation + _last_access: Dict[str, datetime] = {} # For cleanup inactive pools + + async def initialize(self, tenant_db_url: str): + """Load tenant configs from tenant DB""" + + async def get_connection(self, tenant_id: str): + """Context manager - get connection from tenant pool (lazy init)""" + + async def _ensure_pool(self, tenant_id: str): + """Lazy initialize pool if not exists""" + + async def reload_tenant(self, tenant_id: str): + """Reload tenant config and recreate pool""" + + async def cleanup_inactive_pools(self, max_idle_hours: int = 1): + """Close pools inactive > max_idle_hours""" + + async def close_all_pools(self): + """Shutdown - close all pools""" + ``` + +2. **Implementează lazy pool initialization** + - **Logica:** + ```python + async def _ensure_pool(self, tenant_id: str): + if tenant_id in self._pools: + self._last_access[tenant_id] = datetime.utcnow() + return # Pool already exists + + # Acquire lock pentru thread-safety + async with self._pool_locks.setdefault(tenant_id, asyncio.Lock()): + # Double-check în lock + if tenant_id in self._pools: + return + + # Load tenant config + tenant_config = await self._load_tenant_config(tenant_id) + if not tenant_config.is_active: + raise ValueError(f"Tenant {tenant_id} is not active") + + # Create pool + pool = oracledb.create_pool( + user=tenant_config.oracle_user, + password=tenant_config.oracle_password, + host=tenant_config.oracle_host, + port=tenant_config.oracle_port, + sid=tenant_config.oracle_sid, + min=tenant_config.min_connections, + max=tenant_config.max_connections, + increment=1, + getmode=oracledb.POOL_GETMODE_WAIT + ) + + self._pools[tenant_id] = pool + self._tenant_configs[tenant_id] = tenant_config + self._last_access[tenant_id] = datetime.utcnow() + logger.info(f"Created pool for tenant {tenant_id} ({tenant_config.name})") + ``` + +3. **Implementează get_connection context manager** + - **Pattern:** Same as `OraclePool.get_connection()` dar per tenant + ```python + @asynccontextmanager + async def get_connection(self, tenant_id: str): + await self._ensure_pool(tenant_id) # Lazy init + + pool = self._pools[tenant_id] + connection = None + try: + connection = pool.acquire() + self._last_access[tenant_id] = datetime.utcnow() + logger.debug(f"Connection acquired for tenant {tenant_id}") + yield connection + finally: + if connection is not None: + connection.close() + logger.debug(f"Connection returned for tenant {tenant_id}") + ``` + +4. **Implementează pool cleanup pentru inactive tenants** + - **Scheduled task:** Run every hour, close pools inactive > 1h + ```python + async def cleanup_inactive_pools(self, max_idle_hours: int = 1): + now = datetime.utcnow() + inactive_tenants = [] + + for tenant_id, last_access in self._last_access.items(): + idle_hours = (now - last_access).total_seconds() / 3600 + if idle_hours > max_idle_hours: + inactive_tenants.append(tenant_id) + + for tenant_id in inactive_tenants: + logger.info(f"Closing inactive pool for tenant {tenant_id}") + pool = self._pools.pop(tenant_id, None) + if pool: + pool.close() + self._tenant_configs.pop(tenant_id, None) + self._last_access.pop(tenant_id, None) + ``` + +5. **Implementează tenant config reload (for dynamic updates)** + - **Use case:** Admin updates tenant config în DB, aplicația reloadează fără restart + ```python + async def reload_tenant(self, tenant_id: str): + # Close existing pool + old_pool = self._pools.pop(tenant_id, None) + if old_pool: + old_pool.close() + + # Reload config from DB + tenant_config = await self._tenant_config_loader.get_tenant(tenant_id) + if not tenant_config: + raise ValueError(f"Tenant {tenant_id} not found") + + # Pool will be recreated on next request (lazy init) + self._tenant_configs.pop(tenant_id, None) + self._last_access.pop(tenant_id, None) + logger.info(f"Reloaded tenant config for {tenant_id}") + ``` + +6. **Add backward compatibility layer** + - **Tenant "default"** mapează la credențiale din `.env` pentru zero breaking changes + ```python + async def _load_default_tenant_from_env(self) -> TenantConfig: + """Fallback: Load default tenant from .env if tenant DB is not available""" + return TenantConfig( + id='default', + name='Default Tenant (Legacy)', + connection_type='ssh_tunnel' if os.getenv('ORACLE_HOST') == 'localhost' else 'direct', + oracle_host=os.getenv('ORACLE_HOST', 'localhost'), + oracle_port=int(os.getenv('ORACLE_PORT', '1526')), + oracle_sid=os.getenv('ORACLE_SID', 'ROA'), + oracle_user=os.getenv('ORACLE_USER'), + oracle_password=os.getenv('ORACLE_PASSWORD'), + min_connections=2, + max_connections=10, + is_active=True + ) + ``` + +7. **Mark OraclePool as DEPRECATED** + - **Fișier:** `shared/database/oracle_pool.py` + - **Acțiune:** Add deprecation warning + ```python + import warnings + + class OraclePool: + """ + DEPRECATED: Use MultiTenantPoolManager instead. + This class is kept for backward compatibility only. + Will be removed in version 2.0. + """ + def __init__(self): + warnings.warn( + "OraclePool is deprecated. Use MultiTenantPoolManager for multi-tenant support.", + DeprecationWarning, + stacklevel=2 + ) + # ... rest of code + ``` + +#### Output Verificabil + +- ✅ `MultiTenantPoolManager` creează pool-uri per tenant +- ✅ Lazy initialization: Pool creat doar la prima cerere +- ✅ Tenant "default" funcționează cu credențiale din `.env` (backward compatible) +- ✅ Pool cleanup: Inactive pools se închid automat după 1h +- ✅ Reload tenant: Config update fără restart aplicație +- ✅ Test: `pytest shared/tests/test_multi_tenant_pool.py -v` +- ✅ Test: Connect la 3 tenants dummy simultaneous + +--- + +### FAZA 3: SSH Tunnel Management per Tenant (2-3 zile) + +**Obiectiv:** Implementează SSH tunnel manager care creează și monitorizează subprocess SSH per tenant remote. + +#### Tasks + +1. **Implementează SSHTunnelManager class** + - **Fișier:** `shared/database/ssh_tunnel_manager.py` + - **Responsabilități:** + - Start SSH tunnel subprocess per tenant + - Monitor tunnel health (periodic checks) + - Auto-restart on failure (exponential backoff) + - Cleanup la shutdown + - **Structură:** + ```python + class SSHTunnelManager: + _tunnels: Dict[str, subprocess.Popen] = {} # tenant_id -> SSH process + _tunnel_ports: Dict[str, int] = {} # tenant_id -> local port + _restart_attempts: Dict[str, int] = {} # For exponential backoff + + async def start_tunnel(self, tenant_config: TenantConfig) -> int: + """Start SSH tunnel for tenant, return local port""" + + async def stop_tunnel(self, tenant_id: str): + """Stop SSH tunnel subprocess""" + + async def check_tunnel_health(self, tenant_id: str) -> bool: + """Check if tunnel is alive and responding""" + + async def restart_tunnel(self, tenant_id: str): + """Restart tunnel with exponential backoff""" + + async def cleanup_all_tunnels(self): + """Shutdown - kill all SSH processes""" + ``` + +2. **Implementează SSH tunnel start logic** + - **Logica:** + ```python + async def start_tunnel(self, tenant_config: TenantConfig) -> int: + tenant_id = tenant_config.id + + # Generate unique local port for this tenant + local_port = tenant_config.ssh_tunnel_local_port or self._allocate_port() + + # Build SSH command + ssh_cmd = [ + 'ssh', '-f', '-N', + '-L', f'{local_port}:{tenant_config.oracle_host}:{tenant_config.oracle_port}', + '-p', str(tenant_config.ssh_port), + '-i', tenant_config.ssh_key_path, + '-o', 'ServerAliveInterval=60', + '-o', 'ServerAliveCountMax=3', + '-o', 'ExitOnForwardFailure=yes', + f'{tenant_config.ssh_user}@{tenant_config.ssh_host}' + ] + + # Start process + process = subprocess.Popen(ssh_cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE) + + # Wait for tunnel to establish (max 10 seconds) + for _ in range(10): + if self._check_port_open('localhost', local_port): + break + await asyncio.sleep(1) + else: + process.kill() + raise RuntimeError(f"SSH tunnel failed to start for tenant {tenant_id}") + + self._tunnels[tenant_id] = process + self._tunnel_ports[tenant_id] = local_port + logger.info(f"SSH tunnel started for tenant {tenant_id} on port {local_port}") + + return local_port + ``` + +3. **Implementează tunnel health checks** + - **Periodic check:** Every 60 seconds, verify tunnel is alive + ```python + async def check_tunnel_health(self, tenant_id: str) -> bool: + if tenant_id not in self._tunnels: + return False + + process = self._tunnels[tenant_id] + local_port = self._tunnel_ports[tenant_id] + + # Check process is alive + if process.poll() is not None: + logger.warning(f"SSH tunnel process died for tenant {tenant_id}") + return False + + # Check port is accessible + if not self._check_port_open('localhost', local_port): + logger.warning(f"SSH tunnel port {local_port} not accessible for tenant {tenant_id}") + return False + + return True + + def _check_port_open(self, host: str, port: int) -> bool: + import socket + try: + with socket.create_connection((host, port), timeout=2): + return True + except: + return False + ``` + +4. **Implementează auto-restart cu exponential backoff** + - **Logica:** Dacă tunnel moare, restart cu delay: 5s, 10s, 20s, 40s, max 60s + ```python + async def restart_tunnel(self, tenant_id: str): + attempts = self._restart_attempts.get(tenant_id, 0) + delay = min(5 * (2 ** attempts), 60) # Exponential backoff, max 60s + + logger.info(f"Restarting tunnel for tenant {tenant_id} (attempt {attempts+1}, delay {delay}s)") + await asyncio.sleep(delay) + + try: + await self.stop_tunnel(tenant_id) + tenant_config = await self._get_tenant_config(tenant_id) + await self.start_tunnel(tenant_config) + + # Reset attempts on success + self._restart_attempts[tenant_id] = 0 + logger.info(f"Tunnel restarted successfully for tenant {tenant_id}") + except Exception as e: + self._restart_attempts[tenant_id] = attempts + 1 + logger.error(f"Tunnel restart failed for tenant {tenant_id}: {e}") + raise + ``` + +5. **Integrate SSH tunnel manager în MultiTenantPoolManager** + - **Logica:** Dacă tenant are `connection_type='ssh_tunnel'`, start tunnel înainte de pool + ```python + # În MultiTenantPoolManager._ensure_pool() + + tenant_config = await self._load_tenant_config(tenant_id) + + # Start SSH tunnel if needed + if tenant_config.connection_type == 'ssh_tunnel': + if not await self._ssh_tunnel_manager.check_tunnel_health(tenant_id): + local_port = await self._ssh_tunnel_manager.start_tunnel(tenant_config) + # Override Oracle host/port to use tunnel + tenant_config.oracle_host = 'localhost' + tenant_config.oracle_port = local_port + + # Create pool (rest of code same as before) + pool = oracledb.create_pool(...) + ``` + +6. **Implementează cleanup la shutdown** + - **Logica:** Kill all SSH processes gracefully + ```python + async def cleanup_all_tunnels(self): + for tenant_id, process in self._tunnels.items(): + try: + process.terminate() # SIGTERM + await asyncio.sleep(2) + if process.poll() is None: + process.kill() # SIGKILL if not dead + logger.info(f"Stopped SSH tunnel for tenant {tenant_id}") + except Exception as e: + logger.error(f"Error stopping tunnel for tenant {tenant_id}: {e}") + + self._tunnels.clear() + self._tunnel_ports.clear() + ``` + +7. **Add background task pentru health monitoring** + - **Fișier:** `reports-app/backend/app/main.py` + - **Task:** Run every 60 seconds + ```python + async def monitor_ssh_tunnels(): + while True: + await asyncio.sleep(60) + for tenant_id in multi_tenant_pool._tunnels.keys(): + if not await multi_tenant_pool._ssh_tunnel_manager.check_tunnel_health(tenant_id): + logger.warning(f"Tunnel unhealthy for tenant {tenant_id}, restarting...") + await multi_tenant_pool._ssh_tunnel_manager.restart_tunnel(tenant_id) + + # În lifespan startup + asyncio.create_task(monitor_ssh_tunnels()) + ``` + +#### Output Verificabil + +- ✅ SSH tunnel subprocess pornește per tenant remote +- ✅ Tunnel health check detectează tunnels moarte +- ✅ Auto-restart cu exponential backoff funcționează +- ✅ Multiple tenants cu SSH tunnels simultaneous (port allocation unique) +- ✅ Cleanup la shutdown: toate procesele SSH se opresc +- ✅ Test: `pytest shared/tests/test_ssh_tunnel_manager.py -v` +- ✅ Manual test: Kill SSH process, verifică auto-restart în < 60s + +--- + +### FAZA 4: JWT & Middleware Update (2-3 zile) + +**Obiectiv:** Update JWT tokens să includă `tenant_id` și middleware să extragă/valideze tenant access. + +#### Tasks + +1. **Update JWT handler să includă tenant_id** + - **Fișier:** `shared/auth/jwt_handler.py` + - **Modificări:** + ```python + # În TokenData model + class TokenData(BaseModel): + username: str + user_id: Optional[int] = None + tenant_id: str = Field(description="Tenant ID (UUID)") # NEW + companies: List[str] = Field(default_factory=list) + permissions: List[str] = Field(default_factory=list) + exp: datetime + iat: datetime + token_type: str = Field(alias="type") + + # În create_access_token() + def create_access_token( + self, + username: str, + tenant_id: str, # NEW parameter + companies: List[str], + user_id: Optional[int] = None, + permissions: Optional[List[str]] = None + ) -> str: + payload = { + "username": username, + "user_id": user_id, + "tenant_id": tenant_id, # NEW + "companies": companies or [], + "permissions": permissions or ["read"], + "exp": expire, + "iat": now, + "type": "access" + } + # ... rest same + ``` + +2. **Update login endpoint să determine tenant_id** + - **Fișier:** `reports-app/backend/app/main.py` (auth router) + - **Logica:** + - Check `tenant_users` table pentru user_id + - Dacă user are access la multiple tenants, return primul (default) + - Sau user selectează tenant la login (future enhancement) + ```python + # În login endpoint + + # Get user's tenants from tenant_users table + tenants = await tenant_config_loader.get_user_tenants(user_id) + + if not tenants: + # Fallback: Use "default" tenant (backward compatibility) + tenant_id = "default" + else: + # Use first tenant (or let user select in future) + tenant_id = tenants[0]['tenant_id'] + + # Create JWT with tenant_id + access_token = jwt_handler.create_access_token( + username=credentials.username, + tenant_id=tenant_id, # NEW + companies=companies, + user_id=user_id, + permissions=["read", "reports"] + ) + ``` + +3. **Implementează TenantMiddleware pentru validare tenant access** + - **Fișier:** `shared/middleware/tenant_middleware.py` + - **Responsabilități:** + - Extract `tenant_id` din JWT token + - Validate user are acces la tenant-ul respectiv + - Inject `tenant_id` în `request.state.tenant_id` + ```python + class TenantMiddleware(BaseHTTPMiddleware): + async def dispatch(self, request: Request, call_next): + # Skip pentru excluded paths + if request.url.path in self.excluded_paths: + return await call_next(request) + + # Extract tenant_id from JWT token (already decoded by AuthMiddleware) + user = getattr(request.state, 'user', None) + if not user: + return JSONResponse( + status_code=401, + content={"detail": "Not authenticated"} + ) + + tenant_id = user.get('tenant_id') + if not tenant_id: + return JSONResponse( + status_code=400, + content={"detail": "Missing tenant_id in token"} + ) + + # Validate tenant exists and is active + tenant_config = await tenant_config_loader.get_tenant(tenant_id) + if not tenant_config or not tenant_config.is_active: + return JSONResponse( + status_code=403, + content={"detail": f"Tenant {tenant_id} is not active"} + ) + + # Validate user has access to this tenant + user_id = user.get('user_id') + has_access = await tenant_config_loader.check_user_tenant_access(user_id, tenant_id) + if not has_access: + return JSONResponse( + status_code=403, + content={"detail": f"User {user_id} does not have access to tenant {tenant_id}"} + ) + + # Inject tenant_id în request state + request.state.tenant_id = tenant_id + request.state.tenant_name = tenant_config.name + + # Continue request + response = await call_next(request) + + # Log audit (async background task) + await self._log_audit(request, response, tenant_id, user_id) + + return response + ``` + +4. **Update AuthenticationMiddleware să funcționeze cu TenantMiddleware** + - **Fișier:** `shared/auth/middleware.py` + - **Ordinea middleware-urilor:** + ```python + # În main.py + app.add_middleware(TenantMiddleware, excluded_paths=["/", "/docs", "/health", ...]) + app.add_middleware(AuthenticationMiddleware, excluded_paths=["/", "/docs", "/health", ...]) + ``` + - **Flow:** AuthMiddleware decode JWT → TenantMiddleware validate tenant access + +5. **Update toate router-urile să folosească tenant_id din request.state** + - **Fișiere:** `reports-app/backend/app/routers/*.py` + - **Pattern:** + ```python + # Înainte (single-tenant) + async with oracle_pool.get_connection() as connection: + # query... + + # După (multi-tenant) + tenant_id = request.state.tenant_id # Injected by TenantMiddleware + async with multi_tenant_pool.get_connection(tenant_id) as connection: + # query... + ``` + - **Exemplu:** `dashboard.py` + ```python + @router.get("/{company_id}") + async def get_dashboard(company_id: str, request: Request): + tenant_id = request.state.tenant_id # NEW + + async with multi_tenant_pool.get_connection(tenant_id) as connection: + with connection.cursor() as cursor: + # ... rest same + ``` + +6. **Update Telegram bot pentru tenant support** + - **Fișier:** `reports-app/telegram-bot/app/auth/linking.py` + - **Modificări:** + - La linking, salvează și `tenant_id` în SQLite + - JWT token include `tenant_id` + - Toate requests la backend includ tenant_id corect + +7. **Add tenant selection endpoint (future enhancement)** + - **Endpoint:** `POST /api/auth/select-tenant` + - **Use case:** User cu access la multiple tenants poate switcha între ele + - **Response:** New JWT token cu alt tenant_id + +#### Output Verificabil + +- ✅ JWT token include `tenant_id` field +- ✅ Login endpoint generate token cu tenant_id corect +- ✅ TenantMiddleware extrage și validează tenant_id +- ✅ Router-uri folosesc `multi_tenant_pool.get_connection(tenant_id)` +- ✅ Request la tenant invalid returnează 403 Forbidden +- ✅ User fără access la tenant returnează 403 Forbidden +- ✅ Test: `pytest shared/tests/test_tenant_middleware.py -v` +- ✅ Test: Login cu user care are access la tenant A, request la tenant B → 403 + +--- + +### FAZA 5: Cache & Audit Logging Integration (1-2 zile) + +**Obiectiv:** Update Redis cache să folosească real tenant_id (nu "default") și implementează audit logging per tenant. + +#### Tasks + +1. **Update Redis cache să folosească real tenant_id** + - **Fișier:** `shared/cache/redis_client.py` (dacă există) sau inline în routers + - **Modificare:** Înlocuiește hardcoded `"default"` cu real `tenant_id` + - **Înainte:** + ```python + cache_key = f"cache:default:dashboard:{company_id}" + ``` + - **După:** + ```python + tenant_id = request.state.tenant_id + cache_key = f"cache:{tenant_id}:dashboard:{company_id}" + ``` + +2. **Implementează cache invalidation per tenant** + - **Use case:** Admin updates tenant data, invalidate doar cache-ul tenant-ului respectiv + - **Endpoint:** `DELETE /api/cache/{tenant_id}` (admin only) + - **Logica:** + ```python + pattern = f"cache:{tenant_id}:*" + keys = redis_client.keys(pattern) + if keys: + redis_client.delete(*keys) + ``` + +3. **Implementează audit logging în TenantMiddleware** + - **Fișier:** `shared/middleware/tenant_middleware.py` + - **Logica:** Log toate request-urile în `audit_logs` table + ```python + async def _log_audit(self, request: Request, response: Response, tenant_id: str, user_id: int): + # Extract info + action = f"{request.method} {request.url.path}" + status = "success" if response.status_code < 400 else "error" + error_message = None if status == "success" else response.body.decode() + + # Insert în audit_logs table (async background task) + await audit_logger.log( + tenant_id=tenant_id, + user_id=user_id, + username=request.state.user.get('username'), + action=action, + resource=request.url.path, + status=status, + error_message=error_message, + ip_address=request.client.host, + user_agent=request.headers.get('user-agent') + ) + ``` + +4. **Implementează AuditLogger helper class** + - **Fișier:** `shared/utils/audit_logger.py` + - **Metodă:** + ```python + class AuditLogger: + def __init__(self, tenant_db_url: str): + self.db_url = tenant_db_url + + async def log( + self, + tenant_id: str, + user_id: int, + username: str, + action: str, + resource: str, + status: str, + error_message: Optional[str] = None, + ip_address: Optional[str] = None, + user_agent: Optional[str] = None + ): + # Insert în audit_logs table (PostgreSQL/SQLite) + query = """ + INSERT INTO audit_logs ( + tenant_id, user_id, username, action, resource, + status, error_message, ip_address, user_agent + ) VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9) + """ + await self._execute_query(query, [ + tenant_id, user_id, username, action, resource, + status, error_message, ip_address, user_agent + ]) + ``` + +5. **Add audit logs viewing endpoint** + - **Endpoint:** `GET /api/audit-logs/{tenant_id}` (tenant admin only) + - **Filters:** `?user_id=123&start_date=2025-10-01&end_date=2025-10-31&status=error` + - **Response:** Paginated audit logs for tenant + +6. **Add metrics per tenant (optional, future)** + - **Metrics:** + - Request count per tenant + - Response time per tenant + - Error rate per tenant + - Active users per tenant + - **Storage:** Time-series database (InfluxDB) sau Redis sorted sets + +#### Output Verificabil + +- ✅ Redis cache keys include real tenant_id (not "default") +- ✅ Cache isolation: Tenant A cache nu e vizibil pentru tenant B +- ✅ Cache invalidation per tenant funcționează +- ✅ Audit logs se salvează în `audit_logs` table +- ✅ Audit logs include tenant_id, user_id, action, status +- ✅ Audit logs viewing endpoint returnează logs filtered per tenant +- ✅ Test: `pytest shared/tests/test_audit_logging.py -v` + +--- + +### FAZA 6: Deployment & Testing (3-4 zile) + +**Obiectiv:** Deploy multi-tenant în toate environment-urile (dev, Docker, Windows) și test complet. + +#### Tasks + +1. **Update development environment (WSL)** + - **Setup:** + ```bash + # Create SQLite tenant DB + sqlite3 data/tenant_config.db < shared/schemas/tenant_config_schema.sql + + # Generate encryption key + python -c "from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())" + + # Update .env + echo "TENANT_DB_URL=sqlite:///data/tenant_config.db" >> .env + echo "DB_ENCRYPTION_KEY=" >> .env + + # Create default tenant + python shared/scripts/create_default_tenant.py + + # Start app + ./start-dev.sh + ``` + - **Verificare:** Login funcționează cu tenant "default" + +2. **Update Docker deployment** + - **Fișier:** `docker-compose.yml` + - **Modificări:** + - Add `roa-tenant-config-db` service (PostgreSQL) + - Update `roa-backend` env vars (`TENANT_DB_URL`, `DB_ENCRYPTION_KEY`) + - Mount SSH keys volume read-only + - **Deployment:** + ```bash + # Build images + docker-compose build + + # Start services + docker-compose up -d + + # Initialize tenant DB + docker-compose exec roa-backend python shared/scripts/create_default_tenant.py + + # Verify + docker-compose logs roa-backend | grep "tenant" + ``` + +3. **Update Windows IIS deployment** + - **Script:** `deployment/windows/scripts/Setup-TenantDB.ps1` + - **Acțiuni:** + - Install SQL Server Express SAU PostgreSQL Windows service + - Create `tenant_config` database + - Run schema SQL + - Generate encryption key (store în Windows Credential Manager) + - Create default tenant + - **Manual steps:** + ```powershell + # Run setup + .\deployment\windows\scripts\Setup-TenantDB.ps1 + + # Update web.config cu TENANT_DB_URL + # Restart ROA2WEB-Backend service + Restart-Service ROA2WEB-Backend + ``` + +4. **Implementează comprehensive integration tests** + - **Fișier:** `shared/tests/integration/test_multi_tenant_flow.py` + - **Scenarios:** + - Login cu tenant A → Get dashboard → Cache hit tenant A + - Login cu tenant B → Get dashboard → Cache miss (different tenant) + - User cu access la tenant A încearcă tenant B → 403 Forbidden + - SSH tunnel tenant restart după kill → Auto-recovery + - Tenant inactive > 1h → Pool cleanup + - **Run:** + ```bash + pytest shared/tests/integration/ -v --tb=short + ``` + +5. **Implementează load testing cu multiple tenants** + - **Tool:** Locust sau Apache Bench + - **Scenario:** 3 tenants, 100 requests each, simultaneous + - **Script:** `shared/tests/load/test_multi_tenant_load.py` + - **Metrics:** + - Response time per tenant (< 200ms avg) + - Error rate (< 1%) + - Pool usage (max connections per tenant) + - SSH tunnel stability (no restarts) + +6. **Create tenant onboarding guide** + - **Fișier:** `shared/docs/TENANT_ONBOARDING.md` + - **Conținut:** + - How to add a new tenant (manual SQL sau admin UI) + - SSH key setup pentru tenant remote + - User assignment la tenant + - Testing tenant connection + - Troubleshooting common issues + +7. **Create monitoring dashboard (optional)** + - **Tools:** Grafana + Prometheus + - **Metrics:** + - Active tenants count + - Pool connections per tenant + - Request rate per tenant + - Error rate per tenant + - SSH tunnel uptime per tenant + +#### Output Verificabil + +- ✅ Development (WSL): Multi-tenant funcționează cu SQLite tenant DB +- ✅ Docker: Multi-tenant funcționează cu PostgreSQL tenant DB +- ✅ Windows IIS: Multi-tenant funcționează cu SQL Server Express +- ✅ Integration tests pass (100% success rate) +- ✅ Load tests: 3 tenants × 100 requests, < 200ms avg response time +- ✅ SSH tunnels: No crashes during 1h load test +- ✅ Cache isolation validated: Tenant A cache ≠ Tenant B cache +- ✅ Audit logs populated corect pentru toate requests +- ✅ Documentation complete (onboarding guide, troubleshooting) + +--- + +## 🔧 Connection Management + +### SSH Tunnel Configuration + +**Tenant cu SSH Tunnel (Client Remote)** + +```json +{ + "id": "client-a-uuid", + "name": "Client A - Retail SRL", + "connection_type": "ssh_tunnel", + + "oracle_host": "10.0.20.36", + "oracle_port": 1521, + "oracle_sid": "ROA", + "oracle_user": "CLIENT_A_USER", + "oracle_password_encrypted": "gAAAAABh...", + + "ssh_host": "83.103.197.79", + "ssh_port": 22122, + "ssh_user": "roa2web", + "ssh_key_path": "/app/ssh-keys/client-a.key", + "ssh_tunnel_local_port": 15261, + + "min_connections": 2, + "max_connections": 10, + "is_active": true +} +``` + +**SSH Tunnel Flow:** + +``` +Backend Process + ↓ +SSHTunnelManager.start_tunnel() + ↓ +subprocess: ssh -f -N -L 15261:10.0.20.36:1521 -p 22122 roa2web@83.103.197.79 + ↓ +Tunnel established: localhost:15261 → 10.0.20.36:1521 + ↓ +OraclePool connects to localhost:15261 + ↓ +Oracle queries routed prin SSH tunnel +``` + +### Direct Connection Configuration + +**Tenant cu Direct Connection (Client LAN)** + +```json +{ + "id": "client-b-uuid", + "name": "Client B - Import Export SA", + "connection_type": "direct", + + "oracle_host": "192.168.1.50", + "oracle_port": 1521, + "oracle_sid": "ROA", + "oracle_user": "CLIENT_B_USER", + "oracle_password_encrypted": "gAAAAABh...", + + "ssh_host": null, + "ssh_port": null, + "ssh_user": null, + "ssh_key_path": null, + "ssh_tunnel_local_port": null, + + "min_connections": 5, + "max_connections": 20, + "is_active": true +} +``` + +**Direct Connection Flow:** + +``` +Backend Process + ↓ +MultiTenantPoolManager.get_connection(tenant_id) + ↓ +Check connection_type: "direct" → Skip SSH tunnel + ↓ +OraclePool.create_pool(host=192.168.1.50, port=1521, ...) + ↓ +Oracle queries direct la 192.168.1.50:1521 +``` + +### Mixed Environment Setup + +**3 Tenants: 2 SSH, 1 Direct** + +| Tenant ID | Name | Type | Oracle Host | SSH Tunnel | Local Port | +|-----------|------|------|-------------|------------|------------| +| client-a-uuid | Client A - Retail SRL | ssh_tunnel | 10.0.20.36:1521 | 83.103.197.79:22122 | 15261 | +| client-b-uuid | Client B - Import SA | direct | 192.168.1.50:1521 | N/A | N/A | +| client-c-uuid | Client C - Distribution | ssh_tunnel | 10.0.20.36:1521 | 212.18.45.99:22 | 15262 | + +**Resource Usage:** + +``` +Backend Memory: +├── Pool Client A: 2-10 connections × ~5MB = 10-50MB +├── Pool Client B: 5-20 connections × ~5MB = 25-100MB +├── Pool Client C: 2-10 connections × ~5MB = 10-50MB +└── Total: ~50-200MB (vs single-tenant ~10-50MB) + +SSH Processes: +├── Tunnel Client A: ~10MB RAM +├── Tunnel Client C: ~10MB RAM +└── Total: ~20MB + +Total Overhead: ~70-220MB (acceptable for multi-tenant SaaS) +``` + +--- + +## 🔒 Security Model + +### Encryption Strategy + +**Password Encryption în Tenant DB** + +```python +from cryptography.fernet import Fernet + +# Generate encryption key (store în .env) +encryption_key = Fernet.generate_key() # Example: b'Xs3J7...' + +# Encrypt password +fernet = Fernet(encryption_key) +encrypted_password = fernet.encrypt(b"oracle_password_plaintext") +# Result: "gAAAAABh3J..." + +# Decrypt password (la runtime) +decrypted_password = fernet.decrypt(encrypted_password.encode()).decode() +``` + +**Security Properties:** + +- ✅ Symmetric encryption (Fernet - AES 128 CBC + HMAC) +- ✅ Encryption key în environment variable (`DB_ENCRYPTION_KEY`) +- ✅ Passwords encrypted at rest în tenant DB +- ✅ Decryption doar la pool initialization (memory only) +- ❌ **NOT**: Passwords în logs, error messages, audit trails + +### Tenant Isolation + +**Izolare Completă între Tenants** + +``` +┌─────────────────────────────────────────────────────────┐ +│ Tenant A │ +│ │ +│ ┌──────────────────────────────────────────────────┐ │ +│ │ Connection Pool (2-10 connections) │ │ +│ │ - oracle_host: 10.0.20.36 (via SSH tunnel) │ │ +│ │ - oracle_user: CLIENT_A_USER │ │ +│ │ - Schema: CLIENT_A_SCHEMA │ │ +│ └──────────────────────────────────────────────────┘ │ +│ │ +│ ┌──────────────────────────────────────────────────┐ │ +│ │ Redis Cache Namespace │ │ +│ │ - cache:client-a-uuid:* │ │ +│ └──────────────────────────────────────────────────┘ │ +│ │ +│ ┌──────────────────────────────────────────────────┐ │ +│ │ Audit Logs │ │ +│ │ - audit_logs WHERE tenant_id='client-a-uuid' │ │ +│ └──────────────────────────────────────────────────┘ │ +└─────────────────────────────────────────────────────────┘ + + ❌ ZERO SHARING ❌ + +┌─────────────────────────────────────────────────────────┐ +│ Tenant B │ +│ (Same structure, COMPLETELY ISOLATED) │ +└─────────────────────────────────────────────────────────┘ +``` + +**Isolation Guarantees:** + +1. **Connection Pool:** Tenant A connections NEVER folosite pentru tenant B queries +2. **Cache:** Redis keys namespaced per tenant (`cache:{tenant_id}:*`) +3. **Audit Logs:** Query filter `WHERE tenant_id = $1` (indexat pentru performance) +4. **SSH Tunnels:** Separate processes, separate local ports (no crosstalk) + +### JWT Token Structure + +**Token cu Tenant ID (Signed)** + +```json +{ + "username": "john.doe", + "user_id": 123, + "tenant_id": "client-a-uuid", + "companies": ["COMP1", "COMP2"], + "permissions": ["read", "reports"], + "exp": 1735142400, + "iat": 1735140600, + "type": "access" +} +``` + +**Security Checks în TenantMiddleware:** + +```python +# 1. Extract tenant_id from JWT (decoded by AuthMiddleware) +tenant_id = request.state.user.get('tenant_id') + +# 2. Validate tenant exists and is active +tenant_config = await tenant_config_loader.get_tenant(tenant_id) +if not tenant_config or not tenant_config.is_active: + raise HTTPException(403, "Tenant not active") + +# 3. Validate user has access to this tenant +user_id = request.state.user.get('user_id') +has_access = await tenant_config_loader.check_user_tenant_access(user_id, tenant_id) +if not has_access: + raise HTTPException(403, "User does not have access to this tenant") + +# 4. Inject tenant_id în request state (immutable) +request.state.tenant_id = tenant_id # Routers use this +``` + +**Attack Scenarios Prevented:** + +- ❌ **Tenant ID Tampering:** JWT signed, client nu poate modifica tenant_id fără invalid signature +- ❌ **Cross-Tenant Access:** User cu access la tenant A nu poate accesa tenant B (check în step 3) +- ❌ **Inactive Tenant Access:** Tenant deactivated → requests rejected (check în step 2) +- ❌ **SQL Injection via Tenant ID:** UUID validated, folosit în parameterized queries + +--- + +## 🧪 Testing Strategy + +### Unit Tests + +**Test Coverage per Component** + +```bash +shared/tests/ +├── test_tenant_config.py # TenantConfigLoader +│ ├── test_load_tenants() # Load all tenants from DB +│ ├── test_get_tenant() # Get specific tenant +│ ├── test_reload_tenant() # Reload tenant config +│ ├── test_encryption_decryption() # Password encryption/decryption +│ └── test_default_tenant_fallback() # Fallback la .env credențiale +│ +├── test_multi_tenant_pool.py # MultiTenantPoolManager +│ ├── test_lazy_pool_initialization() # Pool creat doar la prima cerere +│ ├── test_pool_per_tenant() # Pool-uri separate per tenant +│ ├── test_pool_cleanup_inactive() # Cleanup după 1h inactivity +│ ├── test_tenant_reload() # Reload tenant fără restart +│ └── test_connection_context_manager() # get_connection() pattern +│ +├── test_ssh_tunnel_manager.py # SSHTunnelManager +│ ├── test_start_tunnel() # Start SSH tunnel subprocess +│ ├── test_stop_tunnel() # Stop SSH tunnel gracefully +│ ├── test_tunnel_health_check() # Detect dead tunnels +│ ├── test_auto_restart() # Restart cu exponential backoff +│ └── test_cleanup_all_tunnels() # Kill all processes la shutdown +│ +├── test_tenant_middleware.py # TenantMiddleware +│ ├── test_extract_tenant_id() # Extract tenant_id din JWT +│ ├── test_validate_tenant_access() # User access validation +│ ├── test_inactive_tenant_blocked() # Inactive tenant → 403 +│ ├── test_cross_tenant_access_blocked() # User A tenant → User B tenant → 403 +│ └── test_audit_logging() # Audit logs salvate corect +│ +└── test_encryption.py # Encryption utils + ├── test_fernet_encryption() # Encrypt/decrypt passwords + └── test_key_rotation() # Future: Key rotation support +``` + +**Run Unit Tests:** + +```bash +cd shared/ +pytest tests/ -v --cov=database --cov=middleware --cov=utils --cov-report=html + +# Expected output: +# ✅ test_tenant_config.py::test_load_tenants PASSED +# ✅ test_multi_tenant_pool.py::test_lazy_pool_initialization PASSED +# ... +# Coverage: 85% (target: > 80%) +``` + +### Integration Tests + +**End-to-End Scenarios** + +```bash +shared/tests/integration/ +├── test_multi_tenant_flow.py # Complete multi-tenant flow +│ ├── test_login_with_tenant_a() # Login → JWT cu tenant A +│ ├── test_dashboard_tenant_a() # Dashboard query tenant A +│ ├── test_cache_hit_tenant_a() # Cache hit pentru tenant A +│ ├── test_cross_tenant_isolation() # Tenant A cache ≠ Tenant B cache +│ └── test_audit_logs_populated() # Audit logs salvate per tenant +│ +├── test_ssh_tunnel_resilience.py # SSH tunnel stability +│ ├── test_tunnel_auto_recovery() # Kill tunnel → Auto-restart +│ ├── test_multiple_tunnels_parallel() # 3 tenants SSH simultaneous +│ └── test_tunnel_port_conflicts() # Port allocation unique +│ +└── test_deployment_scenarios.py # Deployment compatibility + ├── test_development_sqlite() # Development cu SQLite tenant DB + ├── test_docker_postgresql() # Docker cu PostgreSQL tenant DB + └── test_backward_compatibility() # Tenant "default" funcționează +``` + +**Run Integration Tests:** + +```bash +# Requires: PostgreSQL tenant DB running + Redis + Oracle test server +docker-compose -f docker-compose.test.yml up -d + +pytest shared/tests/integration/ -v --tb=short + +# Expected output: +# ✅ test_multi_tenant_flow.py::test_login_with_tenant_a PASSED (0.5s) +# ✅ test_multi_tenant_flow.py::test_cache_hit_tenant_a PASSED (0.2s) +# ... +``` + +### Load Testing + +**Performance Validation cu Multiple Tenants** + +```python +# shared/tests/load/test_multi_tenant_load.py + +from locust import HttpUser, task, between +import random + +class MultiTenantUser(HttpUser): + wait_time = between(1, 3) + + def on_start(self): + # Login to random tenant + self.tenant = random.choice(['client-a-uuid', 'client-b-uuid', 'client-c-uuid']) + response = self.client.post('/api/auth/login', json={ + 'username': f'user_{self.tenant}', + 'password': 'test_password' + }) + self.token = response.json()['access_token'] + self.client.headers.update({'Authorization': f'Bearer {self.token}'}) + + @task(3) + def get_dashboard(self): + self.client.get(f'/api/dashboard/COMP1') + + @task(2) + def get_invoices(self): + self.client.get(f'/api/invoices/COMP1') + + @task(1) + def get_treasury(self): + self.client.get(f'/api/treasury/COMP1') +``` + +**Run Load Test:** + +```bash +locust -f shared/tests/load/test_multi_tenant_load.py --host=http://localhost:8001 + +# Scenario: 3 tenants × 100 users = 300 concurrent users +# Duration: 10 minutes +# Expected: +# - Response time: < 200ms (p95) +# - Error rate: < 1% +# - SSH tunnels: No restarts +# - Pool connections: Max 10 per tenant (no exhaustion) +``` + +--- + +## 📊 Migration Checklist + +### Pre-Migration + +- [ ] **Backup production database** + ```bash + # Backup Oracle database + expdp username/password@ROA directory=BACKUP dumpfile=pre_migration.dmp + + # Backup existing .env files + cp reports-app/backend/.env reports-app/backend/.env.backup + ``` + +- [ ] **Document current single-tenant config** + ```bash + # Save current credentials + cat reports-app/backend/.env > docs/pre_migration_env.txt + + # Save current SSH tunnel config + ./ssh_tunnel.sh status > docs/pre_migration_ssh.txt + ``` + +- [ ] **Test deployment în environment non-production** + ```bash + # Create staging environment + docker-compose -f docker-compose.staging.yml up -d + + # Deploy multi-tenant în staging + # ... follow migration steps ... + + # Validate staging works + curl http://staging.roa2web.local/api/health + ``` + +- [ ] **Generate DB encryption key** + ```bash + python -c "from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())" + # Save în .env: DB_ENCRYPTION_KEY= + ``` + +- [ ] **Prepare tenant configuration** + - Create tenant DB (PostgreSQL/SQLite) + - Populate cu tenant "default" (credențiale existente) + - Add SSH keys pentru tenants remote + +### Migration Steps (Production) + +**Step 1: Deploy Tenant Config DB (30 min)** + +```bash +# Docker deployment +docker-compose up -d roa-tenant-config-db + +# Verify DB is running +docker-compose exec roa-tenant-config-db psql -U tenant_admin -d tenant_config -c '\dt' + +# Run schema +docker-compose exec roa-tenant-config-db psql -U tenant_admin -d tenant_config -f /docker-entrypoint-initdb.d/schema.sql +``` + +**Step 2: Populate Tenant "default" (15 min)** + +```bash +# Run migration script +docker-compose exec roa-backend python shared/scripts/create_default_tenant.py + +# Verify tenant created +docker-compose exec roa-tenant-config-db psql -U tenant_admin -d tenant_config -c 'SELECT id, name, connection_type FROM tenants;' +``` + +**Step 3: Deploy Backend cu MultiTenantPoolManager (45 min)** + +```bash +# Update .env with tenant DB URL +echo "TENANT_DB_URL=postgresql://tenant_admin:password@roa-tenant-config-db:5432/tenant_config" >> .env + +# Rebuild backend image +docker-compose build roa-backend + +# Deploy new backend (rolling update) +docker-compose up -d roa-backend + +# Wait for health check +watch -n 2 'curl -s http://localhost:8001/health | jq' +``` + +**Step 4: Verify Tenant "default" funcționează (15 min)** + +```bash +# Test login (should work exactly as before) +curl -X POST http://localhost:8001/api/auth/login \ + -H 'Content-Type: application/json' \ + -d '{"username": "test_user", "password": "test_password"}' + +# Response should include tenant_id: "default" +# { +# "access_token": "eyJ...", +# "user": { +# "tenant_id": "default", +# ... +# } +# } + +# Test dashboard (should work as before) +curl -H "Authorization: Bearer $TOKEN" http://localhost:8001/api/dashboard/COMP1 +``` + +**Step 5: Add Tenants Noi (One by One)** + +```bash +# Add tenant A (SSH tunnel) +docker-compose exec roa-backend python shared/scripts/add_tenant.py \ + --name "Client A - Retail SRL" \ + --connection-type ssh_tunnel \ + --oracle-host 10.0.20.36 \ + --oracle-user CLIENT_A_USER \ + --oracle-password "encrypted_password" \ + --ssh-host 83.103.197.79 \ + --ssh-port 22122 \ + --ssh-key /app/ssh-keys/client-a.key \ + --ssh-local-port 15261 + +# Add users la tenant A +docker-compose exec roa-tenant-config-db psql -U tenant_admin -d tenant_config -c \ + "INSERT INTO tenant_users (tenant_id, user_id, username) VALUES ('client-a-uuid', 123, 'john.doe');" + +# Test tenant A login +curl -X POST http://localhost:8001/api/auth/login \ + -H 'Content-Type: application/json' \ + -d '{"username": "john.doe", "password": "password"}' + +# Verify JWT includes tenant_id: "client-a-uuid" +``` + +**Step 6: Monitor Logs per Tenant (Ongoing)** + +```bash +# Monitor all tenant logs +docker-compose logs -f roa-backend | grep "tenant_id" + +# Monitor SSH tunnels +docker-compose logs -f roa-backend | grep "SSH tunnel" + +# Monitor pool connections +docker-compose logs -f roa-backend | grep "pool" + +# Check audit logs +docker-compose exec roa-tenant-config-db psql -U tenant_admin -d tenant_config -c \ + 'SELECT tenant_id, username, action, status, created_at FROM audit_logs ORDER BY created_at DESC LIMIT 20;' +``` + +**Step 7: Performance Validation (1-2h)** + +```bash +# Run load test +locust -f shared/tests/load/test_multi_tenant_load.py --host=http://localhost:8001 --users=100 --spawn-rate=10 --run-time=1h + +# Monitor metrics +# - Response time: < 200ms (p95) +# - Error rate: < 1% +# - Pool usage: < 80% per tenant +# - SSH tunnels: No restarts +``` + +### Post-Migration + +- [ ] **All tenants functional** + - Tenant "default" works (backward compatibility) + - Tenant A works (SSH tunnel) + - Tenant B works (direct connection) + +- [ ] **No performance degradation** + - Response time same as single-tenant (< 10% overhead) + - No connection pool exhaustion + - SSH tunnels stable (no auto-restarts) + +- [ ] **Audit logs populated** + ```bash + # Verify audit logs per tenant + SELECT tenant_id, COUNT(*) FROM audit_logs GROUP BY tenant_id; + ``` + +- [ ] **Documentation updated** + - Update `CLAUDE.md` cu multi-tenant architecture + - Update deployment guides (Docker, Windows) + - Create tenant onboarding guide + +- [ ] **Monitoring dashboards** + - Grafana dashboard per tenant + - Alerts pentru pool exhaustion, SSH tunnel failures + +--- + +## 🎯 Deployment Guides + +### Development Setup (WSL/Local) + +**Prerequisites:** +- Python 3.11+ +- SQLite3 +- Redis server +- SSH access la Oracle server (pentru tenants cu SSH tunnel) + +**Setup Steps:** + +```bash +# 1. Create SQLite tenant DB +mkdir -p data +sqlite3 data/tenant_config.db < shared/schemas/tenant_config_schema.sql + +# 2. Generate encryption key +python3 -c "from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())" > .encryption_key +DB_ENCRYPTION_KEY=$(cat .encryption_key) + +# 3. Update .env +cat >> reports-app/backend/.env << EOF +# Tenant Configuration +TENANT_DB_URL=sqlite:///data/tenant_config.db +DB_ENCRYPTION_KEY=$DB_ENCRYPTION_KEY +EOF + +# 4. Create default tenant +cd shared/ +python scripts/create_default_tenant.py + +# 5. Start Redis +redis-server --daemonize yes + +# 6. Start application +cd ../ +./start-dev.sh + +# 7. Verify +curl http://localhost:8001/health +# Should return: {"database": "connected", "tenants_loaded": 1} +``` + +**Add New Tenant (Development):** + +```bash +# Add tenant via SQL +sqlite3 data/tenant_config.db << EOF +INSERT INTO tenants ( + id, name, connection_type, + oracle_host, oracle_port, oracle_sid, oracle_user, oracle_password_encrypted, + ssh_host, ssh_port, ssh_user, ssh_key_path, ssh_tunnel_local_port +) VALUES ( + 'dev-tenant-uuid', + 'Dev Tenant - Test Company', + 'ssh_tunnel', + '10.0.20.36', + 1521, + 'ROA', + 'DEV_USER', + 'encrypted_password_here', + '83.103.197.79', + 22122, + 'roa2web', + '/tmp/roa_oracle_server', + 15263 +); + +-- Add user to tenant +INSERT INTO tenant_users (tenant_id, user_id, username) +VALUES ('dev-tenant-uuid', 999, 'dev_user'); +EOF + +# Restart backend +pkill -f "uvicorn app.main:app" +./start-dev.sh +``` + +--- + +### Docker Deployment (Proxmox LXC) + +**Prerequisites:** +- Docker 24+ +- Docker Compose 2.20+ +- 4GB RAM minimum +- PostgreSQL 15 container + +**docker-compose.multi-tenant.yml:** + +```yaml +version: '3.8' + +services: + # Tenant Configuration Database + roa-tenant-config-db: + image: postgres:15-alpine + container_name: roa-tenant-config-db + restart: unless-stopped + environment: + POSTGRES_DB: tenant_config + POSTGRES_USER: tenant_admin + POSTGRES_PASSWORD: ${TENANT_DB_PASSWORD} + volumes: + - tenant-config-data:/var/lib/postgresql/data + - ./shared/schemas/tenant_config_schema.sql:/docker-entrypoint-initdb.d/schema.sql:ro + networks: + - roa-network + healthcheck: + test: ["CMD-SHELL", "pg_isready -U tenant_admin -d tenant_config"] + interval: 10s + timeout: 5s + retries: 5 + + # Backend (Multi-Tenant) + roa-backend: + build: + context: . + dockerfile: ./reports-app/backend/Dockerfile + image: roa2web/backend:multi-tenant + container_name: roa-backend + restart: unless-stopped + environment: + # Tenant Configuration + - TENANT_DB_URL=postgresql://tenant_admin:${TENANT_DB_PASSWORD}@roa-tenant-config-db:5432/tenant_config + - DB_ENCRYPTION_KEY=${DB_ENCRYPTION_KEY} + + # JWT Configuration + - JWT_SECRET_KEY=${JWT_SECRET_KEY} + + # Redis Cache + - REDIS_URL=redis://:${REDIS_PASSWORD}@roa-redis:6379/0 + volumes: + # SSH keys for tenant tunnels (read-only) + - ./ssh-keys:/app/ssh-keys:ro + - backend-logs:/app/logs + networks: + - roa-network + depends_on: + roa-tenant-config-db: + condition: service_healthy + roa-redis: + condition: service_healthy + healthcheck: + test: ["CMD", "curl", "-f", "http://localhost:8000/health"] + interval: 30s + timeout: 10s + retries: 3 + + # Redis Cache + roa-redis: + image: redis:7-alpine + container_name: roa-redis + restart: unless-stopped + command: redis-server --appendonly yes --requirepass ${REDIS_PASSWORD} + volumes: + - redis-data:/data + networks: + - roa-network + healthcheck: + test: ["CMD", "redis-cli", "ping"] + interval: 10s + timeout: 5s + retries: 5 + + # Frontend (unchanged) + roa-frontend: + build: + context: ./reports-app/frontend + dockerfile: Dockerfile + image: roa2web/frontend:latest + container_name: roa-frontend + restart: unless-stopped + networks: + - roa-network + + # Nginx Gateway (unchanged) + roa-gateway: + build: + context: ./nginx + dockerfile: Dockerfile + image: roa2web/nginx-gateway:latest + container_name: roa-gateway + restart: unless-stopped + ports: + - "80:80" + - "443:443" + networks: + - roa-network + depends_on: + - roa-backend + - roa-frontend + +volumes: + tenant-config-data: + redis-data: + backend-logs: + +networks: + roa-network: + driver: bridge +``` + +**Deployment:** + +```bash +# 1. Create .env file +cat > .env << EOF +TENANT_DB_PASSWORD=$(openssl rand -base64 32) +DB_ENCRYPTION_KEY=$(python3 -c "from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())") +JWT_SECRET_KEY=$(openssl rand -base64 64) +REDIS_PASSWORD=$(openssl rand -base64 32) +EOF + +# 2. Prepare SSH keys directory +mkdir -p ssh-keys +chmod 700 ssh-keys +cp /path/to/client-a.key ssh-keys/client-a.key +chmod 400 ssh-keys/client-a.key + +# 3. Build and start services +docker-compose -f docker-compose.multi-tenant.yml build +docker-compose -f docker-compose.multi-tenant.yml up -d + +# 4. Wait for tenant DB initialization +docker-compose logs -f roa-tenant-config-db | grep "database system is ready" + +# 5. Create default tenant +docker-compose exec roa-backend python shared/scripts/create_default_tenant.py + +# 6. Verify deployment +curl http://localhost/api/health +# {"api": "healthy", "database": "connected", "tenants_loaded": 1} +``` + +**Add New Tenant:** + +```bash +# Connect to tenant DB +docker-compose exec roa-tenant-config-db psql -U tenant_admin -d tenant_config + +# Insert tenant (with encrypted password) +INSERT INTO tenants (id, name, connection_type, oracle_host, oracle_port, oracle_sid, oracle_user, oracle_password_encrypted, ssh_host, ssh_port, ssh_user, ssh_key_path, ssh_tunnel_local_port, is_active) +VALUES ( + 'client-a-uuid', + 'Client A - Retail SRL', + 'ssh_tunnel', + '10.0.20.36', + 1521, + 'ROA', + 'CLIENT_A_USER', + 'gAAAAABh...', -- Fernet encrypted password + '83.103.197.79', + 22122, + 'roa2web', + '/app/ssh-keys/client-a.key', + 15261, + TRUE +); + +-- Add user to tenant +INSERT INTO tenant_users (tenant_id, user_id, username) +VALUES ('client-a-uuid', 123, 'john.doe'); + +\q + +# Reload backend (or wait for auto-reload) +docker-compose restart roa-backend +``` + +--- + +### Windows IIS Deployment + +**Prerequisites:** +- Windows Server 2019+ +- IIS 10+ +- SQL Server Express 2019+ SAU PostgreSQL 15 for Windows +- Python 3.11+ (Windows installer) +- Redis for Windows (MSI installer) + +**Setup Script:** `deployment/windows/scripts/Setup-MultiTenant.ps1` + +```powershell +# Run as Administrator +.\deployment\windows\scripts\Setup-MultiTenant.ps1 + +<# +This script will: +1. Install SQL Server Express 2019 +2. Create tenant_config database +3. Run schema SQL +4. Generate encryption key (save în Windows Credential Manager) +5. Create default tenant +6. Update ROA2WEB backend service +7. Restart IIS +#> +``` + +**Manual Setup:** + +```powershell +# 1. Install SQL Server Express +# Download from: https://www.microsoft.com/en-us/sql-server/sql-server-downloads +# Install with default instance name: SQLEXPRESS + +# 2. Create tenant database +sqlcmd -S localhost\SQLEXPRESS -E -Q "CREATE DATABASE tenant_config" + +# 3. Run schema +sqlcmd -S localhost\SQLEXPRESS -d tenant_config -E -i shared\schemas\tenant_config_schema.sql + +# 4. Generate encryption key +python -c "from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())" | Out-File -FilePath .encryption_key -NoNewline + +# 5. Store key în Windows Credential Manager +cmdkey /generic:ROA2WEB_DB_ENCRYPTION_KEY /user:system /pass:(Get-Content .encryption_key) + +# 6. Update backend .env +@" +TENANT_DB_URL=mssql+pyodbc://localhost\SQLEXPRESS/tenant_config?driver=ODBC+Driver+17+for+SQL+Server&trusted_connection=yes +DB_ENCRYPTION_KEY=$(Get-Content .encryption_key) +"@ | Add-Content -Path C:\inetpub\wwwroot\roa2web\backend\.env + +# 7. Create default tenant +cd C:\inetpub\wwwroot\roa2web +python shared\scripts\create_default_tenant.py + +# 8. Restart backend service +Restart-Service ROA2WEB-Backend + +# 9. Verify +curl http://localhost:8000/health +``` + +**Add New Tenant (Windows):** + +```powershell +# Connect to SQL Server +sqlcmd -S localhost\SQLEXPRESS -d tenant_config -E + +-- Insert tenant +INSERT INTO tenants (id, name, connection_type, oracle_host, oracle_port, oracle_sid, oracle_user, oracle_password_encrypted, is_active) +VALUES ( + 'client-b-uuid', + 'Client B - Import Export SA', + 'direct', + '192.168.1.50', + 1521, + 'ROA', + 'CLIENT_B_USER', + 'gAAAAABh...', -- Encrypted password + 1 +); + +-- Add user to tenant +INSERT INTO tenant_users (tenant_id, user_id, username) +VALUES ('client-b-uuid', 456, 'jane.smith'); + +GO +EXIT + +# Restart backend +Restart-Service ROA2WEB-Backend +``` + +--- + +## 📝 Configuration Examples + +### Tenant Config: SSH Tunnel (Development) + +```json +{ + "id": "dev-client-uuid", + "name": "Development Client - Test Company", + "connection_type": "ssh_tunnel", + + "oracle_host": "10.0.20.36", + "oracle_port": 1521, + "oracle_sid": "ROA", + "oracle_user": "DEV_USER", + "oracle_password_encrypted": "gAAAAABhXj7Ks3J...", + + "ssh_host": "83.103.197.79", + "ssh_port": 22122, + "ssh_user": "roa2web", + "ssh_key_path": "/tmp/roa_oracle_server", + "ssh_tunnel_local_port": 15260, + + "min_connections": 2, + "max_connections": 5, + "is_active": true +} +``` + +### Tenant Config: Direct Connection (Production) + +```json +{ + "id": "prod-client-uuid", + "name": "Production Client - Enterprise Corp", + "connection_type": "direct", + + "oracle_host": "192.168.100.50", + "oracle_port": 1521, + "oracle_sid": "ROA", + "oracle_user": "PROD_USER", + "oracle_password_encrypted": "gAAAAABhXj8Nm4K...", + + "ssh_host": null, + "ssh_port": null, + "ssh_user": null, + "ssh_key_path": null, + "ssh_tunnel_local_port": null, + + "min_connections": 5, + "max_connections": 20, + "is_active": true +} +``` + +### Tenant Config: Docker Deployment (PostgreSQL Tenant DB) + +**.env for Docker Compose:** + +```bash +# Tenant Configuration Database +TENANT_DB_PASSWORD=SecurePostgresPassword123! +DB_ENCRYPTION_KEY=Xs3J7vN2pQ8kR9mT1wY5zC6bA4dF0gH= + +# Backend +JWT_SECRET_KEY=YourVerySecureJWTSecretKeyHere123456789 + +# Redis +REDIS_PASSWORD=SecureRedisPassword456! +``` + +### User-Tenant Mapping Example + +```sql +-- User john.doe has access to 2 tenants +INSERT INTO tenant_users (tenant_id, user_id, username, is_admin) VALUES +('client-a-uuid', 123, 'john.doe', TRUE), +('client-b-uuid', 123, 'john.doe', FALSE); + +-- User jane.smith has access to 1 tenant +INSERT INTO tenant_users (tenant_id, user_id, username, is_admin) VALUES +('client-b-uuid', 456, 'jane.smith', FALSE); + +-- Query: Get all tenants for user +SELECT t.id, t.name, tu.is_admin +FROM tenants t +JOIN tenant_users tu ON t.id = tu.tenant_id +WHERE tu.user_id = 123 AND t.is_active = TRUE; + +-- Result: +-- | id | name | is_admin | +-- |----------------|-------------------------------|----------| +-- | client-a-uuid | Client A - Retail SRL | TRUE | +-- | client-b-uuid | Client B - Import Export SA | FALSE | +``` + +--- + +## 🎯 Success Criteria + +### Definition of Done + +**Funcțional:** +- ✅ Aplicația suportă minimum 3 tenants simultaneous +- ✅ Tenant identification din JWT funcționează corect +- ✅ SSH tunnels pornesc/opresc automat per tenant +- ✅ Connection pools izolate per tenant (zero sharing) +- ✅ Cache isolation între tenants (namespace per tenant) +- ✅ No cross-tenant data leakage în audit logs sau cache + +**Deployment:** +- ✅ Funcționează în toate deployment scenarios (dev/WSL, Docker, Windows IIS) +- ✅ Backward compatibility: Tenant "default" funcționează exact ca single-tenant +- ✅ Zero downtime pentru existing tenant când adaugi tenant nou (lazy loading) +- ✅ Migration script successful în < 2h (staging environment) + +**Performance:** +- ✅ Overhead < 10% vs single-tenant (measured în load testing) +- ✅ Response time < 200ms (p95) cu 3 tenants × 100 requests +- ✅ No connection pool exhaustion (max 80% usage per tenant) +- ✅ SSH tunnels stable (zero auto-restarts în 1h load test) + +**Security:** +- ✅ Passwords encrypted at rest în tenant DB (Fernet AES-128) +- ✅ SSH keys mounted read-only în Docker volumes +- ✅ JWT tenant_id signed (nu poate fi modificat de client) +- ✅ Tenant access validation în middleware (403 pentru unauthorized) +- ✅ Audit logging TOATE operațiile per tenant + +**Testing:** +- ✅ Unit tests: > 80% code coverage +- ✅ Integration tests: All scenarios pass (login, dashboard, cross-tenant isolation) +- ✅ Load tests: 3 tenants × 100 users, 10 minutes, < 1% error rate +- ✅ Manual testing: Tenant onboarding guide validated + +**Documentation:** +- ✅ CLAUDE.md updated cu multi-tenant architecture +- ✅ Deployment guides (dev, Docker, Windows) complete +- ✅ Tenant onboarding guide created +- ✅ Troubleshooting guide created +- ✅ API documentation updated (Swagger/ReDoc) + +--- + +## ⚠️ Risks & Mitigations + +### Risk: SSH Tunnel Instability + +**Scenario:** SSH tunnel process crashes sau network interruption între backend și SSH server. + +**Impact:** Tenant-ul afectat nu poate accesa Oracle DB (requests fail cu connection error). + +**Mitigation:** +1. **Health Checks:** Background task checks tunnel health every 60s +2. **Auto-Restart:** Restart tunnel automat cu exponential backoff (5s, 10s, 20s, max 60s) +3. **Monitoring:** Alert dacă tunnel e down > 5 minutes +4. **Fallback:** Graceful degradation - alți tenants continuă să funcționeze normal + +**Detection:** +```python +async def monitor_ssh_tunnels(): + for tenant_id in ssh_tunnel_manager.tunnels: + if not await ssh_tunnel_manager.check_tunnel_health(tenant_id): + logger.error(f"Tunnel down for tenant {tenant_id}, restarting...") + await ssh_tunnel_manager.restart_tunnel(tenant_id) +``` + +--- + +### Risk: Connection Pool Exhaustion + +**Scenario:** Tenant face burst de requests, pool ajunge la max connections (ex: 10), noi requests block sau timeout. + +**Impact:** Slow response time sau 503 Service Unavailable pentru tenant-ul respectiv. + +**Mitigation:** +1. **Pool Limits:** Set realistic limits per tenant (min=2, max=10 default, configurable) +2. **Queue Timeout:** `getmode=POOL_GETMODE_WAIT` cu timeout (ex: 30s) +3. **Rate Limiting:** Limit requests per user/tenant (ex: 100 req/min) +4. **Monitoring:** Alert dacă pool usage > 80% pentru > 5 minutes +5. **Scaling:** Increase `max_connections` pentru high-traffic tenants + +**Configuration:** +```python +# În tenant config DB +UPDATE tenants SET max_connections = 20 WHERE id = 'high-traffic-tenant-uuid'; + +# Reload tenant +await multi_tenant_pool.reload_tenant('high-traffic-tenant-uuid') +``` + +--- + +### Risk: Tenant Credential Leak + +**Scenario:** Attacker obține acces la tenant DB sau logs și vede Oracle passwords. + +**Impact:** Data breach - attacker poate accesa Oracle DB direct. + +**Mitigation:** +1. **Encryption at Rest:** Passwords encrypted cu Fernet în tenant DB +2. **Encryption Key Security:** `DB_ENCRYPTION_KEY` în environment variables (nu în git) +3. **Access Control:** Tenant DB access restricted (firewall, VPN) +4. **No Plaintext Logs:** NEVER log decrypted passwords (check code reviews) +5. **Audit Logging:** Log all access la tenant config (who/when) +6. **Key Rotation:** Support key rotation (encrypt cu new key, decrypt cu old key) + +**Validation:** +```bash +# Check logs pentru password leaks +docker-compose logs roa-backend | grep -i "password" | grep -v "encrypted" +# Should return ZERO results + +# Check tenant DB +docker-compose exec roa-tenant-config-db psql -U tenant_admin -d tenant_config -c 'SELECT id, name, oracle_password_encrypted FROM tenants LIMIT 5;' +# oracle_password_encrypted should start with "gAAAAA..." (Fernet token) +``` + +--- + +### Risk: Cross-Tenant Data Leakage + +**Scenario:** Bug în middleware sau router permite user din tenant A să acceseze date din tenant B. + +**Impact:** CRITICAL data breach - confidențialitate compromisă. + +**Mitigation:** +1. **Mandatory Middleware:** TenantMiddleware validează tenant access pentru TOATE requests +2. **Explicit Tenant ID:** Routers MUST use `request.state.tenant_id` (no global state) +3. **Code Reviews:** TOATE modificările în routers reviewed pentru tenant isolation +4. **Integration Tests:** Test cross-tenant access blocked (403 Forbidden) +5. **Audit Logging:** Log tenant_id în TOATE audit entries pentru forensics + +**Test Scenario:** +```python +# Test: User cu tenant A încearcă să acceseze tenant B +def test_cross_tenant_access_blocked(): + # Login cu tenant A + token_a = login(user_id=123, tenant_id='client-a-uuid') + + # Modify JWT tenant_id → tenant B (attack simulation) + forged_token = jwt.encode({ + 'user_id': 123, + 'tenant_id': 'client-b-uuid', # FORGED + 'exp': datetime.utcnow() + timedelta(hours=1) + }, secret_key, algorithm='HS256') + + # Request cu forged token + response = client.get('/api/dashboard/COMP1', headers={'Authorization': f'Bearer {forged_token}'}) + + # MUST return 403 Forbidden (not 200 OK) + assert response.status_code == 403 + assert 'does not have access to tenant' in response.json()['detail'] +``` + +--- + +### Risk: Performance Degradation cu Multiple Tenants + +**Scenario:** Cu 10+ tenants, response time crește sau backend consumă prea multă memorie. + +**Impact:** Poor user experience, server overload. + +**Mitigation:** +1. **Lazy Loading:** Pool-uri create doar când tenant e accesat (economie memorie) +2. **Pool Cleanup:** Inactive pools > 1h se închid automat +3. **Resource Limits:** Set `max_connections` realistic per tenant (evită OOM) +4. **Monitoring:** Track memory usage, response time per tenant +5. **Horizontal Scaling:** Add more backend replicas (Docker Swarm, Kubernetes) +6. **Connection Pooling:** Reuse connections (oracle `create_pool` already does this) + +**Performance Baseline:** +``` +Single-Tenant: +- Memory: 50MB (1 pool × 2-10 connections) +- Response time: 50ms (p95) + +Multi-Tenant (3 tenants): +- Memory: 150MB (3 pools × 2-10 connections) +- Response time: 55ms (p95) +- Overhead: 10% (acceptable) + +Multi-Tenant (10 tenants): +- Memory: 500MB (10 pools × 2-10 connections) +- Response time: 65ms (p95) +- Overhead: 30% (needs optimization if > 10% target) +``` + +**Optimization:** +- Reduce `min_connections` de la 2 la 1 pentru low-traffic tenants +- Aggressive cleanup: Idle > 30 min (instead of 1h) +- Cache more aggressively (reduce Oracle queries) + +--- + +## 📚 Referințe + +### Current Implementation + +- **OraclePool:** `shared/database/oracle_pool.py` - Singleton pattern for single-tenant +- **JWT Handler:** `shared/auth/jwt_handler.py` - Token creation/validation (needs tenant_id) +- **Auth Middleware:** `shared/auth/middleware.py` - JWT verification (needs tenant validation) +- **Backend Main:** `reports-app/backend/app/main.py` - Startup logic (needs MultiTenantPoolManager) +- **SSH Tunnel Script:** `ssh_tunnel.sh` - Single tunnel script (needs per-tenant manager) + +### Inspiration & Patterns + +- **Redis Implementation Plan:** `shared/docs/REDIS_IMPLEMENTATION_PLAN.md` - Good structure for this plan +- **Docker Compose:** `docker-compose.yml` - Current deployment (needs tenant-config-db service) +- **Windows Deployment:** `deployment/windows/scripts/` - Deployment patterns for Windows +- **Python oracledb Docs:** https://python-oracledb.readthedocs.io/en/latest/user_guide/connection_handling.html +- **Fernet Encryption:** https://cryptography.io/en/latest/fernet/ + +### Multi-Tenant Best Practices + +- **Tenant Isolation Patterns:** https://docs.microsoft.com/en-us/azure/architecture/guide/multitenant/ +- **Connection Pooling:** https://python-oracledb.readthedocs.io/en/latest/user_guide/connection_handling.html#connection-pooling +- **SSH Tunnel Management:** https://www.ssh.com/academy/ssh/tunneling-example +- **JWT Security:** https://jwt.io/introduction + +### Testing Resources + +- **pytest-asyncio:** https://pytest-asyncio.readthedocs.io/ +- **Locust Load Testing:** https://docs.locust.io/en/stable/ +- **Docker Compose Testing:** https://docs.docker.com/compose/ + +--- + +## 📅 Timeline Summary + +| Faza | Durată | Obiectiv | Output Verificabil | +|------|--------|----------|-------------------| +| **Faza 1** | 2-3 zile | Tenant Config DB | Tenant DB funcționează, default tenant creat | +| **Faza 2** | 3-4 zile | MultiTenantPoolManager | Pool-uri per tenant, lazy loading | +| **Faza 3** | 2-3 zile | SSH Tunnel Manager | SSH tunnels per tenant, auto-restart | +| **Faza 4** | 2-3 zile | JWT & Middleware | JWT cu tenant_id, tenant validation | +| **Faza 5** | 1-2 zile | Cache & Audit | Redis cache per tenant, audit logs | +| **Faza 6** | 3-4 zile | Deployment & Testing | Deploy în toate env-urile, tests pass | +| **TOTAL** | **14-20 zile** | **Multi-Tenant Production-Ready** | All success criteria met | + +--- + +## 🚀 Next Steps + +1. **Review acest plan** cu team/stakeholders +2. **Prioritizează fazele** (poate Faza 1+2 first, restul după) +3. **Setup development environment** pentru testing +4. **Creează branch:** `feature/multi-tenant-architecture` +5. **Start Faza 1:** Tenant Configuration Database +6. **Iterate:** Test după fiecare fază, adjust plan dacă e nevoie + +--- + +**Document Version:** 1.0 +**Last Updated:** 2025-10-25 +**Author:** Claude Code (Anthropic) +**Status:** Ready for Implementation + diff --git a/shared/docs/REDIS_IMPLEMENTATION_PLAN.md b/shared/docs/REDIS_IMPLEMENTATION_PLAN.md new file mode 100644 index 0000000..c0c6c6e --- /dev/null +++ b/shared/docs/REDIS_IMPLEMENTATION_PLAN.md @@ -0,0 +1,910 @@ +# Plan Implementare Redis Caching - ROA2WEB + +**Data Creare:** 2025-01-25 +**Status:** DRAFT - Ready for Implementation +**Durata Estimată:** 2-3 ore + +--- + +## 📋 Sumar Executiv + +- **Obiectiv:** Implementare Redis caching layer pentru reducerea query-urilor repetitive la Oracle DB +- **Target:** 60-80% reducere în numărul de query-uri pentru date frecvent accesate +- **Strategie:** Caching simplu la nivel de Service layer cu invalidation manuală +- **Backward Compatibility:** ✅ Aplicația funcționează fără Redis (graceful degradation) +- **Multi-Tenant Ready:** ✅ Cache keys pregătite pentru viitoare multi-tenancy + +**Infrastructure Status:** +- ✅ Redis container configurat în `docker-compose.yml` (lines 147-163) +- ✅ Backend `depends_on: roa-redis` +- ❌ Redis client library LIPSĂ +- ❌ Cod de caching LIPSĂ + +--- + +## 🗂️ Structura Fișierelor + +### Fișiere Noi (6 fișiere) + +- [ ] `shared/cache/__init__.py` - Package initialization +- [ ] `shared/cache/redis_client.py` - Redis connection client (90 lines) +- [ ] `shared/cache/decorators.py` - Cache decorators (60 lines) +- [ ] `shared/cache/utils.py` - Cache key generators, helpers (40 lines) +- [ ] `shared/tests/test_redis_client.py` - Unit tests pentru Redis client (120 lines) +- [ ] `shared/tests/test_cache_decorators.py` - Unit tests pentru decorators (80 lines) + +### Fișiere Modificate (7 fișiere) + +- [ ] `reports-app/backend/requirements.txt` - Adaugă redis>=5.0.0 +- [ ] `reports-app/backend/.env.example` - Adaugă REDIS_* env vars +- [ ] `reports-app/backend/app/main.py` - Initialize Redis at startup +- [ ] `reports-app/backend/app/services/dashboard_service.py` - Apply caching +- [ ] `reports-app/backend/app/services/invoice_service.py` - Apply caching +- [ ] `reports-app/backend/app/routers/dashboard.py` - Cache invalidation on mutations +- [ ] `reports-app/backend/app/routers/invoices.py` - Cache invalidation on mutations + +--- + +## 🚀 Faze de Implementare + +### FAZA 1: Setup Redis Client și Connection (30 min) + +**Obiectiv:** Creează Redis client singleton cu connection pooling, similar cu `OraclePool` pattern. + +**Tasks:** + +1. [ ] **Adaugă dependency în requirements.txt** + - Fișier: `reports-app/backend/requirements.txt` + - Acțiune: Adaugă linia `redis>=5.0.0` după `httpx>=0.27.0` + - Motivație: redis 5.0+ are async support nativ + +2. [ ] **Creează package cache în shared/** + - Fișiere: `shared/cache/__init__.py` + - Acțiune: Creează directorul și fișierul de init cu exports: + ```python + from .redis_client import redis_cache + from .decorators import cached, invalidate_cache + + __all__ = ['redis_cache', 'cached', 'invalidate_cache'] + ``` + +3. [ ] **Implementează RedisCache singleton client** + - Fișier: `shared/cache/redis_client.py` + - Acțiune: Creează clasa `RedisCache` similar cu `OraclePool`: + - Singleton pattern (nu multi-instance) + - Async redis client cu connection pooling + - Methods: `initialize()`, `get()`, `set()`, `delete()`, `delete_pattern()`, `close()` + - Graceful degradation: dacă Redis e down, log warning și return None + - Connection retry cu exponential backoff + - Template (vezi secțiunea Code Templates mai jos) + +4. [ ] **Adaugă env variables în .env.example** + - Fișier: `reports-app/backend/.env.example` + - Acțiune: Adaugă la sfârșitul fișierului: + ```bash + # Redis Configuration + REDIS_URL=redis://roa-redis:6379/0 + REDIS_PASSWORD=roa2web_redis_password + REDIS_ENABLED=true + CACHE_DEFAULT_TTL=300 + ``` + +5. [ ] **Initialize Redis la startup în main.py** + - Fișier: `reports-app/backend/app/main.py` + - Acțiune: În funcția `startup_event()`: + ```python + from shared.cache import redis_cache + + @app.on_event("startup") + async def startup_event(): + # ... existing oracle pool init ... + + # Initialize Redis cache + if os.getenv('REDIS_ENABLED', 'false').lower() == 'true': + await redis_cache.initialize( + url=os.getenv('REDIS_URL'), + password=os.getenv('REDIS_PASSWORD') + ) + + @app.on_event("shutdown") + async def shutdown_event(): + # ... existing oracle pool close ... + await redis_cache.close() + ``` + +**Output Verificabil:** + +- [ ] `pip install -r requirements.txt` rulează fără erori +- [ ] Redis client se conectează cu succes la container +- [ ] Test manual: `python -c "import asyncio; from shared.cache import redis_cache; asyncio.run(redis_cache.initialize())"` +- [ ] Log message: "✅ Redis cache initialized successfully" + +--- + +### FAZA 2: Cache Decorator și Helpers (30 min) + +**Obiectiv:** Creează decorator `@cached()` pentru aplicare ușoară în Services. + +**Tasks:** + +1. [ ] **Implementează cache key generator** + - Fișier: `shared/cache/utils.py` + - Acțiune: Funcții helper pentru key generation: + ```python + import hashlib + import json + from typing import Any, Dict + + def make_cache_key(tenant_id: str, resource: str, **params) -> str: + """ + Generate tenant-aware cache key + Format: cache:{tenant_id}:{resource}:{params_hash} + """ + params_str = json.dumps(params, sort_keys=True) + params_hash = hashlib.md5(params_str.encode()).hexdigest()[:12] + return f"cache:{tenant_id}:{resource}:{params_hash}" + + def extract_tenant_id(kwargs: Dict[str, Any]) -> str: + """ + Extract tenant_id from function kwargs + For now returns 'default', later extract from JWT token + """ + # TODO: Extract from request.state.tenant_id when multi-tenant implemented + return kwargs.get('tenant_id', 'default') + ``` + +2. [ ] **Implementează @cached decorator** + - Fișier: `shared/cache/decorators.py` + - Acțiune: Decorator pentru auto-caching de funcții async: + ```python + from functools import wraps + from typing import Callable, Optional + import logging + from .redis_client import redis_cache + from .utils import make_cache_key, extract_tenant_id + + logger = logging.getLogger(__name__) + + def cached(resource: str, ttl: int = 300): + """ + Cache decorator pentru funcții async + + Usage: + @cached(resource="dashboard_summary", ttl=300) + async def get_dashboard_summary(company: str, username: str): + # ... query Oracle ... + return data + + Args: + resource: Resource name (e.g., 'dashboard_summary', 'invoices_list') + ttl: Time-to-live în secunde (default: 5 min) + """ + def decorator(func: Callable): + @wraps(func) + async def wrapper(*args, **kwargs): + # Skip cache dacă Redis e disabled + if not redis_cache.is_enabled(): + return await func(*args, **kwargs) + + # Extract tenant_id și params pentru cache key + tenant_id = extract_tenant_id(kwargs) + cache_params = {k: v for k, v in kwargs.items() + if k not in ['username', 'current_user']} + + cache_key = make_cache_key(tenant_id, resource, **cache_params) + + # Try cache GET + cached_value = await redis_cache.get(cache_key) + if cached_value is not None: + logger.debug(f"Cache HIT: {cache_key}") + return cached_value + + # Cache MISS - execute function + logger.debug(f"Cache MISS: {cache_key}") + result = await func(*args, **kwargs) + + # Save to cache + await redis_cache.set(cache_key, result, ttl=ttl) + + return result + + return wrapper + return decorator + ``` + +3. [ ] **Implementează invalidate_cache helper** + - Fișier: `shared/cache/decorators.py` (same file) + - Acțiune: Helper function pentru manual invalidation: + ```python + async def invalidate_cache( + tenant_id: str = "default", + resource: Optional[str] = None + ): + """ + Invalidate cache entries + + Examples: + await invalidate_cache(resource="dashboard_summary") # clear specific resource + await invalidate_cache() # clear all for default tenant + """ + if not redis_cache.is_enabled(): + return + + if resource: + pattern = f"cache:{tenant_id}:{resource}:*" + else: + pattern = f"cache:{tenant_id}:*" + + await redis_cache.delete_pattern(pattern) + logger.info(f"Cache invalidated: {pattern}") + ``` + +**Output Verificabil:** + +- [ ] Decorator funcționează fără erori +- [ ] Cache key format: `cache:default:dashboard_summary:abc123` +- [ ] Test unit: `pytest shared/tests/test_cache_decorators.py -v` + +--- + +### FAZA 3: Aplicare în Endpoint-uri (45 min) + +**Obiectiv:** Aplică caching în Service layer pentru dashboard și invoices. + +**Tasks:** + +1. [ ] **Apply @cached în DashboardService.get_complete_summary()** + - Fișier: `reports-app/backend/app/services/dashboard_service.py` + - Acțiune: Adaugă decorator la metoda `get_complete_summary`: + ```python + from shared.cache import cached + + class DashboardService: + @staticmethod + @cached(resource="dashboard_summary", ttl=300) # 5 min + async def get_complete_summary(company: str, username: str): + # ... existing implementation ... + ``` + - Motivație: Dashboard e accesat des, datele se schimbă rar + +2. [ ] **Apply @cached în DashboardService.get_trends()** + - Fișier: `reports-app/backend/app/services/dashboard_service.py` + - Acțiune: Similar, TTL=180 (3 min pentru trends) + - Cache key va include period: `cache:default:dashboard_trends:company-X:period-30d:abc123` + +3. [ ] **Apply @cached în DashboardService.get_detailed_data()** + - Fișier: `reports-app/backend/app/services/dashboard_service.py` + - Acțiune: TTL=60 (1 min pentru tabel detalii - se refreshează des) + - Cache key include page, page_size, search + +4. [ ] **Apply @cached în InvoiceService.get_invoices()** + - Fișier: `reports-app/backend/app/services/invoice_service.py` + - Acțiune: TTL=60 (1 min) + - Cache key include filter params (partner_type, date_from, date_to, etc.) + +5. [ ] **Apply @cached în InvoiceService.get_invoice_summary()** + - Fișier: `reports-app/backend/app/services/invoice_service.py` + - Acțiune: TTL=180 (3 min pentru summary) + +6. [ ] **Cache invalidation în dashboard mutations (viitor)** + - Fișier: `reports-app/backend/app/routers/dashboard.py` + - Acțiune: Pregătește cod pentru invalidation (de activat când există POST/PUT/DELETE): + ```python + # TODO: Activate când implementăm mutations + # from shared.cache import invalidate_cache + # + # @router.post("/...") + # async def update_dashboard_data(...): + # # ... update logic ... + # await invalidate_cache(resource="dashboard_summary") + # await invalidate_cache(resource="dashboard_trends") + ``` + +7. [ ] **Cache invalidation în invoice mutations** + - Fișier: `reports-app/backend/app/routers/invoices.py` + - Acțiune: Când se implementează POST/PUT/DELETE pentru invoices, invalidează: + - `invoices_list` + - `invoices_summary` + - `dashboard_summary` (afectează dashboard) + +**Output Verificabil:** + +- [ ] Backend pornește fără erori +- [ ] First request: Cache MISS + Oracle query (măsoară timp: ~500-1000ms) +- [ ] Second request (same params): Cache HIT (măsoară timp: ~10-20ms) +- [ ] Cache hit rate > 80% după 100 requests repetitive +- [ ] Logs arată `Cache HIT/MISS` messages + +--- + +### FAZA 4: Testing, Monitoring și Cleanup (45 min) + +**Obiectiv:** Validare funcționare corectă, performance benchmarks, și documentare. + +**Tasks:** + +1. [ ] **Unit tests pentru RedisCache client** + - Fișier: `shared/tests/test_redis_client.py` + - Acțiune: Testează: + - Connection success/failure + - Get/Set/Delete operations + - Pattern matching delete + - Graceful degradation când Redis e down + - Run: `pytest shared/tests/test_redis_client.py -v` + +2. [ ] **Unit tests pentru cache decorators** + - Fișier: `shared/tests/test_cache_decorators.py` + - Acțiune: Testează: + - Decorator aplică caching corect + - Cache key generation + - TTL respectat + - Invalidation funcționează + - Run: `pytest shared/tests/test_cache_decorators.py -v` + +3. [ ] **Integration test în Docker** + - Acțiune: Pornește stack complet cu `docker-compose up` + - Verifică: + - Backend se conectează la Redis + - Cache funcționează end-to-end + - Logs arată cache hits/misses + +4. [ ] **Performance benchmark** + - Tool: Apache Bench sau Python requests loop + - Test case: 100 requests la `/api/dashboard/summary?company=X` + - Măsoară: + - **Without cache** (REDIS_ENABLED=false): + - Avg response time: ~800ms + - Total time: ~80 seconds + - **With cache** (REDIS_ENABLED=true): + - First request: ~800ms (MISS) + - Next 99 requests: ~15ms (HIT) + - Total time: ~2 seconds + - **Improvement: 97.5%** + - Salvează results în `shared/docs/REDIS_PERFORMANCE_BENCHMARK.md` + +5. [ ] **Manual testing checklist** + - [ ] Dashboard: Refresh multiple ori (verify cache HIT în logs) + - [ ] Invoices: Filtrare diferită (verify cache keys unice) + - [ ] Redis failure test: Stop Redis container, verify app funcționează (fallback la Oracle) + - [ ] Cache invalidation: Manual invalidate via Redis CLI, verify re-query + +6. [ ] **Update CLAUDE.md documentation** + - Fișier: `CLAUDE.md` + - Acțiune: Adaugă secțiune "Redis Caching": + ```markdown + ## 💾 Redis Caching + + ROA2WEB folosește Redis pentru caching layer: + + - **Client**: `shared/cache/redis_client.py` (singleton pattern) + - **Decorator**: `@cached(resource="name", ttl=300)` în Services + - **Cache Keys**: `cache:{tenant_id}:{resource}:{params_hash}` + - **TTL Defaults**: + - Dashboard summary: 5 min + - Dashboard trends: 3 min + - Invoices list: 1 min + - Invoices summary: 3 min + + **Toggle cache:** Set `REDIS_ENABLED=false` în `.env` + + **Invalidate manual:** + ```python + from shared.cache import invalidate_cache + await invalidate_cache(resource="dashboard_summary") + ``` + + **Performance:** 60-80% reduction în query time pentru repetitive requests + ``` + +**Output Verificabil:** + +- [ ] All tests pass: `pytest shared/tests/ -v` +- [ ] Performance benchmark shows >60% improvement +- [ ] Manual testing checklist complet +- [ ] Documentation updated +- [ ] Ready for code review + +--- + +## 📊 Cache Strategy + +### Resource TTLs + +| Resource | TTL | Motivație | +|----------|-----|-----------| +| `dashboard_summary` | 300s (5 min) | Date agregate, se schimbă rar | +| `dashboard_trends` | 180s (3 min) | Trends calculation expensive | +| `dashboard_detailed_data` | 60s (1 min) | Tabel interactiv, refresh frecvent | +| `dashboard_performance` | 180s (3 min) | Performance metrics stabile | +| `dashboard_cashflow` | 180s (3 min) | Forecast calculation expensive | +| `dashboard_maturity` | 180s (3 min) | Maturity analysis complex | +| `invoices_list` | 60s (1 min) | Listing cu filtre, refresh frecvent | +| `invoices_summary` | 180s (3 min) | Summary stats stabile | +| `companies_list` | 600s (10 min) | Lista rareori se schimbă | +| `treasury_data` | 120s (2 min) | Trezorerie moderate changes | + +**Raționament TTL:** +- Scurt (60s): Date interactive, tabel listings +- Mediu (180-300s): Calculații expensive, agregări +- Lung (600s+): Date aproape statice (companies, permissions) + +### Cache Keys Pattern + +**Format:** `cache:{tenant_id}:{resource}:{params_hash}` + +**Exemplu concret:** +``` +cache:default:dashboard_summary:company-123:abc456def789 +cache:default:invoices_list:company-123:partner-CLIENTI:unpaid-true:xyz890 +cache:default:dashboard_trends:company-456:period-30d:compare-true:def123 +``` + +**Componente:** +- `cache:` - Prefix constant (pentru separare de alte Redis keys) +- `{tenant_id}` - Tenant ID (deocamdată "default", viitor: din JWT token) +- `{resource}` - Resource name (dashboard_summary, invoices_list, etc.) +- `{params_hash}` - MD5 hash (primele 12 caractere) al parametrilor sortați JSON + +**Multi-Tenant Ready:** +Când se implementează multi-tenant: +1. Modifică `extract_tenant_id()` în `utils.py` să citească din `request.state.tenant_id` +2. JWT token va include `tenant_id` field +3. Cache keys automat vor fi per-tenant +4. Invalidation per-tenant: `await invalidate_cache(tenant_id="client-a")` + +### Invalidation Rules + +**Trigger:** Când se schimbă date în Oracle DB + +| Mutation | Invalidate Resources | +|----------|---------------------| +| Invoice created/updated | `invoices_list`, `invoices_summary`, `dashboard_summary`, `dashboard_trends` | +| Payment recorded | `invoices_list`, `dashboard_summary`, `treasury_data`, `dashboard_cashflow` | +| Treasury transaction | `treasury_data`, `dashboard_summary`, `dashboard_cashflow` | +| Company settings changed | `companies_list`, `dashboard_*` (pentru acea companie) | + +**Implementare:** +```python +# În router după mutation +from shared.cache import invalidate_cache + +@router.post("/invoices/{invoice_id}/pay") +async def mark_invoice_paid(...): + # ... update DB ... + + # Invalidate affected caches + await invalidate_cache(resource="invoices_list") + await invalidate_cache(resource="invoices_summary") + await invalidate_cache(resource="dashboard_summary") + await invalidate_cache(resource="treasury_data") + + return {"status": "ok"} +``` + +**Pattern Matching:** +```python +# Invalidate toate cache-urile pentru dashboard +await invalidate_cache(resource="dashboard") # matches dashboard_* + +# Invalidate tot pentru un tenant +await invalidate_cache(tenant_id="client-a") # matches cache:client-a:* +``` + +--- + +## 🧪 Testing Plan + +### Unit Tests + +**File:** `shared/tests/test_redis_client.py` + +- [ ] `test_redis_connection_success()` - Verify successful connection +- [ ] `test_redis_connection_failure_graceful()` - Redis down, no exception thrown +- [ ] `test_redis_get_set_delete()` - Basic operations +- [ ] `test_redis_delete_pattern()` - Pattern matching deletion +- [ ] `test_redis_ttl_expiration()` - Verify TTL works +- [ ] `test_redis_connection_retry()` - Exponential backoff retry + +**File:** `shared/tests/test_cache_decorators.py` + +- [ ] `test_cached_decorator_hit()` - Second call returns cached value +- [ ] `test_cached_decorator_miss()` - First call queries function +- [ ] `test_cache_key_generation()` - Keys format correct +- [ ] `test_cache_key_unique_params()` - Different params = different keys +- [ ] `test_invalidate_cache_pattern()` - Invalidation works +- [ ] `test_cached_disabled()` - Works when REDIS_ENABLED=false + +### Integration Tests + +**Setup:** `docker-compose up -d` + +**Test Scenarios:** + +1. **Full Stack Test:** + - Start backend + Redis + - Call `/api/dashboard/summary?company=123` + - Verify: Oracle query executed (check logs) + - Call again same endpoint + - Verify: Cache hit (no Oracle query) + +2. **Cache Invalidation Test:** + - Call endpoint (cache populated) + - Invalidate via `redis-cli KEYS "cache:*"` + `DEL` + - Call endpoint again + - Verify: Oracle query executed (cache miss) + +3. **Redis Failure Test:** + - `docker-compose stop roa-redis` + - Call endpoint + - Verify: App works (fallback to Oracle) + - No error thrown + - Logs show warning: "Redis unavailable, fallback to DB" + +### Performance Benchmarks + +**Tool:** Apache Bench or Python script + +**Baseline (No Cache):** +```bash +# Stop Redis or set REDIS_ENABLED=false +ab -n 100 -c 10 http://localhost:8001/api/dashboard/summary?company=123 +# Expected: ~800ms avg response time, 80s total +``` + +**With Cache:** +```bash +# Start Redis and set REDIS_ENABLED=true +ab -n 100 -c 10 http://localhost:8001/api/dashboard/summary?company=123 +# Expected: ~15ms avg (after first request), ~2s total +``` + +**Target Metrics:** +- Cache hit rate: >90% (după warmup) +- Avg response time reduction: >60% +- Total time reduction: >75% +- Memory usage: +50-200MB (Redis) + +**Save Results:** `shared/docs/REDIS_PERFORMANCE_BENCHMARK.md` + +### Manual Testing Checklist + +- [ ] **Dashboard Summary:** + - [ ] First load → check logs for "Cache MISS" + - [ ] Refresh page → check logs for "Cache HIT" + - [ ] Change company → new cache key, "Cache MISS" + +- [ ] **Invoices List:** + - [ ] Filter: toate facturile → "Cache MISS" first time + - [ ] Refresh → "Cache HIT" + - [ ] Filter: doar neplatite → new key, "Cache MISS" + - [ ] Refresh → "Cache HIT" + +- [ ] **Cache Invalidation:** + - [ ] Load dashboard (cached) + - [ ] Redis CLI: `redis-cli KEYS "cache:*"` → see keys + - [ ] Delete: `redis-cli DEL cache:default:dashboard_summary:*` + - [ ] Refresh dashboard → "Cache MISS" (re-queries Oracle) + +- [ ] **Redis Failure Graceful:** + - [ ] Stop Redis: `docker-compose stop roa-redis` + - [ ] Access dashboard → works (no crash) + - [ ] Check logs: "Redis unavailable, using direct DB query" + - [ ] Start Redis: `docker-compose start roa-redis` + - [ ] Access dashboard → caching resume + +- [ ] **Multi-Tenant Simulation:** + - [ ] Load dashboard company=123 (tenant=default) + - [ ] Load dashboard company=456 (tenant=default) + - [ ] Verify different cache keys in Redis + +--- + +## 🔧 Configurare Env Variables + +**File:** `reports-app/backend/.env` + +```bash +# ============================================================================ +# REDIS CONFIGURATION +# ============================================================================ + +# Redis Connection URL +# Development: redis://roa-redis:6379/0 (Docker network) +# Production: redis://redis-host:6379/0 or redis://localhost:6379/0 +REDIS_URL=redis://roa-redis:6379/0 + +# Redis Password (from docker-compose secrets) +# Match with REDIS_PASSWORD in docker-compose.yml +REDIS_PASSWORD=roa2web_redis_password + +# Enable/Disable Redis Caching +# Set to 'false' to disable caching (fallback to direct DB queries) +REDIS_ENABLED=true + +# Default Cache TTL (seconds) +# Used when no specific TTL provided to @cached decorator +CACHE_DEFAULT_TTL=300 + +# Redis Connection Pool Settings (optional, defaults shown) +REDIS_MAX_CONNECTIONS=50 +REDIS_SOCKET_CONNECT_TIMEOUT=5 +REDIS_SOCKET_KEEPALIVE=true +``` + +**Docker Compose Integration:** + +No changes needed! Redis container already configured in `docker-compose.yml:147-163`. + +**Verify:** +```bash +docker-compose exec roa-backend env | grep REDIS +# Should show REDIS_URL, REDIS_PASSWORD, REDIS_ENABLED +``` + +--- + +## 📝 Checklist Final + +### Pre-Implementation + +- [ ] Read și înțeles planul complet +- [ ] Backup codebase: `git commit -am "Backup before Redis implementation"` +- [ ] Redis container rulează: `docker-compose up -d roa-redis` +- [ ] Test connection: `docker-compose exec roa-redis redis-cli ping` → PONG + +### Faza 1 (Setup) + +- [ ] Dependency added: `redis>=5.0.0` în requirements.txt +- [ ] Package created: `shared/cache/__init__.py` +- [ ] Redis client: `shared/cache/redis_client.py` +- [ ] Env vars added: `reports-app/backend/.env.example` +- [ ] Main.py updated: Redis initialize at startup +- [ ] Test: `python -c "import asyncio; from shared.cache import redis_cache; asyncio.run(redis_cache.initialize())"` + +### Faza 2 (Decorators) + +- [ ] Utils created: `shared/cache/utils.py` +- [ ] Decorator created: `shared/cache/decorators.py` +- [ ] Unit tests: `shared/tests/test_cache_decorators.py` +- [ ] Test: `pytest shared/tests/test_cache_decorators.py -v` + +### Faza 3 (Integration) + +- [ ] Cached applied: DashboardService.get_complete_summary +- [ ] Cached applied: DashboardService.get_trends +- [ ] Cached applied: DashboardService.get_detailed_data +- [ ] Cached applied: InvoiceService.get_invoices +- [ ] Cached applied: InvoiceService.get_invoice_summary +- [ ] Backend starts: `uvicorn app.main:app --reload` +- [ ] Test: First request slow, second fast + +### Faza 4 (Validation) + +- [ ] Unit tests pass: `pytest shared/tests/ -v` +- [ ] Integration tests pass (Docker stack) +- [ ] Performance benchmark run (save results) +- [ ] Manual testing checklist completed +- [ ] Documentation updated: `CLAUDE.md` +- [ ] Git commit: `git add . && git commit -m "feat: implement Redis caching layer"` + +### Ready for Production + +- [ ] All tests green +- [ ] Performance improvement >60% +- [ ] Graceful degradation tested (Redis failure) +- [ ] Code review requested +- [ ] Merge to main branch + +--- + +## 📚 Referințe + +### Documentație Existentă + +- **Docker Compose Redis Config:** `docker-compose.yml:147-163` +- **Oracle Pool Pattern:** `shared/database/oracle_pool.py` (reference for singleton pattern) +- **Backend Services:** `reports-app/backend/app/services/` (where to apply caching) +- **Backend Routers:** `reports-app/backend/app/routers/` (where to invalidate cache) + +### Documentație Externă + +- Redis Python Client: https://redis.readthedocs.io/en/stable/ +- Redis Commands: https://redis.io/commands/ +- FastAPI Async: https://fastapi.tiangolo.com/async/ + +### Debugging + +**Redis CLI Access:** +```bash +docker-compose exec roa-redis redis-cli -a roa2web_redis_password +> KEYS cache:* +> GET cache:default:dashboard_summary:abc123 +> DEL cache:default:dashboard_summary:abc123 +> FLUSHDB # Delete all keys (WARNING: destructive) +``` + +**Monitor Redis Operations:** +```bash +docker-compose exec roa-redis redis-cli -a roa2web_redis_password MONITOR +``` + +**Check Cache Stats:** +```bash +docker-compose exec roa-redis redis-cli -a roa2web_redis_password INFO stats +``` + +--- + +## 🎯 Code Templates + +### Template: RedisCache Client (`shared/cache/redis_client.py`) + +```python +""" +Redis Cache Client - Singleton pattern similar to OraclePool +Provides async Redis operations with graceful degradation +""" +import redis.asyncio as redis +import logging +import json +from typing import Optional, Any +import os + +logger = logging.getLogger(__name__) + +class RedisCache: + """Singleton Redis cache client with connection pooling""" + + _instance: Optional['RedisCache'] = None + _client: Optional[redis.Redis] = None + _enabled: bool = False + + def __new__(cls): + if cls._instance is None: + cls._instance = super(RedisCache, cls).__new__(cls) + return cls._instance + + async def initialize( + self, + url: str = None, + password: str = None, + max_connections: int = 50 + ): + """Initialize Redis connection pool""" + if self._client is not None: + return + + try: + url = url or os.getenv('REDIS_URL', 'redis://localhost:6379/0') + password = password or os.getenv('REDIS_PASSWORD') + + self._client = await redis.from_url( + url, + password=password, + encoding="utf-8", + decode_responses=True, + max_connections=max_connections, + socket_connect_timeout=5, + socket_keepalive=True + ) + + # Test connection + await self._client.ping() + + self._enabled = True + logger.info("✅ Redis cache initialized successfully") + + except Exception as e: + logger.warning(f"⚠️ Redis initialization failed: {e}. Caching disabled.") + self._enabled = False + self._client = None + + def is_enabled(self) -> bool: + """Check if Redis caching is enabled""" + return self._enabled and self._client is not None + + async def get(self, key: str) -> Optional[Any]: + """Get value from cache""" + if not self.is_enabled(): + return None + + try: + value = await self._client.get(key) + if value: + return json.loads(value) + return None + except Exception as e: + logger.error(f"Redis GET error for key {key}: {e}") + return None + + async def set(self, key: str, value: Any, ttl: int = 300): + """Set value in cache with TTL""" + if not self.is_enabled(): + return + + try: + value_json = json.dumps(value, default=str) + await self._client.setex(key, ttl, value_json) + except Exception as e: + logger.error(f"Redis SET error for key {key}: {e}") + + async def delete(self, key: str): + """Delete single key""" + if not self.is_enabled(): + return + + try: + await self._client.delete(key) + except Exception as e: + logger.error(f"Redis DELETE error for key {key}: {e}") + + async def delete_pattern(self, pattern: str): + """Delete all keys matching pattern (e.g., 'cache:default:dashboard*')""" + if not self.is_enabled(): + return + + try: + async for key in self._client.scan_iter(match=pattern): + await self._client.delete(key) + logger.debug(f"Deleted keys matching pattern: {pattern}") + except Exception as e: + logger.error(f"Redis DELETE_PATTERN error for {pattern}: {e}") + + async def close(self): + """Close Redis connection""" + if self._client: + await self._client.close() + self._client = None + self._enabled = False + logger.info("✅ Redis connection closed") + +# Global singleton instance +redis_cache = RedisCache() +``` + +--- + +## ⚠️ Known Limitations & Future Work + +### Current Limitations + +1. **No Cache Warming:** Cache is cold on startup (first requests slow) + - Future: Implement background task to pre-populate hot keys + +2. **Manual Invalidation:** Invalidation must be coded manually in routers + - Future: Auto-invalidation via database triggers or event system + +3. **Single Tenant:** All cache keys use `tenant_id="default"` + - Future: Extract tenant_id from JWT token when multi-tenant implemented + +4. **No Cache Monitoring:** No dashboard/metrics for cache performance + - Future: Integrate Prometheus metrics (hit/miss rate, latency, memory) + +5. **Simple Serialization:** Uses JSON (no support for binary data, datetime needs str conversion) + - Future: Consider msgpack for faster serialization + +### Future Enhancements + +- [ ] **Cache Warming:** Background task to pre-load hot keys at startup +- [ ] **Smart Invalidation:** Event-driven invalidation based on DB changes +- [ ] **Cache Monitoring Dashboard:** Redis metrics + hit/miss rates +- [ ] **Cache Compression:** Compress large values (>10KB) before storing +- [ ] **Multi-Level Cache:** L1 (in-memory LRU) + L2 (Redis) for ultra-fast access +- [ ] **Cache Tagging:** Tag-based invalidation instead of pattern matching + +--- + +## 📞 Support & Questions + +**Dacă întâmpini probleme:** + +1. **Redis nu pornește:** Check `docker-compose logs roa-redis` +2. **Connection failed:** Verify REDIS_URL și REDIS_PASSWORD în .env +3. **Cache nu funcționează:** Verify REDIS_ENABLED=true și logs pentru errors +4. **Performance nu se îmbunătățește:** Check cache hit rate în logs + +**Contact:** Claude Code Implementation Team + +--- + +**Planul este gata pentru implementare! Începe cu FAZA 1 și urmează pașii exact cum sunt descriși.**