- Delete data-entry-app/ (1.6GB), reports-app/ (447MB), .auto-build-data/
- Saved ~1.4GB disk space (64% reduction: 2.2GB → 845MB)
Updated references across 38 files:
- .claude/rules/ paths: backend/modules/, src/modules/
- .claude/commands/validate.md: all validation paths
- docs/ (13 files): data-entry, telegram, README, CLAUDE.md
- scripts/ (3 files): backup-secrets, restore-secrets, test-docker
- security/ (2 files): git_cleanup, SECURITY_PROCEDURES
- deployment/ & shared/: updated all stale comments
All paths now reflect ultrathin monolith architecture:
- Backend: backend/modules/{reports,data_entry,telegram}/
- Frontend: src/modules/{reports,data-entry}/
- Shared: shared/{auth,database,routes}/
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
87 KiB
Plan Upgrade Multi-Tenant Architecture - ROA2WEB
Version: 1.0 Created: 2025-10-25 Status: Planning Phase
📋 Sumar Executiv
ROA2WEB va fi transformat de la o aplicație single-tenant (un singur client, o singură bază de date Oracle) la o arhitectură multi-tenant SaaS care suportă:
- Multiple clienți simultaneous cu izolare completă între tenants (pool-uri, cache, audit logs)
- Conexiuni hibride: SSH tunnel pentru clienți remote SAU direct TCP pentru clienți în LAN
- Deployment flexibil: Development (WSL), Docker (Proxmox LXC), Windows IIS
- Backward compatibility: Tenant "default" funcționează exact ca single-tenant actual (zero breaking changes)
- Gradual migration: Fiecare fază testabilă independent, rollout incremental
- Security-first: Passwords encrypted în tenant DB, SSH keys read-only, JWT signing per tenant
- Performance: < 10% overhead vs single-tenant, izolare pool-uri per tenant
🏗️ Arhitectură Target
Single-Tenant (Actual)
┌─────────────────────────────────────────────────────┐
│ FastAPI Backend │
│ │
│ ┌─────────────────────────────────────────────┐ │
│ │ OraclePool (Singleton) │ │
│ │ - Hardcoded credentials din .env │ │
│ │ - Min: 2, Max: 10 connections │ │
│ │ - Shared pentru toți userii │ │
│ └─────────────────────────────────────────────┘ │
│ ▼ │
└──────────────────────┼──────────────────────────────┘
│
┌─────────────┴───────────┐
│ │
SSH Tunnel Direct Connection
(Development) (Windows Production)
│ │
▼ ▼
┌─────────────────┐ ┌──────────────────┐
│ Oracle Server │ │ Oracle Server │
│ (Remote) │ │ (Local LAN) │
└─────────────────┘ └──────────────────┘
JWT Token Structure (Actual):
{
"username": "john.doe",
"user_id": 123,
"companies": ["COMP1", "COMP2"],
"permissions": ["read", "reports"],
"exp": 1234567890,
"iat": 1234567800,
"type": "access"
}
Multi-Tenant (Target)
┌────────────────────────────────────────────────────────────────────┐
│ FastAPI Backend │
│ │
│ ┌──────────────────────────────────────────────────────────────┐ │
│ │ MultiTenantPoolManager (New) │ │
│ │ │ │
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │
│ │ │ Client A │ │ Client B │ │ Client C │ │ │
│ │ │ Pool (2-10) │ │ Pool (2-10) │ │ Pool (2-10) │ │ │
│ │ │ SSH Tunnel │ │ Direct Conn │ │ SSH Tunnel │ │ │
│ │ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │ │
│ │ │ │ │ │ │
│ └─────────┼─────────────────┼─────────────────┼──────────────┘ │
│ │ │ │ │
└────────────┼─────────────────┼─────────────────┼────────────────┘
│ │ │
┌────────┴─────┐ ┌────────┴─────┐ ┌────────┴─────┐
│ SSH Process │ │ Direct │ │ SSH Process │
│ localhost: │ │ 192.168.1.50 │ │ localhost: │
│ 15261 │ │ :1521 │ │ 15262 │
└────────┬─────┘ └────────┬─────┘ └────────┬─────┘
│ │ │
▼ ▼ ▼
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Oracle │ │ Oracle │ │ Oracle │
│ Client A │ │ Client B │ │ Client C │
│ (Remote) │ │ (LAN) │ │ (Remote) │
└──────────────┘ └──────────────┘ └──────────────┘
┌──────────────────────┐
│ Tenant Config DB │
│ (PostgreSQL/SQLite) │
│ │
│ - tenants │
│ - tenant_users │
│ - audit_logs │
└──────────────────────┘
JWT Token Structure (Target):
{
"username": "john.doe",
"user_id": 123,
"tenant_id": "client-a-uuid", ← NEW
"companies": ["COMP1", "COMP2"],
"permissions": ["read", "reports"],
"exp": 1234567890,
"iat": 1234567800,
"type": "access"
}
Redis Cache Keys:
cache:{tenant_id}:dashboard:{company_id} ← Already prepared!
cache:{tenant_id}:invoices:{filters_hash}
Key Architectural Decisions
- Lazy Pool Initialization: Pool-uri create doar când tenant-ul e accesat prima dată (economie memorie)
- SSH Tunnel per Tenant: Subprocess separat pentru fiecare tenant remote (izolare, resilience)
- Tenant Config DB Separate: Nu stocăm tenant config în Oracle (evităm dependențe circulare)
- JWT Tenant ID Signed: Tenant ID e în token signed, nu poate fi modificat de client
- Pool Cleanup: Pool-uri inactive > 1h se închid automat (economie resurse)
- Backward Compatible: Tenant "default" mapează la .env actual (zero migration pain)
🗂️ Structura Fișierelor
Fișiere Noi
shared/
├── database/
│ ├── multi_tenant_pool.py ✅ NEW - MultiTenantPoolManager class
│ ├── tenant_config.py ✅ NEW - Tenant configuration loader
│ ├── ssh_tunnel_manager.py ✅ NEW - SSH tunnel per tenant management
│ └── tenant_models.py ✅ NEW - Pydantic models for tenants
│
├── middleware/
│ └── tenant_middleware.py ✅ NEW - Tenant identification middleware
│
├── schemas/
│ └── tenant_config_schema.sql ✅ NEW - PostgreSQL/SQLite schema
│
└── utils/
├── encryption.py ✅ NEW - Fernet encryption for passwords
└── tenant_utils.py ✅ NEW - Tenant helper functions
deployment/
├── docker/
│ └── tenant-config-db.dockerfile ✅ NEW - PostgreSQL tenant config container
│
└── windows/
└── tenant-config-setup.ps1 ✅ NEW - SQL Server Express setup for tenants
Fișiere Modificate
shared/
├── database/
│ └── oracle_pool.py ⚠️ MODIFY - Add DEPRECATED warning
│
├── auth/
│ ├── jwt_handler.py ⚠️ MODIFY - Add tenant_id to JWT payload
│ └── middleware.py ⚠️ MODIFY - Extract tenant_id, validate access
│
└── cache/
└── redis_client.py ⚠️ MODIFY - Use real tenant_id (not "default")
backend/
├── app/
│ ├── main.py ⚠️ MODIFY - Initialize MultiTenantPoolManager
│ └── routers/
│ ├── companies.py ⚠️ MODIFY - Use tenant_id from request.state
│ ├── dashboard.py ⚠️ MODIFY - Use tenant_id from request.state
│ ├── invoices.py ⚠️ MODIFY - Use tenant_id from request.state
│ └── treasury.py ⚠️ MODIFY - Use tenant_id from request.state
│
└── .env.example ⚠️ MODIFY - Add tenant config DB variables
docker-compose.yml ⚠️ MODIFY - Add tenant-config-db service
deployment/windows/
└── scripts/
└── Install-ROA2WEB.ps1 ⚠️ MODIFY - Add tenant DB setup
Database Schema (Tenant Config DB)
PostgreSQL/SQLite Compatible Schema
-- shared/schemas/tenant_config_schema.sql
-- Tenants configuration table
CREATE TABLE IF NOT EXISTS tenants (
id VARCHAR(36) PRIMARY KEY, -- UUID
name VARCHAR(255) NOT NULL, -- Display name (ex: "Client A - Retail SRL")
connection_type VARCHAR(20) NOT NULL, -- 'ssh_tunnel' | 'direct'
-- Oracle connection details
oracle_host VARCHAR(255) NOT NULL, -- Oracle server IP/hostname
oracle_port INTEGER NOT NULL DEFAULT 1521,
oracle_sid VARCHAR(50) NOT NULL DEFAULT 'ROA',
oracle_user VARCHAR(100) NOT NULL,
oracle_password_encrypted TEXT NOT NULL, -- Fernet encrypted password
-- SSH tunnel configuration (NULL if connection_type='direct')
ssh_host VARCHAR(255), -- SSH server IP
ssh_port INTEGER DEFAULT 22,
ssh_user VARCHAR(100),
ssh_key_path VARCHAR(500), -- Path to SSH private key
ssh_tunnel_local_port INTEGER, -- Local port for tunnel (ex: 15261)
-- Pool configuration
min_connections INTEGER NOT NULL DEFAULT 2,
max_connections INTEGER NOT NULL DEFAULT 10,
-- Status
is_active BOOLEAN NOT NULL DEFAULT TRUE,
created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
-- Constraints
CONSTRAINT chk_connection_type CHECK (connection_type IN ('ssh_tunnel', 'direct')),
CONSTRAINT chk_ssh_config CHECK (
(connection_type = 'direct') OR
(connection_type = 'ssh_tunnel' AND ssh_host IS NOT NULL AND ssh_key_path IS NOT NULL)
)
);
-- Tenant users mapping (which users have access to which tenants)
CREATE TABLE IF NOT EXISTS tenant_users (
id SERIAL PRIMARY KEY, -- Auto-increment ID
tenant_id VARCHAR(36) NOT NULL REFERENCES tenants(id) ON DELETE CASCADE,
user_id INTEGER NOT NULL, -- Oracle user ID from CONTAFIN_ORACLE.UTILIZATORI
username VARCHAR(100) NOT NULL, -- Oracle username
is_admin BOOLEAN NOT NULL DEFAULT FALSE, -- Tenant admin (can manage tenant config)
granted_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
granted_by INTEGER, -- User ID who granted access
UNIQUE(tenant_id, user_id)
);
-- Audit logs per tenant
CREATE TABLE IF NOT EXISTS audit_logs (
id SERIAL PRIMARY KEY,
tenant_id VARCHAR(36) NOT NULL REFERENCES tenants(id) ON DELETE CASCADE,
user_id INTEGER NOT NULL,
username VARCHAR(100) NOT NULL,
action VARCHAR(100) NOT NULL, -- 'login', 'query', 'export', etc.
resource VARCHAR(255), -- Resource accessed (ex: 'dashboard', 'invoices')
status VARCHAR(20) NOT NULL, -- 'success' | 'error'
error_message TEXT,
ip_address VARCHAR(50),
user_agent TEXT,
created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
-- Index for fast queries
INDEX idx_tenant_user (tenant_id, user_id),
INDEX idx_created_at (created_at)
);
-- Insert default tenant (backward compatibility)
-- This maps to existing .env credentials
INSERT INTO tenants (
id, name, connection_type,
oracle_host, oracle_port, oracle_sid, oracle_user, oracle_password_encrypted,
min_connections, max_connections, is_active
) VALUES (
'default',
'Default Tenant (Single-Tenant Legacy)',
'ssh_tunnel', -- Will be read from environment
'localhost', -- Will be overridden by environment if needed
1526,
'ROA',
'CONTAFIN_ORACLE',
'PLACEHOLDER_ENCRYPTED_PASSWORD', -- Will be replaced by migration script
2,
10,
TRUE
) ON CONFLICT (id) DO NOTHING;
-- Indexes for performance
CREATE INDEX IF NOT EXISTS idx_tenants_active ON tenants(is_active);
CREATE INDEX IF NOT EXISTS idx_tenant_users_user ON tenant_users(user_id);
CREATE INDEX IF NOT EXISTS idx_audit_tenant ON audit_logs(tenant_id);
🚀 Faze de Upgrade
FAZA 1: Tenant Configuration Database (2-3 zile)
Obiectiv: Creează tenant configuration database și loader pentru citirea tenant configs la startup.
Tasks
-
Creează PostgreSQL/SQLite schema pentru tenant config
- Fișier:
shared/schemas/tenant_config_schema.sql - Acțiune: Define tables
tenants,tenant_users,audit_logs - Deployment:
- Dev: SQLite (
data/tenant_config.db) - Docker: PostgreSQL container (
roa-tenant-config-db) - Windows: SQL Server Express SAU PostgreSQL Windows service
- Dev: SQLite (
- Fișier:
-
Implementează TenantConfigLoader
- Fișier:
shared/database/tenant_config.py - Clasa:
TenantConfigLoader(db_url: str) - Metode:
async def load_tenants() -> Dict[str, TenantConfig]- Load all active tenantsasync def get_tenant(tenant_id: str) -> Optional[TenantConfig]- Get specific tenantasync def reload_tenant(tenant_id: str)- Reload tenant config (for updates)
- Pattern: Async context manager pentru DB connections
- Fișier:
-
Implementează Pydantic models pentru tenant config
- Fișier:
shared/database/tenant_models.py - Models:
class TenantConfig(BaseModel): id: str # UUID name: str connection_type: Literal['ssh_tunnel', 'direct'] oracle_host: str oracle_port: int oracle_sid: str oracle_user: str oracle_password: str # Decrypted ssh_host: Optional[str] = None ssh_port: Optional[int] = 22 ssh_user: Optional[str] = None ssh_key_path: Optional[str] = None ssh_tunnel_local_port: Optional[int] = None min_connections: int = 2 max_connections: int = 10 is_active: bool = True
- Fișier:
-
Implementează password encryption/decryption
- Fișier:
shared/utils/encryption.py - Funcții:
encrypt_password(password: str, key: str) -> str- Fernet encryptiondecrypt_password(encrypted: str, key: str) -> str- Fernet decryption
- Environment:
DB_ENCRYPTION_KEY(generate withFernet.generate_key())
- Fișier:
-
Creează migration script pentru tenant default
- Fișier:
shared/scripts/create_default_tenant.py - Acțiune:
- Citește credențiale din
.envactual - Encrypt password cu
DB_ENCRYPTION_KEY - Insert tenant "default" în tenant DB
- Testează decryption și Oracle connection
- Citește credențiale din
- Fișier:
-
Update Docker Compose cu tenant config DB
- Fișier:
docker-compose.yml - Service nou:
roa-tenant-config-db: image: postgres:15-alpine container_name: roa-tenant-config-db environment: POSTGRES_DB: tenant_config POSTGRES_USER: tenant_admin POSTGRES_PASSWORD: ${TENANT_DB_PASSWORD} volumes: - tenant-config-data:/var/lib/postgresql/data - ./shared/schemas/tenant_config_schema.sql:/docker-entrypoint-initdb.d/schema.sql:ro networks: - roa-network
- Fișier:
-
Update .env.example cu tenant DB variables
# Tenant Configuration Database TENANT_DB_URL=postgresql://tenant_admin:password@localhost:5432/tenant_config # For SQLite (development): sqlite:///data/tenant_config.db DB_ENCRYPTION_KEY=GENERATE_WITH_Fernet.generate_key()
Output Verificabil
- ✅ Tenant DB se creează cu succes (PostgreSQL/SQLite)
- ✅ Schema tables create (
tenants,tenant_users,audit_logs) - ✅ Default tenant se încarcă cu credențiale din
.envactual - ✅ Password encryption/decryption funcționează
- ✅ Test:
pytest shared/tests/test_tenant_config.py -v - ✅ Docker:
docker-compose up roa-tenant-config-dbpornește cu succes
FAZA 2: MultiTenantPoolManager (3-4 zile)
Obiectiv: Implementează pool manager care creează pool-uri Oracle separate per tenant cu lazy initialization.
Tasks
-
Implementează MultiTenantPoolManager class
- Fișier:
shared/database/multi_tenant_pool.py - Pattern: Singleton (similar cu
OraclePoolactual) - Structură:
class MultiTenantPoolManager: _instance: Optional['MultiTenantPoolManager'] = None _pools: Dict[str, oracledb.ConnectionPool] = {} # tenant_id -> pool _tenant_configs: Dict[str, TenantConfig] = {} _pool_locks: Dict[str, asyncio.Lock] = {} # Thread-safe pool creation _last_access: Dict[str, datetime] = {} # For cleanup inactive pools async def initialize(self, tenant_db_url: str): """Load tenant configs from tenant DB""" async def get_connection(self, tenant_id: str): """Context manager - get connection from tenant pool (lazy init)""" async def _ensure_pool(self, tenant_id: str): """Lazy initialize pool if not exists""" async def reload_tenant(self, tenant_id: str): """Reload tenant config and recreate pool""" async def cleanup_inactive_pools(self, max_idle_hours: int = 1): """Close pools inactive > max_idle_hours""" async def close_all_pools(self): """Shutdown - close all pools"""
- Fișier:
-
Implementează lazy pool initialization
- Logica:
async def _ensure_pool(self, tenant_id: str): if tenant_id in self._pools: self._last_access[tenant_id] = datetime.utcnow() return # Pool already exists # Acquire lock pentru thread-safety async with self._pool_locks.setdefault(tenant_id, asyncio.Lock()): # Double-check în lock if tenant_id in self._pools: return # Load tenant config tenant_config = await self._load_tenant_config(tenant_id) if not tenant_config.is_active: raise ValueError(f"Tenant {tenant_id} is not active") # Create pool pool = oracledb.create_pool( user=tenant_config.oracle_user, password=tenant_config.oracle_password, host=tenant_config.oracle_host, port=tenant_config.oracle_port, sid=tenant_config.oracle_sid, min=tenant_config.min_connections, max=tenant_config.max_connections, increment=1, getmode=oracledb.POOL_GETMODE_WAIT ) self._pools[tenant_id] = pool self._tenant_configs[tenant_id] = tenant_config self._last_access[tenant_id] = datetime.utcnow() logger.info(f"Created pool for tenant {tenant_id} ({tenant_config.name})")
- Logica:
-
Implementează get_connection context manager
- Pattern: Same as
OraclePool.get_connection()dar per tenant@asynccontextmanager async def get_connection(self, tenant_id: str): await self._ensure_pool(tenant_id) # Lazy init pool = self._pools[tenant_id] connection = None try: connection = pool.acquire() self._last_access[tenant_id] = datetime.utcnow() logger.debug(f"Connection acquired for tenant {tenant_id}") yield connection finally: if connection is not None: connection.close() logger.debug(f"Connection returned for tenant {tenant_id}")
- Pattern: Same as
-
Implementează pool cleanup pentru inactive tenants
- Scheduled task: Run every hour, close pools inactive > 1h
async def cleanup_inactive_pools(self, max_idle_hours: int = 1): now = datetime.utcnow() inactive_tenants = [] for tenant_id, last_access in self._last_access.items(): idle_hours = (now - last_access).total_seconds() / 3600 if idle_hours > max_idle_hours: inactive_tenants.append(tenant_id) for tenant_id in inactive_tenants: logger.info(f"Closing inactive pool for tenant {tenant_id}") pool = self._pools.pop(tenant_id, None) if pool: pool.close() self._tenant_configs.pop(tenant_id, None) self._last_access.pop(tenant_id, None)
- Scheduled task: Run every hour, close pools inactive > 1h
-
Implementează tenant config reload (for dynamic updates)
- Use case: Admin updates tenant config în DB, aplicația reloadează fără restart
async def reload_tenant(self, tenant_id: str): # Close existing pool old_pool = self._pools.pop(tenant_id, None) if old_pool: old_pool.close() # Reload config from DB tenant_config = await self._tenant_config_loader.get_tenant(tenant_id) if not tenant_config: raise ValueError(f"Tenant {tenant_id} not found") # Pool will be recreated on next request (lazy init) self._tenant_configs.pop(tenant_id, None) self._last_access.pop(tenant_id, None) logger.info(f"Reloaded tenant config for {tenant_id}")
- Use case: Admin updates tenant config în DB, aplicația reloadează fără restart
-
Add backward compatibility layer
- Tenant "default" mapează la credențiale din
.envpentru zero breaking changesasync def _load_default_tenant_from_env(self) -> TenantConfig: """Fallback: Load default tenant from .env if tenant DB is not available""" return TenantConfig( id='default', name='Default Tenant (Legacy)', connection_type='ssh_tunnel' if os.getenv('ORACLE_HOST') == 'localhost' else 'direct', oracle_host=os.getenv('ORACLE_HOST', 'localhost'), oracle_port=int(os.getenv('ORACLE_PORT', '1526')), oracle_sid=os.getenv('ORACLE_SID', 'ROA'), oracle_user=os.getenv('ORACLE_USER'), oracle_password=os.getenv('ORACLE_PASSWORD'), min_connections=2, max_connections=10, is_active=True )
- Tenant "default" mapează la credențiale din
-
Mark OraclePool as DEPRECATED
- Fișier:
shared/database/oracle_pool.py - Acțiune: Add deprecation warning
import warnings class OraclePool: """ DEPRECATED: Use MultiTenantPoolManager instead. This class is kept for backward compatibility only. Will be removed in version 2.0. """ def __init__(self): warnings.warn( "OraclePool is deprecated. Use MultiTenantPoolManager for multi-tenant support.", DeprecationWarning, stacklevel=2 ) # ... rest of code
- Fișier:
Output Verificabil
- ✅
MultiTenantPoolManagercreează pool-uri per tenant - ✅ Lazy initialization: Pool creat doar la prima cerere
- ✅ Tenant "default" funcționează cu credențiale din
.env(backward compatible) - ✅ Pool cleanup: Inactive pools se închid automat după 1h
- ✅ Reload tenant: Config update fără restart aplicație
- ✅ Test:
pytest shared/tests/test_multi_tenant_pool.py -v - ✅ Test: Connect la 3 tenants dummy simultaneous
FAZA 3: SSH Tunnel Management per Tenant (2-3 zile)
Obiectiv: Implementează SSH tunnel manager care creează și monitorizează subprocess SSH per tenant remote.
Tasks
-
Implementează SSHTunnelManager class
- Fișier:
shared/database/ssh_tunnel_manager.py - Responsabilități:
- Start SSH tunnel subprocess per tenant
- Monitor tunnel health (periodic checks)
- Auto-restart on failure (exponential backoff)
- Cleanup la shutdown
- Structură:
class SSHTunnelManager: _tunnels: Dict[str, subprocess.Popen] = {} # tenant_id -> SSH process _tunnel_ports: Dict[str, int] = {} # tenant_id -> local port _restart_attempts: Dict[str, int] = {} # For exponential backoff async def start_tunnel(self, tenant_config: TenantConfig) -> int: """Start SSH tunnel for tenant, return local port""" async def stop_tunnel(self, tenant_id: str): """Stop SSH tunnel subprocess""" async def check_tunnel_health(self, tenant_id: str) -> bool: """Check if tunnel is alive and responding""" async def restart_tunnel(self, tenant_id: str): """Restart tunnel with exponential backoff""" async def cleanup_all_tunnels(self): """Shutdown - kill all SSH processes"""
- Fișier:
-
Implementează SSH tunnel start logic
- Logica:
async def start_tunnel(self, tenant_config: TenantConfig) -> int: tenant_id = tenant_config.id # Generate unique local port for this tenant local_port = tenant_config.ssh_tunnel_local_port or self._allocate_port() # Build SSH command ssh_cmd = [ 'ssh', '-f', '-N', '-L', f'{local_port}:{tenant_config.oracle_host}:{tenant_config.oracle_port}', '-p', str(tenant_config.ssh_port), '-i', tenant_config.ssh_key_path, '-o', 'ServerAliveInterval=60', '-o', 'ServerAliveCountMax=3', '-o', 'ExitOnForwardFailure=yes', f'{tenant_config.ssh_user}@{tenant_config.ssh_host}' ] # Start process process = subprocess.Popen(ssh_cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE) # Wait for tunnel to establish (max 10 seconds) for _ in range(10): if self._check_port_open('localhost', local_port): break await asyncio.sleep(1) else: process.kill() raise RuntimeError(f"SSH tunnel failed to start for tenant {tenant_id}") self._tunnels[tenant_id] = process self._tunnel_ports[tenant_id] = local_port logger.info(f"SSH tunnel started for tenant {tenant_id} on port {local_port}") return local_port
- Logica:
-
Implementează tunnel health checks
- Periodic check: Every 60 seconds, verify tunnel is alive
async def check_tunnel_health(self, tenant_id: str) -> bool: if tenant_id not in self._tunnels: return False process = self._tunnels[tenant_id] local_port = self._tunnel_ports[tenant_id] # Check process is alive if process.poll() is not None: logger.warning(f"SSH tunnel process died for tenant {tenant_id}") return False # Check port is accessible if not self._check_port_open('localhost', local_port): logger.warning(f"SSH tunnel port {local_port} not accessible for tenant {tenant_id}") return False return True def _check_port_open(self, host: str, port: int) -> bool: import socket try: with socket.create_connection((host, port), timeout=2): return True except: return False
- Periodic check: Every 60 seconds, verify tunnel is alive
-
Implementează auto-restart cu exponential backoff
- Logica: Dacă tunnel moare, restart cu delay: 5s, 10s, 20s, 40s, max 60s
async def restart_tunnel(self, tenant_id: str): attempts = self._restart_attempts.get(tenant_id, 0) delay = min(5 * (2 ** attempts), 60) # Exponential backoff, max 60s logger.info(f"Restarting tunnel for tenant {tenant_id} (attempt {attempts+1}, delay {delay}s)") await asyncio.sleep(delay) try: await self.stop_tunnel(tenant_id) tenant_config = await self._get_tenant_config(tenant_id) await self.start_tunnel(tenant_config) # Reset attempts on success self._restart_attempts[tenant_id] = 0 logger.info(f"Tunnel restarted successfully for tenant {tenant_id}") except Exception as e: self._restart_attempts[tenant_id] = attempts + 1 logger.error(f"Tunnel restart failed for tenant {tenant_id}: {e}") raise
- Logica: Dacă tunnel moare, restart cu delay: 5s, 10s, 20s, 40s, max 60s
-
Integrate SSH tunnel manager în MultiTenantPoolManager
- Logica: Dacă tenant are
connection_type='ssh_tunnel', start tunnel înainte de pool# În MultiTenantPoolManager._ensure_pool() tenant_config = await self._load_tenant_config(tenant_id) # Start SSH tunnel if needed if tenant_config.connection_type == 'ssh_tunnel': if not await self._ssh_tunnel_manager.check_tunnel_health(tenant_id): local_port = await self._ssh_tunnel_manager.start_tunnel(tenant_config) # Override Oracle host/port to use tunnel tenant_config.oracle_host = 'localhost' tenant_config.oracle_port = local_port # Create pool (rest of code same as before) pool = oracledb.create_pool(...)
- Logica: Dacă tenant are
-
Implementează cleanup la shutdown
- Logica: Kill all SSH processes gracefully
async def cleanup_all_tunnels(self): for tenant_id, process in self._tunnels.items(): try: process.terminate() # SIGTERM await asyncio.sleep(2) if process.poll() is None: process.kill() # SIGKILL if not dead logger.info(f"Stopped SSH tunnel for tenant {tenant_id}") except Exception as e: logger.error(f"Error stopping tunnel for tenant {tenant_id}: {e}") self._tunnels.clear() self._tunnel_ports.clear()
- Logica: Kill all SSH processes gracefully
-
Add background task pentru health monitoring
- Fișier:
backend/app/main.py - Task: Run every 60 seconds
async def monitor_ssh_tunnels(): while True: await asyncio.sleep(60) for tenant_id in multi_tenant_pool._tunnels.keys(): if not await multi_tenant_pool._ssh_tunnel_manager.check_tunnel_health(tenant_id): logger.warning(f"Tunnel unhealthy for tenant {tenant_id}, restarting...") await multi_tenant_pool._ssh_tunnel_manager.restart_tunnel(tenant_id) # În lifespan startup asyncio.create_task(monitor_ssh_tunnels())
- Fișier:
Output Verificabil
- ✅ SSH tunnel subprocess pornește per tenant remote
- ✅ Tunnel health check detectează tunnels moarte
- ✅ Auto-restart cu exponential backoff funcționează
- ✅ Multiple tenants cu SSH tunnels simultaneous (port allocation unique)
- ✅ Cleanup la shutdown: toate procesele SSH se opresc
- ✅ Test:
pytest shared/tests/test_ssh_tunnel_manager.py -v - ✅ Manual test: Kill SSH process, verifică auto-restart în < 60s
FAZA 4: JWT & Middleware Update (2-3 zile)
Obiectiv: Update JWT tokens să includă tenant_id și middleware să extragă/valideze tenant access.
Tasks
-
Update JWT handler să includă tenant_id
- Fișier:
shared/auth/jwt_handler.py - Modificări:
# În TokenData model class TokenData(BaseModel): username: str user_id: Optional[int] = None tenant_id: str = Field(description="Tenant ID (UUID)") # NEW companies: List[str] = Field(default_factory=list) permissions: List[str] = Field(default_factory=list) exp: datetime iat: datetime token_type: str = Field(alias="type") # În create_access_token() def create_access_token( self, username: str, tenant_id: str, # NEW parameter companies: List[str], user_id: Optional[int] = None, permissions: Optional[List[str]] = None ) -> str: payload = { "username": username, "user_id": user_id, "tenant_id": tenant_id, # NEW "companies": companies or [], "permissions": permissions or ["read"], "exp": expire, "iat": now, "type": "access" } # ... rest same
- Fișier:
-
Update login endpoint să determine tenant_id
- Fișier:
backend/app/main.py(auth router) - Logica:
- Check
tenant_userstable pentru user_id - Dacă user are access la multiple tenants, return primul (default)
- Sau user selectează tenant la login (future enhancement)
# În login endpoint # Get user's tenants from tenant_users table tenants = await tenant_config_loader.get_user_tenants(user_id) if not tenants: # Fallback: Use "default" tenant (backward compatibility) tenant_id = "default" else: # Use first tenant (or let user select in future) tenant_id = tenants[0]['tenant_id'] # Create JWT with tenant_id access_token = jwt_handler.create_access_token( username=credentials.username, tenant_id=tenant_id, # NEW companies=companies, user_id=user_id, permissions=["read", "reports"] ) - Check
- Fișier:
-
Implementează TenantMiddleware pentru validare tenant access
- Fișier:
shared/middleware/tenant_middleware.py - Responsabilități:
- Extract
tenant_iddin JWT token - Validate user are acces la tenant-ul respectiv
- Inject
tenant_idînrequest.state.tenant_id
class TenantMiddleware(BaseHTTPMiddleware): async def dispatch(self, request: Request, call_next): # Skip pentru excluded paths if request.url.path in self.excluded_paths: return await call_next(request) # Extract tenant_id from JWT token (already decoded by AuthMiddleware) user = getattr(request.state, 'user', None) if not user: return JSONResponse( status_code=401, content={"detail": "Not authenticated"} ) tenant_id = user.get('tenant_id') if not tenant_id: return JSONResponse( status_code=400, content={"detail": "Missing tenant_id in token"} ) # Validate tenant exists and is active tenant_config = await tenant_config_loader.get_tenant(tenant_id) if not tenant_config or not tenant_config.is_active: return JSONResponse( status_code=403, content={"detail": f"Tenant {tenant_id} is not active"} ) # Validate user has access to this tenant user_id = user.get('user_id') has_access = await tenant_config_loader.check_user_tenant_access(user_id, tenant_id) if not has_access: return JSONResponse( status_code=403, content={"detail": f"User {user_id} does not have access to tenant {tenant_id}"} ) # Inject tenant_id în request state request.state.tenant_id = tenant_id request.state.tenant_name = tenant_config.name # Continue request response = await call_next(request) # Log audit (async background task) await self._log_audit(request, response, tenant_id, user_id) return response - Extract
- Fișier:
-
Update AuthenticationMiddleware să funcționeze cu TenantMiddleware
- Fișier:
shared/auth/middleware.py - Ordinea middleware-urilor:
# În main.py app.add_middleware(TenantMiddleware, excluded_paths=["/", "/docs", "/health", ...]) app.add_middleware(AuthenticationMiddleware, excluded_paths=["/", "/docs", "/health", ...]) - Flow: AuthMiddleware decode JWT → TenantMiddleware validate tenant access
- Fișier:
-
Update toate router-urile să folosească tenant_id din request.state
- Fișiere:
backend/app/routers/*.py - Pattern:
# Înainte (single-tenant) async with oracle_pool.get_connection() as connection: # query... # După (multi-tenant) tenant_id = request.state.tenant_id # Injected by TenantMiddleware async with multi_tenant_pool.get_connection(tenant_id) as connection: # query... - Exemplu:
dashboard.py@router.get("/{company_id}") async def get_dashboard(company_id: str, request: Request): tenant_id = request.state.tenant_id # NEW async with multi_tenant_pool.get_connection(tenant_id) as connection: with connection.cursor() as cursor: # ... rest same
- Fișiere:
-
Update Telegram bot pentru tenant support
- Fișier:
backend/modules/telegram/app/auth/linking.py - Modificări:
- La linking, salvează și
tenant_idîn SQLite - JWT token include
tenant_id - Toate requests la backend includ tenant_id corect
- La linking, salvează și
- Fișier:
-
Add tenant selection endpoint (future enhancement)
- Endpoint:
POST /api/auth/select-tenant - Use case: User cu access la multiple tenants poate switcha între ele
- Response: New JWT token cu alt tenant_id
- Endpoint:
Output Verificabil
- ✅ JWT token include
tenant_idfield - ✅ Login endpoint generate token cu tenant_id corect
- ✅ TenantMiddleware extrage și validează tenant_id
- ✅ Router-uri folosesc
multi_tenant_pool.get_connection(tenant_id) - ✅ Request la tenant invalid returnează 403 Forbidden
- ✅ User fără access la tenant returnează 403 Forbidden
- ✅ Test:
pytest shared/tests/test_tenant_middleware.py -v - ✅ Test: Login cu user care are access la tenant A, request la tenant B → 403
FAZA 5: Cache & Audit Logging Integration (1-2 zile)
Obiectiv: Update Redis cache să folosească real tenant_id (nu "default") și implementează audit logging per tenant.
Tasks
-
Update Redis cache să folosească real tenant_id
- Fișier:
shared/cache/redis_client.py(dacă există) sau inline în routers - Modificare: Înlocuiește hardcoded
"default"cu realtenant_id - Înainte:
cache_key = f"cache:default:dashboard:{company_id}" - După:
tenant_id = request.state.tenant_id cache_key = f"cache:{tenant_id}:dashboard:{company_id}"
- Fișier:
-
Implementează cache invalidation per tenant
- Use case: Admin updates tenant data, invalidate doar cache-ul tenant-ului respectiv
- Endpoint:
DELETE /api/cache/{tenant_id}(admin only) - Logica:
pattern = f"cache:{tenant_id}:*" keys = redis_client.keys(pattern) if keys: redis_client.delete(*keys)
-
Implementează audit logging în TenantMiddleware
- Fișier:
shared/middleware/tenant_middleware.py - Logica: Log toate request-urile în
audit_logstableasync def _log_audit(self, request: Request, response: Response, tenant_id: str, user_id: int): # Extract info action = f"{request.method} {request.url.path}" status = "success" if response.status_code < 400 else "error" error_message = None if status == "success" else response.body.decode() # Insert în audit_logs table (async background task) await audit_logger.log( tenant_id=tenant_id, user_id=user_id, username=request.state.user.get('username'), action=action, resource=request.url.path, status=status, error_message=error_message, ip_address=request.client.host, user_agent=request.headers.get('user-agent') )
- Fișier:
-
Implementează AuditLogger helper class
- Fișier:
shared/utils/audit_logger.py - Metodă:
class AuditLogger: def __init__(self, tenant_db_url: str): self.db_url = tenant_db_url async def log( self, tenant_id: str, user_id: int, username: str, action: str, resource: str, status: str, error_message: Optional[str] = None, ip_address: Optional[str] = None, user_agent: Optional[str] = None ): # Insert în audit_logs table (PostgreSQL/SQLite) query = """ INSERT INTO audit_logs ( tenant_id, user_id, username, action, resource, status, error_message, ip_address, user_agent ) VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9) """ await self._execute_query(query, [ tenant_id, user_id, username, action, resource, status, error_message, ip_address, user_agent ])
- Fișier:
-
Add audit logs viewing endpoint
- Endpoint:
GET /api/audit-logs/{tenant_id}(tenant admin only) - Filters:
?user_id=123&start_date=2025-10-01&end_date=2025-10-31&status=error - Response: Paginated audit logs for tenant
- Endpoint:
-
Add metrics per tenant (optional, future)
- Metrics:
- Request count per tenant
- Response time per tenant
- Error rate per tenant
- Active users per tenant
- Storage: Time-series database (InfluxDB) sau Redis sorted sets
- Metrics:
Output Verificabil
- ✅ Redis cache keys include real tenant_id (not "default")
- ✅ Cache isolation: Tenant A cache nu e vizibil pentru tenant B
- ✅ Cache invalidation per tenant funcționează
- ✅ Audit logs se salvează în
audit_logstable - ✅ Audit logs include tenant_id, user_id, action, status
- ✅ Audit logs viewing endpoint returnează logs filtered per tenant
- ✅ Test:
pytest shared/tests/test_audit_logging.py -v
FAZA 6: Deployment & Testing (3-4 zile)
Obiectiv: Deploy multi-tenant în toate environment-urile (dev, Docker, Windows) și test complet.
Tasks
-
Update development environment (WSL)
- Setup:
# Create SQLite tenant DB sqlite3 data/tenant_config.db < shared/schemas/tenant_config_schema.sql # Generate encryption key python -c "from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())" # Update .env echo "TENANT_DB_URL=sqlite:///data/tenant_config.db" >> .env echo "DB_ENCRYPTION_KEY=<generated_key>" >> .env # Create default tenant python shared/scripts/create_default_tenant.py # Start app ./start-dev.sh - Verificare: Login funcționează cu tenant "default"
- Setup:
-
Update Docker deployment
- Fișier:
docker-compose.yml - Modificări:
- Add
roa-tenant-config-dbservice (PostgreSQL) - Update
roa-backendenv vars (TENANT_DB_URL,DB_ENCRYPTION_KEY) - Mount SSH keys volume read-only
- Add
- Deployment:
# Build images docker-compose build # Start services docker-compose up -d # Initialize tenant DB docker-compose exec roa-backend python shared/scripts/create_default_tenant.py # Verify docker-compose logs roa-backend | grep "tenant"
- Fișier:
-
Update Windows IIS deployment
- Script:
deployment/windows/scripts/Setup-TenantDB.ps1 - Acțiuni:
- Install SQL Server Express SAU PostgreSQL Windows service
- Create
tenant_configdatabase - Run schema SQL
- Generate encryption key (store în Windows Credential Manager)
- Create default tenant
- Manual steps:
# Run setup .\deployment\windows\scripts\Setup-TenantDB.ps1 # Update web.config cu TENANT_DB_URL # Restart ROA2WEB-Backend service Restart-Service ROA2WEB-Backend
- Script:
-
Implementează comprehensive integration tests
- Fișier:
shared/tests/integration/test_multi_tenant_flow.py - Scenarios:
- Login cu tenant A → Get dashboard → Cache hit tenant A
- Login cu tenant B → Get dashboard → Cache miss (different tenant)
- User cu access la tenant A încearcă tenant B → 403 Forbidden
- SSH tunnel tenant restart după kill → Auto-recovery
- Tenant inactive > 1h → Pool cleanup
- Run:
pytest shared/tests/integration/ -v --tb=short
- Fișier:
-
Implementează load testing cu multiple tenants
- Tool: Locust sau Apache Bench
- Scenario: 3 tenants, 100 requests each, simultaneous
- Script:
shared/tests/load/test_multi_tenant_load.py - Metrics:
- Response time per tenant (< 200ms avg)
- Error rate (< 1%)
- Pool usage (max connections per tenant)
- SSH tunnel stability (no restarts)
-
Create tenant onboarding guide
- Fișier:
shared/docs/TENANT_ONBOARDING.md - Conținut:
- How to add a new tenant (manual SQL sau admin UI)
- SSH key setup pentru tenant remote
- User assignment la tenant
- Testing tenant connection
- Troubleshooting common issues
- Fișier:
-
Create monitoring dashboard (optional)
- Tools: Grafana + Prometheus
- Metrics:
- Active tenants count
- Pool connections per tenant
- Request rate per tenant
- Error rate per tenant
- SSH tunnel uptime per tenant
Output Verificabil
- ✅ Development (WSL): Multi-tenant funcționează cu SQLite tenant DB
- ✅ Docker: Multi-tenant funcționează cu PostgreSQL tenant DB
- ✅ Windows IIS: Multi-tenant funcționează cu SQL Server Express
- ✅ Integration tests pass (100% success rate)
- ✅ Load tests: 3 tenants × 100 requests, < 200ms avg response time
- ✅ SSH tunnels: No crashes during 1h load test
- ✅ Cache isolation validated: Tenant A cache ≠ Tenant B cache
- ✅ Audit logs populated corect pentru toate requests
- ✅ Documentation complete (onboarding guide, troubleshooting)
🔧 Connection Management
SSH Tunnel Configuration
Tenant cu SSH Tunnel (Client Remote)
{
"id": "client-a-uuid",
"name": "Client A - Retail SRL",
"connection_type": "ssh_tunnel",
"oracle_host": "10.0.20.36",
"oracle_port": 1521,
"oracle_sid": "ROA",
"oracle_user": "CLIENT_A_USER",
"oracle_password_encrypted": "gAAAAABh...",
"ssh_host": "83.103.197.79",
"ssh_port": 22122,
"ssh_user": "roa2web",
"ssh_key_path": "/app/ssh-keys/client-a.key",
"ssh_tunnel_local_port": 15261,
"min_connections": 2,
"max_connections": 10,
"is_active": true
}
SSH Tunnel Flow:
Backend Process
↓
SSHTunnelManager.start_tunnel()
↓
subprocess: ssh -f -N -L 15261:10.0.20.36:1521 -p 22122 roa2web@83.103.197.79
↓
Tunnel established: localhost:15261 → 10.0.20.36:1521
↓
OraclePool connects to localhost:15261
↓
Oracle queries routed prin SSH tunnel
Direct Connection Configuration
Tenant cu Direct Connection (Client LAN)
{
"id": "client-b-uuid",
"name": "Client B - Import Export SA",
"connection_type": "direct",
"oracle_host": "192.168.1.50",
"oracle_port": 1521,
"oracle_sid": "ROA",
"oracle_user": "CLIENT_B_USER",
"oracle_password_encrypted": "gAAAAABh...",
"ssh_host": null,
"ssh_port": null,
"ssh_user": null,
"ssh_key_path": null,
"ssh_tunnel_local_port": null,
"min_connections": 5,
"max_connections": 20,
"is_active": true
}
Direct Connection Flow:
Backend Process
↓
MultiTenantPoolManager.get_connection(tenant_id)
↓
Check connection_type: "direct" → Skip SSH tunnel
↓
OraclePool.create_pool(host=192.168.1.50, port=1521, ...)
↓
Oracle queries direct la 192.168.1.50:1521
Mixed Environment Setup
3 Tenants: 2 SSH, 1 Direct
| Tenant ID | Name | Type | Oracle Host | SSH Tunnel | Local Port |
|---|---|---|---|---|---|
| client-a-uuid | Client A - Retail SRL | ssh_tunnel | 10.0.20.36:1521 | 83.103.197.79:22122 | 15261 |
| client-b-uuid | Client B - Import SA | direct | 192.168.1.50:1521 | N/A | N/A |
| client-c-uuid | Client C - Distribution | ssh_tunnel | 10.0.20.36:1521 | 212.18.45.99:22 | 15262 |
Resource Usage:
Backend Memory:
├── Pool Client A: 2-10 connections × ~5MB = 10-50MB
├── Pool Client B: 5-20 connections × ~5MB = 25-100MB
├── Pool Client C: 2-10 connections × ~5MB = 10-50MB
└── Total: ~50-200MB (vs single-tenant ~10-50MB)
SSH Processes:
├── Tunnel Client A: ~10MB RAM
├── Tunnel Client C: ~10MB RAM
└── Total: ~20MB
Total Overhead: ~70-220MB (acceptable for multi-tenant SaaS)
🔒 Security Model
Encryption Strategy
Password Encryption în Tenant DB
from cryptography.fernet import Fernet
# Generate encryption key (store în .env)
encryption_key = Fernet.generate_key() # Example: b'Xs3J7...'
# Encrypt password
fernet = Fernet(encryption_key)
encrypted_password = fernet.encrypt(b"oracle_password_plaintext")
# Result: "gAAAAABh3J..."
# Decrypt password (la runtime)
decrypted_password = fernet.decrypt(encrypted_password.encode()).decode()
Security Properties:
- ✅ Symmetric encryption (Fernet - AES 128 CBC + HMAC)
- ✅ Encryption key în environment variable (
DB_ENCRYPTION_KEY) - ✅ Passwords encrypted at rest în tenant DB
- ✅ Decryption doar la pool initialization (memory only)
- ❌ NOT: Passwords în logs, error messages, audit trails
Tenant Isolation
Izolare Completă între Tenants
┌─────────────────────────────────────────────────────────┐
│ Tenant A │
│ │
│ ┌──────────────────────────────────────────────────┐ │
│ │ Connection Pool (2-10 connections) │ │
│ │ - oracle_host: 10.0.20.36 (via SSH tunnel) │ │
│ │ - oracle_user: CLIENT_A_USER │ │
│ │ - Schema: CLIENT_A_SCHEMA │ │
│ └──────────────────────────────────────────────────┘ │
│ │
│ ┌──────────────────────────────────────────────────┐ │
│ │ Redis Cache Namespace │ │
│ │ - cache:client-a-uuid:* │ │
│ └──────────────────────────────────────────────────┘ │
│ │
│ ┌──────────────────────────────────────────────────┐ │
│ │ Audit Logs │ │
│ │ - audit_logs WHERE tenant_id='client-a-uuid' │ │
│ └──────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────┘
❌ ZERO SHARING ❌
┌─────────────────────────────────────────────────────────┐
│ Tenant B │
│ (Same structure, COMPLETELY ISOLATED) │
└─────────────────────────────────────────────────────────┘
Isolation Guarantees:
- Connection Pool: Tenant A connections NEVER folosite pentru tenant B queries
- Cache: Redis keys namespaced per tenant (
cache:{tenant_id}:*) - Audit Logs: Query filter
WHERE tenant_id = $1(indexat pentru performance) - SSH Tunnels: Separate processes, separate local ports (no crosstalk)
JWT Token Structure
Token cu Tenant ID (Signed)
{
"username": "john.doe",
"user_id": 123,
"tenant_id": "client-a-uuid",
"companies": ["COMP1", "COMP2"],
"permissions": ["read", "reports"],
"exp": 1735142400,
"iat": 1735140600,
"type": "access"
}
Security Checks în TenantMiddleware:
# 1. Extract tenant_id from JWT (decoded by AuthMiddleware)
tenant_id = request.state.user.get('tenant_id')
# 2. Validate tenant exists and is active
tenant_config = await tenant_config_loader.get_tenant(tenant_id)
if not tenant_config or not tenant_config.is_active:
raise HTTPException(403, "Tenant not active")
# 3. Validate user has access to this tenant
user_id = request.state.user.get('user_id')
has_access = await tenant_config_loader.check_user_tenant_access(user_id, tenant_id)
if not has_access:
raise HTTPException(403, "User does not have access to this tenant")
# 4. Inject tenant_id în request state (immutable)
request.state.tenant_id = tenant_id # Routers use this
Attack Scenarios Prevented:
- ❌ Tenant ID Tampering: JWT signed, client nu poate modifica tenant_id fără invalid signature
- ❌ Cross-Tenant Access: User cu access la tenant A nu poate accesa tenant B (check în step 3)
- ❌ Inactive Tenant Access: Tenant deactivated → requests rejected (check în step 2)
- ❌ SQL Injection via Tenant ID: UUID validated, folosit în parameterized queries
🧪 Testing Strategy
Unit Tests
Test Coverage per Component
shared/tests/
├── test_tenant_config.py # TenantConfigLoader
│ ├── test_load_tenants() # Load all tenants from DB
│ ├── test_get_tenant() # Get specific tenant
│ ├── test_reload_tenant() # Reload tenant config
│ ├── test_encryption_decryption() # Password encryption/decryption
│ └── test_default_tenant_fallback() # Fallback la .env credențiale
│
├── test_multi_tenant_pool.py # MultiTenantPoolManager
│ ├── test_lazy_pool_initialization() # Pool creat doar la prima cerere
│ ├── test_pool_per_tenant() # Pool-uri separate per tenant
│ ├── test_pool_cleanup_inactive() # Cleanup după 1h inactivity
│ ├── test_tenant_reload() # Reload tenant fără restart
│ └── test_connection_context_manager() # get_connection() pattern
│
├── test_ssh_tunnel_manager.py # SSHTunnelManager
│ ├── test_start_tunnel() # Start SSH tunnel subprocess
│ ├── test_stop_tunnel() # Stop SSH tunnel gracefully
│ ├── test_tunnel_health_check() # Detect dead tunnels
│ ├── test_auto_restart() # Restart cu exponential backoff
│ └── test_cleanup_all_tunnels() # Kill all processes la shutdown
│
├── test_tenant_middleware.py # TenantMiddleware
│ ├── test_extract_tenant_id() # Extract tenant_id din JWT
│ ├── test_validate_tenant_access() # User access validation
│ ├── test_inactive_tenant_blocked() # Inactive tenant → 403
│ ├── test_cross_tenant_access_blocked() # User A tenant → User B tenant → 403
│ └── test_audit_logging() # Audit logs salvate corect
│
└── test_encryption.py # Encryption utils
├── test_fernet_encryption() # Encrypt/decrypt passwords
└── test_key_rotation() # Future: Key rotation support
Run Unit Tests:
cd shared/
pytest tests/ -v --cov=database --cov=middleware --cov=utils --cov-report=html
# Expected output:
# ✅ test_tenant_config.py::test_load_tenants PASSED
# ✅ test_multi_tenant_pool.py::test_lazy_pool_initialization PASSED
# ...
# Coverage: 85% (target: > 80%)
Integration Tests
End-to-End Scenarios
shared/tests/integration/
├── test_multi_tenant_flow.py # Complete multi-tenant flow
│ ├── test_login_with_tenant_a() # Login → JWT cu tenant A
│ ├── test_dashboard_tenant_a() # Dashboard query tenant A
│ ├── test_cache_hit_tenant_a() # Cache hit pentru tenant A
│ ├── test_cross_tenant_isolation() # Tenant A cache ≠ Tenant B cache
│ └── test_audit_logs_populated() # Audit logs salvate per tenant
│
├── test_ssh_tunnel_resilience.py # SSH tunnel stability
│ ├── test_tunnel_auto_recovery() # Kill tunnel → Auto-restart
│ ├── test_multiple_tunnels_parallel() # 3 tenants SSH simultaneous
│ └── test_tunnel_port_conflicts() # Port allocation unique
│
└── test_deployment_scenarios.py # Deployment compatibility
├── test_development_sqlite() # Development cu SQLite tenant DB
├── test_docker_postgresql() # Docker cu PostgreSQL tenant DB
└── test_backward_compatibility() # Tenant "default" funcționează
Run Integration Tests:
# Requires: PostgreSQL tenant DB running + Redis + Oracle test server
docker-compose -f docker-compose.test.yml up -d
pytest shared/tests/integration/ -v --tb=short
# Expected output:
# ✅ test_multi_tenant_flow.py::test_login_with_tenant_a PASSED (0.5s)
# ✅ test_multi_tenant_flow.py::test_cache_hit_tenant_a PASSED (0.2s)
# ...
Load Testing
Performance Validation cu Multiple Tenants
# shared/tests/load/test_multi_tenant_load.py
from locust import HttpUser, task, between
import random
class MultiTenantUser(HttpUser):
wait_time = between(1, 3)
def on_start(self):
# Login to random tenant
self.tenant = random.choice(['client-a-uuid', 'client-b-uuid', 'client-c-uuid'])
response = self.client.post('/api/auth/login', json={
'username': f'user_{self.tenant}',
'password': 'test_password'
})
self.token = response.json()['access_token']
self.client.headers.update({'Authorization': f'Bearer {self.token}'})
@task(3)
def get_dashboard(self):
self.client.get(f'/api/dashboard/COMP1')
@task(2)
def get_invoices(self):
self.client.get(f'/api/invoices/COMP1')
@task(1)
def get_treasury(self):
self.client.get(f'/api/treasury/COMP1')
Run Load Test:
locust -f shared/tests/load/test_multi_tenant_load.py --host=http://localhost:8001
# Scenario: 3 tenants × 100 users = 300 concurrent users
# Duration: 10 minutes
# Expected:
# - Response time: < 200ms (p95)
# - Error rate: < 1%
# - SSH tunnels: No restarts
# - Pool connections: Max 10 per tenant (no exhaustion)
📊 Migration Checklist
Pre-Migration
-
Backup production database
# Backup Oracle database expdp username/password@ROA directory=BACKUP dumpfile=pre_migration.dmp # Backup existing .env files cp backend/.env backend/.env.backup -
Document current single-tenant config
# Save current credentials cat backend/.env > docs/pre_migration_env.txt # Save current SSH tunnel config ./ssh-tunnel-prod.sh status > docs/pre_migration_ssh.txt -
Test deployment în environment non-production
# Create staging environment docker-compose -f docker-compose.staging.yml up -d # Deploy multi-tenant în staging # ... follow migration steps ... # Validate staging works curl http://staging.roa2web.local/api/health -
Generate DB encryption key
python -c "from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())" # Save în .env: DB_ENCRYPTION_KEY=<generated_key> -
Prepare tenant configuration
- Create tenant DB (PostgreSQL/SQLite)
- Populate cu tenant "default" (credențiale existente)
- Add SSH keys pentru tenants remote
Migration Steps (Production)
Step 1: Deploy Tenant Config DB (30 min)
# Docker deployment
docker-compose up -d roa-tenant-config-db
# Verify DB is running
docker-compose exec roa-tenant-config-db psql -U tenant_admin -d tenant_config -c '\dt'
# Run schema
docker-compose exec roa-tenant-config-db psql -U tenant_admin -d tenant_config -f /docker-entrypoint-initdb.d/schema.sql
Step 2: Populate Tenant "default" (15 min)
# Run migration script
docker-compose exec roa-backend python shared/scripts/create_default_tenant.py
# Verify tenant created
docker-compose exec roa-tenant-config-db psql -U tenant_admin -d tenant_config -c 'SELECT id, name, connection_type FROM tenants;'
Step 3: Deploy Backend cu MultiTenantPoolManager (45 min)
# Update .env with tenant DB URL
echo "TENANT_DB_URL=postgresql://tenant_admin:password@roa-tenant-config-db:5432/tenant_config" >> .env
# Rebuild backend image
docker-compose build roa-backend
# Deploy new backend (rolling update)
docker-compose up -d roa-backend
# Wait for health check
watch -n 2 'curl -s http://localhost:8001/health | jq'
Step 4: Verify Tenant "default" funcționează (15 min)
# Test login (should work exactly as before)
curl -X POST http://localhost:8001/api/auth/login \
-H 'Content-Type: application/json' \
-d '{"username": "test_user", "password": "test_password"}'
# Response should include tenant_id: "default"
# {
# "access_token": "eyJ...",
# "user": {
# "tenant_id": "default",
# ...
# }
# }
# Test dashboard (should work as before)
curl -H "Authorization: Bearer $TOKEN" http://localhost:8001/api/dashboard/COMP1
Step 5: Add Tenants Noi (One by One)
# Add tenant A (SSH tunnel)
docker-compose exec roa-backend python shared/scripts/add_tenant.py \
--name "Client A - Retail SRL" \
--connection-type ssh_tunnel \
--oracle-host 10.0.20.36 \
--oracle-user CLIENT_A_USER \
--oracle-password "encrypted_password" \
--ssh-host 83.103.197.79 \
--ssh-port 22122 \
--ssh-key /app/ssh-keys/client-a.key \
--ssh-local-port 15261
# Add users la tenant A
docker-compose exec roa-tenant-config-db psql -U tenant_admin -d tenant_config -c \
"INSERT INTO tenant_users (tenant_id, user_id, username) VALUES ('client-a-uuid', 123, 'john.doe');"
# Test tenant A login
curl -X POST http://localhost:8001/api/auth/login \
-H 'Content-Type: application/json' \
-d '{"username": "john.doe", "password": "password"}'
# Verify JWT includes tenant_id: "client-a-uuid"
Step 6: Monitor Logs per Tenant (Ongoing)
# Monitor all tenant logs
docker-compose logs -f roa-backend | grep "tenant_id"
# Monitor SSH tunnels
docker-compose logs -f roa-backend | grep "SSH tunnel"
# Monitor pool connections
docker-compose logs -f roa-backend | grep "pool"
# Check audit logs
docker-compose exec roa-tenant-config-db psql -U tenant_admin -d tenant_config -c \
'SELECT tenant_id, username, action, status, created_at FROM audit_logs ORDER BY created_at DESC LIMIT 20;'
Step 7: Performance Validation (1-2h)
# Run load test
locust -f shared/tests/load/test_multi_tenant_load.py --host=http://localhost:8001 --users=100 --spawn-rate=10 --run-time=1h
# Monitor metrics
# - Response time: < 200ms (p95)
# - Error rate: < 1%
# - Pool usage: < 80% per tenant
# - SSH tunnels: No restarts
Post-Migration
-
All tenants functional
- Tenant "default" works (backward compatibility)
- Tenant A works (SSH tunnel)
- Tenant B works (direct connection)
-
No performance degradation
- Response time same as single-tenant (< 10% overhead)
- No connection pool exhaustion
- SSH tunnels stable (no auto-restarts)
-
Audit logs populated
# Verify audit logs per tenant SELECT tenant_id, COUNT(*) FROM audit_logs GROUP BY tenant_id; -
Documentation updated
- Update
CLAUDE.mdcu multi-tenant architecture - Update deployment guides (Docker, Windows)
- Create tenant onboarding guide
- Update
-
Monitoring dashboards
- Grafana dashboard per tenant
- Alerts pentru pool exhaustion, SSH tunnel failures
🎯 Deployment Guides
Development Setup (WSL/Local)
Prerequisites:
- Python 3.11+
- SQLite3
- Redis server
- SSH access la Oracle server (pentru tenants cu SSH tunnel)
Setup Steps:
# 1. Create SQLite tenant DB
mkdir -p data
sqlite3 data/tenant_config.db < shared/schemas/tenant_config_schema.sql
# 2. Generate encryption key
python3 -c "from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())" > .encryption_key
DB_ENCRYPTION_KEY=$(cat .encryption_key)
# 3. Update .env
cat >> backend/.env << EOF
# Tenant Configuration
TENANT_DB_URL=sqlite:///data/tenant_config.db
DB_ENCRYPTION_KEY=$DB_ENCRYPTION_KEY
EOF
# 4. Create default tenant
cd shared/
python scripts/create_default_tenant.py
# 5. Start Redis
redis-server --daemonize yes
# 6. Start application
cd ../
./start-dev.sh
# 7. Verify
curl http://localhost:8001/health
# Should return: {"database": "connected", "tenants_loaded": 1}
Add New Tenant (Development):
# Add tenant via SQL
sqlite3 data/tenant_config.db << EOF
INSERT INTO tenants (
id, name, connection_type,
oracle_host, oracle_port, oracle_sid, oracle_user, oracle_password_encrypted,
ssh_host, ssh_port, ssh_user, ssh_key_path, ssh_tunnel_local_port
) VALUES (
'dev-tenant-uuid',
'Dev Tenant - Test Company',
'ssh_tunnel',
'10.0.20.36',
1521,
'ROA',
'DEV_USER',
'encrypted_password_here',
'83.103.197.79',
22122,
'roa2web',
'/tmp/roa_oracle_server',
15263
);
-- Add user to tenant
INSERT INTO tenant_users (tenant_id, user_id, username)
VALUES ('dev-tenant-uuid', 999, 'dev_user');
EOF
# Restart backend
pkill -f "uvicorn app.main:app"
./start-dev.sh
Docker Deployment (Proxmox LXC)
Prerequisites:
- Docker 24+
- Docker Compose 2.20+
- 4GB RAM minimum
- PostgreSQL 15 container
docker-compose.multi-tenant.yml:
version: '3.8'
services:
# Tenant Configuration Database
roa-tenant-config-db:
image: postgres:15-alpine
container_name: roa-tenant-config-db
restart: unless-stopped
environment:
POSTGRES_DB: tenant_config
POSTGRES_USER: tenant_admin
POSTGRES_PASSWORD: ${TENANT_DB_PASSWORD}
volumes:
- tenant-config-data:/var/lib/postgresql/data
- ./shared/schemas/tenant_config_schema.sql:/docker-entrypoint-initdb.d/schema.sql:ro
networks:
- roa-network
healthcheck:
test: ["CMD-SHELL", "pg_isready -U tenant_admin -d tenant_config"]
interval: 10s
timeout: 5s
retries: 5
# Backend (Multi-Tenant)
roa-backend:
build:
context: .
dockerfile: ./backend/Dockerfile
image: roa2web/backend:multi-tenant
container_name: roa-backend
restart: unless-stopped
environment:
# Tenant Configuration
- TENANT_DB_URL=postgresql://tenant_admin:${TENANT_DB_PASSWORD}@roa-tenant-config-db:5432/tenant_config
- DB_ENCRYPTION_KEY=${DB_ENCRYPTION_KEY}
# JWT Configuration
- JWT_SECRET_KEY=${JWT_SECRET_KEY}
# Redis Cache
- REDIS_URL=redis://:${REDIS_PASSWORD}@roa-redis:6379/0
volumes:
# SSH keys for tenant tunnels (read-only)
- ./ssh-keys:/app/ssh-keys:ro
- backend-logs:/app/logs
networks:
- roa-network
depends_on:
roa-tenant-config-db:
condition: service_healthy
roa-redis:
condition: service_healthy
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
interval: 30s
timeout: 10s
retries: 3
# Redis Cache
roa-redis:
image: redis:7-alpine
container_name: roa-redis
restart: unless-stopped
command: redis-server --appendonly yes --requirepass ${REDIS_PASSWORD}
volumes:
- redis-data:/data
networks:
- roa-network
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 10s
timeout: 5s
retries: 5
# Frontend (unchanged)
roa-frontend:
build:
context: ./src
dockerfile: Dockerfile
image: roa2web/frontend:latest
container_name: roa-frontend
restart: unless-stopped
networks:
- roa-network
# Nginx Gateway (unchanged)
roa-gateway:
build:
context: ./nginx
dockerfile: Dockerfile
image: roa2web/nginx-gateway:latest
container_name: roa-gateway
restart: unless-stopped
ports:
- "80:80"
- "443:443"
networks:
- roa-network
depends_on:
- roa-backend
- roa-frontend
volumes:
tenant-config-data:
redis-data:
backend-logs:
networks:
roa-network:
driver: bridge
Deployment:
# 1. Create .env file
cat > .env << EOF
TENANT_DB_PASSWORD=$(openssl rand -base64 32)
DB_ENCRYPTION_KEY=$(python3 -c "from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())")
JWT_SECRET_KEY=$(openssl rand -base64 64)
REDIS_PASSWORD=$(openssl rand -base64 32)
EOF
# 2. Prepare SSH keys directory
mkdir -p ssh-keys
chmod 700 ssh-keys
cp /path/to/client-a.key ssh-keys/client-a.key
chmod 400 ssh-keys/client-a.key
# 3. Build and start services
docker-compose -f docker-compose.multi-tenant.yml build
docker-compose -f docker-compose.multi-tenant.yml up -d
# 4. Wait for tenant DB initialization
docker-compose logs -f roa-tenant-config-db | grep "database system is ready"
# 5. Create default tenant
docker-compose exec roa-backend python shared/scripts/create_default_tenant.py
# 6. Verify deployment
curl http://localhost/api/health
# {"api": "healthy", "database": "connected", "tenants_loaded": 1}
Add New Tenant:
# Connect to tenant DB
docker-compose exec roa-tenant-config-db psql -U tenant_admin -d tenant_config
# Insert tenant (with encrypted password)
INSERT INTO tenants (id, name, connection_type, oracle_host, oracle_port, oracle_sid, oracle_user, oracle_password_encrypted, ssh_host, ssh_port, ssh_user, ssh_key_path, ssh_tunnel_local_port, is_active)
VALUES (
'client-a-uuid',
'Client A - Retail SRL',
'ssh_tunnel',
'10.0.20.36',
1521,
'ROA',
'CLIENT_A_USER',
'gAAAAABh...', -- Fernet encrypted password
'83.103.197.79',
22122,
'roa2web',
'/app/ssh-keys/client-a.key',
15261,
TRUE
);
-- Add user to tenant
INSERT INTO tenant_users (tenant_id, user_id, username)
VALUES ('client-a-uuid', 123, 'john.doe');
\q
# Reload backend (or wait for auto-reload)
docker-compose restart roa-backend
Windows IIS Deployment
Prerequisites:
- Windows Server 2019+
- IIS 10+
- SQL Server Express 2019+ SAU PostgreSQL 15 for Windows
- Python 3.11+ (Windows installer)
- Redis for Windows (MSI installer)
Setup Script: deployment/windows/scripts/Setup-MultiTenant.ps1
# Run as Administrator
.\deployment\windows\scripts\Setup-MultiTenant.ps1
<#
This script will:
1. Install SQL Server Express 2019
2. Create tenant_config database
3. Run schema SQL
4. Generate encryption key (save în Windows Credential Manager)
5. Create default tenant
6. Update ROA2WEB backend service
7. Restart IIS
#>
Manual Setup:
# 1. Install SQL Server Express
# Download from: https://www.microsoft.com/en-us/sql-server/sql-server-downloads
# Install with default instance name: SQLEXPRESS
# 2. Create tenant database
sqlcmd -S localhost\SQLEXPRESS -E -Q "CREATE DATABASE tenant_config"
# 3. Run schema
sqlcmd -S localhost\SQLEXPRESS -d tenant_config -E -i shared\schemas\tenant_config_schema.sql
# 4. Generate encryption key
python -c "from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())" | Out-File -FilePath .encryption_key -NoNewline
# 5. Store key în Windows Credential Manager
cmdkey /generic:ROA2WEB_DB_ENCRYPTION_KEY /user:system /pass:(Get-Content .encryption_key)
# 6. Update backend .env
@"
TENANT_DB_URL=mssql+pyodbc://localhost\SQLEXPRESS/tenant_config?driver=ODBC+Driver+17+for+SQL+Server&trusted_connection=yes
DB_ENCRYPTION_KEY=$(Get-Content .encryption_key)
"@ | Add-Content -Path C:\inetpub\wwwroot\roa2web\backend\.env
# 7. Create default tenant
cd C:\inetpub\wwwroot\roa2web
python shared\scripts\create_default_tenant.py
# 8. Restart backend service
Restart-Service ROA2WEB-Backend
# 9. Verify
curl http://localhost:8000/health
Add New Tenant (Windows):
# Connect to SQL Server
sqlcmd -S localhost\SQLEXPRESS -d tenant_config -E
-- Insert tenant
INSERT INTO tenants (id, name, connection_type, oracle_host, oracle_port, oracle_sid, oracle_user, oracle_password_encrypted, is_active)
VALUES (
'client-b-uuid',
'Client B - Import Export SA',
'direct',
'192.168.1.50',
1521,
'ROA',
'CLIENT_B_USER',
'gAAAAABh...', -- Encrypted password
1
);
-- Add user to tenant
INSERT INTO tenant_users (tenant_id, user_id, username)
VALUES ('client-b-uuid', 456, 'jane.smith');
GO
EXIT
# Restart backend
Restart-Service ROA2WEB-Backend
📝 Configuration Examples
Tenant Config: SSH Tunnel (Development)
{
"id": "dev-client-uuid",
"name": "Development Client - Test Company",
"connection_type": "ssh_tunnel",
"oracle_host": "10.0.20.36",
"oracle_port": 1521,
"oracle_sid": "ROA",
"oracle_user": "DEV_USER",
"oracle_password_encrypted": "gAAAAABhXj7Ks3J...",
"ssh_host": "83.103.197.79",
"ssh_port": 22122,
"ssh_user": "roa2web",
"ssh_key_path": "/tmp/roa_oracle_server",
"ssh_tunnel_local_port": 15260,
"min_connections": 2,
"max_connections": 5,
"is_active": true
}
Tenant Config: Direct Connection (Production)
{
"id": "prod-client-uuid",
"name": "Production Client - Enterprise Corp",
"connection_type": "direct",
"oracle_host": "192.168.100.50",
"oracle_port": 1521,
"oracle_sid": "ROA",
"oracle_user": "PROD_USER",
"oracle_password_encrypted": "gAAAAABhXj8Nm4K...",
"ssh_host": null,
"ssh_port": null,
"ssh_user": null,
"ssh_key_path": null,
"ssh_tunnel_local_port": null,
"min_connections": 5,
"max_connections": 20,
"is_active": true
}
Tenant Config: Docker Deployment (PostgreSQL Tenant DB)
.env for Docker Compose:
# Tenant Configuration Database
TENANT_DB_PASSWORD=SecurePostgresPassword123!
DB_ENCRYPTION_KEY=Xs3J7vN2pQ8kR9mT1wY5zC6bA4dF0gH=
# Backend
JWT_SECRET_KEY=YourVerySecureJWTSecretKeyHere123456789
# Redis
REDIS_PASSWORD=SecureRedisPassword456!
User-Tenant Mapping Example
-- User john.doe has access to 2 tenants
INSERT INTO tenant_users (tenant_id, user_id, username, is_admin) VALUES
('client-a-uuid', 123, 'john.doe', TRUE),
('client-b-uuid', 123, 'john.doe', FALSE);
-- User jane.smith has access to 1 tenant
INSERT INTO tenant_users (tenant_id, user_id, username, is_admin) VALUES
('client-b-uuid', 456, 'jane.smith', FALSE);
-- Query: Get all tenants for user
SELECT t.id, t.name, tu.is_admin
FROM tenants t
JOIN tenant_users tu ON t.id = tu.tenant_id
WHERE tu.user_id = 123 AND t.is_active = TRUE;
-- Result:
-- | id | name | is_admin |
-- |----------------|-------------------------------|----------|
-- | client-a-uuid | Client A - Retail SRL | TRUE |
-- | client-b-uuid | Client B - Import Export SA | FALSE |
🎯 Success Criteria
Definition of Done
Funcțional:
- ✅ Aplicația suportă minimum 3 tenants simultaneous
- ✅ Tenant identification din JWT funcționează corect
- ✅ SSH tunnels pornesc/opresc automat per tenant
- ✅ Connection pools izolate per tenant (zero sharing)
- ✅ Cache isolation între tenants (namespace per tenant)
- ✅ No cross-tenant data leakage în audit logs sau cache
Deployment:
- ✅ Funcționează în toate deployment scenarios (dev/WSL, Docker, Windows IIS)
- ✅ Backward compatibility: Tenant "default" funcționează exact ca single-tenant
- ✅ Zero downtime pentru existing tenant când adaugi tenant nou (lazy loading)
- ✅ Migration script successful în < 2h (staging environment)
Performance:
- ✅ Overhead < 10% vs single-tenant (measured în load testing)
- ✅ Response time < 200ms (p95) cu 3 tenants × 100 requests
- ✅ No connection pool exhaustion (max 80% usage per tenant)
- ✅ SSH tunnels stable (zero auto-restarts în 1h load test)
Security:
- ✅ Passwords encrypted at rest în tenant DB (Fernet AES-128)
- ✅ SSH keys mounted read-only în Docker volumes
- ✅ JWT tenant_id signed (nu poate fi modificat de client)
- ✅ Tenant access validation în middleware (403 pentru unauthorized)
- ✅ Audit logging TOATE operațiile per tenant
Testing:
- ✅ Unit tests: > 80% code coverage
- ✅ Integration tests: All scenarios pass (login, dashboard, cross-tenant isolation)
- ✅ Load tests: 3 tenants × 100 users, 10 minutes, < 1% error rate
- ✅ Manual testing: Tenant onboarding guide validated
Documentation:
- ✅ CLAUDE.md updated cu multi-tenant architecture
- ✅ Deployment guides (dev, Docker, Windows) complete
- ✅ Tenant onboarding guide created
- ✅ Troubleshooting guide created
- ✅ API documentation updated (Swagger/ReDoc)
⚠️ Risks & Mitigations
Risk: SSH Tunnel Instability
Scenario: SSH tunnel process crashes sau network interruption între backend și SSH server.
Impact: Tenant-ul afectat nu poate accesa Oracle DB (requests fail cu connection error).
Mitigation:
- Health Checks: Background task checks tunnel health every 60s
- Auto-Restart: Restart tunnel automat cu exponential backoff (5s, 10s, 20s, max 60s)
- Monitoring: Alert dacă tunnel e down > 5 minutes
- Fallback: Graceful degradation - alți tenants continuă să funcționeze normal
Detection:
async def monitor_ssh_tunnels():
for tenant_id in ssh_tunnel_manager.tunnels:
if not await ssh_tunnel_manager.check_tunnel_health(tenant_id):
logger.error(f"Tunnel down for tenant {tenant_id}, restarting...")
await ssh_tunnel_manager.restart_tunnel(tenant_id)
Risk: Connection Pool Exhaustion
Scenario: Tenant face burst de requests, pool ajunge la max connections (ex: 10), noi requests block sau timeout.
Impact: Slow response time sau 503 Service Unavailable pentru tenant-ul respectiv.
Mitigation:
- Pool Limits: Set realistic limits per tenant (min=2, max=10 default, configurable)
- Queue Timeout:
getmode=POOL_GETMODE_WAITcu timeout (ex: 30s) - Rate Limiting: Limit requests per user/tenant (ex: 100 req/min)
- Monitoring: Alert dacă pool usage > 80% pentru > 5 minutes
- Scaling: Increase
max_connectionspentru high-traffic tenants
Configuration:
# În tenant config DB
UPDATE tenants SET max_connections = 20 WHERE id = 'high-traffic-tenant-uuid';
# Reload tenant
await multi_tenant_pool.reload_tenant('high-traffic-tenant-uuid')
Risk: Tenant Credential Leak
Scenario: Attacker obține acces la tenant DB sau logs și vede Oracle passwords.
Impact: Data breach - attacker poate accesa Oracle DB direct.
Mitigation:
- Encryption at Rest: Passwords encrypted cu Fernet în tenant DB
- Encryption Key Security:
DB_ENCRYPTION_KEYîn environment variables (nu în git) - Access Control: Tenant DB access restricted (firewall, VPN)
- No Plaintext Logs: NEVER log decrypted passwords (check code reviews)
- Audit Logging: Log all access la tenant config (who/when)
- Key Rotation: Support key rotation (encrypt cu new key, decrypt cu old key)
Validation:
# Check logs pentru password leaks
docker-compose logs roa-backend | grep -i "password" | grep -v "encrypted"
# Should return ZERO results
# Check tenant DB
docker-compose exec roa-tenant-config-db psql -U tenant_admin -d tenant_config -c 'SELECT id, name, oracle_password_encrypted FROM tenants LIMIT 5;'
# oracle_password_encrypted should start with "gAAAAA..." (Fernet token)
Risk: Cross-Tenant Data Leakage
Scenario: Bug în middleware sau router permite user din tenant A să acceseze date din tenant B.
Impact: CRITICAL data breach - confidențialitate compromisă.
Mitigation:
- Mandatory Middleware: TenantMiddleware validează tenant access pentru TOATE requests
- Explicit Tenant ID: Routers MUST use
request.state.tenant_id(no global state) - Code Reviews: TOATE modificările în routers reviewed pentru tenant isolation
- Integration Tests: Test cross-tenant access blocked (403 Forbidden)
- Audit Logging: Log tenant_id în TOATE audit entries pentru forensics
Test Scenario:
# Test: User cu tenant A încearcă să acceseze tenant B
def test_cross_tenant_access_blocked():
# Login cu tenant A
token_a = login(user_id=123, tenant_id='client-a-uuid')
# Modify JWT tenant_id → tenant B (attack simulation)
forged_token = jwt.encode({
'user_id': 123,
'tenant_id': 'client-b-uuid', # FORGED
'exp': datetime.utcnow() + timedelta(hours=1)
}, secret_key, algorithm='HS256')
# Request cu forged token
response = client.get('/api/dashboard/COMP1', headers={'Authorization': f'Bearer {forged_token}'})
# MUST return 403 Forbidden (not 200 OK)
assert response.status_code == 403
assert 'does not have access to tenant' in response.json()['detail']
Risk: Performance Degradation cu Multiple Tenants
Scenario: Cu 10+ tenants, response time crește sau backend consumă prea multă memorie.
Impact: Poor user experience, server overload.
Mitigation:
- Lazy Loading: Pool-uri create doar când tenant e accesat (economie memorie)
- Pool Cleanup: Inactive pools > 1h se închid automat
- Resource Limits: Set
max_connectionsrealistic per tenant (evită OOM) - Monitoring: Track memory usage, response time per tenant
- Horizontal Scaling: Add more backend replicas (Docker Swarm, Kubernetes)
- Connection Pooling: Reuse connections (oracle
create_poolalready does this)
Performance Baseline:
Single-Tenant:
- Memory: 50MB (1 pool × 2-10 connections)
- Response time: 50ms (p95)
Multi-Tenant (3 tenants):
- Memory: 150MB (3 pools × 2-10 connections)
- Response time: 55ms (p95)
- Overhead: 10% (acceptable)
Multi-Tenant (10 tenants):
- Memory: 500MB (10 pools × 2-10 connections)
- Response time: 65ms (p95)
- Overhead: 30% (needs optimization if > 10% target)
Optimization:
- Reduce
min_connectionsde la 2 la 1 pentru low-traffic tenants - Aggressive cleanup: Idle > 30 min (instead of 1h)
- Cache more aggressively (reduce Oracle queries)
📚 Referințe
Current Implementation
- OraclePool:
shared/database/oracle_pool.py- Singleton pattern for single-tenant - JWT Handler:
shared/auth/jwt_handler.py- Token creation/validation (needs tenant_id) - Auth Middleware:
shared/auth/middleware.py- JWT verification (needs tenant validation) - Backend Main:
backend/app/main.py- Startup logic (needs MultiTenantPoolManager) - SSH Tunnel Script:
ssh-tunnel-prod.sh- Single tunnel script (needs per-tenant manager)
Inspiration & Patterns
- Redis Implementation Plan:
shared/docs/REDIS_IMPLEMENTATION_PLAN.md- Good structure for this plan - Docker Compose:
docker-compose.yml- Current deployment (needs tenant-config-db service) - Windows Deployment:
deployment/windows/scripts/- Deployment patterns for Windows - Python oracledb Docs: https://python-oracledb.readthedocs.io/en/latest/user_guide/connection_handling.html
- Fernet Encryption: https://cryptography.io/en/latest/fernet/
Multi-Tenant Best Practices
- Tenant Isolation Patterns: https://docs.microsoft.com/en-us/azure/architecture/guide/multitenant/
- Connection Pooling: https://python-oracledb.readthedocs.io/en/latest/user_guide/connection_handling.html#connection-pooling
- SSH Tunnel Management: https://www.ssh.com/academy/ssh/tunneling-example
- JWT Security: https://jwt.io/introduction
Testing Resources
- pytest-asyncio: https://pytest-asyncio.readthedocs.io/
- Locust Load Testing: https://docs.locust.io/en/stable/
- Docker Compose Testing: https://docs.docker.com/compose/
📅 Timeline Summary
| Faza | Durată | Obiectiv | Output Verificabil |
|---|---|---|---|
| Faza 1 | 2-3 zile | Tenant Config DB | Tenant DB funcționează, default tenant creat |
| Faza 2 | 3-4 zile | MultiTenantPoolManager | Pool-uri per tenant, lazy loading |
| Faza 3 | 2-3 zile | SSH Tunnel Manager | SSH tunnels per tenant, auto-restart |
| Faza 4 | 2-3 zile | JWT & Middleware | JWT cu tenant_id, tenant validation |
| Faza 5 | 1-2 zile | Cache & Audit | Redis cache per tenant, audit logs |
| Faza 6 | 3-4 zile | Deployment & Testing | Deploy în toate env-urile, tests pass |
| TOTAL | 14-20 zile | Multi-Tenant Production-Ready | All success criteria met |
🚀 Next Steps
- Review acest plan cu team/stakeholders
- Prioritizează fazele (poate Faza 1+2 first, restul după)
- Setup development environment pentru testing
- Creează branch:
feature/multi-tenant-architecture - Start Faza 1: Tenant Configuration Database
- Iterate: Test după fiecare fază, adjust plan dacă e nevoie
Document Version: 1.0 Last Updated: 2025-10-25 Author: Claude Code (Anthropic) Status: Ready for Implementation