Files
roa2web-service-auto/shared/docs/MULTI_TENANT_UPGRADE_PLAN.md
Marius Mutu c56f832e81 Add comprehensive multi-tenant architecture upgrade plan
Creates detailed 60-page implementation roadmap for transforming ROA2WEB from
single-tenant to multi-tenant SaaS architecture. Plan includes 6 phases with
backward compatibility, hybrid connection support (SSH tunnel + direct), and
complete deployment strategies for dev/Docker/Windows environments.

Key features:
- Tenant isolation with separate Oracle connection pools per tenant
- Dynamic SSH tunnel management with auto-restart
- Encrypted credentials in PostgreSQL/SQLite tenant config DB
- JWT-based tenant identification and access validation
- Redis cache namespacing per tenant
- Comprehensive testing and migration strategies

Timeline: 14-20 days implementation
Target: <10% performance overhead, zero downtime migration

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-25 22:59:12 +03:00

87 KiB
Raw Blame History

Plan Upgrade Multi-Tenant Architecture - ROA2WEB

Version: 1.0 Created: 2025-10-25 Status: Planning Phase


📋 Sumar Executiv

ROA2WEB va fi transformat de la o aplicație single-tenant (un singur client, o singură bază de date Oracle) la o arhitectură multi-tenant SaaS care suportă:

  • Multiple clienți simultaneous cu izolare completă între tenants (pool-uri, cache, audit logs)
  • Conexiuni hibride: SSH tunnel pentru clienți remote SAU direct TCP pentru clienți în LAN
  • Deployment flexibil: Development (WSL), Docker (Proxmox LXC), Windows IIS
  • Backward compatibility: Tenant "default" funcționează exact ca single-tenant actual (zero breaking changes)
  • Gradual migration: Fiecare fază testabilă independent, rollout incremental
  • Security-first: Passwords encrypted în tenant DB, SSH keys read-only, JWT signing per tenant
  • Performance: < 10% overhead vs single-tenant, izolare pool-uri per tenant

🏗️ Arhitectură Target

Single-Tenant (Actual)

┌─────────────────────────────────────────────────────┐
│                  FastAPI Backend                    │
│                                                     │
│  ┌─────────────────────────────────────────────┐   │
│  │         OraclePool (Singleton)              │   │
│  │  - Hardcoded credentials din .env           │   │
│  │  - Min: 2, Max: 10 connections              │   │
│  │  - Shared pentru toți userii                │   │
│  └─────────────────────────────────────────────┘   │
│                      ▼                              │
└──────────────────────┼──────────────────────────────┘
                       │
         ┌─────────────┴───────────┐
         │                         │
    SSH Tunnel                Direct Connection
  (Development)              (Windows Production)
         │                         │
         ▼                         ▼
┌─────────────────┐      ┌──────────────────┐
│  Oracle Server  │      │  Oracle Server   │
│   (Remote)      │      │   (Local LAN)    │
└─────────────────┘      └──────────────────┘

JWT Token Structure (Actual):
{
  "username": "john.doe",
  "user_id": 123,
  "companies": ["COMP1", "COMP2"],
  "permissions": ["read", "reports"],
  "exp": 1234567890,
  "iat": 1234567800,
  "type": "access"
}

Multi-Tenant (Target)

┌────────────────────────────────────────────────────────────────────┐
│                        FastAPI Backend                             │
│                                                                    │
│  ┌──────────────────────────────────────────────────────────────┐ │
│  │            MultiTenantPoolManager (New)                      │ │
│  │                                                              │ │
│  │  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐      │ │
│  │  │ Client A     │  │ Client B     │  │ Client C     │      │ │
│  │  │ Pool (2-10)  │  │ Pool (2-10)  │  │ Pool (2-10)  │      │ │
│  │  │ SSH Tunnel   │  │ Direct Conn  │  │ SSH Tunnel   │      │ │
│  │  └──────┬───────┘  └──────┬───────┘  └──────┬───────┘      │ │
│  │         │                 │                 │              │ │
│  └─────────┼─────────────────┼─────────────────┼──────────────┘ │
│            │                 │                 │                │
└────────────┼─────────────────┼─────────────────┼────────────────┘
             │                 │                 │
    ┌────────┴─────┐  ┌────────┴─────┐  ┌────────┴─────┐
    │ SSH Process  │  │   Direct     │  │ SSH Process  │
    │ localhost:   │  │ 192.168.1.50 │  │ localhost:   │
    │   15261      │  │   :1521      │  │   15262      │
    └────────┬─────┘  └────────┬─────┘  └────────┬─────┘
             │                 │                 │
             ▼                 ▼                 ▼
    ┌──────────────┐  ┌──────────────┐  ┌──────────────┐
    │ Oracle       │  │ Oracle       │  │ Oracle       │
    │ Client A     │  │ Client B     │  │ Client C     │
    │ (Remote)     │  │ (LAN)        │  │ (Remote)     │
    └──────────────┘  └──────────────┘  └──────────────┘

                          ┌──────────────────────┐
                          │ Tenant Config DB     │
                          │ (PostgreSQL/SQLite)  │
                          │                      │
                          │ - tenants            │
                          │ - tenant_users       │
                          │ - audit_logs         │
                          └──────────────────────┘

JWT Token Structure (Target):
{
  "username": "john.doe",
  "user_id": 123,
  "tenant_id": "client-a-uuid",        ← NEW
  "companies": ["COMP1", "COMP2"],
  "permissions": ["read", "reports"],
  "exp": 1234567890,
  "iat": 1234567800,
  "type": "access"
}

Redis Cache Keys:
cache:{tenant_id}:dashboard:{company_id}  ← Already prepared!
cache:{tenant_id}:invoices:{filters_hash}

Key Architectural Decisions

  1. Lazy Pool Initialization: Pool-uri create doar când tenant-ul e accesat prima dată (economie memorie)
  2. SSH Tunnel per Tenant: Subprocess separat pentru fiecare tenant remote (izolare, resilience)
  3. Tenant Config DB Separate: Nu stocăm tenant config în Oracle (evităm dependențe circulare)
  4. JWT Tenant ID Signed: Tenant ID e în token signed, nu poate fi modificat de client
  5. Pool Cleanup: Pool-uri inactive > 1h se închid automat (economie resurse)
  6. Backward Compatible: Tenant "default" mapează la .env actual (zero migration pain)

🗂️ Structura Fișierelor

Fișiere Noi

shared/
├── database/
│   ├── multi_tenant_pool.py           ✅ NEW - MultiTenantPoolManager class
│   ├── tenant_config.py                ✅ NEW - Tenant configuration loader
│   ├── ssh_tunnel_manager.py          ✅ NEW - SSH tunnel per tenant management
│   └── tenant_models.py                ✅ NEW - Pydantic models for tenants
│
├── middleware/
│   └── tenant_middleware.py            ✅ NEW - Tenant identification middleware
│
├── schemas/
│   └── tenant_config_schema.sql        ✅ NEW - PostgreSQL/SQLite schema
│
└── utils/
    ├── encryption.py                   ✅ NEW - Fernet encryption for passwords
    └── tenant_utils.py                 ✅ NEW - Tenant helper functions

deployment/
├── docker/
│   └── tenant-config-db.dockerfile     ✅ NEW - PostgreSQL tenant config container
│
└── windows/
    └── tenant-config-setup.ps1         ✅ NEW - SQL Server Express setup for tenants

Fișiere Modificate

shared/
├── database/
│   └── oracle_pool.py                  ⚠️ MODIFY - Add DEPRECATED warning
│
├── auth/
│   ├── jwt_handler.py                  ⚠️ MODIFY - Add tenant_id to JWT payload
│   └── middleware.py                   ⚠️ MODIFY - Extract tenant_id, validate access
│
└── cache/
    └── redis_client.py                 ⚠️ MODIFY - Use real tenant_id (not "default")

reports-app/backend/
├── app/
│   ├── main.py                         ⚠️ MODIFY - Initialize MultiTenantPoolManager
│   └── routers/
│       ├── companies.py                ⚠️ MODIFY - Use tenant_id from request.state
│       ├── dashboard.py                ⚠️ MODIFY - Use tenant_id from request.state
│       ├── invoices.py                 ⚠️ MODIFY - Use tenant_id from request.state
│       └── treasury.py                 ⚠️ MODIFY - Use tenant_id from request.state
│
└── .env.example                        ⚠️ MODIFY - Add tenant config DB variables

docker-compose.yml                      ⚠️ MODIFY - Add tenant-config-db service

deployment/windows/
└── scripts/
    └── Install-ROA2WEB.ps1             ⚠️ MODIFY - Add tenant DB setup

Database Schema (Tenant Config DB)

PostgreSQL/SQLite Compatible Schema

-- shared/schemas/tenant_config_schema.sql

-- Tenants configuration table
CREATE TABLE IF NOT EXISTS tenants (
    id VARCHAR(36) PRIMARY KEY,                          -- UUID
    name VARCHAR(255) NOT NULL,                          -- Display name (ex: "Client A - Retail SRL")
    connection_type VARCHAR(20) NOT NULL,                -- 'ssh_tunnel' | 'direct'

    -- Oracle connection details
    oracle_host VARCHAR(255) NOT NULL,                   -- Oracle server IP/hostname
    oracle_port INTEGER NOT NULL DEFAULT 1521,
    oracle_sid VARCHAR(50) NOT NULL DEFAULT 'ROA',
    oracle_user VARCHAR(100) NOT NULL,
    oracle_password_encrypted TEXT NOT NULL,             -- Fernet encrypted password

    -- SSH tunnel configuration (NULL if connection_type='direct')
    ssh_host VARCHAR(255),                               -- SSH server IP
    ssh_port INTEGER DEFAULT 22,
    ssh_user VARCHAR(100),
    ssh_key_path VARCHAR(500),                           -- Path to SSH private key
    ssh_tunnel_local_port INTEGER,                       -- Local port for tunnel (ex: 15261)

    -- Pool configuration
    min_connections INTEGER NOT NULL DEFAULT 2,
    max_connections INTEGER NOT NULL DEFAULT 10,

    -- Status
    is_active BOOLEAN NOT NULL DEFAULT TRUE,
    created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,

    -- Constraints
    CONSTRAINT chk_connection_type CHECK (connection_type IN ('ssh_tunnel', 'direct')),
    CONSTRAINT chk_ssh_config CHECK (
        (connection_type = 'direct') OR
        (connection_type = 'ssh_tunnel' AND ssh_host IS NOT NULL AND ssh_key_path IS NOT NULL)
    )
);

-- Tenant users mapping (which users have access to which tenants)
CREATE TABLE IF NOT EXISTS tenant_users (
    id SERIAL PRIMARY KEY,                               -- Auto-increment ID
    tenant_id VARCHAR(36) NOT NULL REFERENCES tenants(id) ON DELETE CASCADE,
    user_id INTEGER NOT NULL,                            -- Oracle user ID from CONTAFIN_ORACLE.UTILIZATORI
    username VARCHAR(100) NOT NULL,                      -- Oracle username
    is_admin BOOLEAN NOT NULL DEFAULT FALSE,             -- Tenant admin (can manage tenant config)
    granted_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
    granted_by INTEGER,                                  -- User ID who granted access

    UNIQUE(tenant_id, user_id)
);

-- Audit logs per tenant
CREATE TABLE IF NOT EXISTS audit_logs (
    id SERIAL PRIMARY KEY,
    tenant_id VARCHAR(36) NOT NULL REFERENCES tenants(id) ON DELETE CASCADE,
    user_id INTEGER NOT NULL,
    username VARCHAR(100) NOT NULL,
    action VARCHAR(100) NOT NULL,                        -- 'login', 'query', 'export', etc.
    resource VARCHAR(255),                               -- Resource accessed (ex: 'dashboard', 'invoices')
    status VARCHAR(20) NOT NULL,                         -- 'success' | 'error'
    error_message TEXT,
    ip_address VARCHAR(50),
    user_agent TEXT,
    created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,

    -- Index for fast queries
    INDEX idx_tenant_user (tenant_id, user_id),
    INDEX idx_created_at (created_at)
);

-- Insert default tenant (backward compatibility)
-- This maps to existing .env credentials
INSERT INTO tenants (
    id, name, connection_type,
    oracle_host, oracle_port, oracle_sid, oracle_user, oracle_password_encrypted,
    min_connections, max_connections, is_active
) VALUES (
    'default',
    'Default Tenant (Single-Tenant Legacy)',
    'ssh_tunnel',  -- Will be read from environment
    'localhost',   -- Will be overridden by environment if needed
    1526,
    'ROA',
    'CONTAFIN_ORACLE',
    'PLACEHOLDER_ENCRYPTED_PASSWORD',  -- Will be replaced by migration script
    2,
    10,
    TRUE
) ON CONFLICT (id) DO NOTHING;

-- Indexes for performance
CREATE INDEX IF NOT EXISTS idx_tenants_active ON tenants(is_active);
CREATE INDEX IF NOT EXISTS idx_tenant_users_user ON tenant_users(user_id);
CREATE INDEX IF NOT EXISTS idx_audit_tenant ON audit_logs(tenant_id);

🚀 Faze de Upgrade

FAZA 1: Tenant Configuration Database (2-3 zile)

Obiectiv: Creează tenant configuration database și loader pentru citirea tenant configs la startup.

Tasks

  1. Creează PostgreSQL/SQLite schema pentru tenant config

    • Fișier: shared/schemas/tenant_config_schema.sql
    • Acțiune: Define tables tenants, tenant_users, audit_logs
    • Deployment:
      • Dev: SQLite (data/tenant_config.db)
      • Docker: PostgreSQL container (roa-tenant-config-db)
      • Windows: SQL Server Express SAU PostgreSQL Windows service
  2. Implementează TenantConfigLoader

    • Fișier: shared/database/tenant_config.py
    • Clasa: TenantConfigLoader(db_url: str)
    • Metode:
      • async def load_tenants() -> Dict[str, TenantConfig] - Load all active tenants
      • async def get_tenant(tenant_id: str) -> Optional[TenantConfig] - Get specific tenant
      • async def reload_tenant(tenant_id: str) - Reload tenant config (for updates)
    • Pattern: Async context manager pentru DB connections
  3. Implementează Pydantic models pentru tenant config

    • Fișier: shared/database/tenant_models.py
    • Models:
      class TenantConfig(BaseModel):
          id: str  # UUID
          name: str
          connection_type: Literal['ssh_tunnel', 'direct']
          oracle_host: str
          oracle_port: int
          oracle_sid: str
          oracle_user: str
          oracle_password: str  # Decrypted
          ssh_host: Optional[str] = None
          ssh_port: Optional[int] = 22
          ssh_user: Optional[str] = None
          ssh_key_path: Optional[str] = None
          ssh_tunnel_local_port: Optional[int] = None
          min_connections: int = 2
          max_connections: int = 10
          is_active: bool = True
      
  4. Implementează password encryption/decryption

    • Fișier: shared/utils/encryption.py
    • Funcții:
      • encrypt_password(password: str, key: str) -> str - Fernet encryption
      • decrypt_password(encrypted: str, key: str) -> str - Fernet decryption
    • Environment: DB_ENCRYPTION_KEY (generate with Fernet.generate_key())
  5. Creează migration script pentru tenant default

    • Fișier: shared/scripts/create_default_tenant.py
    • Acțiune:
      • Citește credențiale din .env actual
      • Encrypt password cu DB_ENCRYPTION_KEY
      • Insert tenant "default" în tenant DB
      • Testează decryption și Oracle connection
  6. Update Docker Compose cu tenant config DB

    • Fișier: docker-compose.yml
    • Service nou:
      roa-tenant-config-db:
        image: postgres:15-alpine
        container_name: roa-tenant-config-db
        environment:
          POSTGRES_DB: tenant_config
          POSTGRES_USER: tenant_admin
          POSTGRES_PASSWORD: ${TENANT_DB_PASSWORD}
        volumes:
          - tenant-config-data:/var/lib/postgresql/data
          - ./shared/schemas/tenant_config_schema.sql:/docker-entrypoint-initdb.d/schema.sql:ro
        networks:
          - roa-network
      
  7. Update .env.example cu tenant DB variables

    # Tenant Configuration Database
    TENANT_DB_URL=postgresql://tenant_admin:password@localhost:5432/tenant_config
    # For SQLite (development): sqlite:///data/tenant_config.db
    DB_ENCRYPTION_KEY=GENERATE_WITH_Fernet.generate_key()
    

Output Verificabil

  • Tenant DB se creează cu succes (PostgreSQL/SQLite)
  • Schema tables create (tenants, tenant_users, audit_logs)
  • Default tenant se încarcă cu credențiale din .env actual
  • Password encryption/decryption funcționează
  • Test: pytest shared/tests/test_tenant_config.py -v
  • Docker: docker-compose up roa-tenant-config-db pornește cu succes

FAZA 2: MultiTenantPoolManager (3-4 zile)

Obiectiv: Implementează pool manager care creează pool-uri Oracle separate per tenant cu lazy initialization.

Tasks

  1. Implementează MultiTenantPoolManager class

    • Fișier: shared/database/multi_tenant_pool.py
    • Pattern: Singleton (similar cu OraclePool actual)
    • Structură:
      class MultiTenantPoolManager:
          _instance: Optional['MultiTenantPoolManager'] = None
          _pools: Dict[str, oracledb.ConnectionPool] = {}  # tenant_id -> pool
          _tenant_configs: Dict[str, TenantConfig] = {}
          _pool_locks: Dict[str, asyncio.Lock] = {}  # Thread-safe pool creation
          _last_access: Dict[str, datetime] = {}  # For cleanup inactive pools
      
          async def initialize(self, tenant_db_url: str):
              """Load tenant configs from tenant DB"""
      
          async def get_connection(self, tenant_id: str):
              """Context manager - get connection from tenant pool (lazy init)"""
      
          async def _ensure_pool(self, tenant_id: str):
              """Lazy initialize pool if not exists"""
      
          async def reload_tenant(self, tenant_id: str):
              """Reload tenant config and recreate pool"""
      
          async def cleanup_inactive_pools(self, max_idle_hours: int = 1):
              """Close pools inactive > max_idle_hours"""
      
          async def close_all_pools(self):
              """Shutdown - close all pools"""
      
  2. Implementează lazy pool initialization

    • Logica:
      async def _ensure_pool(self, tenant_id: str):
          if tenant_id in self._pools:
              self._last_access[tenant_id] = datetime.utcnow()
              return  # Pool already exists
      
          # Acquire lock pentru thread-safety
          async with self._pool_locks.setdefault(tenant_id, asyncio.Lock()):
              # Double-check în lock
              if tenant_id in self._pools:
                  return
      
              # Load tenant config
              tenant_config = await self._load_tenant_config(tenant_id)
              if not tenant_config.is_active:
                  raise ValueError(f"Tenant {tenant_id} is not active")
      
              # Create pool
              pool = oracledb.create_pool(
                  user=tenant_config.oracle_user,
                  password=tenant_config.oracle_password,
                  host=tenant_config.oracle_host,
                  port=tenant_config.oracle_port,
                  sid=tenant_config.oracle_sid,
                  min=tenant_config.min_connections,
                  max=tenant_config.max_connections,
                  increment=1,
                  getmode=oracledb.POOL_GETMODE_WAIT
              )
      
              self._pools[tenant_id] = pool
              self._tenant_configs[tenant_id] = tenant_config
              self._last_access[tenant_id] = datetime.utcnow()
              logger.info(f"Created pool for tenant {tenant_id} ({tenant_config.name})")
      
  3. Implementează get_connection context manager

    • Pattern: Same as OraclePool.get_connection() dar per tenant
      @asynccontextmanager
      async def get_connection(self, tenant_id: str):
          await self._ensure_pool(tenant_id)  # Lazy init
      
          pool = self._pools[tenant_id]
          connection = None
          try:
              connection = pool.acquire()
              self._last_access[tenant_id] = datetime.utcnow()
              logger.debug(f"Connection acquired for tenant {tenant_id}")
              yield connection
          finally:
              if connection is not None:
                  connection.close()
                  logger.debug(f"Connection returned for tenant {tenant_id}")
      
  4. Implementează pool cleanup pentru inactive tenants

    • Scheduled task: Run every hour, close pools inactive > 1h
      async def cleanup_inactive_pools(self, max_idle_hours: int = 1):
          now = datetime.utcnow()
          inactive_tenants = []
      
          for tenant_id, last_access in self._last_access.items():
              idle_hours = (now - last_access).total_seconds() / 3600
              if idle_hours > max_idle_hours:
                  inactive_tenants.append(tenant_id)
      
          for tenant_id in inactive_tenants:
              logger.info(f"Closing inactive pool for tenant {tenant_id}")
              pool = self._pools.pop(tenant_id, None)
              if pool:
                  pool.close()
              self._tenant_configs.pop(tenant_id, None)
              self._last_access.pop(tenant_id, None)
      
  5. Implementează tenant config reload (for dynamic updates)

    • Use case: Admin updates tenant config în DB, aplicația reloadează fără restart
      async def reload_tenant(self, tenant_id: str):
          # Close existing pool
          old_pool = self._pools.pop(tenant_id, None)
          if old_pool:
              old_pool.close()
      
          # Reload config from DB
          tenant_config = await self._tenant_config_loader.get_tenant(tenant_id)
          if not tenant_config:
              raise ValueError(f"Tenant {tenant_id} not found")
      
          # Pool will be recreated on next request (lazy init)
          self._tenant_configs.pop(tenant_id, None)
          self._last_access.pop(tenant_id, None)
          logger.info(f"Reloaded tenant config for {tenant_id}")
      
  6. Add backward compatibility layer

    • Tenant "default" mapează la credențiale din .env pentru zero breaking changes
      async def _load_default_tenant_from_env(self) -> TenantConfig:
          """Fallback: Load default tenant from .env if tenant DB is not available"""
          return TenantConfig(
              id='default',
              name='Default Tenant (Legacy)',
              connection_type='ssh_tunnel' if os.getenv('ORACLE_HOST') == 'localhost' else 'direct',
              oracle_host=os.getenv('ORACLE_HOST', 'localhost'),
              oracle_port=int(os.getenv('ORACLE_PORT', '1526')),
              oracle_sid=os.getenv('ORACLE_SID', 'ROA'),
              oracle_user=os.getenv('ORACLE_USER'),
              oracle_password=os.getenv('ORACLE_PASSWORD'),
              min_connections=2,
              max_connections=10,
              is_active=True
          )
      
  7. Mark OraclePool as DEPRECATED

    • Fișier: shared/database/oracle_pool.py
    • Acțiune: Add deprecation warning
      import warnings
      
      class OraclePool:
          """
          DEPRECATED: Use MultiTenantPoolManager instead.
          This class is kept for backward compatibility only.
          Will be removed in version 2.0.
          """
          def __init__(self):
              warnings.warn(
                  "OraclePool is deprecated. Use MultiTenantPoolManager for multi-tenant support.",
                  DeprecationWarning,
                  stacklevel=2
              )
              # ... rest of code
      

Output Verificabil

  • MultiTenantPoolManager creează pool-uri per tenant
  • Lazy initialization: Pool creat doar la prima cerere
  • Tenant "default" funcționează cu credențiale din .env (backward compatible)
  • Pool cleanup: Inactive pools se închid automat după 1h
  • Reload tenant: Config update fără restart aplicație
  • Test: pytest shared/tests/test_multi_tenant_pool.py -v
  • Test: Connect la 3 tenants dummy simultaneous

FAZA 3: SSH Tunnel Management per Tenant (2-3 zile)

Obiectiv: Implementează SSH tunnel manager care creează și monitorizează subprocess SSH per tenant remote.

Tasks

  1. Implementează SSHTunnelManager class

    • Fișier: shared/database/ssh_tunnel_manager.py
    • Responsabilități:
      • Start SSH tunnel subprocess per tenant
      • Monitor tunnel health (periodic checks)
      • Auto-restart on failure (exponential backoff)
      • Cleanup la shutdown
    • Structură:
      class SSHTunnelManager:
          _tunnels: Dict[str, subprocess.Popen] = {}  # tenant_id -> SSH process
          _tunnel_ports: Dict[str, int] = {}  # tenant_id -> local port
          _restart_attempts: Dict[str, int] = {}  # For exponential backoff
      
          async def start_tunnel(self, tenant_config: TenantConfig) -> int:
              """Start SSH tunnel for tenant, return local port"""
      
          async def stop_tunnel(self, tenant_id: str):
              """Stop SSH tunnel subprocess"""
      
          async def check_tunnel_health(self, tenant_id: str) -> bool:
              """Check if tunnel is alive and responding"""
      
          async def restart_tunnel(self, tenant_id: str):
              """Restart tunnel with exponential backoff"""
      
          async def cleanup_all_tunnels(self):
              """Shutdown - kill all SSH processes"""
      
  2. Implementează SSH tunnel start logic

    • Logica:
      async def start_tunnel(self, tenant_config: TenantConfig) -> int:
          tenant_id = tenant_config.id
      
          # Generate unique local port for this tenant
          local_port = tenant_config.ssh_tunnel_local_port or self._allocate_port()
      
          # Build SSH command
          ssh_cmd = [
              'ssh', '-f', '-N',
              '-L', f'{local_port}:{tenant_config.oracle_host}:{tenant_config.oracle_port}',
              '-p', str(tenant_config.ssh_port),
              '-i', tenant_config.ssh_key_path,
              '-o', 'ServerAliveInterval=60',
              '-o', 'ServerAliveCountMax=3',
              '-o', 'ExitOnForwardFailure=yes',
              f'{tenant_config.ssh_user}@{tenant_config.ssh_host}'
          ]
      
          # Start process
          process = subprocess.Popen(ssh_cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
      
          # Wait for tunnel to establish (max 10 seconds)
          for _ in range(10):
              if self._check_port_open('localhost', local_port):
                  break
              await asyncio.sleep(1)
          else:
              process.kill()
              raise RuntimeError(f"SSH tunnel failed to start for tenant {tenant_id}")
      
          self._tunnels[tenant_id] = process
          self._tunnel_ports[tenant_id] = local_port
          logger.info(f"SSH tunnel started for tenant {tenant_id} on port {local_port}")
      
          return local_port
      
  3. Implementează tunnel health checks

    • Periodic check: Every 60 seconds, verify tunnel is alive
      async def check_tunnel_health(self, tenant_id: str) -> bool:
          if tenant_id not in self._tunnels:
              return False
      
          process = self._tunnels[tenant_id]
          local_port = self._tunnel_ports[tenant_id]
      
          # Check process is alive
          if process.poll() is not None:
              logger.warning(f"SSH tunnel process died for tenant {tenant_id}")
              return False
      
          # Check port is accessible
          if not self._check_port_open('localhost', local_port):
              logger.warning(f"SSH tunnel port {local_port} not accessible for tenant {tenant_id}")
              return False
      
          return True
      
      def _check_port_open(self, host: str, port: int) -> bool:
          import socket
          try:
              with socket.create_connection((host, port), timeout=2):
                  return True
          except:
              return False
      
  4. Implementează auto-restart cu exponential backoff

    • Logica: Dacă tunnel moare, restart cu delay: 5s, 10s, 20s, 40s, max 60s
      async def restart_tunnel(self, tenant_id: str):
          attempts = self._restart_attempts.get(tenant_id, 0)
          delay = min(5 * (2 ** attempts), 60)  # Exponential backoff, max 60s
      
          logger.info(f"Restarting tunnel for tenant {tenant_id} (attempt {attempts+1}, delay {delay}s)")
          await asyncio.sleep(delay)
      
          try:
              await self.stop_tunnel(tenant_id)
              tenant_config = await self._get_tenant_config(tenant_id)
              await self.start_tunnel(tenant_config)
      
              # Reset attempts on success
              self._restart_attempts[tenant_id] = 0
              logger.info(f"Tunnel restarted successfully for tenant {tenant_id}")
          except Exception as e:
              self._restart_attempts[tenant_id] = attempts + 1
              logger.error(f"Tunnel restart failed for tenant {tenant_id}: {e}")
              raise
      
  5. Integrate SSH tunnel manager în MultiTenantPoolManager

    • Logica: Dacă tenant are connection_type='ssh_tunnel', start tunnel înainte de pool
      # În MultiTenantPoolManager._ensure_pool()
      
      tenant_config = await self._load_tenant_config(tenant_id)
      
      # Start SSH tunnel if needed
      if tenant_config.connection_type == 'ssh_tunnel':
          if not await self._ssh_tunnel_manager.check_tunnel_health(tenant_id):
              local_port = await self._ssh_tunnel_manager.start_tunnel(tenant_config)
              # Override Oracle host/port to use tunnel
              tenant_config.oracle_host = 'localhost'
              tenant_config.oracle_port = local_port
      
      # Create pool (rest of code same as before)
      pool = oracledb.create_pool(...)
      
  6. Implementează cleanup la shutdown

    • Logica: Kill all SSH processes gracefully
      async def cleanup_all_tunnels(self):
          for tenant_id, process in self._tunnels.items():
              try:
                  process.terminate()  # SIGTERM
                  await asyncio.sleep(2)
                  if process.poll() is None:
                      process.kill()  # SIGKILL if not dead
                  logger.info(f"Stopped SSH tunnel for tenant {tenant_id}")
              except Exception as e:
                  logger.error(f"Error stopping tunnel for tenant {tenant_id}: {e}")
      
          self._tunnels.clear()
          self._tunnel_ports.clear()
      
  7. Add background task pentru health monitoring

    • Fișier: reports-app/backend/app/main.py
    • Task: Run every 60 seconds
      async def monitor_ssh_tunnels():
          while True:
              await asyncio.sleep(60)
              for tenant_id in multi_tenant_pool._tunnels.keys():
                  if not await multi_tenant_pool._ssh_tunnel_manager.check_tunnel_health(tenant_id):
                      logger.warning(f"Tunnel unhealthy for tenant {tenant_id}, restarting...")
                      await multi_tenant_pool._ssh_tunnel_manager.restart_tunnel(tenant_id)
      
      # În lifespan startup
      asyncio.create_task(monitor_ssh_tunnels())
      

Output Verificabil

  • SSH tunnel subprocess pornește per tenant remote
  • Tunnel health check detectează tunnels moarte
  • Auto-restart cu exponential backoff funcționează
  • Multiple tenants cu SSH tunnels simultaneous (port allocation unique)
  • Cleanup la shutdown: toate procesele SSH se opresc
  • Test: pytest shared/tests/test_ssh_tunnel_manager.py -v
  • Manual test: Kill SSH process, verifică auto-restart în < 60s

FAZA 4: JWT & Middleware Update (2-3 zile)

Obiectiv: Update JWT tokens să includă tenant_id și middleware să extragă/valideze tenant access.

Tasks

  1. Update JWT handler să includă tenant_id

    • Fișier: shared/auth/jwt_handler.py
    • Modificări:
      # În TokenData model
      class TokenData(BaseModel):
          username: str
          user_id: Optional[int] = None
          tenant_id: str = Field(description="Tenant ID (UUID)")  # NEW
          companies: List[str] = Field(default_factory=list)
          permissions: List[str] = Field(default_factory=list)
          exp: datetime
          iat: datetime
          token_type: str = Field(alias="type")
      
      # În create_access_token()
      def create_access_token(
          self,
          username: str,
          tenant_id: str,  # NEW parameter
          companies: List[str],
          user_id: Optional[int] = None,
          permissions: Optional[List[str]] = None
      ) -> str:
          payload = {
              "username": username,
              "user_id": user_id,
              "tenant_id": tenant_id,  # NEW
              "companies": companies or [],
              "permissions": permissions or ["read"],
              "exp": expire,
              "iat": now,
              "type": "access"
          }
          # ... rest same
      
  2. Update login endpoint să determine tenant_id

    • Fișier: reports-app/backend/app/main.py (auth router)
    • Logica:
      • Check tenant_users table pentru user_id
      • Dacă user are access la multiple tenants, return primul (default)
      • Sau user selectează tenant la login (future enhancement)
      # În login endpoint
      
      # Get user's tenants from tenant_users table
      tenants = await tenant_config_loader.get_user_tenants(user_id)
      
      if not tenants:
          # Fallback: Use "default" tenant (backward compatibility)
          tenant_id = "default"
      else:
          # Use first tenant (or let user select in future)
          tenant_id = tenants[0]['tenant_id']
      
      # Create JWT with tenant_id
      access_token = jwt_handler.create_access_token(
          username=credentials.username,
          tenant_id=tenant_id,  # NEW
          companies=companies,
          user_id=user_id,
          permissions=["read", "reports"]
      )
      
  3. Implementează TenantMiddleware pentru validare tenant access

    • Fișier: shared/middleware/tenant_middleware.py
    • Responsabilități:
      • Extract tenant_id din JWT token
      • Validate user are acces la tenant-ul respectiv
      • Inject tenant_id în request.state.tenant_id
      class TenantMiddleware(BaseHTTPMiddleware):
          async def dispatch(self, request: Request, call_next):
              # Skip pentru excluded paths
              if request.url.path in self.excluded_paths:
                  return await call_next(request)
      
              # Extract tenant_id from JWT token (already decoded by AuthMiddleware)
              user = getattr(request.state, 'user', None)
              if not user:
                  return JSONResponse(
                      status_code=401,
                      content={"detail": "Not authenticated"}
                  )
      
              tenant_id = user.get('tenant_id')
              if not tenant_id:
                  return JSONResponse(
                      status_code=400,
                      content={"detail": "Missing tenant_id in token"}
                  )
      
              # Validate tenant exists and is active
              tenant_config = await tenant_config_loader.get_tenant(tenant_id)
              if not tenant_config or not tenant_config.is_active:
                  return JSONResponse(
                      status_code=403,
                      content={"detail": f"Tenant {tenant_id} is not active"}
                  )
      
              # Validate user has access to this tenant
              user_id = user.get('user_id')
              has_access = await tenant_config_loader.check_user_tenant_access(user_id, tenant_id)
              if not has_access:
                  return JSONResponse(
                      status_code=403,
                      content={"detail": f"User {user_id} does not have access to tenant {tenant_id}"}
                  )
      
              # Inject tenant_id în request state
              request.state.tenant_id = tenant_id
              request.state.tenant_name = tenant_config.name
      
              # Continue request
              response = await call_next(request)
      
              # Log audit (async background task)
              await self._log_audit(request, response, tenant_id, user_id)
      
              return response
      
  4. Update AuthenticationMiddleware să funcționeze cu TenantMiddleware

    • Fișier: shared/auth/middleware.py
    • Ordinea middleware-urilor:
      # În main.py
      app.add_middleware(TenantMiddleware, excluded_paths=["/", "/docs", "/health", ...])
      app.add_middleware(AuthenticationMiddleware, excluded_paths=["/", "/docs", "/health", ...])
      
    • Flow: AuthMiddleware decode JWT → TenantMiddleware validate tenant access
  5. Update toate router-urile să folosească tenant_id din request.state

    • Fișiere: reports-app/backend/app/routers/*.py
    • Pattern:
      # Înainte (single-tenant)
      async with oracle_pool.get_connection() as connection:
          # query...
      
      # După (multi-tenant)
      tenant_id = request.state.tenant_id  # Injected by TenantMiddleware
      async with multi_tenant_pool.get_connection(tenant_id) as connection:
          # query...
      
    • Exemplu: dashboard.py
      @router.get("/{company_id}")
      async def get_dashboard(company_id: str, request: Request):
          tenant_id = request.state.tenant_id  # NEW
      
          async with multi_tenant_pool.get_connection(tenant_id) as connection:
              with connection.cursor() as cursor:
                  # ... rest same
      
  6. Update Telegram bot pentru tenant support

    • Fișier: reports-app/telegram-bot/app/auth/linking.py
    • Modificări:
      • La linking, salvează și tenant_id în SQLite
      • JWT token include tenant_id
      • Toate requests la backend includ tenant_id corect
  7. Add tenant selection endpoint (future enhancement)

    • Endpoint: POST /api/auth/select-tenant
    • Use case: User cu access la multiple tenants poate switcha între ele
    • Response: New JWT token cu alt tenant_id

Output Verificabil

  • JWT token include tenant_id field
  • Login endpoint generate token cu tenant_id corect
  • TenantMiddleware extrage și validează tenant_id
  • Router-uri folosesc multi_tenant_pool.get_connection(tenant_id)
  • Request la tenant invalid returnează 403 Forbidden
  • User fără access la tenant returnează 403 Forbidden
  • Test: pytest shared/tests/test_tenant_middleware.py -v
  • Test: Login cu user care are access la tenant A, request la tenant B → 403

FAZA 5: Cache & Audit Logging Integration (1-2 zile)

Obiectiv: Update Redis cache să folosească real tenant_id (nu "default") și implementează audit logging per tenant.

Tasks

  1. Update Redis cache să folosească real tenant_id

    • Fișier: shared/cache/redis_client.py (dacă există) sau inline în routers
    • Modificare: Înlocuiește hardcoded "default" cu real tenant_id
    • Înainte:
      cache_key = f"cache:default:dashboard:{company_id}"
      
    • După:
      tenant_id = request.state.tenant_id
      cache_key = f"cache:{tenant_id}:dashboard:{company_id}"
      
  2. Implementează cache invalidation per tenant

    • Use case: Admin updates tenant data, invalidate doar cache-ul tenant-ului respectiv
    • Endpoint: DELETE /api/cache/{tenant_id} (admin only)
    • Logica:
      pattern = f"cache:{tenant_id}:*"
      keys = redis_client.keys(pattern)
      if keys:
          redis_client.delete(*keys)
      
  3. Implementează audit logging în TenantMiddleware

    • Fișier: shared/middleware/tenant_middleware.py
    • Logica: Log toate request-urile în audit_logs table
      async def _log_audit(self, request: Request, response: Response, tenant_id: str, user_id: int):
          # Extract info
          action = f"{request.method} {request.url.path}"
          status = "success" if response.status_code < 400 else "error"
          error_message = None if status == "success" else response.body.decode()
      
          # Insert în audit_logs table (async background task)
          await audit_logger.log(
              tenant_id=tenant_id,
              user_id=user_id,
              username=request.state.user.get('username'),
              action=action,
              resource=request.url.path,
              status=status,
              error_message=error_message,
              ip_address=request.client.host,
              user_agent=request.headers.get('user-agent')
          )
      
  4. Implementează AuditLogger helper class

    • Fișier: shared/utils/audit_logger.py
    • Metodă:
      class AuditLogger:
          def __init__(self, tenant_db_url: str):
              self.db_url = tenant_db_url
      
          async def log(
              self,
              tenant_id: str,
              user_id: int,
              username: str,
              action: str,
              resource: str,
              status: str,
              error_message: Optional[str] = None,
              ip_address: Optional[str] = None,
              user_agent: Optional[str] = None
          ):
              # Insert în audit_logs table (PostgreSQL/SQLite)
              query = """
                  INSERT INTO audit_logs (
                      tenant_id, user_id, username, action, resource,
                      status, error_message, ip_address, user_agent
                  ) VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9)
              """
              await self._execute_query(query, [
                  tenant_id, user_id, username, action, resource,
                  status, error_message, ip_address, user_agent
              ])
      
  5. Add audit logs viewing endpoint

    • Endpoint: GET /api/audit-logs/{tenant_id} (tenant admin only)
    • Filters: ?user_id=123&start_date=2025-10-01&end_date=2025-10-31&status=error
    • Response: Paginated audit logs for tenant
  6. Add metrics per tenant (optional, future)

    • Metrics:
      • Request count per tenant
      • Response time per tenant
      • Error rate per tenant
      • Active users per tenant
    • Storage: Time-series database (InfluxDB) sau Redis sorted sets

Output Verificabil

  • Redis cache keys include real tenant_id (not "default")
  • Cache isolation: Tenant A cache nu e vizibil pentru tenant B
  • Cache invalidation per tenant funcționează
  • Audit logs se salvează în audit_logs table
  • Audit logs include tenant_id, user_id, action, status
  • Audit logs viewing endpoint returnează logs filtered per tenant
  • Test: pytest shared/tests/test_audit_logging.py -v

FAZA 6: Deployment & Testing (3-4 zile)

Obiectiv: Deploy multi-tenant în toate environment-urile (dev, Docker, Windows) și test complet.

Tasks

  1. Update development environment (WSL)

    • Setup:
      # Create SQLite tenant DB
      sqlite3 data/tenant_config.db < shared/schemas/tenant_config_schema.sql
      
      # Generate encryption key
      python -c "from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())"
      
      # Update .env
      echo "TENANT_DB_URL=sqlite:///data/tenant_config.db" >> .env
      echo "DB_ENCRYPTION_KEY=<generated_key>" >> .env
      
      # Create default tenant
      python shared/scripts/create_default_tenant.py
      
      # Start app
      ./start-dev.sh
      
    • Verificare: Login funcționează cu tenant "default"
  2. Update Docker deployment

    • Fișier: docker-compose.yml
    • Modificări:
      • Add roa-tenant-config-db service (PostgreSQL)
      • Update roa-backend env vars (TENANT_DB_URL, DB_ENCRYPTION_KEY)
      • Mount SSH keys volume read-only
    • Deployment:
      # Build images
      docker-compose build
      
      # Start services
      docker-compose up -d
      
      # Initialize tenant DB
      docker-compose exec roa-backend python shared/scripts/create_default_tenant.py
      
      # Verify
      docker-compose logs roa-backend | grep "tenant"
      
  3. Update Windows IIS deployment

    • Script: deployment/windows/scripts/Setup-TenantDB.ps1
    • Acțiuni:
      • Install SQL Server Express SAU PostgreSQL Windows service
      • Create tenant_config database
      • Run schema SQL
      • Generate encryption key (store în Windows Credential Manager)
      • Create default tenant
    • Manual steps:
      # Run setup
      .\deployment\windows\scripts\Setup-TenantDB.ps1
      
      # Update web.config cu TENANT_DB_URL
      # Restart ROA2WEB-Backend service
      Restart-Service ROA2WEB-Backend
      
  4. Implementează comprehensive integration tests

    • Fișier: shared/tests/integration/test_multi_tenant_flow.py
    • Scenarios:
      • Login cu tenant A → Get dashboard → Cache hit tenant A
      • Login cu tenant B → Get dashboard → Cache miss (different tenant)
      • User cu access la tenant A încearcă tenant B → 403 Forbidden
      • SSH tunnel tenant restart după kill → Auto-recovery
      • Tenant inactive > 1h → Pool cleanup
    • Run:
      pytest shared/tests/integration/ -v --tb=short
      
  5. Implementează load testing cu multiple tenants

    • Tool: Locust sau Apache Bench
    • Scenario: 3 tenants, 100 requests each, simultaneous
    • Script: shared/tests/load/test_multi_tenant_load.py
    • Metrics:
      • Response time per tenant (< 200ms avg)
      • Error rate (< 1%)
      • Pool usage (max connections per tenant)
      • SSH tunnel stability (no restarts)
  6. Create tenant onboarding guide

    • Fișier: shared/docs/TENANT_ONBOARDING.md
    • Conținut:
      • How to add a new tenant (manual SQL sau admin UI)
      • SSH key setup pentru tenant remote
      • User assignment la tenant
      • Testing tenant connection
      • Troubleshooting common issues
  7. Create monitoring dashboard (optional)

    • Tools: Grafana + Prometheus
    • Metrics:
      • Active tenants count
      • Pool connections per tenant
      • Request rate per tenant
      • Error rate per tenant
      • SSH tunnel uptime per tenant

Output Verificabil

  • Development (WSL): Multi-tenant funcționează cu SQLite tenant DB
  • Docker: Multi-tenant funcționează cu PostgreSQL tenant DB
  • Windows IIS: Multi-tenant funcționează cu SQL Server Express
  • Integration tests pass (100% success rate)
  • Load tests: 3 tenants × 100 requests, < 200ms avg response time
  • SSH tunnels: No crashes during 1h load test
  • Cache isolation validated: Tenant A cache ≠ Tenant B cache
  • Audit logs populated corect pentru toate requests
  • Documentation complete (onboarding guide, troubleshooting)

🔧 Connection Management

SSH Tunnel Configuration

Tenant cu SSH Tunnel (Client Remote)

{
  "id": "client-a-uuid",
  "name": "Client A - Retail SRL",
  "connection_type": "ssh_tunnel",

  "oracle_host": "10.0.20.36",
  "oracle_port": 1521,
  "oracle_sid": "ROA",
  "oracle_user": "CLIENT_A_USER",
  "oracle_password_encrypted": "gAAAAABh...",

  "ssh_host": "83.103.197.79",
  "ssh_port": 22122,
  "ssh_user": "roa2web",
  "ssh_key_path": "/app/ssh-keys/client-a.key",
  "ssh_tunnel_local_port": 15261,

  "min_connections": 2,
  "max_connections": 10,
  "is_active": true
}

SSH Tunnel Flow:

Backend Process
    ↓
SSHTunnelManager.start_tunnel()
    ↓
subprocess: ssh -f -N -L 15261:10.0.20.36:1521 -p 22122 roa2web@83.103.197.79
    ↓
Tunnel established: localhost:15261 → 10.0.20.36:1521
    ↓
OraclePool connects to localhost:15261
    ↓
Oracle queries routed prin SSH tunnel

Direct Connection Configuration

Tenant cu Direct Connection (Client LAN)

{
  "id": "client-b-uuid",
  "name": "Client B - Import Export SA",
  "connection_type": "direct",

  "oracle_host": "192.168.1.50",
  "oracle_port": 1521,
  "oracle_sid": "ROA",
  "oracle_user": "CLIENT_B_USER",
  "oracle_password_encrypted": "gAAAAABh...",

  "ssh_host": null,
  "ssh_port": null,
  "ssh_user": null,
  "ssh_key_path": null,
  "ssh_tunnel_local_port": null,

  "min_connections": 5,
  "max_connections": 20,
  "is_active": true
}

Direct Connection Flow:

Backend Process
    ↓
MultiTenantPoolManager.get_connection(tenant_id)
    ↓
Check connection_type: "direct" → Skip SSH tunnel
    ↓
OraclePool.create_pool(host=192.168.1.50, port=1521, ...)
    ↓
Oracle queries direct la 192.168.1.50:1521

Mixed Environment Setup

3 Tenants: 2 SSH, 1 Direct

Tenant ID Name Type Oracle Host SSH Tunnel Local Port
client-a-uuid Client A - Retail SRL ssh_tunnel 10.0.20.36:1521 83.103.197.79:22122 15261
client-b-uuid Client B - Import SA direct 192.168.1.50:1521 N/A N/A
client-c-uuid Client C - Distribution ssh_tunnel 10.0.20.36:1521 212.18.45.99:22 15262

Resource Usage:

Backend Memory:
├── Pool Client A: 2-10 connections × ~5MB = 10-50MB
├── Pool Client B: 5-20 connections × ~5MB = 25-100MB
├── Pool Client C: 2-10 connections × ~5MB = 10-50MB
└── Total: ~50-200MB (vs single-tenant ~10-50MB)

SSH Processes:
├── Tunnel Client A: ~10MB RAM
├── Tunnel Client C: ~10MB RAM
└── Total: ~20MB

Total Overhead: ~70-220MB (acceptable for multi-tenant SaaS)

🔒 Security Model

Encryption Strategy

Password Encryption în Tenant DB

from cryptography.fernet import Fernet

# Generate encryption key (store în .env)
encryption_key = Fernet.generate_key()  # Example: b'Xs3J7...'

# Encrypt password
fernet = Fernet(encryption_key)
encrypted_password = fernet.encrypt(b"oracle_password_plaintext")
# Result: "gAAAAABh3J..."

# Decrypt password (la runtime)
decrypted_password = fernet.decrypt(encrypted_password.encode()).decode()

Security Properties:

  • Symmetric encryption (Fernet - AES 128 CBC + HMAC)
  • Encryption key în environment variable (DB_ENCRYPTION_KEY)
  • Passwords encrypted at rest în tenant DB
  • Decryption doar la pool initialization (memory only)
  • NOT: Passwords în logs, error messages, audit trails

Tenant Isolation

Izolare Completă între Tenants

┌─────────────────────────────────────────────────────────┐
│                    Tenant A                             │
│                                                         │
│  ┌──────────────────────────────────────────────────┐   │
│  │ Connection Pool (2-10 connections)               │   │
│  │ - oracle_host: 10.0.20.36 (via SSH tunnel)      │   │
│  │ - oracle_user: CLIENT_A_USER                     │   │
│  │ - Schema: CLIENT_A_SCHEMA                        │   │
│  └──────────────────────────────────────────────────┘   │
│                                                         │
│  ┌──────────────────────────────────────────────────┐   │
│  │ Redis Cache Namespace                            │   │
│  │ - cache:client-a-uuid:*                          │   │
│  └──────────────────────────────────────────────────┘   │
│                                                         │
│  ┌──────────────────────────────────────────────────┐   │
│  │ Audit Logs                                       │   │
│  │ - audit_logs WHERE tenant_id='client-a-uuid'     │   │
│  └──────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────┘

                    ❌ ZERO SHARING ❌

┌─────────────────────────────────────────────────────────┐
│                    Tenant B                             │
│  (Same structure, COMPLETELY ISOLATED)                  │
└─────────────────────────────────────────────────────────┘

Isolation Guarantees:

  1. Connection Pool: Tenant A connections NEVER folosite pentru tenant B queries
  2. Cache: Redis keys namespaced per tenant (cache:{tenant_id}:*)
  3. Audit Logs: Query filter WHERE tenant_id = $1 (indexat pentru performance)
  4. SSH Tunnels: Separate processes, separate local ports (no crosstalk)

JWT Token Structure

Token cu Tenant ID (Signed)

{
  "username": "john.doe",
  "user_id": 123,
  "tenant_id": "client-a-uuid",
  "companies": ["COMP1", "COMP2"],
  "permissions": ["read", "reports"],
  "exp": 1735142400,
  "iat": 1735140600,
  "type": "access"
}

Security Checks în TenantMiddleware:

# 1. Extract tenant_id from JWT (decoded by AuthMiddleware)
tenant_id = request.state.user.get('tenant_id')

# 2. Validate tenant exists and is active
tenant_config = await tenant_config_loader.get_tenant(tenant_id)
if not tenant_config or not tenant_config.is_active:
    raise HTTPException(403, "Tenant not active")

# 3. Validate user has access to this tenant
user_id = request.state.user.get('user_id')
has_access = await tenant_config_loader.check_user_tenant_access(user_id, tenant_id)
if not has_access:
    raise HTTPException(403, "User does not have access to this tenant")

# 4. Inject tenant_id în request state (immutable)
request.state.tenant_id = tenant_id  # Routers use this

Attack Scenarios Prevented:

  • Tenant ID Tampering: JWT signed, client nu poate modifica tenant_id fără invalid signature
  • Cross-Tenant Access: User cu access la tenant A nu poate accesa tenant B (check în step 3)
  • Inactive Tenant Access: Tenant deactivated → requests rejected (check în step 2)
  • SQL Injection via Tenant ID: UUID validated, folosit în parameterized queries

🧪 Testing Strategy

Unit Tests

Test Coverage per Component

shared/tests/
├── test_tenant_config.py                    # TenantConfigLoader
│   ├── test_load_tenants()                  # Load all tenants from DB
│   ├── test_get_tenant()                    # Get specific tenant
│   ├── test_reload_tenant()                 # Reload tenant config
│   ├── test_encryption_decryption()         # Password encryption/decryption
│   └── test_default_tenant_fallback()       # Fallback la .env credențiale
│
├── test_multi_tenant_pool.py               # MultiTenantPoolManager
│   ├── test_lazy_pool_initialization()      # Pool creat doar la prima cerere
│   ├── test_pool_per_tenant()               # Pool-uri separate per tenant
│   ├── test_pool_cleanup_inactive()         # Cleanup după 1h inactivity
│   ├── test_tenant_reload()                 # Reload tenant fără restart
│   └── test_connection_context_manager()    # get_connection() pattern
│
├── test_ssh_tunnel_manager.py              # SSHTunnelManager
│   ├── test_start_tunnel()                  # Start SSH tunnel subprocess
│   ├── test_stop_tunnel()                   # Stop SSH tunnel gracefully
│   ├── test_tunnel_health_check()           # Detect dead tunnels
│   ├── test_auto_restart()                  # Restart cu exponential backoff
│   └── test_cleanup_all_tunnels()           # Kill all processes la shutdown
│
├── test_tenant_middleware.py               # TenantMiddleware
│   ├── test_extract_tenant_id()             # Extract tenant_id din JWT
│   ├── test_validate_tenant_access()        # User access validation
│   ├── test_inactive_tenant_blocked()       # Inactive tenant → 403
│   ├── test_cross_tenant_access_blocked()   # User A tenant → User B tenant → 403
│   └── test_audit_logging()                 # Audit logs salvate corect
│
└── test_encryption.py                       # Encryption utils
    ├── test_fernet_encryption()             # Encrypt/decrypt passwords
    └── test_key_rotation()                  # Future: Key rotation support

Run Unit Tests:

cd shared/
pytest tests/ -v --cov=database --cov=middleware --cov=utils --cov-report=html

# Expected output:
# ✅ test_tenant_config.py::test_load_tenants PASSED
# ✅ test_multi_tenant_pool.py::test_lazy_pool_initialization PASSED
# ...
# Coverage: 85% (target: > 80%)

Integration Tests

End-to-End Scenarios

shared/tests/integration/
├── test_multi_tenant_flow.py                # Complete multi-tenant flow
│   ├── test_login_with_tenant_a()           # Login → JWT cu tenant A
│   ├── test_dashboard_tenant_a()            # Dashboard query tenant A
│   ├── test_cache_hit_tenant_a()            # Cache hit pentru tenant A
│   ├── test_cross_tenant_isolation()        # Tenant A cache ≠ Tenant B cache
│   └── test_audit_logs_populated()          # Audit logs salvate per tenant
│
├── test_ssh_tunnel_resilience.py           # SSH tunnel stability
│   ├── test_tunnel_auto_recovery()          # Kill tunnel → Auto-restart
│   ├── test_multiple_tunnels_parallel()     # 3 tenants SSH simultaneous
│   └── test_tunnel_port_conflicts()         # Port allocation unique
│
└── test_deployment_scenarios.py            # Deployment compatibility
    ├── test_development_sqlite()            # Development cu SQLite tenant DB
    ├── test_docker_postgresql()             # Docker cu PostgreSQL tenant DB
    └── test_backward_compatibility()        # Tenant "default" funcționează

Run Integration Tests:

# Requires: PostgreSQL tenant DB running + Redis + Oracle test server
docker-compose -f docker-compose.test.yml up -d

pytest shared/tests/integration/ -v --tb=short

# Expected output:
# ✅ test_multi_tenant_flow.py::test_login_with_tenant_a PASSED (0.5s)
# ✅ test_multi_tenant_flow.py::test_cache_hit_tenant_a PASSED (0.2s)
# ...

Load Testing

Performance Validation cu Multiple Tenants

# shared/tests/load/test_multi_tenant_load.py

from locust import HttpUser, task, between
import random

class MultiTenantUser(HttpUser):
    wait_time = between(1, 3)

    def on_start(self):
        # Login to random tenant
        self.tenant = random.choice(['client-a-uuid', 'client-b-uuid', 'client-c-uuid'])
        response = self.client.post('/api/auth/login', json={
            'username': f'user_{self.tenant}',
            'password': 'test_password'
        })
        self.token = response.json()['access_token']
        self.client.headers.update({'Authorization': f'Bearer {self.token}'})

    @task(3)
    def get_dashboard(self):
        self.client.get(f'/api/dashboard/COMP1')

    @task(2)
    def get_invoices(self):
        self.client.get(f'/api/invoices/COMP1')

    @task(1)
    def get_treasury(self):
        self.client.get(f'/api/treasury/COMP1')

Run Load Test:

locust -f shared/tests/load/test_multi_tenant_load.py --host=http://localhost:8001

# Scenario: 3 tenants × 100 users = 300 concurrent users
# Duration: 10 minutes
# Expected:
# - Response time: < 200ms (p95)
# - Error rate: < 1%
# - SSH tunnels: No restarts
# - Pool connections: Max 10 per tenant (no exhaustion)

📊 Migration Checklist

Pre-Migration

  • Backup production database

    # Backup Oracle database
    expdp username/password@ROA directory=BACKUP dumpfile=pre_migration.dmp
    
    # Backup existing .env files
    cp reports-app/backend/.env reports-app/backend/.env.backup
    
  • Document current single-tenant config

    # Save current credentials
    cat reports-app/backend/.env > docs/pre_migration_env.txt
    
    # Save current SSH tunnel config
    ./ssh_tunnel.sh status > docs/pre_migration_ssh.txt
    
  • Test deployment în environment non-production

    # Create staging environment
    docker-compose -f docker-compose.staging.yml up -d
    
    # Deploy multi-tenant în staging
    # ... follow migration steps ...
    
    # Validate staging works
    curl http://staging.roa2web.local/api/health
    
  • Generate DB encryption key

    python -c "from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())"
    # Save în .env: DB_ENCRYPTION_KEY=<generated_key>
    
  • Prepare tenant configuration

    • Create tenant DB (PostgreSQL/SQLite)
    • Populate cu tenant "default" (credențiale existente)
    • Add SSH keys pentru tenants remote

Migration Steps (Production)

Step 1: Deploy Tenant Config DB (30 min)

# Docker deployment
docker-compose up -d roa-tenant-config-db

# Verify DB is running
docker-compose exec roa-tenant-config-db psql -U tenant_admin -d tenant_config -c '\dt'

# Run schema
docker-compose exec roa-tenant-config-db psql -U tenant_admin -d tenant_config -f /docker-entrypoint-initdb.d/schema.sql

Step 2: Populate Tenant "default" (15 min)

# Run migration script
docker-compose exec roa-backend python shared/scripts/create_default_tenant.py

# Verify tenant created
docker-compose exec roa-tenant-config-db psql -U tenant_admin -d tenant_config -c 'SELECT id, name, connection_type FROM tenants;'

Step 3: Deploy Backend cu MultiTenantPoolManager (45 min)

# Update .env with tenant DB URL
echo "TENANT_DB_URL=postgresql://tenant_admin:password@roa-tenant-config-db:5432/tenant_config" >> .env

# Rebuild backend image
docker-compose build roa-backend

# Deploy new backend (rolling update)
docker-compose up -d roa-backend

# Wait for health check
watch -n 2 'curl -s http://localhost:8001/health | jq'

Step 4: Verify Tenant "default" funcționează (15 min)

# Test login (should work exactly as before)
curl -X POST http://localhost:8001/api/auth/login \
  -H 'Content-Type: application/json' \
  -d '{"username": "test_user", "password": "test_password"}'

# Response should include tenant_id: "default"
# {
#   "access_token": "eyJ...",
#   "user": {
#     "tenant_id": "default",
#     ...
#   }
# }

# Test dashboard (should work as before)
curl -H "Authorization: Bearer $TOKEN" http://localhost:8001/api/dashboard/COMP1

Step 5: Add Tenants Noi (One by One)

# Add tenant A (SSH tunnel)
docker-compose exec roa-backend python shared/scripts/add_tenant.py \
  --name "Client A - Retail SRL" \
  --connection-type ssh_tunnel \
  --oracle-host 10.0.20.36 \
  --oracle-user CLIENT_A_USER \
  --oracle-password "encrypted_password" \
  --ssh-host 83.103.197.79 \
  --ssh-port 22122 \
  --ssh-key /app/ssh-keys/client-a.key \
  --ssh-local-port 15261

# Add users la tenant A
docker-compose exec roa-tenant-config-db psql -U tenant_admin -d tenant_config -c \
  "INSERT INTO tenant_users (tenant_id, user_id, username) VALUES ('client-a-uuid', 123, 'john.doe');"

# Test tenant A login
curl -X POST http://localhost:8001/api/auth/login \
  -H 'Content-Type: application/json' \
  -d '{"username": "john.doe", "password": "password"}'

# Verify JWT includes tenant_id: "client-a-uuid"

Step 6: Monitor Logs per Tenant (Ongoing)

# Monitor all tenant logs
docker-compose logs -f roa-backend | grep "tenant_id"

# Monitor SSH tunnels
docker-compose logs -f roa-backend | grep "SSH tunnel"

# Monitor pool connections
docker-compose logs -f roa-backend | grep "pool"

# Check audit logs
docker-compose exec roa-tenant-config-db psql -U tenant_admin -d tenant_config -c \
  'SELECT tenant_id, username, action, status, created_at FROM audit_logs ORDER BY created_at DESC LIMIT 20;'

Step 7: Performance Validation (1-2h)

# Run load test
locust -f shared/tests/load/test_multi_tenant_load.py --host=http://localhost:8001 --users=100 --spawn-rate=10 --run-time=1h

# Monitor metrics
# - Response time: < 200ms (p95)
# - Error rate: < 1%
# - Pool usage: < 80% per tenant
# - SSH tunnels: No restarts

Post-Migration

  • All tenants functional

    • Tenant "default" works (backward compatibility)
    • Tenant A works (SSH tunnel)
    • Tenant B works (direct connection)
  • No performance degradation

    • Response time same as single-tenant (< 10% overhead)
    • No connection pool exhaustion
    • SSH tunnels stable (no auto-restarts)
  • Audit logs populated

    # Verify audit logs per tenant
    SELECT tenant_id, COUNT(*) FROM audit_logs GROUP BY tenant_id;
    
  • Documentation updated

    • Update CLAUDE.md cu multi-tenant architecture
    • Update deployment guides (Docker, Windows)
    • Create tenant onboarding guide
  • Monitoring dashboards

    • Grafana dashboard per tenant
    • Alerts pentru pool exhaustion, SSH tunnel failures

🎯 Deployment Guides

Development Setup (WSL/Local)

Prerequisites:

  • Python 3.11+
  • SQLite3
  • Redis server
  • SSH access la Oracle server (pentru tenants cu SSH tunnel)

Setup Steps:

# 1. Create SQLite tenant DB
mkdir -p data
sqlite3 data/tenant_config.db < shared/schemas/tenant_config_schema.sql

# 2. Generate encryption key
python3 -c "from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())" > .encryption_key
DB_ENCRYPTION_KEY=$(cat .encryption_key)

# 3. Update .env
cat >> reports-app/backend/.env << EOF
# Tenant Configuration
TENANT_DB_URL=sqlite:///data/tenant_config.db
DB_ENCRYPTION_KEY=$DB_ENCRYPTION_KEY
EOF

# 4. Create default tenant
cd shared/
python scripts/create_default_tenant.py

# 5. Start Redis
redis-server --daemonize yes

# 6. Start application
cd ../
./start-dev.sh

# 7. Verify
curl http://localhost:8001/health
# Should return: {"database": "connected", "tenants_loaded": 1}

Add New Tenant (Development):

# Add tenant via SQL
sqlite3 data/tenant_config.db << EOF
INSERT INTO tenants (
  id, name, connection_type,
  oracle_host, oracle_port, oracle_sid, oracle_user, oracle_password_encrypted,
  ssh_host, ssh_port, ssh_user, ssh_key_path, ssh_tunnel_local_port
) VALUES (
  'dev-tenant-uuid',
  'Dev Tenant - Test Company',
  'ssh_tunnel',
  '10.0.20.36',
  1521,
  'ROA',
  'DEV_USER',
  'encrypted_password_here',
  '83.103.197.79',
  22122,
  'roa2web',
  '/tmp/roa_oracle_server',
  15263
);

-- Add user to tenant
INSERT INTO tenant_users (tenant_id, user_id, username)
VALUES ('dev-tenant-uuid', 999, 'dev_user');
EOF

# Restart backend
pkill -f "uvicorn app.main:app"
./start-dev.sh

Docker Deployment (Proxmox LXC)

Prerequisites:

  • Docker 24+
  • Docker Compose 2.20+
  • 4GB RAM minimum
  • PostgreSQL 15 container

docker-compose.multi-tenant.yml:

version: '3.8'

services:
  # Tenant Configuration Database
  roa-tenant-config-db:
    image: postgres:15-alpine
    container_name: roa-tenant-config-db
    restart: unless-stopped
    environment:
      POSTGRES_DB: tenant_config
      POSTGRES_USER: tenant_admin
      POSTGRES_PASSWORD: ${TENANT_DB_PASSWORD}
    volumes:
      - tenant-config-data:/var/lib/postgresql/data
      - ./shared/schemas/tenant_config_schema.sql:/docker-entrypoint-initdb.d/schema.sql:ro
    networks:
      - roa-network
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U tenant_admin -d tenant_config"]
      interval: 10s
      timeout: 5s
      retries: 5

  # Backend (Multi-Tenant)
  roa-backend:
    build:
      context: .
      dockerfile: ./reports-app/backend/Dockerfile
    image: roa2web/backend:multi-tenant
    container_name: roa-backend
    restart: unless-stopped
    environment:
      # Tenant Configuration
      - TENANT_DB_URL=postgresql://tenant_admin:${TENANT_DB_PASSWORD}@roa-tenant-config-db:5432/tenant_config
      - DB_ENCRYPTION_KEY=${DB_ENCRYPTION_KEY}

      # JWT Configuration
      - JWT_SECRET_KEY=${JWT_SECRET_KEY}

      # Redis Cache
      - REDIS_URL=redis://:${REDIS_PASSWORD}@roa-redis:6379/0
    volumes:
      # SSH keys for tenant tunnels (read-only)
      - ./ssh-keys:/app/ssh-keys:ro
      - backend-logs:/app/logs
    networks:
      - roa-network
    depends_on:
      roa-tenant-config-db:
        condition: service_healthy
      roa-redis:
        condition: service_healthy
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
      interval: 30s
      timeout: 10s
      retries: 3

  # Redis Cache
  roa-redis:
    image: redis:7-alpine
    container_name: roa-redis
    restart: unless-stopped
    command: redis-server --appendonly yes --requirepass ${REDIS_PASSWORD}
    volumes:
      - redis-data:/data
    networks:
      - roa-network
    healthcheck:
      test: ["CMD", "redis-cli", "ping"]
      interval: 10s
      timeout: 5s
      retries: 5

  # Frontend (unchanged)
  roa-frontend:
    build:
      context: ./reports-app/frontend
      dockerfile: Dockerfile
    image: roa2web/frontend:latest
    container_name: roa-frontend
    restart: unless-stopped
    networks:
      - roa-network

  # Nginx Gateway (unchanged)
  roa-gateway:
    build:
      context: ./nginx
      dockerfile: Dockerfile
    image: roa2web/nginx-gateway:latest
    container_name: roa-gateway
    restart: unless-stopped
    ports:
      - "80:80"
      - "443:443"
    networks:
      - roa-network
    depends_on:
      - roa-backend
      - roa-frontend

volumes:
  tenant-config-data:
  redis-data:
  backend-logs:

networks:
  roa-network:
    driver: bridge

Deployment:

# 1. Create .env file
cat > .env << EOF
TENANT_DB_PASSWORD=$(openssl rand -base64 32)
DB_ENCRYPTION_KEY=$(python3 -c "from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())")
JWT_SECRET_KEY=$(openssl rand -base64 64)
REDIS_PASSWORD=$(openssl rand -base64 32)
EOF

# 2. Prepare SSH keys directory
mkdir -p ssh-keys
chmod 700 ssh-keys
cp /path/to/client-a.key ssh-keys/client-a.key
chmod 400 ssh-keys/client-a.key

# 3. Build and start services
docker-compose -f docker-compose.multi-tenant.yml build
docker-compose -f docker-compose.multi-tenant.yml up -d

# 4. Wait for tenant DB initialization
docker-compose logs -f roa-tenant-config-db | grep "database system is ready"

# 5. Create default tenant
docker-compose exec roa-backend python shared/scripts/create_default_tenant.py

# 6. Verify deployment
curl http://localhost/api/health
# {"api": "healthy", "database": "connected", "tenants_loaded": 1}

Add New Tenant:

# Connect to tenant DB
docker-compose exec roa-tenant-config-db psql -U tenant_admin -d tenant_config

# Insert tenant (with encrypted password)
INSERT INTO tenants (id, name, connection_type, oracle_host, oracle_port, oracle_sid, oracle_user, oracle_password_encrypted, ssh_host, ssh_port, ssh_user, ssh_key_path, ssh_tunnel_local_port, is_active)
VALUES (
  'client-a-uuid',
  'Client A - Retail SRL',
  'ssh_tunnel',
  '10.0.20.36',
  1521,
  'ROA',
  'CLIENT_A_USER',
  'gAAAAABh...',  -- Fernet encrypted password
  '83.103.197.79',
  22122,
  'roa2web',
  '/app/ssh-keys/client-a.key',
  15261,
  TRUE
);

-- Add user to tenant
INSERT INTO tenant_users (tenant_id, user_id, username)
VALUES ('client-a-uuid', 123, 'john.doe');

\q

# Reload backend (or wait for auto-reload)
docker-compose restart roa-backend

Windows IIS Deployment

Prerequisites:

  • Windows Server 2019+
  • IIS 10+
  • SQL Server Express 2019+ SAU PostgreSQL 15 for Windows
  • Python 3.11+ (Windows installer)
  • Redis for Windows (MSI installer)

Setup Script: deployment/windows/scripts/Setup-MultiTenant.ps1

# Run as Administrator
.\deployment\windows\scripts\Setup-MultiTenant.ps1

<#
This script will:
1. Install SQL Server Express 2019
2. Create tenant_config database
3. Run schema SQL
4. Generate encryption key (save în Windows Credential Manager)
5. Create default tenant
6. Update ROA2WEB backend service
7. Restart IIS
#>

Manual Setup:

# 1. Install SQL Server Express
# Download from: https://www.microsoft.com/en-us/sql-server/sql-server-downloads
# Install with default instance name: SQLEXPRESS

# 2. Create tenant database
sqlcmd -S localhost\SQLEXPRESS -E -Q "CREATE DATABASE tenant_config"

# 3. Run schema
sqlcmd -S localhost\SQLEXPRESS -d tenant_config -E -i shared\schemas\tenant_config_schema.sql

# 4. Generate encryption key
python -c "from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())" | Out-File -FilePath .encryption_key -NoNewline

# 5. Store key în Windows Credential Manager
cmdkey /generic:ROA2WEB_DB_ENCRYPTION_KEY /user:system /pass:(Get-Content .encryption_key)

# 6. Update backend .env
@"
TENANT_DB_URL=mssql+pyodbc://localhost\SQLEXPRESS/tenant_config?driver=ODBC+Driver+17+for+SQL+Server&trusted_connection=yes
DB_ENCRYPTION_KEY=$(Get-Content .encryption_key)
"@ | Add-Content -Path C:\inetpub\wwwroot\roa2web\backend\.env

# 7. Create default tenant
cd C:\inetpub\wwwroot\roa2web
python shared\scripts\create_default_tenant.py

# 8. Restart backend service
Restart-Service ROA2WEB-Backend

# 9. Verify
curl http://localhost:8000/health

Add New Tenant (Windows):

# Connect to SQL Server
sqlcmd -S localhost\SQLEXPRESS -d tenant_config -E

-- Insert tenant
INSERT INTO tenants (id, name, connection_type, oracle_host, oracle_port, oracle_sid, oracle_user, oracle_password_encrypted, is_active)
VALUES (
  'client-b-uuid',
  'Client B - Import Export SA',
  'direct',
  '192.168.1.50',
  1521,
  'ROA',
  'CLIENT_B_USER',
  'gAAAAABh...',  -- Encrypted password
  1
);

-- Add user to tenant
INSERT INTO tenant_users (tenant_id, user_id, username)
VALUES ('client-b-uuid', 456, 'jane.smith');

GO
EXIT

# Restart backend
Restart-Service ROA2WEB-Backend

📝 Configuration Examples

Tenant Config: SSH Tunnel (Development)

{
  "id": "dev-client-uuid",
  "name": "Development Client - Test Company",
  "connection_type": "ssh_tunnel",

  "oracle_host": "10.0.20.36",
  "oracle_port": 1521,
  "oracle_sid": "ROA",
  "oracle_user": "DEV_USER",
  "oracle_password_encrypted": "gAAAAABhXj7Ks3J...",

  "ssh_host": "83.103.197.79",
  "ssh_port": 22122,
  "ssh_user": "roa2web",
  "ssh_key_path": "/tmp/roa_oracle_server",
  "ssh_tunnel_local_port": 15260,

  "min_connections": 2,
  "max_connections": 5,
  "is_active": true
}

Tenant Config: Direct Connection (Production)

{
  "id": "prod-client-uuid",
  "name": "Production Client - Enterprise Corp",
  "connection_type": "direct",

  "oracle_host": "192.168.100.50",
  "oracle_port": 1521,
  "oracle_sid": "ROA",
  "oracle_user": "PROD_USER",
  "oracle_password_encrypted": "gAAAAABhXj8Nm4K...",

  "ssh_host": null,
  "ssh_port": null,
  "ssh_user": null,
  "ssh_key_path": null,
  "ssh_tunnel_local_port": null,

  "min_connections": 5,
  "max_connections": 20,
  "is_active": true
}

Tenant Config: Docker Deployment (PostgreSQL Tenant DB)

.env for Docker Compose:

# Tenant Configuration Database
TENANT_DB_PASSWORD=SecurePostgresPassword123!
DB_ENCRYPTION_KEY=Xs3J7vN2pQ8kR9mT1wY5zC6bA4dF0gH=

# Backend
JWT_SECRET_KEY=YourVerySecureJWTSecretKeyHere123456789

# Redis
REDIS_PASSWORD=SecureRedisPassword456!

User-Tenant Mapping Example

-- User john.doe has access to 2 tenants
INSERT INTO tenant_users (tenant_id, user_id, username, is_admin) VALUES
('client-a-uuid', 123, 'john.doe', TRUE),
('client-b-uuid', 123, 'john.doe', FALSE);

-- User jane.smith has access to 1 tenant
INSERT INTO tenant_users (tenant_id, user_id, username, is_admin) VALUES
('client-b-uuid', 456, 'jane.smith', FALSE);

-- Query: Get all tenants for user
SELECT t.id, t.name, tu.is_admin
FROM tenants t
JOIN tenant_users tu ON t.id = tu.tenant_id
WHERE tu.user_id = 123 AND t.is_active = TRUE;

-- Result:
-- | id             | name                          | is_admin |
-- |----------------|-------------------------------|----------|
-- | client-a-uuid  | Client A - Retail SRL         | TRUE     |
-- | client-b-uuid  | Client B - Import Export SA   | FALSE    |

🎯 Success Criteria

Definition of Done

Funcțional:

  • Aplicația suportă minimum 3 tenants simultaneous
  • Tenant identification din JWT funcționează corect
  • SSH tunnels pornesc/opresc automat per tenant
  • Connection pools izolate per tenant (zero sharing)
  • Cache isolation între tenants (namespace per tenant)
  • No cross-tenant data leakage în audit logs sau cache

Deployment:

  • Funcționează în toate deployment scenarios (dev/WSL, Docker, Windows IIS)
  • Backward compatibility: Tenant "default" funcționează exact ca single-tenant
  • Zero downtime pentru existing tenant când adaugi tenant nou (lazy loading)
  • Migration script successful în < 2h (staging environment)

Performance:

  • Overhead < 10% vs single-tenant (measured în load testing)
  • Response time < 200ms (p95) cu 3 tenants × 100 requests
  • No connection pool exhaustion (max 80% usage per tenant)
  • SSH tunnels stable (zero auto-restarts în 1h load test)

Security:

  • Passwords encrypted at rest în tenant DB (Fernet AES-128)
  • SSH keys mounted read-only în Docker volumes
  • JWT tenant_id signed (nu poate fi modificat de client)
  • Tenant access validation în middleware (403 pentru unauthorized)
  • Audit logging TOATE operațiile per tenant

Testing:

  • Unit tests: > 80% code coverage
  • Integration tests: All scenarios pass (login, dashboard, cross-tenant isolation)
  • Load tests: 3 tenants × 100 users, 10 minutes, < 1% error rate
  • Manual testing: Tenant onboarding guide validated

Documentation:

  • CLAUDE.md updated cu multi-tenant architecture
  • Deployment guides (dev, Docker, Windows) complete
  • Tenant onboarding guide created
  • Troubleshooting guide created
  • API documentation updated (Swagger/ReDoc)

⚠️ Risks & Mitigations

Risk: SSH Tunnel Instability

Scenario: SSH tunnel process crashes sau network interruption între backend și SSH server.

Impact: Tenant-ul afectat nu poate accesa Oracle DB (requests fail cu connection error).

Mitigation:

  1. Health Checks: Background task checks tunnel health every 60s
  2. Auto-Restart: Restart tunnel automat cu exponential backoff (5s, 10s, 20s, max 60s)
  3. Monitoring: Alert dacă tunnel e down > 5 minutes
  4. Fallback: Graceful degradation - alți tenants continuă să funcționeze normal

Detection:

async def monitor_ssh_tunnels():
    for tenant_id in ssh_tunnel_manager.tunnels:
        if not await ssh_tunnel_manager.check_tunnel_health(tenant_id):
            logger.error(f"Tunnel down for tenant {tenant_id}, restarting...")
            await ssh_tunnel_manager.restart_tunnel(tenant_id)

Risk: Connection Pool Exhaustion

Scenario: Tenant face burst de requests, pool ajunge la max connections (ex: 10), noi requests block sau timeout.

Impact: Slow response time sau 503 Service Unavailable pentru tenant-ul respectiv.

Mitigation:

  1. Pool Limits: Set realistic limits per tenant (min=2, max=10 default, configurable)
  2. Queue Timeout: getmode=POOL_GETMODE_WAIT cu timeout (ex: 30s)
  3. Rate Limiting: Limit requests per user/tenant (ex: 100 req/min)
  4. Monitoring: Alert dacă pool usage > 80% pentru > 5 minutes
  5. Scaling: Increase max_connections pentru high-traffic tenants

Configuration:

# În tenant config DB
UPDATE tenants SET max_connections = 20 WHERE id = 'high-traffic-tenant-uuid';

# Reload tenant
await multi_tenant_pool.reload_tenant('high-traffic-tenant-uuid')

Risk: Tenant Credential Leak

Scenario: Attacker obține acces la tenant DB sau logs și vede Oracle passwords.

Impact: Data breach - attacker poate accesa Oracle DB direct.

Mitigation:

  1. Encryption at Rest: Passwords encrypted cu Fernet în tenant DB
  2. Encryption Key Security: DB_ENCRYPTION_KEY în environment variables (nu în git)
  3. Access Control: Tenant DB access restricted (firewall, VPN)
  4. No Plaintext Logs: NEVER log decrypted passwords (check code reviews)
  5. Audit Logging: Log all access la tenant config (who/when)
  6. Key Rotation: Support key rotation (encrypt cu new key, decrypt cu old key)

Validation:

# Check logs pentru password leaks
docker-compose logs roa-backend | grep -i "password" | grep -v "encrypted"
# Should return ZERO results

# Check tenant DB
docker-compose exec roa-tenant-config-db psql -U tenant_admin -d tenant_config -c 'SELECT id, name, oracle_password_encrypted FROM tenants LIMIT 5;'
# oracle_password_encrypted should start with "gAAAAA..." (Fernet token)

Risk: Cross-Tenant Data Leakage

Scenario: Bug în middleware sau router permite user din tenant A să acceseze date din tenant B.

Impact: CRITICAL data breach - confidențialitate compromisă.

Mitigation:

  1. Mandatory Middleware: TenantMiddleware validează tenant access pentru TOATE requests
  2. Explicit Tenant ID: Routers MUST use request.state.tenant_id (no global state)
  3. Code Reviews: TOATE modificările în routers reviewed pentru tenant isolation
  4. Integration Tests: Test cross-tenant access blocked (403 Forbidden)
  5. Audit Logging: Log tenant_id în TOATE audit entries pentru forensics

Test Scenario:

# Test: User cu tenant A încearcă să acceseze tenant B
def test_cross_tenant_access_blocked():
    # Login cu tenant A
    token_a = login(user_id=123, tenant_id='client-a-uuid')

    # Modify JWT tenant_id → tenant B (attack simulation)
    forged_token = jwt.encode({
        'user_id': 123,
        'tenant_id': 'client-b-uuid',  # FORGED
        'exp': datetime.utcnow() + timedelta(hours=1)
    }, secret_key, algorithm='HS256')

    # Request cu forged token
    response = client.get('/api/dashboard/COMP1', headers={'Authorization': f'Bearer {forged_token}'})

    # MUST return 403 Forbidden (not 200 OK)
    assert response.status_code == 403
    assert 'does not have access to tenant' in response.json()['detail']

Risk: Performance Degradation cu Multiple Tenants

Scenario: Cu 10+ tenants, response time crește sau backend consumă prea multă memorie.

Impact: Poor user experience, server overload.

Mitigation:

  1. Lazy Loading: Pool-uri create doar când tenant e accesat (economie memorie)
  2. Pool Cleanup: Inactive pools > 1h se închid automat
  3. Resource Limits: Set max_connections realistic per tenant (evită OOM)
  4. Monitoring: Track memory usage, response time per tenant
  5. Horizontal Scaling: Add more backend replicas (Docker Swarm, Kubernetes)
  6. Connection Pooling: Reuse connections (oracle create_pool already does this)

Performance Baseline:

Single-Tenant:
- Memory: 50MB (1 pool × 2-10 connections)
- Response time: 50ms (p95)

Multi-Tenant (3 tenants):
- Memory: 150MB (3 pools × 2-10 connections)
- Response time: 55ms (p95)
- Overhead: 10% (acceptable)

Multi-Tenant (10 tenants):
- Memory: 500MB (10 pools × 2-10 connections)
- Response time: 65ms (p95)
- Overhead: 30% (needs optimization if > 10% target)

Optimization:

  • Reduce min_connections de la 2 la 1 pentru low-traffic tenants
  • Aggressive cleanup: Idle > 30 min (instead of 1h)
  • Cache more aggressively (reduce Oracle queries)

📚 Referințe

Current Implementation

  • OraclePool: shared/database/oracle_pool.py - Singleton pattern for single-tenant
  • JWT Handler: shared/auth/jwt_handler.py - Token creation/validation (needs tenant_id)
  • Auth Middleware: shared/auth/middleware.py - JWT verification (needs tenant validation)
  • Backend Main: reports-app/backend/app/main.py - Startup logic (needs MultiTenantPoolManager)
  • SSH Tunnel Script: ssh_tunnel.sh - Single tunnel script (needs per-tenant manager)

Inspiration & Patterns

Multi-Tenant Best Practices

Testing Resources


📅 Timeline Summary

Faza Durată Obiectiv Output Verificabil
Faza 1 2-3 zile Tenant Config DB Tenant DB funcționează, default tenant creat
Faza 2 3-4 zile MultiTenantPoolManager Pool-uri per tenant, lazy loading
Faza 3 2-3 zile SSH Tunnel Manager SSH tunnels per tenant, auto-restart
Faza 4 2-3 zile JWT & Middleware JWT cu tenant_id, tenant validation
Faza 5 1-2 zile Cache & Audit Redis cache per tenant, audit logs
Faza 6 3-4 zile Deployment & Testing Deploy în toate env-urile, tests pass
TOTAL 14-20 zile Multi-Tenant Production-Ready All success criteria met

🚀 Next Steps

  1. Review acest plan cu team/stakeholders
  2. Prioritizează fazele (poate Faza 1+2 first, restul după)
  3. Setup development environment pentru testing
  4. Creează branch: feature/multi-tenant-architecture
  5. Start Faza 1: Tenant Configuration Database
  6. Iterate: Test după fiecare fază, adjust plan dacă e nevoie

Document Version: 1.0 Last Updated: 2025-10-25 Author: Claude Code (Anthropic) Status: Ready for Implementation