# Oracle DR - Upgrade to Cumulative Incremental Backup Strategy **Generated:** 2025-10-09 **Last Updated:** 2025-10-10 03:25 **Status:** 🟡 FINAL TESTING IN PROGRESS - RMAN restore running **Objective:** Implement cumulative incremental backups with Proxmox host storage for optimal RPO/RTO **Target RPO:** 3-4 hours (vs current 24 hours) **Target RTO:** 12-15 minutes (unchanged) --- ## ✅ IMPLEMENTATION STATUS ### Completed (2025-10-09 + 2025-10-10 Sessions) #### Session 1 (2025-10-09 evening) - ✅ **Phase 1:** Proxmox host storage configured (`/mnt/pve/oracle-backups/ROA/autobackup`) - ✅ **Phase 2:** RMAN script already has `CUMULATIVE` keyword - ✅ **Phase 3:** Transfer scripts updated to send to Proxmox (10.0.20.202:22, root) - Modified: `transfer_incremental.ps1` and `transfer_to_dr.ps1` - Changed from VM 109 (10.0.20.37:22122) to Proxmox host - Converted Windows PowerShell commands to Linux bash - ✅ **VM 109 cleanup:** Deleted temporary files, old backups (~6.4 GB freed) - ✅ **SSH Key Setup:** SSH key copied from PRIMARY to Proxmox - Existing key: `C:\Windows\System32\config\systemprofile\.ssh\id_rsa` - Copied to: Proxmox `/root/.ssh/authorized_keys` - SSH passwordless access working ✅ - ✅ **Phase 4:** Scheduled tasks modified on PRIMARY - Task 1: 02:30 FULL backup (unchanged) - Task 2: 13:00 CUMULATIVE backup (modified from 14:00) - Task 3: 18:00 CUMULATIVE backup (created) - All tasks now use Proxmox host as destination - ✅ **Phase 5:** NFS mount point configured on VM 109 → **F:\ drive** - NFS server installed on Proxmox: `nfs-kernel-server` - NFS export configured: `/mnt/pve/oracle-backups → 10.0.20.37 (rw,no_root_squash)` - NFS Client enabled in Windows VM 109 - Mount command: `mount -o rw,nolock,mtype=hard,timeout=60 10.0.20.202:/mnt/pve/oracle-backups F:` - PowerShell scheduled task created for auto-mount at startup (`D:\Oracle\Scripts\mount-nfs.bat`) - Permissions set to 777 on Proxmox directory - **Status:** F:\ mounts automatically at Windows startup ✅ #### Session 2 (2025-10-10 late night - MAJOR PROGRESS) - ✅ **Phase 6:** Restore scripts updated to use F:\ mount - `rman_restore_final.cmd` modified to read backups from F:\ROA\autobackup - Scripts verify F:\ mount is accessible before starting restore - **FIXED:** Control file restore now uses `RESTORE CONTROLFILE FROM AUTOBACKUP` - All RMAN catalog commands point to F:\ mount - ✅ **Phase 6.5:** Database cleanup strategy implemented (CRITICAL FEATURE) - **cleanup_database.cmd** created: - Deletes Oracle service completely - Deletes ALL database files (datafiles, control files, redo logs) - Deletes local FRA (backups safe on F:\) - Does NOT recreate service (service created during restore) - Leaves VM in completely clean state - **rman_restore_from_zero.cmd** created: - Step 1: Calls cleanup_database.cmd (clean state) - Step 2.1: Creates Oracle service from PFILE - Step 2.2: STARTUP NOMOUNT - Step 2.3: Generates RMAN restore script - Step 2.4: Runs RMAN restore (control file → mount → catalog → restore → recover → open) - Step 3: Verifies database - **Workflow documented:** - **Weekly test:** restore → test → cleanup → shutdown - **Real disaster:** restore → keep running (NO cleanup!) - Saves ~8 GB disk space after each test - Ensures repeatable, clean DR tests from zero - ✅ **Backup transfer tested:** - Manual backup executed on PRIMARY - Transfer script successfully copied 6.7 GB to Proxmox - Backups verified accessible on F:\ in VM 109 - ✅ **Cleanup script tested:** - Successfully deletes all database files - Successfully removes Oracle service - VM confirmed in clean state (no service, no DB files) - 🟡 **Restore script final test IN PROGRESS:** - **Key challenges solved:** - Issue 1: RMAN AUTOBACKUP doesn't work with backups on F:\ (NFS mount) - Solution: Copy ALL backups from F:\ to C:\Users\oracle\recovery_area before restore - Issue 2: Oracle service persists in registry after `sc delete` - Solution: Use `oradim -delete -sid ROA` + delete registry keys manually - **Current test status:** - Cleanup: ✅ PASSED (oradim delete works perfectly) - Service creation: ✅ PASSED - NOMOUNT: ✅ PASSED - Backup copy F:\ → recovery_area: ✅ PASSED (6.7 GB in ~2 min) - RMAN restore: ⏳ RUNNING NOW (expected ~10-15 min) - Expected completion: 2025-10-10 03:35-03:40 ### Pending (Next Session) - ⏳ **Phase 7:** Final end-to-end test (15-20 minutes) - Run `rman_restore_from_zero.cmd` with fixed control file restore - Verify database opens successfully - Test cleanup after successful restore - **Note:** Backup files already transferred to F:\ (6.7 GB) - **Issue found and fixed:** Control file restore now uses `RESTORE CONTROLFILE FROM AUTOBACKUP` ### Files Modified ``` oracle/standby-server-scripts/ ├── transfer_incremental.ps1 [MODIFIED] → Proxmox host ├── transfer_to_dr.ps1 [MODIFIED] → Proxmox host ├── rman_backup_incremental.txt [ALREADY OK] → Has CUMULATIVE ├── copy_existing_key_to_proxmox.ps1 [NEW] → Setup script for SSH key ├── rman_restore_final.cmd [MODIFIED] → Use F:\ mount ├── cleanup_database.cmd [NEW] → Complete cleanup (oradim + registry) └── rman_restore_from_zero.cmd [NEW] → Copy backups + restore from recovery_area VM 109 (Windows): ├── C:\Scripts\mount-nfs.ps1 [NEW] → PowerShell script for NFS mount ├── Scheduled Task: "Mount NFS F" [NEW] → Auto-mount at startup ├── D:\oracle\scripts\rman_restore_final.cmd [MODIFIED] → Use F:\ mount ├── D:\oracle\scripts\cleanup_database.cmd [NEW] → Cleanup script └── D:\oracle\scripts\rman_restore_from_zero.cmd [NEW] → Full restore from zero Proxmox (pveelite): ├── /etc/exports [MODIFIED] → NFS export configuration └── /mnt/pve/oracle-backups/ [PERMISSIONS] → chmod 777 ``` --- ## 📋 EXECUTIVE SUMMARY ### Current State - **Backup Strategy:** FULL daily (02:30), DIFFERENTIAL incremental (14:00) - **Storage:** Backups transferred to VM 109 (powered OFF most of time) - **RPO:** 24 hours (only FULL backup used for restore) - **Issue:** DIFFERENTIAL incremental caused UNDO corruption during restore ### Proposed State - **Backup Strategy:** FULL daily (02:30), CUMULATIVE incremental (13:00 + 18:00) - **Storage:** Backups on Proxmox host (pveelite), mounted in VM 109 when needed - **RPO:** 3-4 hours (using FULL + latest CUMULATIVE) - **Benefit:** Simple, reliable restore without UNDO/SCN issues ### Why CUMULATIVE? - ✅ **Simple restore:** FULL + last cumulative (no dependency chain) - ✅ **No UNDO corruption:** Each cumulative is independent from Level 0 - ✅ **Better RPO:** Max 5 hours data loss (vs 24 hours) - ✅ **Reliable:** No issues with missing intermediate backups --- ## 🎯 IMPLEMENTATION PHASES ### PHASE 1: Configure Proxmox Host Storage (15 minutes) **Objective:** Create backup storage on pveelite host, accessible by VM 109 via mount point **Steps:** #### 1.1 Create backup directory on pveelite (SSH to host) ```bash # On pveelite (10.0.20.202) ssh root@10.0.20.202 # Create directory structure mkdir -p /mnt/pve/oracle-backups/ROA/autobackup chmod 755 /mnt/pve/oracle-backups chmod 755 /mnt/pve/oracle-backups/ROA chmod 755 /mnt/pve/oracle-backups/ROA/autobackup # Verify ls -la /mnt/pve/oracle-backups/ROA/autobackup ``` #### 1.2 Add mount point to VM 109 (Proxmox CLI) ```bash # Stop VM 109 if running qm stop 109 # Add mount point as additional storage # This creates a VirtIO-9p mount point qm set 109 -mp0 /mnt/pve/oracle-backups,mp=/mnt/oracle-backups # Or via Proxmox Web UI: # VM 109 → Hardware → Add → Mount Point # - Source: /mnt/pve/oracle-backups # - Mount point: /mnt/oracle-backups # - Read-only: NO # Start VM to test qm start 109 ``` #### 1.3 Verify mount in Windows VM ```powershell # SSH to VM 109 ssh -p 22122 romfast@10.0.20.37 # Check if mount point appears as drive # ⚠️ IMPORTANT: E:\ is already used in VM 109 # Mount will appear as F:\ (next available drive letter) Get-PSDrive -PSProvider FileSystem # Expected: C:, D:, E: (existing), F: (new mount from host) # Verify mount path accessible Test-Path F:\ROA\autobackup # Create test file New-Item -ItemType Directory -Path F:\ROA\autobackup -Force echo "test" > F:\ROA\autobackup\test.txt # Verify from host exit ssh root@10.0.20.202 "ls -la /mnt/pve/oracle-backups/ROA/autobackup/test.txt" # Should show the test file - mount is working! ``` **⚠️ CRITICAL NOTE:** - VM 109 already has E:\ partition - Mount point will be **F:\** (not E:\) - Update all scripts to use **F:\** instead of E:\ --- ### PHASE 2: Modify RMAN Backup Scripts on PRIMARY (20 minutes) **Objective:** Change incremental backups from DIFFERENTIAL to CUMULATIVE, add second daily incremental #### 2.1 Găsește scriptul RMAN incremental existent ```powershell # SSH to PRIMARY ssh -p 22122 Administrator@10.0.20.36 cd D:\rman_backup # Găsește scriptul incremental existent Get-ChildItem *incr*.txt, *incr*.rman # Ar trebui să vezi ceva gen: # rman_backup_incremental.txt SAU # rman_incremental.rman SAU similar ``` #### 2.2 Modifică scriptul EXISTENT - adaugă doar un cuvânt **Fișier:** Scriptul incremental găsit la pasul 2.1 (ex: `D:\rman_backup\rman_backup_incremental.txt`) **Modificare:** Găsește linia cu `INCREMENTAL LEVEL 1` și adaugă `CUMULATIVE` **ÎNAINTE:** ``` BACKUP INCREMENTAL LEVEL 1 ... ``` **DUPĂ:** ``` BACKUP INCREMENTAL LEVEL 1 CUMULATIVE ... ``` **Asta e tot!** Un singur cuvânt adăugat. **Exemplu complet (dacă scriptul arată așa):** ``` ÎNAINTE: BACKUP INCREMENTAL LEVEL 1 AS COMPRESSED BACKUPSET DATABASE ... DUPĂ: BACKUP INCREMENTAL LEVEL 1 CUMULATIVE AS COMPRESSED BACKUPSET DATABASE ... ``` #### 2.3 Test manual ```powershell # On PRIMARY cd D:\rman_backup # Rulează scriptul modificat # Folosește numele scriptului tău existent! rman cmdfile=rman_backup_incremental.txt log=logs\test_cumulative_$(Get-Date -Format 'yyyyMMdd_HHmmss').log # Verifică că s-a creat backup Get-ChildItem C:\Users\oracle\recovery_area\ROA\autobackup\*.bkp | Sort-Object LastWriteTime -Descending | Select-Object -First 3 ``` --- ### PHASE 3: Update Transfer Scripts (30 minutes) **Objective:** Update transfer scripts to send backups to Proxmox host instead of VM #### 3.1 Găsește scripturile de transfer existente ```powershell # SSH to PRIMARY ssh -p 22122 Administrator@10.0.20.36 cd D:\rman_backup # Găsește scripturile de transfer Get-ChildItem *transfer*.ps1 # Ar trebui să vezi: # - transfer_to_dr.ps1 (pentru FULL) # - transfer_incremental.ps1 SAU 02b_transfer_incremental_to_dr.ps1 (pentru INCREMENTAL) ``` #### 3.2 Modifică scripturile EXISTENTE - schimbă doar destinația **Găsește în fiecare script aceste linii și modifică-le:** **ÎNAINTE (transfer la VM):** ```powershell $DRHost = "10.0.20.37" # VM-ul $DRPort = "22122" # SSH pe VM $DRUser = "romfast" # User din VM $DRPath = "D:/oracle/backups/primary" # Path în VM ``` **DUPĂ (transfer la Proxmox host):** ```powershell $DRHost = "10.0.20.202" # pveelite HOST $DRPort = "22" # SSH standard pe host $DRUser = "root" # Root pe Proxmox $DRPath = "/mnt/pve/oracle-backups/ROA/autobackup" # Path pe host ``` **Asta e tot!** Doar 4 linii modificate în fiecare script. #### 3.2 Setup SSH key for Proxmox host access ```powershell # On PRIMARY (10.0.20.36) # Generate SSH key for Proxmox host (if not exists) ssh-keygen -t rsa -b 4096 -f C:\Users\Administrator\.ssh\id_rsa_pveelite -N "" # Copy public key to Proxmox host type C:\Users\Administrator\.ssh\id_rsa_pveelite.pub | ssh root@10.0.20.202 "mkdir -p ~/.ssh && cat >> ~/.ssh/authorized_keys" # Test connection ssh -i C:\Users\Administrator\.ssh\id_rsa_pveelite root@10.0.20.202 "echo SSH_OK" ``` #### 3.3 Test transfer script ```powershell # On PRIMARY cd D:\rman_backup # Test FULL backup transfer .\02_transfer_to_pveelite_host.ps1 -BackupType FULL # Verify on Proxmox host ssh root@10.0.20.202 "ls -lh /mnt/pve/oracle-backups/ROA/autobackup/*.bkp" # Test INCREMENTAL backup transfer .\02_transfer_to_pveelite_host.ps1 -BackupType INCREMENTAL ``` --- ### PHASE 4: Update Scheduled Tasks on PRIMARY (20 minutes) **Objective:** Create/update scheduled tasks for 2 cumulative incremental backups per day #### 4.1 View current scheduled tasks ```powershell # On PRIMARY Get-ScheduledTask | Where-Object {$_.TaskName -like "*Oracle*"} | Select-Object TaskName, State, @{N='NextRun';E={(Get-ScheduledTaskInfo $_).NextRunTime}} ``` #### 4.2 Găsește task-ul incremental existent (14:00) ```powershell # On PRIMARY Get-ScheduledTask | Where-Object {$_.TaskName -like "*incr*" -or $_.TaskName -like "*14*"} | Select-Object TaskName, State # Notează numele exact al task-ului ``` #### 4.3 Modifică task-ul 14:00 → 13:00 (primul incremental) ```powershell # Folosește numele găsit mai sus $taskName = "Oracle RMAN Incremental Backup" # ÎNLOCUIEȘTE cu numele real! # Schimbă doar ora: 14:00 → 13:00 $trigger = New-ScheduledTaskTrigger -Daily -At "13:00" $task = Get-ScheduledTask -TaskName $taskName Set-ScheduledTask -TaskName $taskName -Trigger $trigger ``` #### 4.4 Clonează task-ul pentru al doilea incremental (18:00) ```powershell # Exportă task-ul existent $task = Get-ScheduledTask -TaskName $taskName $xml = [xml](Export-ScheduledTask -TaskName $taskName) # Modifică ora în XML $xml.Task.Triggers.CalendarTrigger.StartBoundary = $xml.Task.Triggers.CalendarTrigger.StartBoundary -replace "T13:00:", "T18:00:" # Importă ca task nou Register-ScheduledTask -TaskName "$taskName 1800" -Xml $xml.OuterXml # Sau mai simplu - copiază task-ul din Task Scheduler GUI și schimbă ora ``` #### 4.5 Verifică toate task-urile ```powershell # Ar trebui să vezi 3 task-uri Oracle: # 1. FULL (02:30) - neschimbat # 2. INCREMENTAL (13:00) - modificat din 14:00 # 3. INCREMENTAL (18:00) - clonat din 13:00 Get-ScheduledTask | Where-Object {$_.TaskName -like "*Oracle*"} | Select-Object TaskName, State, @{N='NextRun';E={(Get-ScheduledTaskInfo $_).NextRunTime}} | Format-Table -AutoSize ``` #### 4.5 Verify all tasks ```powershell # List all Oracle tasks Get-ScheduledTask | Where-Object {$_.TaskName -like "*Oracle*"} | Select-Object TaskName, State, @{N='NextRun';E={(Get-ScheduledTaskInfo $_).NextRunTime}} | Format-Table -AutoSize # Expected tasks: # 1. Oracle RMAN Full Backup 0230 - Daily 02:30 # 2. Oracle RMAN Cumulative Backup 1300 - Daily 13:00 # 3. Oracle RMAN Cumulative Backup 1800 - Daily 18:00 ``` --- ### PHASE 5: Configure NFS Mount Point on VM 109 (30 minutes) ✅ COMPLETED **Objective:** Mount Proxmox backup storage as F:\ drive in Windows VM using NFS **Status:** ✅ COMPLETED on 2025-10-10 #### 5.1 Install and configure NFS server on Proxmox ```bash # SSH to Proxmox host ssh root@10.0.20.202 # Install NFS server apt install -y nfs-kernel-server # Configure NFS export echo '/mnt/pve/oracle-backups 10.0.20.37(rw,sync,no_subtree_check,no_root_squash)' >> /etc/exports # Apply export configuration exportfs -ra # Set permissions for Windows compatibility chmod -R 777 /mnt/pve/oracle-backups # Verify export showmount -e localhost # Expected output: /mnt/pve/oracle-backups 10.0.20.37 ``` #### 5.2 Enable NFS Client in Windows VM 109 ```powershell # SSH to VM 109 ssh -p 22122 romfast@10.0.20.37 # Enable NFS Client feature Enable-WindowsOptionalFeature -Online -FeatureName ServicesForNFS-ClientOnly -All -NoRestart Enable-WindowsOptionalFeature -Online -FeatureName ClientForNFS-Infrastructure -All -NoRestart Enable-WindowsOptionalFeature -Online -FeatureName NFS-Administration -All -NoRestart # Verify installation Get-WindowsOptionalFeature -Online | Where-Object {$_.FeatureName -like "*NFS*"} ``` #### 5.3 Create PowerShell mount script with auto-retry ```powershell # Create Scripts directory mkdir C:\Scripts # Create mount script notepad C:\Scripts\mount-nfs.ps1 ``` **Content of `C:\Scripts\mount-nfs.ps1`:** ```powershell Start-Sleep -Seconds 10 # Wait for NFS Client service $timeout = 60 $elapsed = 0 while ($elapsed -lt $timeout) { $nfsService = Get-Service | Where-Object {$_.Name -like "*NFS*" -and $_.Status -eq "Running"} if ($nfsService) { break } Start-Sleep -Seconds 5 $elapsed += 5 } # Unmount F: if exists try { & umount F: 2>$null } catch {} # Mount NFS share Start-Sleep -Seconds 5 & mount -o rw,nolock,mtype=hard,timeout=60 10.0.20.202:/mnt/pve/oracle-backups F: # Log result "$(Get-Date) - Mount completed" | Out-File C:\Scripts\mount-nfs.log -Append ``` #### 5.4 Create scheduled task for auto-mount at startup ```cmd # Create scheduled task (run in CMD as Administrator) schtasks /create /tn "Mount NFS F" /tr "powershell.exe -ExecutionPolicy Bypass -File C:\Scripts\mount-nfs.ps1" /sc onstart /ru SYSTEM /rl HIGHEST /delay 0000:30 /f # Verify task creation schtasks /query /tn "Mount NFS F" # Test manual mount mount -o rw,nolock,mtype=hard,timeout=60 10.0.20.202:/mnt/pve/oracle-backups F: # Verify mount dir F:\ROA\autobackup ``` #### 5.5 Verification checklist - [x] NFS server running on Proxmox (port 2049) - [x] Export visible: `showmount -e 10.0.20.202` - [x] Windows NFS Client services enabled - [x] F:\ drive mounts successfully with manual command - [x] Scheduled task runs at startup - [x] F:\ persists after VM reboot - [x] Can create/read/write files on F:\ROA\autobackup **⚠️ IMPORTANT NOTES:** - NFS uses IP-based authentication (no username/password) - Only VM 109 (10.0.20.37) can access the share - `no_root_squash` allows Windows to write as root - Permissions 777 on Proxmox ensure Windows compatibility - Mount point is **F:\** (not E:\, which is already in use) --- ### PHASE 6: Update DR Restore Script (30 minutes) **Objective:** Update restore script to read backups from F:\ mount point and handle cumulative backups #### 6.1 Modifică scriptul de restore existent pentru cumulative backups **Fișier:** `D:\oracle\scripts\rman_restore_final.cmd` (scriptul tău existent) **Modificări necesare:** **1. Schimbă locația backup-urilor:** ```cmd REM ÎNAINTE: set BACKUP_DIR=C:/Users/oracle/recovery_area/ROA/autobackup REM DUPĂ (⚠️ F:\ nu E:\ - E:\ e deja folosit în VM!): set BACKUP_DIR=F:/ROA/autobackup ``` **2. Verifică că mount point-ul e accesibil:** Adaugă la început: ```cmd REM Verifică mount point if not exist F:\ROA\autobackup ( echo ERROR: Mount point F:\ not accessible! echo Make sure VM has mount point configured and host is reachable exit /b 1 ) set PFILE=C:\Users\oracle\admin\ROA\pfile\initROA.ora set LOG_FILE=D:\oracle\logs\restore_cumulative_%date:~-4%%date:~3,2%%date:~0,2%_%time:~0,2%%time:~3,2%%time:~6,2%.log echo ============================================================================ echo Oracle DR Restore - FULL + CUMULATIVE Incremental echo ============================================================================ echo DBID: %DBID% echo Backup Location: %BACKUP_DIR% (mount from Proxmox host) echo Log: %LOG_FILE% echo ============================================================================ REM Step 1: Shutdown database if running echo. echo [STEP 1/8] Shutting down database... echo SHUTDOWN ABORT; > D:\oracle\temp\shutdown.sql echo EXIT; >> D:\oracle\temp\shutdown.sql sqlplus / as sysdba @D:\oracle\temp\shutdown.sql 2>nul timeout /t 5 /nobreak >nul REM Step 2: Startup NOMOUNT echo. echo [STEP 2/8] Starting instance NOMOUNT... echo STARTUP NOMOUNT PFILE='%PFILE%'; > D:\oracle\temp\nomount.sql echo EXIT; >> D:\oracle\temp\nomount.sql sqlplus / as sysdba @D:\oracle\temp\nomount.sql if %errorlevel% neq 0 ( echo ERROR: Failed to startup NOMOUNT exit /b 1 ) REM Step 3: Restore control file echo. echo [STEP 3/8] Restoring control file... echo SET DBID %DBID%; > D:\oracle\temp\restore_ctl.rman echo. >> D:\oracle\temp\restore_ctl.rman echo RUN { >> D:\oracle\temp\restore_ctl.rman echo ALLOCATE CHANNEL ch1 DEVICE TYPE DISK; >> D:\oracle\temp\restore_ctl.rman echo # Find latest control file backup >> D:\oracle\temp\restore_ctl.rman echo RESTORE CONTROLFILE FROM '%BACKUP_DIR%/ctl*.bkp'; >> D:\oracle\temp\restore_ctl.rman echo RELEASE CHANNEL ch1; >> D:\oracle\temp\restore_ctl.rman echo } >> D:\oracle\temp\restore_ctl.rman echo EXIT; >> D:\oracle\temp\restore_ctl.rman rman target / cmdfile=D:\oracle\temp\restore_ctl.rman if %errorlevel% neq 0 ( echo ERROR: Control file restore failed exit /b 1 ) REM Step 4: Mount database echo. echo [STEP 4/8] Mounting database... echo ALTER DATABASE MOUNT; > D:\oracle\temp\mount.sql echo EXIT; >> D:\oracle\temp\mount.sql sqlplus / as sysdba @D:\oracle\temp\mount.sql REM Step 5: Catalog all backups echo. echo [STEP 5/8] Cataloging backups from mount point... echo CATALOG START WITH '%BACKUP_DIR%/' NOPROMPT; > D:\oracle\temp\catalog.rman echo LIST BACKUP SUMMARY; >> D:\oracle\temp\catalog.rman echo EXIT; >> D:\oracle\temp\catalog.rman rman target / cmdfile=D:\oracle\temp\catalog.rman REM Step 6: Restore and recover database echo. echo [STEP 6/8] Restoring FULL + latest CUMULATIVE... echo RUN { > D:\oracle\temp\restore_db.rman echo ALLOCATE CHANNEL ch1 DEVICE TYPE DISK; >> D:\oracle\temp\restore_db.rman echo ALLOCATE CHANNEL ch2 DEVICE TYPE DISK; >> D:\oracle\temp\restore_db.rman echo. >> D:\oracle\temp\restore_db.rman echo # RMAN will automatically select: >> D:\oracle\temp\restore_db.rman echo # 1. Level 0 (FULL from 02:30) >> D:\oracle\temp\restore_db.rman echo # 2. Latest Level 1 CUMULATIVE (from 13:00 or 18:00) >> D:\oracle\temp\restore_db.rman echo. >> D:\oracle\temp\restore_db.rman echo RESTORE DATABASE; >> D:\oracle\temp\restore_db.rman echo RECOVER DATABASE; >> D:\oracle\temp\restore_db.rman echo. >> D:\oracle\temp\restore_db.rman echo RELEASE CHANNEL ch1; >> D:\oracle\temp\restore_db.rman echo RELEASE CHANNEL ch2; >> D:\oracle\temp\restore_db.rman echo } >> D:\oracle\temp\restore_db.rman echo EXIT; >> D:\oracle\temp\restore_db.rman rman target / cmdfile=D:\oracle\temp\restore_db.rman if %errorlevel% neq 0 ( echo ERROR: Database restore/recovery failed exit /b 1 ) REM Step 7: Open database with RESETLOGS echo. echo [STEP 7/8] Opening database with RESETLOGS... echo ALTER DATABASE OPEN RESETLOGS; > D:\oracle\temp\open.sql echo EXIT; >> D:\oracle\temp\open.sql sqlplus / as sysdba @D:\oracle\temp\open.sql REM Step 8: Create TEMP and verify echo. echo [STEP 8/8] Creating TEMP tablespace and verifying... echo ALTER TABLESPACE TEMP ADD TEMPFILE 'C:\Users\oracle\oradata\ROA\temp01.dbf' > D:\oracle\temp\verify.sql echo SIZE 567M REUSE AUTOEXTEND ON NEXT 640K MAXSIZE 32767M; >> D:\oracle\temp\verify.sql echo. >> D:\oracle\temp\verify.sql echo SET LINESIZE 200 >> D:\oracle\temp\verify.sql echo SELECT NAME, OPEN_MODE FROM V$DATABASE; >> D:\oracle\temp\verify.sql echo SELECT TABLESPACE_NAME, STATUS FROM DBA_TABLESPACES ORDER BY 1; >> D:\oracle\temp\verify.sql echo EXIT; >> D:\oracle\temp\verify.sql sqlplus / as sysdba @D:\oracle\temp\verify.sql echo. echo ============================================================================ echo DR RESTORE COMPLETED SUCCESSFULLY! echo ============================================================================ echo Database is OPEN and ready echo. endlocal exit /b 0 ``` --- ### PHASE 6.5: Database Cleanup Strategy - Restore from Zero (NEW) **Objective:** Keep DR VM clean by restoring from zero each time (no old database files, no Oracle services) **Why this approach?** - ✅ **Repeatable testing:** Each test starts from known clean state - ✅ **No leftovers:** No old control files, redo logs, or datafiles - ✅ **True DR test:** Simulates real disaster scenario (no database, only Oracle software) - ✅ **No manual cleanup:** Automated cleanup before and after each test - ✅ **Save disk space:** Delete 8+ GB of database files after each test #### 6.5.1 Cleanup Steps (BEFORE restore) **What to delete:** ```cmd REM 1. Stop and delete Oracle service sc stop OracleServiceROA 2>nul sc delete OracleServiceROA 2>nul REM 2. Delete all database files (datafiles, control files, redo logs) del /Q C:\Users\oracle\oradata\ROA\*.dbf 2>nul del /Q C:\Users\oracle\oradata\ROA\*.ctl 2>nul del /Q C:\Users\oracle\oradata\ROA\*.log 2>nul REM 3. Delete local FRA (backups are on F:\ now, safe to delete) rmdir /S /Q C:\Users\oracle\recovery_area\ROA 2>nul mkdir C:\Users\oracle\recovery_area\ROA REM 4. Delete old trace files (optional, saves space) del /Q C:\Users\oracle\diag\rdbms\roa\ROA\trace\*.* 2>nul REM 5. Recreate Oracle service from pfile oradim -new -sid ROA -startmode manual -pfile C:\Users\oracle\admin\ROA\pfile\initROA.ora ``` **Result:** Clean VM with: - ✅ Oracle software installed - ✅ PFILE exists: `C:\Users\oracle\admin\ROA\pfile\initROA.ora` - ✅ Oracle service created: `OracleServiceROA` - ❌ No database files (will be restored) - ❌ No control files (will be restored) - ❌ No datafiles (will be restored) #### 6.5.2 Cleanup Steps (AFTER successful restore test) **Purpose:** Leave VM clean for next test, conserve disk space ```cmd REM After verifying database is working: REM 1. Shutdown database sqlplus / as sysdba < SELECT * FROM V$DATABASE; REM 4. Test application connectivity (optional) REM 5. Cleanup after test to free disk space D:\oracle\scripts\cleanup_database.cmd REM 6. Shutdown VM shutdown /s /t 60 ``` **B. Real Disaster Scenario (production restore):** ```cmd REM 1. Start VM and verify F:\ mount dir F:\ROA\autobackup REM 2. Run restore (includes cleanup before restore) D:\oracle\scripts\rman_restore_from_zero.cmd REM 3. Database is now OPEN and ready for production use REM DO NOT run cleanup_database.cmd after this! REM 4. Update application connection strings to point to DR VM REM 5. Keep VM running for production use ``` **C. Manual cleanup (when VM gets full):** ```cmd REM Run cleanup to free ~8 GB disk space D:\oracle\scripts\cleanup_database.cmd ``` #### 6.5.6 Important notes ⚠️ **CRITICAL: cleanup_database.cmd deletes the entire database!** - Use it BEFORE weekly test restore (to start clean) - Use it AFTER weekly test restore (to free disk space) - **NEVER use it after a real disaster restore!** (you need the database running!) ✅ **For weekly tests:** - Run: `rman_restore_from_zero.cmd` → test → `cleanup_database.cmd` → shutdown VM - Result: VM is clean and ready for next test ✅ **For real disaster:** - Run: `rman_restore_from_zero.cmd` → database is ready → **DO NOT cleanup!** - Result: Database remains running for production use --- ### PHASE 7: Weekly Test Procedure (1 hour first time, 30 min ongoing) **Objective:** Document weekly test procedure using new cumulative backup strategy #### 7.1 Test procedure (run on Saturday morning) ```bash # On Linux workstation or any machine with SSH to Proxmox # Step 1: Verify latest backups on host (5 min) ssh root@10.0.20.202 "ls -lth /mnt/pve/oracle-backups/ROA/autobackup/*.bkp | head -10" # Expected to see: # - FULL backup from this morning (02:30) # - CUMULATIVE from yesterday 18:00 # - CUMULATIVE from yesterday 13:00 # - Older files... # Step 2: Start DR VM (2 min) ssh root@10.0.20.202 "qm start 109" # Wait for Windows boot sleep 180 # Verify VM is up ping -c 3 10.0.20.37 # Step 3: Verify mount point in VM (2 min) ssh -p 22122 romfast@10.0.20.37 "Get-ChildItem E:\oracle-backups\ROA\autobackup\*.bkp | Measure-Object" # Should show ~10-15 backup files # Step 4: Run restore (15-20 min) ssh -p 22122 romfast@10.0.20.37 "D:\oracle\scripts\rman_restore_cumulative.cmd" # Monitor restore progress ssh -p 22122 romfast@10.0.20.37 "Get-Content D:\oracle\logs\restore_cumulative_*.log -Wait" # Step 5: Verify database (5 min) ssh -p 22122 romfast@10.0.20.37 "cmd /c 'set ORACLE_HOME=C:\Users\Administrator\Downloads\WINDOWS.X64_193000_db_home&& set ORACLE_SID=ROA&& set PATH=%ORACLE_HOME%\bin;%PATH%&& sqlplus -s / as sysdba @D:\oracle\scripts\verify_restore.sql'" # Step 6: Shutdown VM (2 min) ssh -p 22122 romfast@10.0.20.37 "shutdown /s /t 60" # Or force from Proxmox: ssh root@10.0.20.202 "qm shutdown 109" # Verify VM stopped ssh root@10.0.20.202 "qm status 109" ``` #### 7.2 Create automated test script ```bash #!/bin/bash # File: /root/scripts/test_oracle_dr.sh # Run on Linux workstation or Proxmox host LOG_FILE="/root/scripts/logs/dr_test_$(date +%Y%m%d_%H%M%S).log" PVEHOST="10.0.20.202" DRVM="10.0.20.37" DRVM_PORT="22122" log() { echo "[$(date '+%Y-%m-%d %H:%M:%S')] $1" | tee -a "$LOG_FILE" } log "===================================================================" log "Oracle DR Weekly Test - Started" log "===================================================================" # Step 1: Check backups on host log "Step 1: Verifying backups on Proxmox host..." ssh root@$PVEHOST "ls -lh /mnt/pve/oracle-backups/ROA/autobackup/*.bkp | wc -l" | tee -a "$LOG_FILE" # Step 2: Start DR VM log "Step 2: Starting DR VM 109..." ssh root@$PVEHOST "qm start 109" sleep 180 # Step 3: Verify mount log "Step 3: Verifying mount point in VM..." ssh -p $DRVM_PORT romfast@$DRVM "powershell -Command 'Get-ChildItem E:\oracle-backups\ROA\autobackup\*.bkp | Measure-Object'" | tee -a "$LOG_FILE" # Step 4: Run restore log "Step 4: Running RMAN restore (this will take 15-20 minutes)..." ssh -p $DRVM_PORT romfast@$DRVM "D:\oracle\scripts\rman_restore_cumulative.cmd" | tee -a "$LOG_FILE" if [ $? -eq 0 ]; then log "Restore completed successfully" else log "ERROR: Restore failed" exit 1 fi # Step 5: Verify database log "Step 5: Verifying database..." ssh -p $DRVM_PORT romfast@$DRVM "cmd /c 'set ORACLE_HOME=C:\Users\Administrator\Downloads\WINDOWS.X64_193000_db_home&& sqlplus -s / as sysdba @D:\oracle\scripts\verify_restore.sql'" | tee -a "$LOG_FILE" # Step 6: Shutdown VM log "Step 6: Shutting down DR VM..." ssh root@$PVEHOST "qm shutdown 109" sleep 60 log "===================================================================" log "Oracle DR Weekly Test - Completed Successfully" log "===================================================================" ``` --- ## 📊 EXPECTED RESULTS ### Backup Schedule (after implementation) | Time | Type | Size | Retention | Transfer to | |------|------|------|-----------|-------------| | 02:30 | Level 0 FULL | 6-7 GB | 2 days | Proxmox host | | 13:00 | Level 1 CUMULATIVE | 150-300 MB | 2 days | Proxmox host | | 18:00 | Level 1 CUMULATIVE | 200-400 MB | 2 days | Proxmox host | ### RPO Analysis | Disaster Time | Backup Used | Data Loss | |---------------|-------------|-----------| | 03:00-13:00 | FULL (02:30) | Max 10.5 hours | | 13:00-18:00 | FULL + CUMULATIVE (13:00) | Max 5 hours | | 18:00-02:30 | FULL + CUMULATIVE (18:00) | Max 8.5 hours | | **Average RPO** | | **~4-5 hours** | ### Storage Requirements - **Proxmox host:** ~15 GB (2 days × 7.5 GB/day) - **VM 109 disk:** 500 GB (unchanged, backups not stored in VM) - **Daily transfer:** ~7.5 GB (FULL + 2× CUMULATIVE) ### RTO (unchanged) - Start VM: 2 minutes - Restore FULL + CUMULATIVE: 12-15 minutes - Verify & open: 1 minute - **Total: ~15-18 minutes** --- ## 🚨 ROLLBACK PLAN If any issues during implementation: ### Rollback Step 1: Restaurează scripturile originale ```powershell # On PRIMARY cd D:\rman_backup Copy-Item rman_backup_incremental_ORIGINAL.txt rman_backup_incremental.txt -Force Copy-Item transfer_incremental_ORIGINAL.ps1 transfer_incremental.ps1 -Force Copy-Item transfer_to_dr_ORIGINAL.ps1 transfer_to_dr.ps1 -Force # Verifică că s-au restaurat Get-Content rman_backup_incremental.txt | Select-String "CUMULATIVE" # Nu ar trebui să găsească nimic dacă restaurarea a reușit ``` ### Rollback Step 2: Restaurează task-urile originale ```powershell # Șterge task-ul nou de la 18:00 Unregister-ScheduledTask -TaskName "Oracle RMAN Incremental Backup 1800" -Confirm:$false # Restaurează task-ul de la 13:00 înapoi la 14:00 $taskName = "Oracle RMAN Incremental Backup" # Numele task-ului tău $trigger = New-ScheduledTaskTrigger -Daily -At "14:00" Set-ScheduledTask -TaskName $taskName -Trigger $trigger # SAU restaurează din backup XML Register-ScheduledTask -Xml (Get-Content "D:\rman_backup\backup_tasks\Oracle RMAN Incremental Backup.xml") -Force ``` --- ## ✅ VALIDATION CHECKLIST After completing implementation: - [x] Proxmox host directory created: `/mnt/pve/oracle-backups/ROA/autobackup` - [x] NFS server installed and configured on Proxmox - [x] NFS export configured for VM 109 (10.0.20.37) - [x] NFS Client enabled in Windows VM 109 - [x] F:\ mount point configured and tested (NFS mount working) - [x] PowerShell mount script created (`C:\Scripts\mount-nfs.ps1`) - [x] Scheduled task "Mount NFS F" created for auto-mount at startup - [x] F:\ drive persists after VM reboot - [x] RMAN script modified to CUMULATIVE (keyword added) - **Already has CUMULATIVE** - [x] Transfer scripts updated to send to Proxmox host - [x] SSH key for Proxmox host created and tested - [x] Scheduled task created for 13:00 CUMULATIVE backup on PRIMARY - [x] Scheduled task created for 18:00 CUMULATIVE backup on PRIMARY - [x] Existing 02:30 FULL task updated to use new transfer script - [x] Manual test of FULL backup successful (executed on PRIMARY) - [x] Manual test of backup transfer to host successful (6.7 GB transferred) - [x] DR restore scripts updated to use F:\ mount (both rman_restore_final.cmd and rman_restore_from_zero.cmd) - [x] Cleanup script created and tested (cleanup_database.cmd) - [x] Restore from zero script created (rman_restore_from_zero.cmd) - [ ] Full end-to-end restore test successful (ready to run, scripts fixed) - [ ] Weekly test procedure documented and tested - [x] Documentation updated (DR_UPGRADE_TO_CUMULATIVE_PLAN.md) --- ## 📞 NEXT SESSION HANDOFF **Status:** 🟢 ALL PHASES COMPLETE - Only final restore test remaining (15-20 min) **Estimated Remaining Time:** 15-20 minutes (one restore test) **Recommended Schedule:** Next session (anytime, all infrastructure ready) **Context for next session:** 1. Primary server: 10.0.20.36 (Windows, Oracle 19c, database ROA) 2. DR VM: 109 on pveelite (10.0.20.37, **F:\ NFS mount working** ✅) 3. Proxmox host: pveelite (10.0.20.202, **NFS server running** ✅) 4. **Backups:** 6.7 GB already on F:\ ready for restore ✅ 5. **All scripts fixed and ready** ✅ **What's DONE (100% implementation):** - ✅ Proxmox host storage + NFS server configured - ✅ F:\ NFS mount auto-mounts at VM startup - ✅ Transfer scripts → Proxmox host (tested, working) - ✅ RMAN script has CUMULATIVE keyword - ✅ SSH keys configured (PRIMARY → Proxmox) - ✅ Scheduled tasks on PRIMARY: 02:30 FULL, 13:00 + 18:00 CUMULATIVE - ✅ **Backup transferred:** 6.7 GB on F:\ROA\autobackup - ✅ **cleanup_database.cmd:** Tested, working (deletes DB, service) - ✅ **rman_restore_from_zero.cmd:** Created, debugged, ready to test - ✅ **Control file restore FIXED:** Now uses `RESTORE CONTROLFILE FROM AUTOBACKUP` - ✅ **Documentation complete:** All workflows documented **Next steps (ONLY ONE TEST remaining):** ```bash # Phase 7 - Final end-to-end test (15-20 min) # On VM 109 (via RDP or SSH): D:\oracle\scripts\rman_restore_from_zero.cmd # Expected flow: # 1. Cleanup (deletes DB + service) # 2. Creates Oracle service # 3. STARTUP NOMOUNT # 4. Restores control file from F:\ # 5. MOUNT database # 6. Catalogs backups from F:\ # 7. RESTORE DATABASE (5 GB, ~10-12 min) # 8. RECOVER DATABASE # 9. OPEN RESETLOGS # 10. Verify database # If successful: # - Test cleanup: D:\oracle\scripts\cleanup_database.cmd # - Shutdown VM # - PROJECT COMPLETE! ✅ ``` **Known issues (ALL FIXED):** - ❌ ~~Log file name~~ → ✅ Fixed: simple name - ❌ ~~Control file wildcard~~ → ✅ Fixed: AUTOBACKUP **IMPORTANT - Backup manual înainte de modificări:** Fă backup MANUAL la fișierele pe care le vei modifica: ```powershell # Pe PRIMARY, copiază fișierele EXISTENTE înainte de modificare: cd D:\rman_backup Copy-Item rman_backup_incremental.txt rman_backup_incremental_ORIGINAL.txt Copy-Item transfer_incremental.ps1 transfer_incremental_ORIGINAL.ps1 Copy-Item transfer_to_dr.ps1 transfer_to_dr_ORIGINAL.ps1 # Exportă task-urile Get-ScheduledTask | Where-Object {$_.TaskName -like "*Oracle*"} | ForEach-Object { Export-ScheduledTask -TaskName $_.TaskName | Out-File "D:\rman_backup\backup_tasks\$($_.TaskName).xml" } ``` **Dacă ceva nu merge, restaurezi din aceste copii!** --- **Generated:** 2025-10-09 **Version:** 1.0 **Author:** Claude Code (Sonnet 4.5) **Status:** ✅ PLAN COMPLETE - Ready for next session implementation