Files
ROMFASTSQL/oracle/standby-server-scripts
Marius 3c0beda819 Oracle DR: Add RMAN backup scripts with enhanced logging
- Add rman_backup.bat: FULL backup with live console output and log file
- Add rman_backup_incremental.bat: INCREMENTAL backup with live output
- Add rman_backup.txt: RMAN script for LEVEL 0 FULL backup
- Add rman_backup_incremental.txt: RMAN script for LEVEL 1 CUMULATIVE backup
- Scripts are portable: use current directory instead of hardcoded paths
- Logging: simultaneous output to console AND log file using PowerShell Tee-Object
- Log files saved in logs/ subdirectory with timestamps

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
2025-10-10 23:41:42 +03:00
..

🛡️ Oracle DR System - Complete Architecture

📊 System Overview

┌─────────────────────────────────────────────────────────────────┐
│                     PRODUCTION ENVIRONMENT                       │
├─────────────────────────────────────────────────────────────────┤
│  PRIMARY SERVER (10.0.20.36)                                    │
│  Windows Server + Oracle 19c                                     │
│  ┌──────────────────────────────┐                              │
│  │ Database: ROA                 │                              │
│  │ Size: ~80 GB                  │                              │
│  │ Tables: 42,625                │                              │
│  └──────────────────────────────┘                              │
│         │                                                        │
│         ▼ Backups (Daily)                                       │
│  ┌──────────────────────────────┐                              │
│  │ 02:30 - FULL backup (6-7 GB) │                              │
│  │ 13:00 - CUMULATIVE (200 MB)  │                              │
│  │ 18:00 - CUMULATIVE (300 MB)  │                              │
│  └──────────────────────────────┘                              │
└─────────────────────────────────────────────────────────────────┘
                    │
                    │ SSH Transfer (Port 22)
                    ▼
┌─────────────────────────────────────────────────────────────────┐
│                        DR ENVIRONMENT                            │
├─────────────────────────────────────────────────────────────────┤
│  PROXMOX HOST (10.0.20.202 - pveelite)                         │
│  ┌──────────────────────────────┐                              │
│  │ Backup Storage (NFS Server)   │◄─────── Monitoring Scripts  │
│  │ /mnt/pve/oracle-backups/      │         /opt/scripts/       │
│  │ └── ROA/autobackup/           │                              │
│  └──────────────────────────────┘                              │
│         │                                                        │
│         │ NFS Mount (F:\)                                       │
│         ▼                                                        │
│  ┌──────────────────────────────┐                              │
│  │ DR VM 109 (10.0.20.37)       │                              │
│  │ Windows Server + Oracle 19c   │                              │
│  │ Status: OFF (normally)        │                              │
│  │ Starts for: Tests or Disaster │                              │
│  └──────────────────────────────┘                              │
└─────────────────────────────────────────────────────────────────┘

🎯 Quick Actions

Emergency DR Activation (Production Down!)

# 1. Start DR VM
ssh root@10.0.20.202 "qm start 109"

# 2. Connect to VM (wait 3 min for boot)
ssh -p 22122 romfast@10.0.20.37

# 3. Run restore (takes ~10-15 minutes)
D:\oracle\scripts\rman_restore_from_zero.cmd

# 4. Database is now RUNNING - Update app connections to 10.0.20.37

🧪 Weekly Test (Every Saturday)

# Automatic at 06:00 via cron, or manual:
ssh root@10.0.20.202 "/opt/scripts/weekly-dr-test-proxmox.sh"

# What it does:
# ✓ Starts VM → Restores DB → Tests → Cleanup → Shutdown
# ✓ Sends email report with results

📊 Check Backup Health

# Manual check (runs daily at 09:00 automatically)
ssh root@10.0.20.202 "/opt/scripts/oracle-backup-monitor-proxmox.sh"

# Output:
# Status: OK
# FULL backup age: 11 hours ✓
# CUMULATIVE backup age: 2 hours ✓
# Disk usage: 45% ✓

🗂️ Component Locations

📁 PRIMARY Server (10.0.20.36)

D:\rman_backup\
├── rman_backup_full.txt          # RMAN script for FULL backup
├── rman_backup_incremental.txt   # RMAN script for CUMULATIVE
├── transfer_to_dr.ps1            # Transfer FULL to Proxmox
└── transfer_incremental.ps1      # Transfer CUMULATIVE to Proxmox

Scheduled Tasks:
├── 02:30 - Oracle RMAN Full Backup
├── 13:00 - Oracle RMAN Cumulative Backup
└── 18:00 - Oracle RMAN Cumulative Backup

📁 PROXMOX Host (10.0.20.202)

/opt/scripts/
├── oracle-backup-monitor-proxmox.sh  # Daily backup monitoring
├── weekly-dr-test-proxmox.sh         # Weekly DR test
└── PROXMOX_NOTIFICATIONS_README.md   # Documentation

/mnt/pve/oracle-backups/ROA/autobackup/
├── FULL_20251010_023001.BKP         # Latest FULL backup
├── INCR_20251010_130001.BKP         # CUMULATIVE 13:00
└── INCR_20251010_180001.BKP         # CUMULATIVE 18:00

Cron Jobs:
0 9 * * * /opt/scripts/oracle-backup-monitor-proxmox.sh
0 6 * * 6 /opt/scripts/weekly-dr-test-proxmox.sh

📁 DR VM 109 (10.0.20.37) - When Running

D:\oracle\scripts\
├── rman_restore_from_zero.cmd    # Main restore script ⭐
├── cleanup_database.cmd          # Cleanup after test
└── mount-nfs.bat                 # Mount F:\ at startup

F:\ (NFS mount from Proxmox)
└── ROA\autobackup\               # All backup files

🔄 How It Works

Backup Flow (Daily)

PRIMARY                    PROXMOX
   │                          │
   ├─02:30─FULL─Backup────────►
   │         (6-7 GB)         │
   │                          │
   ├─13:00─CUMULATIVE─────────►
   │         (200 MB)         │
   │                          │
   └─18:00─CUMULATIVE─────────►
             (300 MB)      Storage

                        ┌──────────┐
                        │ Monitor  │ 09:00 Daily
                        │ Check Age│ Alert if old
                        └──────────┘

Restore Process

Start VM → Mount F:\ → Copy Backups → RMAN Restore → Database OPEN
  2min      Auto         2min           8min           Ready!

Total Time: ~15 minutes

🔧 Manual Operations

Test Individual Components

# 1. Test backup transfer (on PRIMARY)
D:\rman_backup\transfer_incremental.ps1

# 2. Test NFS mount (on VM 109)
mount -o rw,nolock,mtype=hard,timeout=60 10.0.20.202:/mnt/pve/oracle-backups F:
dir F:\ROA\autobackup

# 3. Test notification system
ssh root@10.0.20.202 "touch -d '2 days ago' /mnt/pve/oracle-backups/ROA/autobackup/*FULL*.BKP"
ssh root@10.0.20.202 "/opt/scripts/oracle-backup-monitor-proxmox.sh"
# Should send WARNING notification

# 4. Test database restore (on VM 109)
D:\oracle\scripts\rman_restore_from_zero.cmd

Force Actions

# Force backup now (on PRIMARY)
rman cmdfile=D:\rman_backup\rman_backup_incremental.txt

# Force cleanup VM (on VM 109)
D:\oracle\scripts\cleanup_database.cmd

# Force VM shutdown
ssh root@10.0.20.202 "qm stop 109"

🐛 Troubleshooting

Backup Monitor Not Sending Alerts

# 1. Check templates exist
ssh root@10.0.20.202 "ls /usr/share/pve-manager/templates/default/oracle-*"

# 2. Reinstall templates
ssh root@10.0.20.202 "/opt/scripts/oracle-backup-monitor-proxmox.sh --install"

# 3. Check Proxmox notifications work
ssh root@10.0.20.202 "pvesh create /nodes/$(hostname)/apt/update"
# Should receive update notification

F:\ Drive Not Accessible in VM

# On VM 109:
# 1. Check NFS Client service
Get-Service | Where {$_.Name -like "*NFS*"}

# 2. Manual mount
mount -o rw,nolock,mtype=hard,timeout=60 10.0.20.202:/mnt/pve/oracle-backups F:

# 3. Check Proxmox NFS server
ssh root@10.0.20.202 "showmount -e localhost"
# Should show: /mnt/pve/oracle-backups 10.0.20.37

Restore Fails

# 1. Check backup files exist
dir F:\ROA\autobackup\*.BKP

# 2. Check Oracle service
sc query OracleServiceROA

# 3. Check PFILE exists
dir C:\Users\oracle\admin\ROA\pfile\initROA.ora

# 4. View restore log
type D:\oracle\logs\restore_from_zero.log

VM Won't Start

# Check VM status
ssh root@10.0.20.202 "qm status 109"

# Check VM config
ssh root@10.0.20.202 "qm config 109 | grep -E 'memory|cores|bootdisk'"

# Force unlock if locked
ssh root@10.0.20.202 "qm unlock 109"

# Start with console
ssh root@10.0.20.202 "qm start 109 && qm terminal 109"

📈 Monitoring & Metrics

Key Metrics

Metric Target Alert Threshold
FULL Backup Age < 24h > 25h
CUMULATIVE Age < 6h > 7h
Backup Size ~7 GB/day > 10 GB
Restore Time < 15 min > 30 min
Disk Usage < 80% > 80%

Check Logs

# Backup logs (on PRIMARY)
Get-Content D:\rman_backup\logs\backup_*.log -Tail 50

# Transfer logs (on PRIMARY)
Get-Content D:\rman_backup\logs\transfer_*.log -Tail 50

# Monitoring logs (on Proxmox)
tail -50 /var/log/oracle-dr/*.log

# Restore logs (on VM 109)
type D:\oracle\logs\restore_from_zero.log

🔐 Security & Access

SSH Keys Setup

PRIMARY (10.0.20.36) ──────► PROXMOX (10.0.20.202)
                      SSH Key
                      Port 22

LINUX WORKSTATION ─────────► PROXMOX (10.0.20.202)
                      SSH Key
                      Port 22

LINUX WORKSTATION ─────────► VM 109 (10.0.20.37)
                      SSH Key
                      Port 22122

Required Credentials

  • PRIMARY: Administrator (for scheduled tasks)
  • PROXMOX: root (for scripts and VM control)
  • VM 109: romfast (user), SYSTEM (Oracle service)

📅 Maintenance Schedule

Day Time Action Duration Impact
Daily 02:30 FULL Backup 30 min None
Daily 09:00 Monitor Backups 1 min None
Daily 13:00 CUMULATIVE Backup 5 min None
Daily 18:00 CUMULATIVE Backup 5 min None
Saturday 06:00 DR Test 30 min None

🚨 Disaster Recovery Procedure

When PRIMARY is DOWN:

  1. Confirm PRIMARY is unreachable

    ping 10.0.20.36  # Should fail
    
  2. Start DR VM

    ssh root@10.0.20.202 "qm start 109"
    
  3. Wait for boot (3 minutes)

  4. Connect to DR VM

    ssh -p 22122 romfast@10.0.20.37
    
  5. Run restore

    D:\oracle\scripts\rman_restore_from_zero.cmd
    
  6. Verify database

    sqlplus / as sysdba
    SELECT name, open_mode FROM v$database;
    -- Should show: ROA, READ WRITE
    
  7. Update application connections

    • Change from: 10.0.20.36:1521/ROA
    • Change to: 10.0.20.37:1521/ROA
  8. Monitor DR system

    • Database is now production
    • Do NOT run cleanup!
    • Keep VM running

📝 Quick Reference Card

╔══════════════════════════════════════════════════════════════╗
║                    DR QUICK REFERENCE                        ║
╠══════════════════════════════════════════════════════════════╣
║ PRIMARY DOWN?                                                ║
║ ssh root@10.0.20.202                                        ║
║ qm start 109                                                 ║
║ # Wait 3 min                                                 ║
║ ssh -p 22122 romfast@10.0.20.37                            ║
║ D:\oracle\scripts\rman_restore_from_zero.cmd                ║
╠══════════════════════════════════════════════════════════════╣
║ TEST DR?                                                     ║
║ ssh root@10.0.20.202 "/opt/scripts/weekly-dr-test-proxmox.sh"║
╠══════════════════════════════════════════════════════════════╣
║ CHECK BACKUPS?                                               ║
║ ssh root@10.0.20.202 "/opt/scripts/oracle-backup-monitor-proxmox.sh"║
╠══════════════════════════════════════════════════════════════╣
║ SUPPORT:                                                     ║
║ Logs: /var/log/oracle-dr/                                   ║
║ Docs: /opt/scripts/PROXMOX_NOTIFICATIONS_README.md          ║
╚══════════════════════════════════════════════════════════════╝

Last Updated: October 10, 2025 Version: 2.0 - Complete DR System with Proxmox Integration Status: Production Ready