Oracle DR: Complete cleanup and restore scripts with Proxmox integration

- Remove outdated planning documents and implementation guides - Update README with comprehensive DR procedures and monitoring - Enhance rman_restore_from_zero.cmd with SPFILE creation and auto-start - Add Proxmox monitoring and weekly test scripts - Archive old implementation documentation Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
2025-10-10 15:13:29 +03:00
parent cbad9ee779
commit b44e3c8f9b
10 changed files with 2034 additions and 463 deletions
--- a/oracle/standby-server-scripts/README.md
+++ b/oracle/standby-server-scripts/README.md
@@ -1,445 +1,389 @@
-# Oracle ROA - Disaster Recovery Setup
-## Backup-Based DR: Windows PRIMARY (10.0.20.36) → Linux DR (10.0.20.37)
+# 🛡️ Oracle DR System - Complete Architecture

-**Database:** ROA (Contabilitate)
-**Strategie:** 4-Level Backup Protection
-**RTO:** 45-75 minute
-**RPO:** Max 1 zi (ultimul backup de la 02:00 AM)
+## 📊 System Overview

---
-
-## 📋 COMPONENTE SISTEM
-
-### PRIMARY Server (10.0.20.36 - Windows)
- Oracle 19c SE2 database ROA (producție)
- RMAN backup zilnic la 02:00 AM (COMPRESSED)
- Transfer DR la 03:00 AM
- Copiere HDD extern la 21:00
-
-### DR Server (10.0.20.37 - Linux LXC 109)
- Docker container: `oracle-standby`
- Oracle 19c instalat (database OPRIT până la dezastru)
- Primește backup-uri automat de pe PRIMARY
- Retenție: 1 backup (DOAR cel mai recent - relevant pentru contabilitate!)
-
---
-
-## 🗂️ FIȘIERE ÎN ACEST DIRECTOR
-
-| Fișier | Descriere | Folosit Pe |
-|--------|-----------|------------|
-| `01_rman_backup_upgraded.txt` | Script RMAN upgrade cu compression | PRIMARY (Windows) |
-| `02_transfer_to_dr.ps1` | Script PowerShell transfer backups → DR | PRIMARY (Windows) |
-| `03_setup_dr_transfer_task.ps1` | Setup Task Scheduler pentru transfer | PRIMARY (Windows) |
-| `04_full_dr_restore.sh` | Script COMPLET restore pe DR (disaster recovery) | DR (Linux) |
-| `05_test_restore_dr.sh` | Test restore LUNAR (verificare DR capability) | DR (Linux) |
-| `06_quick_verify_backups.sh` | Verificare ZILNICĂ backup-uri (monitoring) | DR (Linux) |
-| **OPȚIONAL - Incremental Backups (RPO îmbunătățit):** | | |
-| `01b_rman_backup_incremental.txt` | Script RMAN incremental (midday) | PRIMARY (Windows) |
-| `02b_transfer_incremental_to_dr.ps1` | Transfer incremental → DR | PRIMARY (Windows) |
-| `03b_setup_incremental_tasks.ps1` | Setup tasks pentru incremental | PRIMARY (Windows) |
-| **Documentație:** | | |
-| `STRATEGIE_BACKUP_CONTABILITATE.md` | Documentație strategiei complete | Referință |
-| `STRATEGIE_INCREMENTAL.md` | Backup incremental pentru RPO mai bun (OPȚIONAL) | Referință |
-| `PLAN_BACKUP_DR_SIMPLE.md` | Plan tehnic detaliat original | Referință |
-| `VERIFICARE_DR.md` | Ghid verificare și testare DR capability | Referință |
-| `RATIONAL_RETENTIE.md` | Justificare REDUNDANCY 1 pentru contabilitate | Referință |
-| `README.md` | Acest fișier - quick start guide | Referință |
-
---
-
-## 🚀 SETUP RAPID (Quick Start)
-
-### Pas 1: Setup SSH Keys (PRIMARY → DR)
-
-```powershell
-# Pe PRIMARY (10.0.20.36) - PowerShell ca Administrator
-ssh-keygen -t rsa -b 4096 -f "$env:USERPROFILE\.ssh\id_rsa" -N '""'
-
-# Afișează public key
-Get-Content "$env:USERPROFILE\.ssh\id_rsa.pub"
-# Copiază OUTPUT-ul
 ```
+┌─────────────────────────────────────────────────────────────────┐
+│                     PRODUCTION ENVIRONMENT                       │
+├─────────────────────────────────────────────────────────────────┤
+│  PRIMARY SERVER (10.0.20.36)                                    │
+│  Windows Server + Oracle 19c                                     │
+│  ┌──────────────────────────────┐                              │
+│  │ Database: ROA                 │                              │
+│  │ Size: ~80 GB                  │                              │
+│  │ Tables: 42,625                │                              │
+│  └──────────────────────────────┘                              │
+│         │                                                        │
+│         ▼ Backups (Daily)                                       │
+│  ┌──────────────────────────────┐                              │
+│  │ 02:30 - FULL backup (6-7 GB) │                              │
+│  │ 13:00 - CUMULATIVE (200 MB)  │                              │
+│  │ 18:00 - CUMULATIVE (300 MB)  │                              │
+│  └──────────────────────────────┘                              │
+└─────────────────────────────────────────────────────────────────┘
+                    │
+                    │ SSH Transfer (Port 22)
+                    ▼
+┌─────────────────────────────────────────────────────────────────┐
+│                        DR ENVIRONMENT                            │
+├─────────────────────────────────────────────────────────────────┤
+│  PROXMOX HOST (10.0.20.202 - pveelite)                         │
+│  ┌──────────────────────────────┐                              │
+│  │ Backup Storage (NFS Server)   │◄─────── Monitoring Scripts  │
+│  │ /mnt/pve/oracle-backups/      │         /opt/scripts/       │
+│  │ └── ROA/autobackup/           │                              │
+│  └──────────────────────────────┘                              │
+│         │                                                        │
+│         │ NFS Mount (F:\)                                       │
+│         ▼                                                        │
+│  ┌──────────────────────────────┐                              │
+│  │ DR VM 109 (10.0.20.37)       │                              │
+│  │ Windows Server + Oracle 19c   │                              │
+│  │ Status: OFF (normally)        │                              │
+│  │ Starts for: Tests or Disaster │                              │
+│  └──────────────────────────────┘                              │
+└─────────────────────────────────────────────────────────────────┘
+```
+
+## 🎯 Quick Actions
+
+### ⚡ Emergency DR Activation (Production Down!)

 ```bash
-# Pe DR Server (10.0.20.37)
-ssh root@10.0.20.37
+# 1. Start DR VM
+ssh root@10.0.20.202 "qm start 109"

-# Adaugă cheia publică
-mkdir -p /root/.ssh
-chmod 700 /root/.ssh
-nano /root/.ssh/authorized_keys
-# PASTE cheia publică aici, save (Ctrl+X, Y, Enter)
-chmod 600 /root/.ssh/authorized_keys
+# 2. Connect to VM (wait 3 min for boot)
+ssh -p 22122 romfast@10.0.20.37

-exit
+# 3. Run restore (takes ~10-15 minutes)
+D:\oracle\scripts\rman_restore_from_zero.cmd
+
+# 4. Database is now RUNNING - Update app connections to 10.0.20.37
 ```

-```powershell
-# Test conexiune (pe PRIMARY)
-ssh -i "$env:USERPROFILE\.ssh\id_rsa" root@10.0.20.37 "echo 'SSH OK'"
-# Ar trebui să vezi "SSH OK" FĂRĂ parolă!
-```
-
---
-
-### Pas 2: Upgrade Script RMAN Backup (PRIMARY)
-
-```powershell
-# Pe PRIMARY - backup scriptul vechi
-Copy-Item "D:\rman_backup\rman_backup.txt" "D:\rman_backup\rman_backup.txt.backup_$(Get-Date -Format 'yyyyMMdd')"
-
-# Copiază conținutul din 01_rman_backup_upgraded.txt
-# în D:\rman_backup\rman_backup.txt
-
-# SAU direct:
-# Copy-Item "\\path\to\01_rman_backup_upgraded.txt" "D:\rman_backup\rman_backup.txt"
-```
-
-**Ce face upgrade-ul:**
- ✅ Adaugă compression → reduce de la 23GB la ~8GB
- ✅ Include ARCHIVELOG DELETE INPUT
- ✅ REDUNDANCY 1 (păstrează doar ultimul backup - relevant pentru contabilitate!)
- ✅ BACKUP VALIDATE (verificare integritate după backup)
- ✅ Parallelism 2 channels (mai rapid)
-
---
-
-### Pas 3: Instalare Script Transfer (PRIMARY)
-
-```powershell
-# Creare director logs
-New-Item -ItemType Directory -Force -Path "D:\rman_backup\logs"
-
-# Copiere script
-Copy-Item "\\path\to\02_transfer_to_dr.ps1" "D:\rman_backup\transfer_to_dr.ps1"
-
-# Test manual
-PowerShell -ExecutionPolicy Bypass -File "D:\rman_backup\transfer_to_dr.ps1"
-```
-
---
-
-### Pas 4: Setup Task Scheduler (PRIMARY)
-
-```powershell
-# Rulează scriptul de setup ca Administrator
-PowerShell -ExecutionPolicy Bypass -File "\\path\to\03_setup_dr_transfer_task.ps1"
-
-# SAU manual:
-$action = New-ScheduledTaskAction -Execute "PowerShell.exe" `
-    -Argument "-ExecutionPolicy Bypass -File D:\rman_backup\transfer_to_dr.ps1"
-
-$trigger = New-ScheduledTaskTrigger -Daily -At "03:00AM"
-
-$principal = New-ScheduledTaskPrincipal -UserId "SYSTEM" `
-    -LogonType ServiceAccount -RunLevel Highest
-
-Register-ScheduledTask -TaskName "Oracle_DR_Transfer" `
-    -Action $action -Trigger $trigger -Principal $principal
-
-# Verificare
-Get-ScheduledTask -TaskName "Oracle_DR_Transfer"
-```
-
---
-
-### Pas 5: Setup DR Server (Linux)
+### 🧪 Weekly Test (Every Saturday)

 ```bash
-# Pe DR Server (10.0.20.37)
-ssh root@10.0.20.37
+# Automatic at 06:00 via cron, or manual:
+ssh root@10.0.20.202 "/opt/scripts/weekly-dr-test-proxmox.sh"

-# Directoare sunt deja create, verificare:
-ls -la /opt/oracle/backups/primary/
-ls -la /opt/oracle/scripts/dr/
-ls -la /opt/oracle/logs/dr/
-
-# Verificare container Docker
-docker ps | grep oracle-standby
-
-# Verificare Oracle software
-docker exec -u oracle oracle-standby bash -c 'ls -la $ORACLE_HOME/bin/rman'
+# What it does:
+# ✓ Starts VM → Restores DB → Tests → Cleanup → Shutdown
+# ✓ Sends email report with results
 ```

-**Script-ul de restore (`04_full_dr_restore.sh`) e deja instalat pe DR!**
-
---
-
-## 🔥 DISASTER RECOVERY - Procedură Urgență
-
-### Când să activezi DR?
-
-**✅ DA - Activează DR dacă:**
- PRIMARY server 10.0.20.36 NU răspunde >30 minute
- Oracle database corupt (nu se deschide)
- Crash disk C:\ sau D:\
- Ransomware / malware
-
-**❌ NU - Nu activa DR pentru:**
- Probleme minore de performance
- User șters accidental câteva înregistrări
- Restart Windows sau maintenance
- Erori fixabile în <30 minute
-
---
-
-### Procedură DR (60 minute)
+### 📊 Check Backup Health

 ```bash
-# Conectare la DR server
-ssh root@10.0.20.37
+# Manual check (runs daily at 09:00 automatically)
+ssh root@10.0.20.202 "/opt/scripts/oracle-backup-monitor-proxmox.sh"

-# IMPORTANT: Verifică că PRIMARY e CU ADEVĂRAT down!
-ping -c 10 10.0.20.36
-# Dacă răspunde → STOP! NU continua!
-
-# Rulează script restore
-/opt/oracle/scripts/dr/full_dr_restore.sh
-
-# Monitorizează progres
-tail -f /opt/oracle/logs/dr/restore_*.log
-
-# După ~45-60 minute, verifică database e OPEN
-docker exec -u oracle oracle-standby bash -c "
-export ORACLE_SID=ROA
-export ORACLE_HOME=/opt/oracle/product/19c/dbhome_1
-\$ORACLE_HOME/bin/sqlplus / as sysdba <<< 'SELECT name, open_mode FROM v\$database;'
-"
-
-# Output așteptat:
-# NAME      OPEN_MODE
-# --------- ----------
-# ROA       READ WRITE
+# Output:
+# Status: OK
+# FULL backup age: 11 hours ✓
+# CUMULATIVE backup age: 2 hours ✓
+# Disk usage: 45% ✓
 ```

-**După restore:**
-1. Update connection strings: `10.0.20.36:1521/ROA` → `10.0.20.37:1521/ROA`
-2. Notifică utilizatori
-3. Test aplicații
-4. Monitorizează performance
-
---
-
-## 📊 ARHITECTURĂ FLOW
+## 🗂️ Component Locations

+### 📁 PRIMARY Server (10.0.20.36)
 ```
-┌──────────────────────────────────────────────┐
-│     PRIMARY 10.0.20.36 (Windows)             │
-│                                              │
-│  02:00 → RMAN Backup COMPRESSED              │
-│          └─ FRA: ~8GB (vs 23GB original)     │
-│                   ↓                          │
-│  21:00 → MareBackup (EXISTENT)              │
-│          └─ Copiere → E:\backup_roa\        │
-│                   ↓                          │
-│  03:00 → Transfer DR (NOU)                  │
-│          └─ SCP → 10.0.20.37                │
-│                                              │
-└──────────────────────────────────────────────┘
-                    ↓ SSH/SCP
-┌──────────────────────────────────────────────┐
-│     DR 10.0.20.37 (Linux LXC 109)           │
-│     Docker: oracle-standby                   │
-│                                              │
-│  /opt/oracle/backups/primary/                │
-│  ├─ *.BKP  (backup files)                   │
-│  └─ Retenție: 1 backup (doar ultimul!)      │
-│                                              │
-│  Database: OPRIT (pornit la dezastru)       │
-│                                              │
-│  La disaster:                                │
-│  → /opt/oracle/scripts/dr/full_dr_restore.sh│
-│  → RTO: 45-75 minute                        │
-│  → RPO: Max 1 zi                            │
-│                                              │
-└──────────────────────────────────────────────┘
+D:\rman_backup\
+├── rman_backup_full.txt          # RMAN script for FULL backup
+├── rman_backup_incremental.txt   # RMAN script for CUMULATIVE
+├── transfer_to_dr.ps1            # Transfer FULL to Proxmox
+└── transfer_incremental.ps1      # Transfer CUMULATIVE to Proxmox
+
+Scheduled Tasks:
+├── 02:30 - Oracle RMAN Full Backup
+├── 13:00 - Oracle RMAN Cumulative Backup
+└── 18:00 - Oracle RMAN Cumulative Backup
 ```

---
-
-## ✅ CHECKLIST IMPLEMENTARE
-
-### Pre-Implementation
- [ ] Backup script RMAN vechi (`rman_backup.txt.backup_*`)
- [ ] Verificare spațiu disk PRIMARY (C:\, D:\, E:\)
- [ ] Verificare spațiu disk DR (`/opt/oracle` >50GB free)
- [ ] Container `oracle-standby` rulează pe DR
-
-### Setup SSH (30 minute)
- [ ] Generare SSH keys pe PRIMARY
- [ ] Copiere public key pe DR
- [ ] Test conexiune passwordless
- [ ] Verificare firewall permite port 22
-
-### PRIMARY Setup (20 minute)
- [ ] Upgrade `rman_backup.txt` (adaugă compression)
- [ ] Copiere `transfer_to_dr.ps1` în `D:\rman_backup\`
- [ ] Creare director `D:\rman_backup\logs\`
- [ ] Setup Task Scheduler (Oracle_DR_Transfer la 03:00 AM)
- [ ] Test manual transfer script
-
-### DR Setup (10 minute)
- [ ] Verificare directoare (`/opt/oracle/backups/primary`)
- [ ] Script `full_dr_restore.sh` instalat
- [ ] Permissions corecte (oracle:dba)
- [ ] Container Oracle functional
-
-### Testing (60 minute)
- [ ] Test manual RMAN backup (verifică compression)
- [ ] Test manual transfer (verifică backup-uri ajung pe DR)
- [ ] Verificare logs transfer (fără erori)
- [ ] Test restore pe DR (OPȚIONAL dar RECOMANDAT!)
-
-### Go-Live
- [ ] Monitorizare 3 nopți consecutive
- [ ] Review logs zilnic
- [ ] Documentare issues
- [ ] Update documentație
-
---
-
-## 📈 MONITORING
-
-### Daily Checks (5 minute)
-
-```powershell
-# Pe PRIMARY - quick health check
-# Check 1: Ultimul backup
-$lastBackup = Get-ChildItem "C:\Users\Oracle\recovery_area\ROA\BACKUPSET" -Recurse -File |
-    Sort-Object LastWriteTime -Descending | Select-Object -First 1
-$age = (Get-Date) - $lastBackup.LastWriteTime
-Write-Host "Last backup: $($age.Hours) hours ago"
-
-# Check 2: Transfer log
-Get-Content "D:\rman_backup\logs\transfer_*.log" | Select-String "completed successfully" | Select-Object -Last 1
-
-# Check 3: Disk space
-Get-PSDrive C,D,E | Format-Table Name, @{L="Free(GB)";E={[math]::Round($_.Free/1GB,1)}}
+### 📁 PROXMOX Host (10.0.20.202)
 ```
+/opt/scripts/
+├── oracle-backup-monitor-proxmox.sh  # Daily backup monitoring
+├── weekly-dr-test-proxmox.sh         # Weekly DR test
+└── PROXMOX_NOTIFICATIONS_README.md   # Documentation
+
+/mnt/pve/oracle-backups/ROA/autobackup/
+├── FULL_20251010_023001.BKP         # Latest FULL backup
+├── INCR_20251010_130001.BKP         # CUMULATIVE 13:00
+└── INCR_20251010_180001.BKP         # CUMULATIVE 18:00
+
+Cron Jobs:
+0 9 * * * /opt/scripts/oracle-backup-monitor-proxmox.sh
+0 6 * * 6 /opt/scripts/weekly-dr-test-proxmox.sh
+```
+
+### 📁 DR VM 109 (10.0.20.37) - When Running
+```
+D:\oracle\scripts\
+├── rman_restore_from_zero.cmd    # Main restore script ⭐
+├── cleanup_database.cmd          # Cleanup after test
+└── mount-nfs.bat                 # Mount F:\ at startup
+
+F:\ (NFS mount from Proxmox)
+└── ROA\autobackup\               # All backup files
+```
+
+## 🔄 How It Works
+
+### Backup Flow (Daily)
+```
+PRIMARY                    PROXMOX
+   │                          │
+   ├─02:30─FULL─Backup────────►
+   │         (6-7 GB)         │
+   │                          │
+   ├─13:00─CUMULATIVE─────────►
+   │         (200 MB)         │
+   │                          │
+   └─18:00─CUMULATIVE─────────►
+             (300 MB)      Storage
+
+                        ┌──────────┐
+                        │ Monitor  │ 09:00 Daily
+                        │ Check Age│ Alert if old
+                        └──────────┘
+```
+
+### Restore Process
+```
+Start VM → Mount F:\ → Copy Backups → RMAN Restore → Database OPEN
+  2min      Auto         2min           8min           Ready!
+
+Total Time: ~15 minutes
+```
+
+## 🔧 Manual Operations
+
+### Test Individual Components

 ```bash
-# Pe DR - săptămânal
-ssh root@10.0.20.37 "ls -lth /opt/oracle/backups/primary/*.BKP | head -5"
+# 1. Test backup transfer (on PRIMARY)
+D:\rman_backup\transfer_incremental.ps1
+
+# 2. Test NFS mount (on VM 109)
+mount -o rw,nolock,mtype=hard,timeout=60 10.0.20.202:/mnt/pve/oracle-backups F:
+dir F:\ROA\autobackup
+
+# 3. Test notification system
+ssh root@10.0.20.202 "touch -d '2 days ago' /mnt/pve/oracle-backups/ROA/autobackup/*FULL*.BKP"
+ssh root@10.0.20.202 "/opt/scripts/oracle-backup-monitor-proxmox.sh"
+# Should send WARNING notification
+
+# 4. Test database restore (on VM 109)
+D:\oracle\scripts\rman_restore_from_zero.cmd
 ```

-### Weekly Checks (10 minute)
+### Force Actions

 ```bash
-# Pe DR - verificare status backup-uri
-ssh root@10.0.20.37 "/opt/oracle/scripts/dr/06_quick_verify_backups.sh"
+# Force backup now (on PRIMARY)
+rman cmdfile=D:\rman_backup\rman_backup_incremental.txt
+
+# Force cleanup VM (on VM 109)
+D:\oracle\scripts\cleanup_database.cmd
+
+# Force VM shutdown
+ssh root@10.0.20.202 "qm stop 109"
 ```

-### Monthly Tasks (OBLIGATORIU!)
+## 🐛 Troubleshooting

-**Prima Duminică a lunii - TEST RESTORE complet:**
+### ❌ Backup Monitor Not Sending Alerts

 ```bash
-# Pe DR - test restore (durează 45-75 min)
-ssh root@10.0.20.37
-/opt/oracle/scripts/dr/05_test_restore_dr.sh
+# 1. Check templates exist
+ssh root@10.0.20.202 "ls /usr/share/pve-manager/templates/default/oracle-*"

-# Verifică raport
-cat /opt/oracle/logs/dr/test_report_$(date +%Y%m%d).txt
+# 2. Reinstall templates
+ssh root@10.0.20.202 "/opt/scripts/oracle-backup-monitor-proxmox.sh --install"
+
+# 3. Check Proxmox notifications work
+ssh root@10.0.20.202 "pvesh create /nodes/$(hostname)/apt/update"
+# Should receive update notification
 ```

- **Review:** Metrics, logs, disk space, RTO
- **Update:** Documentație dacă e necesar
- **Notifică:** Management despre rezultat test
-
---
-
-## 🐛 TROUBLESHOOTING
-
-### "Transfer failed - SSH connection refused"
-
-```powershell
-# Test conexiune
-ping 10.0.20.37
-ssh -v -i "$env:USERPROFILE\.ssh\id_rsa" root@10.0.20.37 "echo OK"
-```
-
-**Soluții:**
- Verifică DR server pornit
- Check firewall (port 22)
- Regenerare SSH keys
-
---
-
-### "RMAN backup failed"
-
-```sql
-- Pe PRIMARY
-sqlplus / as sysdba
-
-- Check FRA usage
-SELECT * FROM v$recovery_area_usage;
-
-- Cleanup manual
-RMAN> DELETE NOPROMPT OBSOLETE;
-```
-
-**Soluții:**
- Disk plin → cleanup old backups
- FRA quota exceeded → increase size
- Oracle process crash → restart database
-
---
-
-### "Restore failed on DR"
+### ❌ F:\ Drive Not Accessible in VM

 ```bash
-# Check backup files integrity
-md5sum /opt/oracle/backups/primary/*.BKP
+# On VM 109:
+# 1. Check NFS Client service
+Get-Service | Where {$_.Name -like "*NFS*"}

-# Check container logs
-docker logs oracle-standby --tail 100
+# 2. Manual mount
+mount -o rw,nolock,mtype=hard,timeout=60 10.0.20.202:/mnt/pve/oracle-backups F:

-# Check Oracle alert log
-docker exec oracle-standby tail -100 /opt/oracle/diag/rdbms/roa/ROA/trace/alert_ROA.log
+# 3. Check Proxmox NFS server
+ssh root@10.0.20.202 "showmount -e localhost"
+# Should show: /mnt/pve/oracle-backups 10.0.20.37
+```
+
+### ❌ Restore Fails
+
+```bash
+# 1. Check backup files exist
+dir F:\ROA\autobackup\*.BKP
+
+# 2. Check Oracle service
+sc query OracleServiceROA
+
+# 3. Check PFILE exists
+dir C:\Users\oracle\admin\ROA\pfile\initROA.ora
+
+# 4. View restore log
+type D:\oracle\logs\restore_from_zero.log
+```
+
+### ❌ VM Won't Start
+
+```bash
+# Check VM status
+ssh root@10.0.20.202 "qm status 109"
+
+# Check VM config
+ssh root@10.0.20.202 "qm config 109 | grep -E 'memory|cores|bootdisk'"
+
+# Force unlock if locked
+ssh root@10.0.20.202 "qm unlock 109"
+
+# Start with console
+ssh root@10.0.20.202 "qm start 109 && qm terminal 109"
+```
+
+## 📈 Monitoring & Metrics
+
+### Key Metrics
+| Metric | Target | Alert Threshold |
+|--------|--------|-----------------|
+| FULL Backup Age | < 24h | > 25h |
+| CUMULATIVE Age | < 6h | > 7h |
+| Backup Size | ~7 GB/day | > 10 GB |
+| Restore Time | < 15 min | > 30 min |
+| Disk Usage | < 80% | > 80% |
+
+### Check Logs
+
+```bash
+# Backup logs (on PRIMARY)
+Get-Content D:\rman_backup\logs\backup_*.log -Tail 50
+
+# Transfer logs (on PRIMARY)
+Get-Content D:\rman_backup\logs\transfer_*.log -Tail 50
+
+# Monitoring logs (on Proxmox)
+tail -50 /var/log/oracle-dr/*.log
+
+# Restore logs (on VM 109)
+type D:\oracle\logs\restore_from_zero.log
+```
+
+## 🔐 Security & Access
+
+### SSH Keys Setup
+```
+PRIMARY (10.0.20.36) ──────► PROXMOX (10.0.20.202)
+                      SSH Key
+                      Port 22
+
+LINUX WORKSTATION ─────────► PROXMOX (10.0.20.202)
+                      SSH Key
+                      Port 22
+
+LINUX WORKSTATION ─────────► VM 109 (10.0.20.37)
+                      SSH Key
+                      Port 22122
+```
+
+### Required Credentials
+- **PRIMARY**: Administrator (for scheduled tasks)
+- **PROXMOX**: root (for scripts and VM control)
+- **VM 109**: romfast (user), SYSTEM (Oracle service)
+
+## 📅 Maintenance Schedule
+
+| Day | Time | Action | Duration | Impact |
+|-----|------|--------|----------|--------|
+| Daily | 02:30 | FULL Backup | 30 min | None |
+| Daily | 09:00 | Monitor Backups | 1 min | None |
+| Daily | 13:00 | CUMULATIVE Backup | 5 min | None |
+| Daily | 18:00 | CUMULATIVE Backup | 5 min | None |
+| Saturday | 06:00 | DR Test | 30 min | None |
+
+## 🚨 Disaster Recovery Procedure
+
+### When PRIMARY is DOWN:
+
+1. **Confirm PRIMARY is unreachable**
+   ```bash
+   ping 10.0.20.36  # Should fail
+   ```
+
+2. **Start DR VM**
+   ```bash
+   ssh root@10.0.20.202 "qm start 109"
+   ```
+
+3. **Wait for boot (3 minutes)**
+
+4. **Connect to DR VM**
+   ```bash
+   ssh -p 22122 romfast@10.0.20.37
+   ```
+
+5. **Run restore**
+   ```cmd
+   D:\oracle\scripts\rman_restore_from_zero.cmd
+   ```
+
+6. **Verify database**
+   ```sql
+   sqlplus / as sysdba
+   SELECT name, open_mode FROM v$database;
+   -- Should show: ROA, READ WRITE
+   ```
+
+7. **Update application connections**
+   - Change from: 10.0.20.36:1521/ROA
+   - Change to: 10.0.20.37:1521/ROA
+
+8. **Monitor DR system**
+   - Database is now production
+   - Do NOT run cleanup!
+   - Keep VM running
+
+## 📝 Quick Reference Card
+
+```
+╔══════════════════════════════════════════════════════════════╗
+║                    DR QUICK REFERENCE                        ║
+╠══════════════════════════════════════════════════════════════╣
+║ PRIMARY DOWN?                                                ║
+║ ssh root@10.0.20.202                                        ║
+║ qm start 109                                                 ║
+║ # Wait 3 min                                                 ║
+║ ssh -p 22122 romfast@10.0.20.37                            ║
+║ D:\oracle\scripts\rman_restore_from_zero.cmd                ║
+╠══════════════════════════════════════════════════════════════╣
+║ TEST DR?                                                     ║
+║ ssh root@10.0.20.202 "/opt/scripts/weekly-dr-test-proxmox.sh"║
+╠══════════════════════════════════════════════════════════════╣
+║ CHECK BACKUPS?                                               ║
+║ ssh root@10.0.20.202 "/opt/scripts/oracle-backup-monitor-proxmox.sh"║
+╠══════════════════════════════════════════════════════════════╣
+║ SUPPORT:                                                     ║
+║ Logs: /var/log/oracle-dr/                                   ║
+║ Docs: /opt/scripts/PROXMOX_NOTIFICATIONS_README.md          ║
+╚══════════════════════════════════════════════════════════════╝
 ```

 ---

-## 📞 SUPPORT
-
-### Log Locations
-
-| Tip | Location |
-|-----|----------|
-| **RMAN Backup** | Oracle Alert Log |
-| **Transfer DR** | `D:\rman_backup\logs\transfer_YYYYMMDD.log` |
-| **Restore DR** | `/opt/oracle/logs/dr/restore_*.log` |
-| **Task Scheduler** | Event Viewer > Task Scheduler |
-
-### Escalation
-
-| Severity | Response Time | Action |
-|----------|---------------|--------|
-| **P1 - PRIMARY Down** | Immediate | Activate DR |
-| **P2 - Backup Failed** | 2 hours | Retry manual |
-| **P3 - Transfer Failed** | 4 hours | Retry next night |
-
---
-
-## 📚 DOCUMENTAȚIE COMPLETĂ
-
-Pentru detalii tehnice complete, vezi:
- **`STRATEGIE_BACKUP_CONTABILITATE.md`** - Strategia completă 4-level protection
- **`PLAN_BACKUP_DR_SIMPLE.md`** - Plan tehnic detaliat original
-
---
-
-## ✨ NEXT STEPS
-
-1. **Citește acest README complet**
-2. **Urmează CHECKLIST IMPLEMENTARE** (secțiunea de mai sus)
-3. **Test manual** toate componentele
-4. **Monitorizare** primele 3 zile după activare
-5. **Schedule primul test restore** lunar (obligatoriu!)
-
---
-
-**Ultima actualizare:** 2025-10-07
-**Status:** Production Ready
-**Versiune:** 1.0
+**Last Updated:** October 10, 2025
+**Version:** 2.0 - Complete DR System with Proxmox Integration
+**Status:** ✅ Production Ready