Compare commits
2 Commits
8da1208ca7
...
249bf4d98a
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
249bf4d98a | ||
|
|
1b523c1624 |
@@ -195,6 +195,131 @@ ssh root@10.0.20.202 "qm stop 109"
|
|||||||
|
|
||||||
## 🐛 Troubleshooting
|
## 🐛 Troubleshooting
|
||||||
|
|
||||||
|
### 🔍 Debugging Restore Tests
|
||||||
|
|
||||||
|
#### Check Backup Files on Proxmox (10.0.20.202)
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# 1. List all backup files with size and date
|
||||||
|
ssh root@10.0.20.202 "ls -lht /mnt/pve/oracle-backups/ROA/autobackup/*.BKP"
|
||||||
|
|
||||||
|
# 2. Count backup files
|
||||||
|
ssh root@10.0.20.202 "ls /mnt/pve/oracle-backups/ROA/autobackup/*.BKP | wc -l"
|
||||||
|
|
||||||
|
# 3. Check latest backups (last 24 hours)
|
||||||
|
ssh root@10.0.20.202 "find /mnt/pve/oracle-backups/ROA/autobackup -name '*.BKP' -mtime -1 -ls"
|
||||||
|
|
||||||
|
# 4. Show backup files grouped by type (with new naming convention)
|
||||||
|
ssh root@10.0.20.202 "ls -lh /mnt/pve/oracle-backups/ROA/autobackup/ | grep -E '(L0_|L1_|ARC_|SPFILE_|CF_|O1_MF)'"
|
||||||
|
|
||||||
|
# 5. Check disk space usage
|
||||||
|
ssh root@10.0.20.202 "df -h /mnt/pve/oracle-backups"
|
||||||
|
ssh root@10.0.20.202 "du -sh /mnt/pve/oracle-backups/ROA/autobackup/"
|
||||||
|
|
||||||
|
# 6. Verify newest backup timestamp
|
||||||
|
ssh root@10.0.20.202 "stat /mnt/pve/oracle-backups/ROA/autobackup/L0_*.BKP 2>/dev/null | grep Modify || echo 'No L0 backups with new naming'"
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Verify Backup Files on DR VM (when running)
|
||||||
|
|
||||||
|
```powershell
|
||||||
|
# 1. Check NFS mount is accessible
|
||||||
|
Test-Path F:\ROA\autobackup
|
||||||
|
|
||||||
|
# 2. List all backup files
|
||||||
|
Get-ChildItem F:\ROA\autobackup\*.BKP | Format-Table Name, Length, LastWriteTime
|
||||||
|
|
||||||
|
# 3. Count backup files
|
||||||
|
(Get-ChildItem F:\ROA\autobackup\*.BKP).Count
|
||||||
|
|
||||||
|
# 4. Show total backup size
|
||||||
|
"{0:N2} GB" -f ((Get-ChildItem F:\ROA\autobackup\*.BKP | Measure-Object -Property Length -Sum).Sum / 1GB)
|
||||||
|
|
||||||
|
# 5. Check latest Level 0 backup
|
||||||
|
Get-ChildItem F:\ROA\autobackup\L0_*.BKP -ErrorAction SilentlyContinue | Sort-Object LastWriteTime -Descending | Select-Object -First 1
|
||||||
|
|
||||||
|
# 6. Check what was copied during last restore
|
||||||
|
Get-Content D:\oracle\logs\restore_from_zero.log | Select-String "Copying|Copied"
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Check DR Test Results
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# 1. View latest DR test log
|
||||||
|
ssh root@10.0.20.202 "ls -lt /var/log/oracle-dr/dr_test_*.log | head -1 | awk '{print \$9}' | xargs cat | tail -100"
|
||||||
|
|
||||||
|
# 2. Check test status (passed/failed)
|
||||||
|
ssh root@10.0.20.202 "grep -E 'PASSED|FAILED|Database Verification' /var/log/oracle-dr/dr_test_*.log | tail -5"
|
||||||
|
|
||||||
|
# 3. See backup selection logic output
|
||||||
|
ssh root@10.0.20.202 "grep -A5 'TEST MODE: Selecting' /var/log/oracle-dr/dr_test_*.log | tail -20"
|
||||||
|
|
||||||
|
# 4. Check how many files were selected
|
||||||
|
ssh root@10.0.20.202 "grep 'Total files selected' /var/log/oracle-dr/dr_test_*.log | tail -1"
|
||||||
|
|
||||||
|
# 5. View RMAN errors (if any)
|
||||||
|
ssh root@10.0.20.202 "grep -i 'RMAN-\|ORA-' /var/log/oracle-dr/dr_test_*.log | tail -20"
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Simulate Test Locally (on DR VM)
|
||||||
|
|
||||||
|
```powershell
|
||||||
|
# 1. Start Oracle service manually
|
||||||
|
Start-Service OracleServiceROA
|
||||||
|
|
||||||
|
# 2. Run cleanup to prepare for restore
|
||||||
|
D:\oracle\scripts\cleanup_database.ps1 /SILENT
|
||||||
|
|
||||||
|
# 3. Run restore in test mode
|
||||||
|
D:\oracle\scripts\rman_restore_from_zero.ps1 -TestMode
|
||||||
|
|
||||||
|
# 4. Verify database opened correctly
|
||||||
|
sqlplus / as sysdba @D:\oracle\scripts\verify_db.sql
|
||||||
|
|
||||||
|
# 5. Check what backups were used
|
||||||
|
Get-Content D:\oracle\logs\restore_from_zero.log | Select-String "backup piece"
|
||||||
|
|
||||||
|
# 6. View database verification output
|
||||||
|
Get-Content D:\oracle\logs\restore_from_zero.log | Select-String -Pattern "DB_NAME|OPEN_MODE|TABLES" -Context 0,1
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Common Restore Test Issues
|
||||||
|
|
||||||
|
| Issue | Check | Fix |
|
||||||
|
|-------|-------|-----|
|
||||||
|
| Test reports FAILED but DB is open | Check log for "OPEN_MODE: READ WRITE" | Already fixed in latest version |
|
||||||
|
| Missing datafiles in restore | Count backup files: should be 15-40+ | Wait for next full backup or copy all files |
|
||||||
|
| "No backups found" error | Verify NFS mount: `Test-Path F:\` | Remount NFS or check Proxmox NFS service |
|
||||||
|
| Restore takes > 30 min | Check backup size: should be ~5-8 GB | Normal for first restore after format change |
|
||||||
|
| RMAN-06023 errors | Check for L0_*.BKP files on F:\ | Old format: need new backup with naming convention |
|
||||||
|
|
||||||
|
#### Verify Naming Convention is Active
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Check if new naming convention is being used (after Oct 11, 2025)
|
||||||
|
ssh root@10.0.20.202 "ls /mnt/pve/oracle-backups/ROA/autobackup/ | grep -E '^(L0_|L1_|ARC_|SPFILE_|CF_)' | wc -l"
|
||||||
|
# Should return > 0 if active
|
||||||
|
|
||||||
|
# If 0, backups are still using old format (O1_MF_ANNNN_*)
|
||||||
|
# Wait for next scheduled backup (02:30 daily) or run manual backup
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Manual Test Run with Verbose Output
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Run test with full output visible
|
||||||
|
ssh root@10.0.20.202
|
||||||
|
cd /opt/scripts
|
||||||
|
./weekly-dr-test-proxmox.sh 2>&1 | tee /tmp/dr_test_manual.log
|
||||||
|
|
||||||
|
# Watch in real-time what's happening
|
||||||
|
# Look for these key stages:
|
||||||
|
# - "TEST MODE: Selecting latest backup set"
|
||||||
|
# - "Total files selected: XX"
|
||||||
|
# - "RMAN restore completed successfully"
|
||||||
|
# - "OPEN_MODE: READ WRITE"
|
||||||
|
```
|
||||||
|
|
||||||
### ❌ Backup Monitor Not Sending Alerts
|
### ❌ Backup Monitor Not Sending Alerts
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
@@ -384,6 +509,6 @@ LINUX WORKSTATION ─────────► VM 109 (10.0.20.37)
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
**Last Updated:** October 10, 2025
|
**Last Updated:** October 11, 2025
|
||||||
**Version:** 2.0 - Complete DR System with Proxmox Integration
|
**Version:** 2.1 - Added restore test debugging guide + naming convention
|
||||||
**Status:** ✅ Production Ready
|
**Status:** ✅ Production Ready
|
||||||
Reference in New Issue
Block a user