ROMFASTSQL

Author	SHA1	Message	Date
Marius	989477f7a4	Add ROA Oracle Database Windows setup scripts with old client support PowerShell scripts for setting up Oracle 21c/XE with ROA application: - Automated tablespace, user creation and imports - sqlnet.ora config for Instant Client 11g/ODBC compatibility - Oracle 21c read-only Home path handling (homes/OraDB21Home1) - Listener restart + 10G password verifier for legacy auth - Tested on VM 302 with CONTAFIN_ORACLE schema import Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-28 17:08:02 +02:00
Marius	665c2b5d37	Add Oracle 18c sqlnet.ora config for old ODBC/Instant Client 11g compatibility - Add config/sqlnet.ora with ALLOWED_LOGON_VERSION=8 for old client support - Add scripts/fix-sqlnet.sh startup script to persist config across container restarts - Update README with ORA-28040 troubleshooting, ODBC connection params, and deployment instructions - Fix SID description: Oracle 18c has PDB (XEPDB1), not non-CDB - Update container recreation instructions with startup scripts volume Resolves ORA-28040: No matching authentication protocol when connecting from Windows ODBC with Oracle Instant Client 11.2 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-28 00:22:03 +02:00
Marius	fb474c3726	curatare	2026-01-27 23:43:20 +02:00
Marius	d2b24c1c47	Update Oracle 18c/21c export scripts and documentation - Increase LXC 108 memory from 4GB to 8GB + 2GB swap - Add manual startup/shutdown instructions for Oracle containers - Document CDB/PDB architecture and correct connection strings - Fix export-roa2.sh: use XEPDB1 PDB for Oracle 18c, separate DMPDIR - Fix export-roa2.ps1: dual DMPDIR paths, auto-start containers - Add container/database status checks before export - Add TNS entries with SERVICE_NAME=XEPDB1 (not SID=XE) - Document DBMS_CUBE_EXP warnings as harmless Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-27 23:42:34 +02:00
Marius	7c6e54f018	Add CLAUDE.md for Claude Code guidance Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-27 18:07:23 +02:00
Marius	c5936791d0	Rename claude-agent to lxc171-claude-agent with full setup documentation - Rename proxmox/claude-agent/ to proxmox/lxc171-claude-agent/ - Move scripts to scripts/ subdirectory - Add complete installation guide for new LXC from scratch - Update proxmox/README.md with LXC 171 documentation and navigation - Add LXC 171 to containers table - Remove .serena/project.yml Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-27 17:54:17 +02:00
Marius	a567f75f25	Reorganize oracle/ and chatbot/ into proxmox/ per LXC/VM structure - Move oracle/migration-scripts/ to proxmox/lxc108-oracle/migration/ - Move oracle/roa/ and oracle/roa-romconstruct/ to proxmox/lxc108-oracle/sql/ - Move oracle/standby-server-scripts/ to proxmox/vm109-windows-dr/ - Move chatbot/ to proxmox/lxc104-flowise/ - Update proxmox/README.md with new structure and navigation - Update all documentation with correct directory references - Remove unused input/claude-agent-sdk/ files Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-27 17:28:53 +02:00
Marius	4d51d5b2d2	Reorganize proxmox documentation into subdirectories per LXC/VM - Create cluster/ for Proxmox cluster infrastructure (SSH guide, HA monitor, UPS) - Create lxc108-oracle/ for Oracle Database documentation and scripts - Create vm201-windows/ for Windows 11 VM docs and SSL certificate scripts - Add SSL certificate monitoring scripts (check-ssl-certificates.ps1, monitor-ssl-certificates.sh) - Remove archived VM107 references (decommissioned) - Update all cross-references between files - Update main README.md with new structure and navigation Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-27 17:02:49 +02:00
Marius	1da4c2347c	Fix Oracle 10g compatibility in PACK_CONTAFIN SCRIE_JC_2007 procedure Replace FORALL bulk operations with FOR loops to avoid PLS-00436 error on Oracle 10.2.0.5. The older Oracle version does not support referencing record fields from collection in FORALL statements. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-15 13:06:17 +02:00
Marius	a96b9b8d8b	romconstruct	2026-01-15 13:00:14 +02:00
Marius	1011d9202c	Fix UPS notifications and add periodic battery status emails - Fix permission denied on log files (chown nut:nut) - Fix upssched.conf permissions (root:nut) - Add sudo for perl to allow PVE::Notify from user nut - Add periodic battery status emails every minute when on battery - Add charging status emails at 5, 10, 30 min after power restore - Remove diacritics from all notification messages - Update documentation with sudo and permissions setup Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-14 13:06:25 +02:00
Marius	ab6ac77d50	Add UPS email notifications and automatic UPS shutdown - Add email notifications via PVE::Notify for all UPS events: - ONBATT: when UPS switches to battery - ONLINE: when power is restored - LOWBATT: critical battery level - SHUTDOWN_START/NODE/PRIMARY: during cluster shutdown - COMMBAD: communication lost with UPS - Add automatic UPS shutdown command after cluster shutdown (protects against power surge when power returns) - Update upssched.conf with ONLINE handler and immediate ONBATT notification - Add notification templates for HTML and text emails - Update documentation with new features and timer configuration Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-13 20:11:30 +02:00
Marius	e0f84298e9	curatare	2026-01-11 15:29:12 +02:00
Marius	00c6410dbd	Document VM 201 power outage incident and update HA configuration - Add troubleshooting guide for 2026-01-11 power outage incident - Update vm201-windows11.md with correct storage details (disk-1, disk-3) - Remove HA configuration, document manual failover procedure - Add ZFS replication status and commands - Document lessons learned: ISO attachments block migration 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-11 15:28:00 +02:00
Marius	594b77e449	Add Claude Agent LXC setup and workflow scripts - Create LXC 171 (claude-agent) on pveelite with Ubuntu 24.04 - Install Node.js 20.x, Claude Code, tmux, Tailscale - Configure SSH access and Gitea integration - Add workflow scripts: start-agent.sh, work.sh, new-task.sh, finish-task.sh - Add code-server for mobile file browsing - Document complete setup in proxmox/claude-agent/README.md LXC Details: - IP internal: 10.0.20.171 - IP Tailscale: 100.95.55.51 - code-server: port 8080 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-31 18:53:23 +02:00
Marius	f01341a707	Add Claude Code configuration 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-31 01:03:23 +02:00
Marius	42f5d5ac85	Add chatbot infrastructure documentation Document Flowise and ngrok configuration on LXC 104, including troubleshooting steps for CORS and version issues. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-31 01:02:50 +02:00
Marius	91b9e08e9d	adunare generala	2025-12-20 22:23:11 +02:00
Marius	90d77704d6	Reorganize Proxmox documentation with clear structure and VM/LXC mapping ## Changes ### Documentation Reorganization - README.md: Complete restructure with logical sections - Infrastructure General (proxmox-ssh-guide.md) - LXC Containers (oracle-database-lxc108.md) - Virtual Machines (vm201-.md) - Cluster-Wide Resources (cluster-ha-monitor.sh, ups/) - Archived/Decommissioned (archived-vm107-monitor.sh) - Added quick navigation "Am nevoie să..." section - Added recommended workflows - Added complete directory structure map - proxmox-ssh-guide.md: Added documentation references section - Clear links to all related documentation - When to use each document - Quick start snippets for each resource ### File Renames for Clarity - `certificat-letsencrypt-iis.md` → `vm201-certificat-letsencrypt-iis.md` - `troubleshooting-vm201-backup-nfs.md` → `vm201-troubleshooting-backup-nfs.md` - `ha-monitor.sh` → `cluster-ha-monitor.sh` - `vm107-monitor.sh` → `archived-vm107-monitor.sh` ### New Documentation - vm201-windows11.md: Complete VM 201 documentation - Hardware configuration - Installed services (IIS, SQLPlus, WinNUT, RDP) - Network configuration - Backup and recovery procedures - Common troubleshooting ## Benefits - Clear naming convention: VM/LXC/Cluster prefixes - Central index in README.md with navigation - Cross-references between documents - Complete VM 201 documentation suite - Clear archival of decommissioned resources 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-19 13:43:44 +02:00
Marius	cd7b2ed9e7	Clarify storage configuration and fix node names Storage Configuration improvements: - Add "Noduri" column showing which nodes have access to each storage - Clarify that 'local' is separate on each node (non-shared) - Clarify that 'local-zfs' is shared across pvemini, pve1, pveelite - Clarify that 'backup' is only on pvemini (10.0.20.201) - Add detailed explanations for each storage type - Add storage paths section with important locations Node name corrections: - Fix node name: pve2 → pveelite (correct cluster name) - Update all references across proxmox-ssh-guide.md and README.md - Add node descriptions in tables for clarity Benefits: - Users now know exactly which storage is available on which nodes - Clear distinction between shared and non-shared storage - Correct node naming throughout documentation 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-19 13:01:08 +02:00
Marius	f1b982794b	Reorganize Oracle and Proxmox documentation structure - Move oracle/CONEXIUNI-ORACLE.md → proxmox/oracle-database-lxc108.md - Create proxmox/README.md as documentation index - Update proxmox-ssh-guide.md: * Remove VM 107 references (decommissioned) * Update LXC and VM tables with IP addresses * Add IP address map for all services * Simplify Oracle section (detailed info in oracle-database-lxc108.md) * Update backup job configuration Benefits: - All infrastructure docs in proxmox/ directory - Clear separation: general Proxmox (proxmox-ssh-guide.md) vs Oracle-specific (oracle-database-lxc108.md) - No duplicate information between files - Easy navigation with README.md index 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-19 12:58:22 +02:00
Marius	b4c2a24281	Fix Oracle DR test ORA-00600 error by forcing service shutdown in cleanup Problem: DR weekly test failed with ORA-00600 [kcbzib_kcrsds_1] when executed via cron, but succeeded when run manually. Error occurred during "ALTER DATABASE OPEN RESETLOGS" step after successful restore and recovery. Root Cause Analysis: - Manual test (12:09): Undo initialization = 0ms, no errors - Cron test (10:45): Undo initialization = 2735ms, ORA-00600 crash - Alert log showed: "Undo initialization recovery: err:600" - Oracle instance was in inconsistent state from previous run The cleanup_database.ps1 script had an "optimization" that preserved the running Oracle service to "save ~30s startup time". This left the service in an inconsistent state between test runs, causing Oracle to crash when attempting to open the database with RESETLOGS. Solution: Modified cleanup_database.ps1 to ALWAYS stop Oracle service completely: 1. SHUTDOWN ABORT the instance (not just when /AFTER flag) 2. Stop-Service OracleServiceROA (force clean state) 3. Kill remaining oracle processes 4. Service starts fresh during restore (clean Undo initialization) Changes: - Removed if/else branch that skipped shutdown before restore - Always perform full shutdown regardless of /AFTER parameter - Updated messages to reflect clean state approach - Added explanation: "This ensures no state inconsistencies (prevents ORA-00600)" Testing: Manual test confirmed clean 0ms Undo initialization after fix. Related: Works in conjunction with weekly-dr-test-proxmox.sh PATH fix (commit `34f91ba`) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-06 12:25:38 +02:00
Marius	34f91ba206	Fix Oracle DR test cron execution by adding explicit PATH Problem: The weekly DR test script worked when run manually but failed when executed via cron with "Failed to start VM 109" error at 0 seconds. Cause: Cron jobs run with a minimal PATH that doesn't include /usr/sbin where Proxmox commands (qm, pvesh, etc.) are located. Manual execution had the full PATH including /usr/sbin. Solution: Added explicit PATH export at the start of the script to ensure all required system binaries are accessible regardless of execution context. Testing: Successfully verified with cron test at 11:32 - VM started properly, restore process completed normally. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-06 11:37:27 +02:00
Marius	c715a0a89d	transfer backups	2025-10-31 01:19:15 +02:00
Marius	63bcdf5c7f	copiere backup	2025-10-31 01:05:31 +02:00
Marius	13a7cd6d96	IIS SSL certificat	2025-10-26 18:52:44 +02:00
Marius	bc75ce30c2	Add chatbot documentation and Claude agent SDK resources	2025-10-21 16:07:35 +03:00
Marius	132b4fb34b	Proxmox HA: Fix false FAILED alerts and suppress cron notification emails Fixed two critical issues with HA monitoring: 1. False positive quorum errors - corosync-quorumtool not in cron PATH 2. Unwanted cron emails from PVE::Notify INFO messages to STDERR Changes: - Set proper PATH including /usr/sbin for corosync-quorumtool - Split notification code: verbose shows all, non-verbose redirects STDERR to /dev/null - Prevents cron from sending duplicate notification emails 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-16 13:35:43 +03:00
Marius	8bb494c60e	Oracle DR: Fix backup retention to keep exactly 2 days instead of 3 Changed -mtime logic from +$RetentionDays to +($RetentionDays - 1) to correctly implement 2-day retention. Previously kept 3 days (today + 2 previous), now keeps exactly 2 days (today + yesterday). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-15 22:54:02 +03:00
Marius	b50cc2b8c4	Oracle DR: Fix backup retention and monitoring for new naming convention Problem: Backups accumulated on DR (73 files, 4 days) instead of keeping only 2 days - transfer_incremental.ps1 had no cleanup function (ran 2x/day without cleanup) - transfer_to_dr.ps1 cleanup had poor logging - oracle-backup-monitor-proxmox.sh couldn't detect new L0/L1 backup format Changes: - Add cleanup to transfer_incremental.ps1 (delete backups older than 2 days) - Improve cleanup logging in transfer_to_dr.ps1 (shows count before/after) - Update oracle-backup-monitor-proxmox.sh to detect both naming conventions: * Old: FULL.BKP, INCR.BKP * New: L0_.BKP (Level 0), L1_.BKP (Level 1) - Remove temporary files from /input/ directory Result: Monitor now correctly reports backup age, cleanup runs after each transfer 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-14 18:05:11 +03:00
Marius	249bf4d98a	debugging	2025-10-11 19:18:37 +03:00
Marius	1b523c1624	Oracle DR: Add comprehensive restore test debugging guide to README - Add section 'Debugging Restore Tests' with practical troubleshooting commands - Check backup files on Proxmox: list, count, verify timestamps - Verify backup files on DR VM: NFS mount, file counts, sizes - Check DR test results: parse logs for PASSED/FAILED status - Simulate test locally: manual restore steps for debugging - Common issues table with checks and fixes - Verify naming convention is active (L0_, L1_ format) - Manual test run with verbose output for real-time monitoring Helps diagnose issues like: - False FAILED notifications - Missing datafiles - RMAN-06023 errors - Backup selection problems Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>	2025-10-11 19:03:08 +03:00
Marius	8da1208ca7	Oracle DR: Fix false FAILED notification - parse database status from log - Replace complex SSH+PowerShell query with simple log file parsing - rman_restore_from_zero.ps1 already verifies and outputs database status - Parse 'OPEN_MODE: READ WRITE' and 'TABLES: <count>' from LOG_FILE - Fixes issue where successful restore was reported as FAILED - More reliable: avoids SSH escaping issues with Select-String -Quiet Root cause: SSH+PowerShell+sqlplus+Select-String chain was too fragile and returned empty/false even when database was successfully opened (42625 tables). Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>	2025-10-11 18:55:05 +03:00
Marius	a7273d1820	Oracle RMAN: Fix backup location - add full path to FORMAT - Add full path to FORMAT in rman_backup.txt and rman_backup_incremental.txt - Files now stored in C:\Users\oracle\recovery_area\ROA\autobackup- Fixes issue where backups were created in ORACLE_HOME\DATABASE instead of recovery area - Ensures transfer_to_dr.ps1 can find and transfer all backups correctly Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>	2025-10-11 16:26:56 +03:00
Marius	62848e635d	Oracle DR: Add naming convention to RMAN backups for smart restore selection - Add FORMAT to rman_backup.txt: L0_, ARC_, SPFILE_, CF_ - Add FORMAT to rman_backup_incremental.txt: L1_, ARC_, SPFILE_, CF_ - Update rman_restore_from_zero.ps1 TestMode to select files by naming convention - Select only latest L0 backup set + all L1 incrementals/archives (faster DR tests) - Backward compatible with old autobackup naming (fallback to copy all) - Fixes missing datafiles issue (previously only copied 8 files, now copies full backup set) Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>	2025-10-11 16:12:41 +03:00
Marius	f1002d6e4a	Oracle DR: Add /AFTER parameter to cleanup - smart shutdown based on context Critical fix based on user analysis: PROBLEM: Cleanup is called in 2 contexts with different requirements: 1. BEFORE restore (from rman_restore): Should NOT shutdown 2. AFTER restore (from weekly-test): MUST shutdown to delete files USER INSIGHT: "Why shutdown if restore will clean anyway? But AFTER restore, you MUST shutdown to release file locks for deletion!" SOLUTION: Add /AFTER parameter to cleanup_database.ps1: WITHOUT /AFTER (before restore): - Skip SHUTDOWN ABORT - Skip Stop-Service - Leave service in current state (running/stopped) - Files CAN be deleted (no lock before restore) - Optimization: If service running → restore saves ~30s WITH /AFTER (after restore): - SHUTDOWN ABORT (stop instance) - Stop-Service (release file locks) - REQUIRED for file deletion after restore - Files are locked by active instance/service CALL SITES: 1. rman_restore: cleanup_database.ps1 /SILENT (no /AFTER) 2. weekly-test: cleanup_database.ps1 /SILENT /AFTER (with /AFTER) FLOW OPTIMIZATION: Test 1: Service stopped → start(30s) → restore → cleanup /AFTER Test 2: Service stopped → start(30s) → restore → cleanup /AFTER → No improvement yet BUT if we keep service running between tests: Test 1: Service stopped → start(30s) → restore → cleanup /AFTER Test 2: Service running → restore(0s saved!) → cleanup /AFTER → Save 30s on subsequent tests! Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>	2025-10-11 15:34:00 +03:00
Marius	5af33fc217	Oracle DR: Add SHUTDOWN ABORT before NOMOUNT to clean auto-started instance Critical fix for service auto-start behavior: Problem (identified by user): - When Oracle service starts, it automatically tries to start instance - Uses configured PFILE which references control files - After cleanup, control files don't exist - Instance ends up in partial/error state - STARTUP NOMOUNT may fail or behave unexpectedly Root Cause: - Oracle service on Windows has auto-start behavior - Service startup takes ~30s trying to start instance - Without valid control files, instance is partially started - This interferes with manual STARTUP NOMOUNT Solution: Before STARTUP NOMOUNT, explicitly clean any existing instance: ```sql SHUTDOWN ABORT; -- Clean any partial instance STARTUP NOMOUNT PFILE='...'; -- Fresh clean start ``` Implementation: - Use WHENEVER SQLERROR CONTINUE (SHUTDOWN may error if no instance) - Explicit SHUTDOWN ABORT before NOMOUNT - Ensures clean instance state for RMAN restore - Service running + clean NOMOUNT instance = ready for restore User requirement met: Instance in NOMOUNT state (not mounted/open) Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>	2025-10-11 15:18:45 +03:00
Marius	4b7bb29b9e	Oracle DR: Fix service start with polling and timeout (not blocking) Critical fix - service MUST be running for SQLPlus connection: Problem (confirmed by user): - After cleanup, service is stopped - sqlplus / as sysdba → ORA-12560: TNS:protocol adapter error - Start-Service blocks indefinitely (user saw 25+ warnings) - Service takes ~30 seconds to start Previous attempt (WRONG): - Assumed SQLPlus works with stopped service ✗ - User proved ORA-12560 occurs when service stopped ✓ Correct Solution: - Start service in background job (non-blocking) - Poll service status every 3 seconds - Timeout after 60 seconds (2x expected startup time) - Progress logging every 15 seconds - Cleanup background job when done Implementation: ```powershell Start-Job { Start-Service OracleServiceROA } while (elapsed < 60s) { if (service.Status == Running) → break sleep 3s } ``` Result: - Service starts in ~30s (user confirmed) - Script doesn't block - SQL*Plus can connect successfully - Graceful fallback if timeout exceeded Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>	2025-10-11 15:14:24 +03:00
Marius	e4df4c11d8	Oracle DR: Fix service start hang - don't start stopped service Critical fix for service preservation: Problem: - After cleanup, Oracle service is stopped - Start-Service attempts to start the instance automatically - Without database files, service startup hangs indefinitely - PowerShell Start-Service blocks waiting for service to start Root Cause: - Oracle service on Windows tries to auto-start the instance - With no controlfile/database files, it cannot start - Start-Service waits forever (user reported 25+ warnings) Solution: - Do NOT attempt to start the stopped service - SQLPlus can connect '/ as sysdba' even if service is stopped - STARTUP NOMOUNT will manually start the instance - This is the correct Oracle workflow for restore from zero Windows SQLPlus requirements: ✓ ORACLE_SID set (we set this) ✓ Service exists in registry (preserved after cleanup) ✓ ORACLE_HOME set (we set this) ✗ Service running NOT required for NOMOUNT startup The service will naturally transition to Running state when STARTUP NOMOUNT successfully starts the instance. Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>	2025-10-11 15:12:21 +03:00
Marius	4256d5a914	Oracle DR: Optimize backup copy - TestMode only copies latest backup set Major performance optimization for weekly DR tests: TestMode (weekly testing): - Copy ONLY latest full backup + everything after it - Includes: latest DAILY_FULL + incrementals + controlfiles + SPFILE - Excludes: older full backups (not needed for testing) - Benefit: ~60-70% reduction (14GB → 4-5GB) - Copy time: 2min → 30-45sec (saves ~1-1.5 min) - Risk: Low - testing only needs to verify latest backup works Standalone Mode (real DR): - Copy ALL backups (unchanged behavior) - Includes: all full backups + redundancy for fallback - Benefit: Maximum safety for disaster recovery - If latest backup corrupted → RMAN uses previous backup Implementation: - Finds latest DAILY_FULL.BKP (Level 0 backup) - Gets its timestamp - Copies all *.BKP files >= that timestamp - Automatic inclusion of incrementals, controlfiles, SPFILE backups Combined optimization results: - VM polling: saves 60-120s - Service preservation: saves 40s - Backup copy (TestMode): saves 60-90s Total: 160-250 seconds (2.5-4 minutes) per test Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>	2025-10-11 14:58:12 +03:00
Marius	5750b42836	Oracle DR: Replace fixed VM boot wait with intelligent polling Performance optimization for VM startup: Before: Fixed 180s wait regardless of actual boot time After: Intelligent polling with early exit when VM is ready Implementation: - Poll every 5 seconds (max 180s timeout) - Check 1: VM running status in Proxmox (qm status) - Check 2: SSH connectivity test - Check 3: PowerShell availability (what we actually need) - Exit immediately when all checks pass - Progress logging every 30 seconds - Fallback: Continue after 180s with warning Benefits: - Fast VM boot (30s) → saves 150s (2min 30s) - Normal VM boot (60s) → saves 120s (2min) - Slow VM boot → 180s (same as before) - More robust: verifies SSH+PowerShell actually work Average expected improvement: 60-120 seconds per test Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>	2025-10-11 14:30:32 +03:00
Marius	835d8b465b	Oracle DR: Fix database verification, add bash log, and collect full RMAN log Critical fixes and improvements: 1. Database verification fix (robust): - Use Select-String -Quiet to get True/False boolean - Convert PowerShell boolean to bash-friendly format - Check for 'READ WRITE' in entire sqlplus output - Eliminates false negatives from text parsing issues 2. Collect FULL RMAN restore log: - Removed -Head 200 limitation - Now sends complete RMAN log in email - Better debugging with full context - Updated templates: "first 200 lines" → "complete" 3. Add bash script log to email notifications: - Include last 100 lines of bash execution log - Separate "RMAN Restore Log" and "Bash Script Log" sections - Both text and HTML templates updated - Shows script flow and any bash-level errors This fixes the issue where 42,625 tables were restored successfully but test reported FAILED due to query output format mismatch. Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>	2025-10-11 14:25:58 +03:00
Marius	12700261c7	Oracle DR: Fix database verification and restore log collection Critical fixes for false negatives in DR test reporting: 1. Database verification fix: - Changed from 'findstr' (CMD) to 'Select-String' (PowerShell native) - findstr was failing in PowerShell context causing db_status to be empty - Result: DB with 42,625 tables was incorrectly reported as FAILED 2. Restore log collection fix: - Changed from 'type' (CMD) to 'Get-Content' (PowerShell native) - type command doesn't work through SSH PowerShell context - Added -ErrorAction SilentlyContinue for cleaner error handling - Simplified fallback logic using [-z] instead of string matching Both issues were caused by mixing CMD commands in PowerShell context. Now uses PowerShell-native commands throughout for consistency. Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>	2025-10-11 11:11:37 +03:00
Marius	c4504cac70	Oracle DR: Increase recovery area size to 50G Adjust db_recovery_file_dest_size in auto-generated PFILE: - Previous: 20G - New: 50G - Reason: Provide more space for RMAN restore operations and backups Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>	2025-10-11 11:06:33 +03:00
Marius	eade344f28	Oracle DR: Auto-create PFILE if missing using tested configuration Enhancement to rman_restore_from_zero.ps1: - Auto-generate initROA.ora if not found at service creation - Uses exact tested configuration from initROA.ora: - memory_target=1024M (tested DR VM allocation) - _allow_resetlogs_corruption=TRUE (critical for DR restore!) - control_files in oradata + recovery_area - Standard Oracle 19c parameters for DR environment Benefits: - Script is now fully self-sufficient - No manual PFILE setup required - DR VM can be restored from completely clean state - Uses battle-tested configuration Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>	2025-10-11 11:05:17 +03:00
Marius	5bed910b1c	Oracle DR: Optimize test speed by preserving service between tests Performance improvements: - cleanup_database.ps1: Skip service deletion (saves ~25s per test) - Remove oradim -delete, sc.exe delete, registry cleanup - Add SPFILE deletion to ensure PFILE-based startup - Service now persists between tests for reuse - rman_restore_from_zero.ps1: Smart service check (saves ~15s per test) - Check if service exists before creating - Skip oradim -new if service already present - Only create service on first run or if missing Total time savings: ~40 seconds per weekly DR test Service lifecycle: Created once, reused indefinitely until manual cleanup Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>	2025-10-11 10:59:43 +03:00
Marius	3a51880c9e	Oracle DR: Fix RMAN crosscheck sequence and improve error handling - Fix CROSSCHECK BACKUP command to execute after database is mounted - Correct CATALOG command to use recovery_area instead of F:\ path - Add robust backup file validation with detailed error reporting - Improve file-by-file backup copying with individual error tracking - Enhance restore log collection for both success and failure scenarios - Fix database verification to check OPEN_MODE instead of STATUS - Add comprehensive directory and permissions error handling Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>	2025-10-11 10:32:49 +03:00
Marius	9ed0ee9e0e	Oracle DR: Add TestMode parameter for dual behavior rman_restore_from_zero.ps1: - Add -TestMode switch parameter - TestMode (weekly DR test): Skip service/listener config, only verify restore works - Standalone mode: Full config with SPFILE + Listener for production use weekly-dr-test-proxmox.sh: - Call restore script with -TestMode flag - Avoids service recreation and SSH disconnect during tests Benefits: - Weekly tests are faster and cleaner (no service restart) - Manual restore prepares system for production use - No more 'Broken pipe' errors during tests Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>	2025-10-11 02:33:43 +03:00
Marius	f79331f7cc	Oracle DR: Fix service recreation causing SSH disconnect Remove service delete/recreate at step 3.3 that was causing 'Broken pipe' error Service is already configured with auto-start at step 2.1 - no need to recreate Issue: oradim -delete was killing running database and breaking SSH connection Solution: Skip recreation, service already has correct auto-start configuration Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>	2025-10-11 02:29:38 +03:00
Marius	6accd1f996	Oracle DR: Fix verification commands and auto-start services weekly-dr-test-proxmox.sh: - Replace Unix commands (echo, grep) with PowerShell equivalents - Use PowerShell Select-String for database status verification - Fix table count query to work properly through SSH rman_restore_from_zero.ps1: - Set Oracle service to AUTOMATIC startup (was manual) - Set Listener service to AUTOMATIC startup - Auto-start Listener after database restore - Add fallback to lsnrctl if service start fails Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>	2025-10-11 02:03:57 +03:00

1 2

86 Commits