Commit Graph

4 Commits

Author SHA1 Message Date
Claude Agent
8846c9c855 docs(dr): document failback DR -> PRIMARY procedure + restore script
Adds end-to-end procedure for moving production back from DR (10.0.20.37)
to a repaired/reinstalled PRIMARY (10.0.20.36): final RMAN backup on DR
in restricted/read-only mode, RMAN restore on PRIMARY, app connection
switch, scheduled-task reactivation, VM 109 stop. Companion PowerShell
script handles the restore with sanity checks (IP, NFS, backup freshness)
and aborts if Oracle major version != 19, since failback to 21c would
need an extra dictionary upgrade step (~30-60 min) that adds untested
risk during the critical window — recommended path is 19c failback then
upgrade later in a planned window.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 20:18:09 +00:00
Claude Agent
1d67f0705b docs(dr): refresh README with post-2026-04-25 architecture
VM 109 returned to its original home on pveelite, co-located with
oracle-backups NFS storage. The README is updated to reflect that:
the VM is now in HA (ha-prefer-pveelite, state=stopped, nofailback=1)
rather than excluded from HA, and the new layered defences (trap
guard, watchdog cron, dynamic memory pre-flight, max_restart caps)
are documented alongside the original 8a0c557 trap.

Adds a Storage Failover section describing the pveelite -> pvemini
manual failover flow: email alert from pveelite-down-alert.sh,
failover-dr-to-pvemini.sh on the surviving node, failback when
pveelite returns. The pve1 nightly mirror is the third copy.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 19:19:38 +00:00
Claude Agent
1203c24d63 docs(proxmox): document HA, corosync tuning, diagnostic tools and mail relay
Following the 2026-04-20 cluster outage, the cluster README now covers
HA resource limits, corosync token tuning (10s tolerance for USB glitches),
rasdaemon/netconsole/kdump diagnostic stack on pvemini, mail relay via
mail.romfast.ro with SMTP auth, OOM alerting via cron, and swap on pveelite.

VM 109 README now clearly states it was removed from HA and is only
started by the weekly DR test script.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 11:30:46 +00:00
Marius
a567f75f25 Reorganize oracle/ and chatbot/ into proxmox/ per LXC/VM structure
- Move oracle/migration-scripts/ to proxmox/lxc108-oracle/migration/
- Move oracle/roa/ and oracle/roa-romconstruct/ to proxmox/lxc108-oracle/sql/
- Move oracle/standby-server-scripts/ to proxmox/vm109-windows-dr/
- Move chatbot/ to proxmox/lxc104-flowise/
- Update proxmox/README.md with new structure and navigation
- Update all documentation with correct directory references
- Remove unused input/claude-agent-sdk/ files

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-27 17:28:53 +02:00