Claude Agent 62e9926bd4 feat(dr): add cluster + memory pre-flight, deploy VM 109 watchdog
DR test script now refuses to start VM 109 if:
  * cluster is not quorate (e.g. mid-failover into a degraded state),
  * available memory on the host is below VM 109 config + 1 GB margin.

Both checks scale automatically — memory threshold is computed from
qm config so resizing VM 109 does not require touching the script.

Adds vm109-watchdog.sh, scheduled cluster-wide every minute. The
watchdog is the second line of defence behind the cleanup trap from
8a0c557: it force-stops VM 109 if the trap was bypassed (script
killed, host crash mid-test, manual run forgotten). It honours
/var/run/vm109-debug.flag for legitimate manual sessions and is
node-aware via /etc/pve/qemu-server/109.conf so it can be deployed
on every node without coordinating with VM 109's current location.

Both safeguards target the 04-18 → 04-20 chain: VM 109 left running
2.5 days then sandwiched against an HA failover that pushed CT 108
Oracle (8 GB) onto pveelite (16 GB) → OOM cascade.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 18:48:12 +00:00
2025-09-30 02:06:34 +03:00
Description
No description provided
1.9 MiB
Languages
PLSQL 49.3%
PowerShell 21.8%
Shell 21.4%
Batchfile 3.7%
Python 3.5%
Other 0.3%