Following the 2026-04-20 cluster outage, the cluster README now covers HA resource limits, corosync token tuning (10s tolerance for USB glitches), rasdaemon/netconsole/kdump diagnostic stack on pvemini, mail relay via mail.romfast.ro with SMTP auth, OOM alerting via cron, and swap on pveelite. VM 109 README now clearly states it was removed from HA and is only started by the weekly DR test script. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
17 KiB
Documentație Proxmox Infrastructure - ROMFASTSQL
Structură Directoare
proxmox/
├── README.md # Acest fișier (index principal)
│
├── cluster/ # Infrastructură cluster Proxmox
│ ├── README.md # Ghid SSH, HA, corosync tuning, diagnostic tools, mail relay, OOM alert
│ ├── cluster-ha-monitor.sh # Script monitorizare HA
│ ├── incidents/ # Post-mortems incidente cluster
│ │ └── 2026-04-20-cluster-outage.md # Cascadă OOM pveelite + USB LAN watchdog
│ └── ups/ # Sistem UPS pentru cluster
│ ├── README.md
│ ├── docs/
│ ├── scripts/
│ └── config/
│
├── lxc103-dokploy/ # LXC 103 - Dokploy + Traefik (Deployment Platform)
│ ├── README.md # Configurare, arhitectură, workflow deploy
│ └── docs/
│ └── pdf-qr-app.md # Deploy pdf-qr-app pe Dokploy
│
├── lxc104-flowise/ # LXC 104 - Flowise AI (Chatbot Maria)
│ ├── README.md # Infrastructură chatbot, ngrok, troubleshooting
│ └── docs/
│ ├── prd.md # Product Requirements Document
│ ├── v1-arhitectura.md # Arhitectură v1 (Flowise + Groq)
│ └── v2-arhitectura.md # Arhitectură v2 (Claude Agent SDK)
│
├── lxc108-oracle/ # LXC 108 - Oracle Database XE 21c
│ ├── README.md # Documentație completă Oracle
│ ├── scripts/
│ │ ├── export-roa2.sh
│ │ └── export-roa2.ps1
│ ├── migration/ # Scripturi migrare Oracle 10g → 21c
│ │ ├── README.md
│ │ ├── 00-MASTER-MIGRATION.sh
│ │ └── ...
│ └── sql/
│ ├── roa/ # SQL-uri Oracle 10g compatibilitate
│ └── roa-romconstruct/ # Package PACK_CONTAFIN
│
├── vm109-windows-dr/ # VM 109 - Windows Standby (Disaster Recovery)
│ ├── README.md # Configurare DR, RMAN backup
│ ├── docs/
│ │ ├── PLAN_TESTARE_MONITORIZARE.md
│ │ ├── PROXMOX_NOTIFICATIONS_README.md
│ │ └── archive/ # Planuri și statusuri anterioare
│ └── scripts/
│ ├── rman_backup*.bat # Scripturi RMAN Windows
│ ├── transfer_backups.ps1 # Transfer backup-uri
│ └── *-proxmox.sh # Monitorizare din Proxmox
│
├── vm201-windows/ # VM 201 - Windows 11 (roacentral)
│ ├── README.md # Informații generale VM
│ ├── docs/
│ │ ├── vm201-certificat-letsencrypt-iis.md
│ │ ├── vm201-dokploy-infrastructure.md # Arhitectură Dokploy + domenii
│ │ ├── vm201-troubleshooting-backup-nfs.md
│ │ └── vm201-troubleshooting-pana-curent-2026-01-11.md
│ ├── iis-configs/ # web.config pentru site-uri IIS
│ │ ├── roa-qr.web.config # Proxy roa-qr.romfast.ro → LXC 103
│ │ └── roa-apps-wildcard.web.config # Proxy *.roa.romfast.ro → LXC 103
│ └── scripts/
│ ├── check-ssl-certificates.ps1
│ ├── monitor-ssl-certificates.sh
│ └── setup-new-iis-sites.ps1 # Setup site-uri IIS noi (Dokploy)
│
├── lxc110-moltbot/ # LXC 110 - MoltBot (AI Telegram Bot)
│ ├── README.md # Configurare, securitate, comenzi
│ └── docs/
│
└── lxc171-claude-agent/ # LXC 171 - Claude Agent (Development)
├── README.md # Configurare, conectare, workflow
└── scripts/
├── start-agent.sh # Pornire sesiune tmux
├── work.sh # Meniu interactiv workflow
├── new-task.sh # Creare branch nou
└── finish-task.sh # Finalizare task (commit+push)
Documentație per Componentă
Cluster Proxmox
Director: cluster/
| Fișier | Descriere |
|---|---|
README.md |
Ghid complet: SSH, noduri, storage, comenzi Proxmox, hartă IP-uri |
cluster-ha-monitor.sh |
Script monitorizare High Availability |
ups/ |
Sistem UPS: configurare NUT, shutdown orchestrat, test baterie |
Quick Start:
# Acces Proxmox
ssh root@10.0.20.201
# Status cluster
ssh root@10.0.20.201 "pvecm status"
# Status UPS
ssh root@10.0.20.201 "upsc nutdev1"
LXC 106 - Gitea (Git Server)
Director: lxc106-gitea/
IP: 10.0.20.165 | Host: pvemini
| Fișier | Descriere |
|---|---|
README.md |
Configurare, operații frecvente, modificare app.ini, troubleshooting |
Quick Start:
# Status
ssh root@10.0.20.201 "pct exec 106 -- docker ps"
# Restart după modificare config
ssh root@10.0.20.201 "pct exec 106 -- docker restart gitea"
# Logs
ssh root@10.0.20.201 "pct exec 106 -- docker logs gitea --tail 50"
LXC 103 - Dokploy + Traefik (Deployment Platform)
Director: lxc103-dokploy/
IP: 10.0.20.167 | Host: pvemini
| Fișier | Descriere |
|---|---|
README.md |
Configurare, arhitectură, workflow deploy app-uri |
docs/pdf-qr-app.md |
Deploy pdf-qr-app în Dokploy |
Rol: Control plane pentru deployment aplicații publice ROMFAST.
Traefik pe LXC 103 routează toate subdomeniile *.roa.romfast.ro.
Quick Start:
# Status containere Dokploy + Traefik
ssh root@10.0.20.201 "pct exec 103 -- docker ps"
# Logs Traefik
ssh root@10.0.20.201 "pct exec 103 -- docker logs traefik -f"
URL: https://dokploy.romfast.ro
LXC 104 - Flowise AI (Chatbot Maria)
Director: lxc104-flowise/
IP: 10.0.20.161 | Host: pvemini
| Fișier | Descriere |
|---|---|
README.md |
Configurare Flowise, ngrok, troubleshooting CORS |
docs/prd.md |
Product Requirements Document chatbot |
docs/v1-arhitectura.md |
Arhitectură Flowise + Groq |
docs/v2-arhitectura.md |
Arhitectură Claude Agent SDK (planificat) |
Quick Start:
# Status servicii
ssh root@10.0.20.201 "pct exec 104 -- systemctl status flowise"
ssh root@10.0.20.201 "pct exec 104 -- systemctl status ngrok"
# Restart Flowise
ssh root@10.0.20.201 "pct exec 104 -- systemctl restart flowise"
# Test chatbot
curl -s "https://mutual-special-koala.ngrok-free.app/api/v1/prediction/d4911620-07fe-41f8-adb4-f2f52d6ec766" \
-X POST -H "Content-Type: application/json" -d '{"question":"test"}'
URL Public: https://mutual-special-koala.ngrok-free.app Pagina Web: https://www.romfast.ro/chatbot_maria.html
LXC 108 - Oracle Database
Director: lxc108-oracle/
IP: 10.0.20.121 | Host: pvemini
| Fișier | Descriere |
|---|---|
README.md |
PDB-uri, useri, parole, connection strings, export/import DMP |
scripts/export-roa2.sh |
Script export PDB roa2 |
scripts/export-roa2.ps1 |
Script export pentru Windows |
Quick Start:
# Acces în container
ssh root@10.0.20.201 "pct enter 108"
# Restart Oracle
ssh root@10.0.20.201 "pct exec 108 -- docker restart oracle-xe"
# Conexiune SQL*Plus
sqlplus sys/romfastsoft@10.0.20.121:1521/roa as sysdba
LXC 110 - MoltBot (AI Chatbot)
Director: lxc110-moltbot/
IP: 10.0.20.173 (intern) | 100.120.119.70 (Tailscale) | Host: pveelite
Canale: Telegram + WhatsApp | Model: Claude Opus 4.5
| Fișier | Descriere |
|---|---|
README.md |
Configurare completă, securitate, comenzi MoltBot |
Quick Start:
# Terminal UI (direct)
ssh -t moltbot@10.0.20.173 "clawdbot tui"
# Status
ssh moltbot@10.0.20.173 "clawdbot status"
# Web Dashboard (via SSH tunnel)
ssh -L 18789:127.0.0.1:18789 -N moltbot@10.0.20.173 &
# apoi http://localhost:18789
# Restart gateway
ssh moltbot@10.0.20.173 "clawdbot gateway restart"
Componente: Ubuntu 24.04, MoltBot v2026.1.24-3, Node.js v22, Bun, Tailscale SSH
LXC 171 - Claude Agent (Development Environment)
Director: lxc171-claude-agent/
IP: 10.0.20.171 (intern) | 100.95.55.51 (Tailscale) | Host: pveelite
| Fișier | Descriere |
|---|---|
README.md |
Configurare completă, conectare SSH, workflow |
scripts/start-agent.sh |
Pornire/atașare sesiune tmux |
scripts/work.sh |
Meniu interactiv pentru workflow Git |
scripts/new-task.sh |
Creare branch nou pentru task |
scripts/finish-task.sh |
Finalizare task (commit + push) |
Quick Start:
# Conectare (rețea internă)
ssh claude@10.0.20.171
# Conectare (Tailscale - de pe telefon/exterior)
ssh claude@100.95.55.51
# Pornire workflow
~/start-agent.sh # pornește tmux
~/work.sh # meniu interactiv
Componente: Ubuntu 24.04, Node.js v20, Claude Code, tmux, Git, Tailscale
VM 109 - Windows Standby (Disaster Recovery)
Director: vm109-windows-dr/
Rol: Backup Oracle database de pe server extern Windows (RMAN)
| Fișier | Descriere |
|---|---|
README.md |
Configurare DR, RMAN backup, scripturi transfer |
docs/PLAN_TESTARE_MONITORIZARE.md |
Plan testare și monitorizare DR |
docs/PROXMOX_NOTIFICATIONS_README.md |
Configurare notificări Proxmox |
docs/archive/ |
Planuri implementare și statusuri anterioare |
scripts/rman_backup*.bat |
Scripturi RMAN pentru backup Windows |
scripts/transfer_backups.ps1 |
Transfer backup-uri către storage |
scripts/*-proxmox.sh |
Scripturi monitorizare din Proxmox |
Quick Start:
# Monitorizare backup Oracle DR
/mnt/e/proiecte/ROMFASTSQL/proxmox/vm109-windows-dr/scripts/oracle-backup-monitor-proxmox.sh
# Test săptămânal DR
/mnt/e/proiecte/ROMFASTSQL/proxmox/vm109-windows-dr/scripts/weekly-dr-test-proxmox.sh
VM 201 - Windows 11
Director: vm201-windows/
IP: DHCP | Host: pvemini | Rol: Reverse proxy IIS, client aplicații
| Fișier | Descriere |
|---|---|
README.md |
Configurație hardware, servicii, rețea, backup |
docs/vm201-certificat-letsencrypt-iis.md |
Certificate SSL Let's Encrypt, Win-ACME, SNI |
docs/vm201-troubleshooting-backup-nfs.md |
Incident backup NFS (2025-10-08) |
docs/vm201-troubleshooting-pana-curent-2026-01-11.md |
Incident pană curent |
scripts/check-ssl-certificates.ps1 |
Verificare/reînnoire certificate (Windows) |
scripts/monitor-ssl-certificates.sh |
Monitorizare certificate (Proxmox) |
Quick Start:
# Reînnoire certificate SSL (din Proxmox)
ssh root@10.0.20.201 "qm guest exec 201 -- powershell -Command 'cd C:\\Tools\\win-acme; .\\wacs.exe --renew --force'"
ssh root@10.0.20.201 "qm guest exec 201 -- cmd /c iisreset"
# Verificare certificate
echo | openssl s_client -connect roa.romfast.ro:443 -servername roa.romfast.ro 2>/dev/null | openssl x509 -noout -dates
Hartă Rapidă Resurse
Noduri Proxmox Cluster
| Nod | IP | Web GUI |
|---|---|---|
| pve1 | 10.0.20.200 | https://10.0.20.200:8006 |
| pvemini | 10.0.20.201 | https://10.0.20.201:8006 |
| pveelite | 10.0.20.202 | https://10.0.20.202:8006 |
LXC Containers
| VMID | Nume | IP | Serviciu | Documentație |
|---|---|---|---|---|
| 100 | portainer | 10.0.20.170 | Docker Management (Remote Node) | cluster/README.md |
| 103 | dokploy | 10.0.20.167 | Dokploy + Traefik (App Deployment) | lxc103-dokploy/ |
| 104 | flowise | 10.0.20.161 | Flowise AI (Chatbot Maria) | lxc104-flowise/ |
| 106 | gitea | 10.0.20.165 | Git Server | lxc106-gitea/ |
| 108 | central-oracle | 10.0.20.121 | Oracle XE 21c | lxc108-oracle/ |
| 110 | moltbot | 10.0.20.173 | MoltBot AI (Telegram+WhatsApp) | lxc110-moltbot/ |
| 171 | claude-agent | 10.0.20.171 | Claude Code Dev Environment | lxc171-claude-agent/ |
Virtual Machines
| VMID | Nume | OS | Documentație |
|---|---|---|---|
| 109 | standby-dr | Windows Server | vm109-windows-dr/ |
| 201 | roacentral | Windows 11 | vm201-windows/ |
| 300 | Win11-Template | Windows 11 | cluster/README.md |
Navigare Rapidă - Am nevoie să...
Infrastructură
- Văd toate IP-urile și serviciile →
cluster/README.md - Configurez SSH →
cluster/README.md→ "Configurare Inițială SSH" - Monitorizez HA cluster →
cluster/cluster-ha-monitor.sh - Gestionez UPS →
cluster/ups/README.md
Flowise AI / Chatbot Maria (LXC 104)
- Configurez chatbot →
lxc104-flowise/README.md - Troubleshooting CORS/ngrok →
lxc104-flowise/README.md→ "Troubleshooting" - PRD Chatbot →
lxc104-flowise/docs/prd.md - Arhitectură viitoare →
lxc104-flowise/docs/v2-arhitectura.md
Oracle Database (LXC 108)
- Conectez la Oracle →
lxc108-oracle/README.md→ "Conexiuni Oracle" - Export/Import DMP →
lxc108-oracle/README.md→ "Export și Import Data Pump" - Restart Oracle →
lxc108-oracle/README.md→ "Restart Oracle" - Scripturi migrare 10g→21c →
lxc108-oracle/migration/README.md - SQL-uri Oracle 10g →
lxc108-oracle/sql/roa/
Windows VM 109 - Disaster Recovery
- Configurez RMAN backup →
vm109-windows-dr/README.md - Monitorizez backup-uri →
vm109-windows-dr/scripts/oracle-backup-monitor-proxmox.sh - Test DR săptămânal →
vm109-windows-dr/scripts/weekly-dr-test-proxmox.sh - Plan testare DR →
vm109-windows-dr/docs/PLAN_TESTARE_MONITORIZARE.md
Windows VM 201
- Reînnoiesc certificate SSL →
vm201-windows/docs/vm201-certificat-letsencrypt-iis.md - Rezolv probleme VM locked →
vm201-windows/docs/vm201-troubleshooting-backup-nfs.md - Informații generale →
vm201-windows/README.md - Configurez site-uri IIS noi (Dokploy) →
vm201-windows/docs/vm201-dokploy-infrastructure.md - Script setup IIS automat →
vm201-windows/scripts/setup-new-iis-sites.ps1
Dokploy + Traefik (LXC 103)
- Deploy aplicație nouă →
lxc103-dokploy/README.md→ "Workflow: Adăugare App Nouă" - Setup server LXC 100 →
lxc103-dokploy/README.md→ "Pasul 2" - Deploy pdf-qr-app →
lxc103-dokploy/docs/pdf-qr-app.md - Arhitectură domenii →
vm201-windows/docs/vm201-dokploy-infrastructure.md
MoltBot AI (LXC 110)
- Configurare și comenzi →
lxc110-moltbot/README.md - Terminal UI →
ssh -t moltbot@10.0.20.173 "clawdbot tui" - Troubleshooting →
lxc110-moltbot/README.md→ "Troubleshooting" - Canale: Telegram + WhatsApp | Model: Claude Opus 4.5
Claude Agent (LXC 171)
- Configurare și conectare →
lxc171-claude-agent/README.md - Workflow dezvoltare →
lxc171-claude-agent/README.md→ "Workflow Complet" - Scripturi workflow →
lxc171-claude-agent/scripts/ - Troubleshooting →
lxc171-claude-agent/README.md→ "Troubleshooting"
Servicii Web
| Serviciu | URL |
|---|---|
| Proxmox pvemini | https://10.0.20.201:8006 |
| Oracle EM Express | http://10.0.20.121:5500/em |
| Portainer (Oracle) | http://10.0.20.121:9443 |
| Portainer Principal | http://10.0.20.170:9443 |
| Gitea | http://10.0.20.165:3000 |
| Dokploy (intern) | http://10.0.20.167:3000 |
| Dokploy (public) | https://dokploy.romfast.ro |
| pdf-qr-app | https://roa-qr.romfast.ro |
| Apps wildcard | https://*.roa.romfast.ro |
| Flowise AI (local) | http://10.0.20.161:3000 |
| Flowise AI (public) | https://mutual-special-koala.ngrok-free.app |
| Chatbot Maria | https://www.romfast.ro/chatbot_maria.html |
Task-uri Automate Configurate
| Task | Locație | Frecvență | Scop |
|---|---|---|---|
| SSL Certificate Check | VM 201 Task Scheduler | Zilnic 07:00 | Verifică/reînnoiește certificate |
| SSL Monitor | Proxmox cron | Zilnic 08:00 | Monitorizare externă certificate |
| Win-ACME Renew | VM 201 Task Scheduler | Zilnic 09:00 | Reînnoire automată Let's Encrypt |
| UPS Monthly Test | Proxmox cron | Lunar | Test baterie UPS |
| Backup Job | Proxmox | Zilnic 02:00 | Backup toate LXC/VM |
Ultima actualizare: 2026-03-02 Autor: Marius Mutu Proiect: ROMFASTSQL - Infrastructure Documentation