feat(maintenance): guard DB + log growth (Option B + daily prune + rotation)

Root cause of the 2GB prod import.db: the sync_run_orders audit junction
recorded every order on every run; under the 1-minute scheduler ~98% of
21.7M rows were no-op ALREADY_IMPORTED re-observations. NSSM stdout/stderr
also grew unbounded (rotation never applied to the live service).

Changes:
- sqlite_service: skip ALREADY_IMPORTED rows in sync_run_orders (write-side
  guard, _SKIP_JUNCTION_STATUSES); add prune_sync_history(retention_days)
  with incremental_vacuum.
- maintenance_service (new): cleanup_old_logs + run_daily_maintenance.
- scheduler_service: start_maintenance_job (daily CronTrigger).
- main.py: RotatingFileHandler (sync_comenzi_current.log, 10MB x5) instead
  of a new timestamped file per start; schedule daily maintenance + one-shot
  catch-up at startup.
- scripts/db_maintenance.py (new): one-shot prune + VACUUM + log cleanup,
  plain sqlite3, invoked by deploy.ps1 while the service is stopped.
- deploy.ps1: stop -> run db_maintenance.py -> (re)apply NSSM AppRotate*
  idempotently -> start, so rotation reaches pre-existing services.

Retention defaults: 7 days history, 7 days logs.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
Claude Agent
2026-06-26 09:59:41 +00:00
parent ccc6a933fa
commit dcc5042586
6 changed files with 331 additions and 29 deletions

View File

@@ -1,6 +1,7 @@
import logging
from apscheduler.schedulers.asyncio import AsyncIOScheduler
from apscheduler.triggers.interval import IntervalTrigger
from apscheduler.triggers.cron import CronTrigger
logger = logging.getLogger(__name__)
@@ -42,6 +43,31 @@ def start_scheduler(interval_minutes: int = 10):
logger.info(f"Scheduler started with interval {interval_minutes}min")
def start_maintenance_job(hour: int = 3):
"""Schedule the daily DB/log maintenance job (prune history + cleanup logs).
Runs independently of the sync job — starts the scheduler if it isn't already
running so maintenance happens even when auto-sync is disabled.
"""
if _scheduler is None:
init_scheduler()
from . import maintenance_service
_scheduler.add_job(
maintenance_service.run_daily_maintenance,
trigger=CronTrigger(hour=hour, minute=0),
id="maintenance_job",
name="Daily DB/Log Maintenance",
replace_existing=True
)
if not _scheduler.running:
_scheduler.start()
logger.info(f"Maintenance job scheduled daily at {hour:02d}:00")
def stop_scheduler():
"""Stop the scheduler."""
global _is_running