feat(maintenance): guard DB + log growth (Option B + daily prune + rotation)

Root cause of the 2GB prod import.db: the sync_run_orders audit junction
recorded every order on every run; under the 1-minute scheduler ~98% of
21.7M rows were no-op ALREADY_IMPORTED re-observations. NSSM stdout/stderr
also grew unbounded (rotation never applied to the live service).

Changes:
- sqlite_service: skip ALREADY_IMPORTED rows in sync_run_orders (write-side
  guard, _SKIP_JUNCTION_STATUSES); add prune_sync_history(retention_days)
  with incremental_vacuum.
- maintenance_service (new): cleanup_old_logs + run_daily_maintenance.
- scheduler_service: start_maintenance_job (daily CronTrigger).
- main.py: RotatingFileHandler (sync_comenzi_current.log, 10MB x5) instead
  of a new timestamped file per start; schedule daily maintenance + one-shot
  catch-up at startup.
- scripts/db_maintenance.py (new): one-shot prune + VACUUM + log cleanup,
  plain sqlite3, invoked by deploy.ps1 while the service is stopped.
- deploy.ps1: stop -> run db_maintenance.py -> (re)apply NSSM AppRotate*
  idempotently -> start, so rotation reaches pre-existing services.

Retention defaults: 7 days history, 7 days logs.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
Claude Agent
2026-06-26 09:59:41 +00:00
parent ccc6a933fa
commit dcc5042586
6 changed files with 331 additions and 29 deletions

View File

@@ -1,9 +1,10 @@
import asyncio
from contextlib import asynccontextmanager
from datetime import datetime
from fastapi import FastAPI
from fastapi.staticfiles import StaticFiles
from pathlib import Path
import logging
import logging.handlers
import os
from .config import settings
@@ -19,8 +20,12 @@ _stream_handler.setFormatter(_formatter)
_log_dir = os.path.join(os.path.dirname(os.path.dirname(os.path.dirname(__file__))), 'logs')
os.makedirs(_log_dir, exist_ok=True)
_log_filename = f"sync_comenzi_{datetime.now().strftime('%Y%m%d_%H%M%S')}.log"
_file_handler = logging.FileHandler(os.path.join(_log_dir, _log_filename), encoding='utf-8')
# Rotating handler (10MB x 5 backups) instead of a new timestamped file per
# start — caps log growth and stops file proliferation across restarts. Fixed
# name still matches the QA glob `sync_comenzi_*.log`.
_file_handler = logging.handlers.RotatingFileHandler(
os.path.join(_log_dir, "sync_comenzi_current.log"),
maxBytes=10 * 1024 * 1024, backupCount=5, encoding='utf-8')
_file_handler.setFormatter(_formatter)
_root_logger = logging.getLogger()
@@ -54,6 +59,15 @@ async def lifespan(app: FastAPI):
except Exception:
pass
# Daily DB/log maintenance (prune audit history + cleanup old logs) + a
# one-shot catch-up so a long-down service reclaims immediately on start.
try:
from .services import maintenance_service
scheduler_service.start_maintenance_job()
asyncio.create_task(maintenance_service.run_daily_maintenance())
except Exception as e:
logger.warning(f"Maintenance scheduling failed: {e}")
logger.info("GoMag Import Manager started")
yield