# Plan: ATM — Automated Trading Monitor (M2D, Faza 1) — ENG-REVIEWED **Source plan:** `/home/claude/.claude/plans/swirling-drifting-starfish.md` **CEO plan artifact:** `~/.gstack/projects/romfast-workspace/ceo-plans/2026-04-15-atm-trading.md` **Eng review mode:** FULL_REVIEW (4 decisions made, 0 unresolved) **Design doc:** `~/.gstack/projects/romfast-workspace/claude-master-design-20260415-atm-trading.md` (APPROVED) **Eng test plan:** `~/.gstack/projects/romfast-workspace/claude-master-eng-review-test-plan-20260415-212932.md` --- ## Context User trades M2D strategy manually on DIA (TradeStation) with execution on TradeLocker US30 CFD (prop firm). Same strategy on GLD → XAUUSD. 4h/evening dual-screen monitoring. Faza 1 goal: bot auto-detects M2D trigger, sends Discord/Telegram notification with screenshot + SL/TP1/TP2 levels; user executes manually in TradeLocker. Faza 2 (auto-execution) deferred until prop firm TOS verified and Faza 1 proven over 20+ sessions. **Review changed two things from the original plan:** 1. **State machine spec corrected.** Original "last 3 consecutive non-gray dots" is wrong. Actual M2D is phased: Phase 1 arming (turquoise → gray/dark-green) → Phase 2 trigger (light-green). 2. **Levels extraction corrected.** Original plan had levels.py extracting SL/TP at trigger. But those lines only appear on TradeStation chart *after* user enters trade in TradeLocker. Corrected to two-phase: spec-math at trigger, chart-scan after entry. Plus 5 accepted expansions (labeled corpus, level fallback, layout canary, trade journal, TOS checklist). --- ## Approach: B (Structured Python service, dry-run, audit log) + CEO-reviewed additions Runs on Windows machine alongside TradeStation. `mss` screenshots → ROI color-sample on M2D MAPS strip → phased state machine → Discord webhook + Telegram bot → JSONL audit + trade journal → dry-run replay against labeled corpus. --- ## State Machine Spec (corrected + exhaustive) States: - `IDLE` - `ARMED_BUY` — turquoise seen - `PRIMED_BUY` — turquoise + at least one dark-green seen - `ARMED_SELL` — yellow seen - `PRIMED_SELL` — yellow + at least one dark-red seen **Default rule:** any (state, event) pair not listed below → stay in current state, no action, log as `noise`. Transitions — BUY side: | From | Event | To | Action | |------|-------|-----|--------| | IDLE | turquoise | ARMED_BUY | log arm_ts | | IDLE | yellow | ARMED_SELL | log arm_ts (sell) | | IDLE | dark-green / dark-red / light-green / light-red / gray | IDLE | noise (log phase-skip if light-green/light-red) | | ARMED_BUY | gray | ARMED_BUY | persist | | ARMED_BUY | turquoise | ARMED_BUY | refresh arm_ts | | ARMED_BUY | dark-green | PRIMED_BUY | log prime_ts | | ARMED_BUY | yellow | ARMED_SELL | opposite rearm | | ARMED_BUY | dark-red | ARMED_BUY | ignore (minority noise) | | ARMED_BUY | light-green | IDLE | **skip detected** — no FIRE, log phase_skip | | ARMED_BUY | light-red | IDLE | skip detected, log | | PRIMED_BUY | dark-green | PRIMED_BUY | accumulate | | PRIMED_BUY | dark-red | PRIMED_BUY | ignore (minority noise) | | PRIMED_BUY | **light-green** | IDLE | **FIRE BUY**, lockout(BUY)=4min | | PRIMED_BUY | light-red | IDLE | skip detected (wrong trigger) | | PRIMED_BUY | gray | IDLE | **COOLED** — signal dead, log | | PRIMED_BUY | turquoise | ARMED_BUY | rearm fresh | | PRIMED_BUY | yellow | ARMED_SELL | opposite rearm | SELL side mirrors exactly: swap turquoise↔yellow, dark-green↔dark-red, light-green↔light-red, BUY↔SELL. Notes: - No time-based TTL on ARMED/PRIMED. State persists until trigger fires, cooled by gray after PRIMED, opposite-color rearm, or process restart (Windows Task Scheduler stops bot at session end → natural session-boundary reset). - Cooling rule: "gray after dark-green" = signal racit (user's term). Gray during ARMED_BUY (before any dark-green) is OK. - After FIRE: 4-minute lockout per-direction. BUY lockout doesn't block SELL and vice versa. Single timestamp per direction. - Opposite-color-Phase-1 triggers rearm to opposite side (captures direction flip). - Phase-skip (arming color → trigger color with no phase-2 step) → IDLE, no FIRE, logged. Would be legitimate only if indicator collapses phases, which it doesn't per observed behavior. --- ## Detection Details - **Loop interval:** 5 seconds (36 cycles per 3-min bar; stays well inside notification-latency target). - **Rightmost-dot detection:** scan ROI from right edge leftward, find first non-background pixel cluster → that's the rightmost dot. Don't hardcode x-pixel positions (chart scrolls; hardcoded positions drift). - **Debounce:** configurable `debounce_depth` in config.toml (default `1` — single-read acceptance). Increase if future sessions show mid-bar color flicker. Screenshot-in-notification is the user's visual verification on top. - **Rolling window:** keep last 20 classified dots with their detection timestamps. State machine consumes the newest *accepted* (post-debounce) dot per cycle. - **Classification:** nearest-color match in RGB Euclidean distance, per-color tolerance from calibration. Report confidence = `1 - distance_nearest / distance_second_nearest`. Log confidence every cycle. If all distances > tolerance → `UNKNOWN`, state unchanged. --- ## Levels Extraction (two-phase, simplified) **Phase A — at trigger (immediate alert to Discord + Telegram):** - No entry-price compute. No spec-math SL/TP. User places a manual 0.6% SL in TradeLocker at entry; actual TP1/TP2/SL come in Phase B from the chart. - Notification: `🟢 BUY signal DIA→US30 | 22:47:03` + annotated screenshot (detected dot highlighted). **Phase B — after user trades (chart-scan confirmation):** - After Phase A fires, detector keeps watching the chart ROI for horizontal colored lines (red=SL, green=TP1/TP2). - When lines appear (user has entered trade in TradeLocker and TradeStation drew them) → scan y-pixels via Hough + color mask, convert via y-axis calibration → send second alert to both channels: `✅ Levels: SL=484.35 | TP1=485.20 | TP2=485.88`. - If chart-line scan times out (no lines in 10 min) → silent (user didn't trade). - If only 2 lines detected (user didn't set TP2 or line not rendered yet) → partial-result alert. - Phase B overlap with next signal: guarded by per-direction lockout + Phase-B completion flag; a new FIRE cannot issue until prior Phase B closes (timeout or success). --- ## Dedup / Lockout - Time-based lockout: after any FIRE, block re-fire for 4 minutes (one 3-min bar + 1 min safety). - Tracked per-direction: BUY lockout doesn't block SELL. - Stored as single timestamp per direction (not pixel-keyed). --- ## Observability - **Heartbeat:** every 30 min to a separate Discord thread (not main alerts channel): `🟢 22:00 alive | 0 triggers | confidence avg 0.85 | chart OK`. Silence >35 min = watchdog concern (user notices). - **Layout canary:** every 60 cycles (5 min), hash a stable reference region (axis labels, chart border). Stored baseline in config. On significant divergence (>threshold) → `⚠️ Layout changed — auto-paused, recalibrate` to alerts channel. Bot pauses detection until operator acknowledges (touch a pause-file or restart). - **Low-confidence alert:** 3+ consecutive cycles with confidence below threshold → `⚠️ Bot lost sight` (already in original plan). - **Window-lost alert:** TradeStation window not found for 60s → `⚠️ Cannot find chart`. - **Audit JSONL:** per-cycle, daily rotation (`logs/YYYY-MM-DD.jsonl`), fields: `{ts, window_found, roi_ok, rightmost_dot_color, confidence, state, transition, trigger, notified, reason}`. --- ## Files to Create - `/workspace/atm/pyproject.toml` — Python 3.11+ required. Deps: `mss`, `opencv-python`, `numpy`, `requests`, `pygetwindow`, `pywin32` (DPI + window capture), `rich` (CLI), `pillow` (screenshot annotation). **No `tomli` — use stdlib `tomllib`.** - `/workspace/atm/config.toml` — populated by calibration tool (ROI coords, per-color RGB + tolerance, `debounce_depth`, y-axis scale, canary-region baseline hash, Discord webhook URL, Telegram bot token + chat_id) - `/workspace/atm/src/atm/config.py` — **[ENG-REVIEW]** `@dataclass Config` with `Config.load(path)` that validates on load (RGB tuples, positive tolerances, both notifier credentials present, y-axis 2-point pair). Fail fast at startup. - `/workspace/atm/src/atm/vision.py` — **[ENG-REVIEW]** shared primitives: ROI crop, perceptual hash, pixel-to-price linear interp, Hough line detection with color mask. Used by detector/canary/levels to avoid drift. - `/workspace/atm/src/atm/detector.py` — screenshot loop, rightmost-dot scan, color classification, rolling window, debounce - `/workspace/atm/src/atm/state_machine.py` — explicit phased state machine (spec above), exhaustive transition table - `/workspace/atm/src/atm/levels.py` — Phase B chart-scan only (Phase A entry-price compute removed after ENG-REVIEW) - `/workspace/atm/src/atm/canary.py` — layout fingerprint hash + drift check + auto-pause - `/workspace/atm/src/atm/notifier/__init__.py` — abstract `Notifier` protocol: `send_alert()`, `send_heartbeat()`, `send_levels_confirm()` - `/workspace/atm/src/atm/notifier/fanout.py` — **[ENG-REVIEW]** `FanoutNotifier` wraps N backends, each with its own worker thread + bounded queue (size 50, drop-oldest on overflow) + retry with exponential backoff + dead-letter file on total failure. Main loop never blocks. - `/workspace/atm/src/atm/notifier/discord.py` — webhook POST, annotated screenshot upload (multipart) - `/workspace/atm/src/atm/notifier/telegram.py` — **[ENG-REVIEW]** built in parallel with Discord (no longer deferred); bot API, photo upload - `/workspace/atm/src/atm/audit.py` — JSONL logger with daily local-midnight rotation, line-buffered write for crash safety - `/workspace/atm/src/atm/calibrate.py` — Tkinter: window pick → DPI check → ROI corners → per-color sample → y-axis scale → canary region → save versioned config - `/workspace/atm/src/atm/labeler.py` — **[EXPANSION]** Tkinter label UI → `labels.json` - `/workspace/atm/src/atm/dryrun.py` — replay with precision/recall/confusion matrix when labels present - `/workspace/atm/src/atm/journal.py` — **[EXPANSION]** `atm journal` CLI → `trades.jsonl` - `/workspace/atm/src/atm/report.py` — **[EXPANSION]** weekly aggregation - `/workspace/atm/src/atm/main.py` — CLI: `atm calibrate`, `atm label `, `atm dryrun `, `atm run [--duration Xh]`, `atm journal`, `atm report [--week YYYY-WW]` - `/workspace/atm/tests/` — **[ENG-REVIEW]** unit + E2E per test plan at `~/.gstack/projects/romfast-workspace/claude-master-eng-review-test-plan-20260415-212932.md` - `/workspace/atm/samples/`, `/workspace/atm/logs/` - `/workspace/atm/configs/` — versioned config archive. **[ENG-REVIEW]** No symlink (Windows admin-required); use `configs/current.txt` marker file storing the active filename. `Config.load()` reads the marker. - `/workspace/atm/docs/phase2-prop-firm-audit.md` — structured TOS checklist - `/workspace/atm/README.md` — setup, calibration workflow, per-session operating checklist, DPI/multi-monitor notes --- ## Build Order 1. **`pyproject.toml` + package scaffold** — Python 3.11+, `pip install -e .`, `atm --help` works. 2. **Standalone screenshot-dump script** — `mss` timer dumps to `samples/` every 5s during trading sessions. Build corpus in parallel. 3. **`config.py` + `vision.py`** — Config dataclass with validation; shared vision primitives. Ship with unit tests for config load + pixel-to-price interp. 4. **`calibrate.py`** — versioned config in `configs/YYYY-MM-DD-HHMM.toml`; `configs/current.txt` marker file points at active. DPI check + canary region capture. 5. **`labeler.py`** — once ~30 samples exist, tag them. `labels.json` is ground truth. 6. **`state_machine.py`** + **unit tests** (clean BUY, clean SELL, cooling, opposite-rearm, lockout per-direction, noise, phase-skip, all state×color pairs via parameterized test). 7. **`detector.py`** + **unit tests** (empty/background ROI, rightmost-cluster, rolling window FIFO, debounce depth=1, classification edges including UNKNOWN). 8. **`canary.py`** + **unit tests** (drift threshold, pause-file gating). 9. **`levels.py`** (Phase B only) + **unit tests** (Hough line detection with color mask, 2 vs 3 lines, 10-min timeout, pixel-to-price roundtrip). 10. **`notifier/fanout.py` + `discord.py` + `telegram.py`** + **unit tests** (queue overflow drop-oldest, 429 backoff, dead-letter on total failure, fanout: one backend down still delivers). Both channels built in parallel — fire together from day 1. 11. **`audit.py`** + **unit tests** (daily rotation at local midnight, line-buffered flush crash safety). 12. **`dryrun.py`** — replay on `samples/` against `labels.json`. **Acceptance gate before live: precision = 100%, recall ≥ 95%.** 13. **E2E replay test** — feed `samples/` through detector → state_machine → notifier-mock → in-memory audit; assert labels match FIREs. 14. **`journal.py`**, **`report.py`**, **`main.py`** (unified CLI). 15. **Windows Task Scheduler setup** — 16:30→18:30, 21:00→23:00. `atm run --duration 2h`. Manual DST check twice yearly. 16. **`docs/phase2-prop-firm-audit.md`** — TOS checklist template. --- ## Existing Utilities to Reuse Greenfield Python project. No internal utilities. External libs: `mss` (screenshot), `pygetwindow` (window locate), `opencv-python` (line detection in Phase B), `numpy` (color math), `requests` (Discord webhook), `tomli` (config parsing), `pillow` (annotated screenshots). --- ## Verification End-to-end, in build order: 1. **State machine unit tests:** `pytest tests/test_state_machine.py` — all scenarios (clean BUY, clean SELL, cooling, rearm, lockout, noise) pass. 2. **Calibration:** `atm calibrate` → step through → `config.toml` populated with plausible RGBs for described colors + y-axis scale sane + canary region picked. 3. **Labeled corpus:** ≥30 screenshots in `samples/`, `atm label ./samples` tags each. 4. **Dry-run with metrics:** `atm dryrun ./samples` → precision + recall + confusion matrix printed. **Acceptance gate:** precision = 100%, recall ≥ 95%. If not met → tune tolerances, re-run. 5. **Live test notification-only (2 sessions):** `atm run`. Verify: - Discord + Telegram notifications within 5s of trigger, both channels receive. - Phase A message: direction + timestamp + annotated screenshot. - Phase B levels-alert fires once TradeStation draws SL/TP lines; correct SL/TP1/TP2 prices. - Heartbeat messages every 30 min in thread. - Audit JSONL complete, state transitions visible. - Kill one notifier (e.g. wrong token) → other still delivers, dead-letter file for failed one. 6. **Canary test:** manually move TradeStation window during session → layout-changed alert within 5 min. Move back → restart bot → resumes. 7. **Scheduler test:** Windows Task Scheduler starts bot at 16:30, stops at 18:30 cleanly, log rotates at midnight. 8. **Journal test:** after real trade, `atm journal` → prompt flow complete → `trades.jsonl` entry present. 9. **Report test:** after 1 week of live use, `atm report --week 2026-16` → precision per color, slippage distribution, P&L summary. --- ## Risk Register - **Prop firm TOS (Faza 2 blocker):** read TOS using `docs/phase2-prop-firm-audit.md` checklist before any auto-execution work. If EA/automation prohibited → Faza 2 dead, stay on Faza 1 permanently. - **TradeStation layout change:** canary catches it within 5 min → auto-pause. Recalibrate. Losing a session to a layout change is acceptable cost. - **Calibration drift over time:** versioned configs in `configs/` let you roll back to last-known-good if new calibration misfires. - **DIA↔US30 price divergence:** accepted (user's judgment). Phase 1 journal captures slippage per signal, feeding Faza 2 go/no-go. - **Screen sharing / RDP during trading:** overlay can break classification. Low prob, documented in README as operator hygiene. - **Windows Task Scheduler DST transitions:** twice per year, schedule may misfire. Manual check first week of each DST change. --- ## Out of Scope (Faza 1) - Any automated click in TradeLocker (Faza 2 work) - Multi-symbol concurrent monitoring (single chart at a time; user switches manually between DIA and GLD) - Backtesting on historical data (strategy already manually validated) - Web UI / dashboard (headless + Discord/Telegram only) - Ack feedback loop (react-on-notification labeling) — deferred to TODOS.md as `P2-ack-loop`: shipping baseline first, adding feedback once detection quality verified - Telegram notifier — built only after Discord is stable 5+ sessions --- ## Accepted Expansions (CEO review, SELECTIVE mode) 1. ✅ **Labeled sample corpus + dry-run metrics** — `labeler.py`, `labels.json`, automated precision/recall in dryrun. Makes acceptance criteria ("false-positives = 0, false-negatives ≤ 5%") machine-checkable. 2. ✅ **Level-extractor fallback (spec-math)** — Phase A always uses spec-math; Phase B validates against chart. Redundancy on fragile piece. 3. ✅ **Layout canary + auto-pause** — `canary.py` hashes stable UI region, auto-pauses on drift. Catches silent classification-with-wrong-positions failure mode. 4. ✅ **Trade journal CLI** — `atm journal` + `trades.jsonl` + weekly report. Data for Faza 2 go/no-go decision. 5. ✅ **Prop-firm TOS audit checklist** — `docs/phase2-prop-firm-audit.md`. Structured Faza 2 evaluation framework shipped now. ## Deferred to TODOS.md - **Ack feedback loop** — Discord reaction emojis feeding precision tuning. High value, operationally heavier (bot vs webhook). Add after Faza 1 baseline stable. --- ## GSTACK REVIEW REPORT | Review | Trigger | Why | Runs | Status | Findings | |--------|---------|-----|------|--------|----------| | CEO Review | `/plan-ceo-review` | Scope & strategy | 1 | CLEAR (SELECTIVE EXPANSION) | 6 proposals, 5 accepted, 1 deferred; 2 arch corrections | | Codex Review | `/codex review` | Independent 2nd opinion | 0 | — | — | | Eng Review | `/plan-eng-review` | Architecture & tests (required) | 1 | CLEAR (FULL_REVIEW) | 9 issues found, 0 critical gaps; 4 decisions made, 0 unresolved | | Design Review | `/plan-design-review` | UI/UX gaps | 0 | — | SKIPPED (no UI scope — CLI + Discord/Telegram) | | DX Review | `/plan-devex-review` | Developer experience gaps | 0 | — | SKIPPED (personal tool, single user) | **UNRESOLVED:** 0 **ENG REVIEW DECISIONS:** 1. **Bar flicker** → debounce depth=1 (configurable), rely on screenshot-in-notification for visual verification. 2. **Phase A entry price** → dropped. User places manual 0.6% SL in TradeLocker at entry. Phase A = direction + screenshot only. Phase B = real SL/TP1/TP2 from chart. 3. **Notifier blocking** → fire-and-forget worker threads per backend, bounded queue (size 50, drop-oldest), retry w/ backoff, dead-letter on total failure. 4. **Alert SPoF** → Discord + Telegram built in parallel from day 1, both fire together. **ENG REVIEW OBVIOUS FIXES (stated, no decision):** - Exhaustive state transition table (all state×color pairs, default-noise rule, SELL mirror explicit). - Python 3.11+ pin, drop `tomli` dep, use stdlib `tomllib`. - Windows symlink → `configs/current.txt` marker file. - Shared `vision.py` module (ROI, hash, interp, Hough). - `@dataclass Config` with fail-fast load-time validation. - DPI check + multi-monitor note in calibrate + README. **ENG REVIEW TEST SCOPE (accepted: FULL):** unit tests for every module (state_machine, detector, levels Phase B, canary, audit, notifier fanout/retry, calibrate roundtrip, config validate) + 1 E2E replay harness asserting labeled-corpus precision/recall. Test plan artifact: `~/.gstack/projects/romfast-workspace/claude-master-eng-review-test-plan-20260415-212932.md`. **VERDICT:** CEO + ENG CLEARED — ready to implement. Run `/ship` after implementation. No further reviews required before build.