dashboard/api.py: adaug link "Ralph" (lucide bot icon) în NAV_HTML între
Workspace și KB. Pagina ralph.html se injectează corect cu nav-ul (verificat
live via curl pe :8088/ralph.html).
tests/test_e2e_planning_walkthrough.py (nou): 4 teste integration care
simulează scripted exact ce face un user pe Discord:
- click Planifică pe game-library cu UI scope → 4 faze (incl design-review)
- /office-hours → ceo → eng → design → final-plan.md stub scris pe disk
- "Dau drumul" → status approved + final_plan_path în approved-tasks.json
- description fără UI keywords → 3 faze (skip design)
- /cancel mid-planning → status revert pending, state cleared
- mesaj fără planning state → cade pe Claude main chat (NU orchestrator)
Subprocess `claude -p` mock-uit; testează tot wire-up-ul router → orchestrator
→ session și schema approved-tasks.json. Nu consumă credite.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces 5s polling on /echo/ralph.html with EventSource streaming and adds
a rollback control for the running Ralph cards.
Server (dashboard/handlers/ralph.py):
- /api/ralph/stream — Server-Sent Events. Emits `event: status` whenever a
signature over the projects' state changes (poll filesystem at 2s); emits
`event: heartbeat` every 30s to keep proxies happy. Disables proxy
buffering via X-Accel-Buffering:no.
- /api/ralph/<slug>/rollback (POST) — runs `git revert --no-edit HEAD` in
the project; falls back to `git reset --hard HEAD~1` only if revert
reports conflict. After rolling back the commit, decrements `passes` on
the last user story marked complete in prd.json (atomic temp+rename
write, same pattern as ralph_dag.py). Returns
`{success, message, reverted_commit, story_reverted, method}`.
- _ralph_validate_slug tightened to a strict regex (alphanum + dash +
underscore, ≤64 chars) plus explicit ../, /, \ rejection. All previously
accepted slugs still pass; URL-encoded traversal and shell metachars
now blocked before the filesystem is touched.
- _ralph_collect_status / _ralph_signature factored out of
handle_ralph_status so the SSE loop can reuse them and detect changes
cheaply.
Server (dashboard/api.py):
- HTTPServer → ThreadingHTTPServer with daemon_threads=True. SSE is a
long-lived response; without threading a single client would block all
other dashboard endpoints.
- /api/ralph/stream (GET) and /api/ralph/<slug>/rollback (POST) wired
into the dispatch.
Client (dashboard/ralph.html):
- EventSource('/api/ralph/stream') with permanent fallback to 5s polling
when readyState=CLOSED (no server, CORS blocked, browser without SSE).
- Indicator badge: 🟢 Live (SSE), ⏱ Polling (fallback), Offline.
- Rollback button (undo-2 icon) on running cards; native confirm() with
message: "Asta va da git revert HEAD pe <slug> și va decrementa ultima
story trecută. Continui?"
Tests (tests/test_dashboard_ralph_endpoint.py, +20 cases):
- Strict slug validator: underscore allowed, >64 rejected, special chars
/ backslash / URL-encoded traversal rejected.
- _ralph_collect_status + _ralph_signature: stable when nothing changes,
flips when project added or `passes` toggles.
- Rollback: invalid slug → 400, non-git project → 400, real two-commit
repo revert succeeds and decrements last passing story (US-002 goes
passes:false while US-001 stays passes:true), no-passing-stories case
succeeds with story_reverted=None, response shape contract, atomic
helper leaves no .tmp file behind.
- API routing smoke: confirms ThreadingHTTPServer + stream + rollback
references present in dashboard/api.py.
39/39 tests pass on tests/test_dashboard_ralph_endpoint.py. Pre-existing
failures in test_dashboard_constants.py::test_base_dir_is_echo_core (the
worktree dir is `echo-core-realtime`, not `echo-core`) and
test_dashboard_unified_index.py::test_index_has_all_panels are unrelated
to this change and reproduced on master.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Echo Core devine planning agent: poartă o conversație multi-fază cu Marius
folosind skill-urile gstack (/office-hours → /plan-ceo-review →
/plan-eng-review → /plan-design-review opt) și produce final-plan.md în
~/workspace/<slug>/scripts/ralph/, gata să fie consumat de Ralph PRD
generator (W3) noaptea.
Decizii arhitecturale (din eng review + spike findings):
- PlanningSession ca clasă SEPARATĂ de chat-ul main (NU mode=string param)
— separation explicit. claude_session.py rămâne strict pentru chat;
planning trăiește în src/planning_session.py + src/planning_orchestrator.py.
Inheritance literală nu se aplică (claude_session.py expune funcții
module-level, nu o clasă) — separation e satisfacută prin module distinct.
- Fresh subprocess PER skill phase, NU single resumed session — phase-urile
coordinează via disk artifacts (gstack convention în
~/.gstack/projects/<slug>/). Avoids context window growth.
- --max-turns 20 default + retry pe error_max_turns la --max-turns 30.
Spike a arătat că prompt-uri complexe pot exploda turn budget-ul.
- approved-tasks.json schema extins cu planning_session_id + final_plan_path
(Status flow: pending → planning → approved → running → complete).
- State separat în sessions/planning.json (NU active.json), keyed pe
(adapter, channel_id) pentru re-resume la restart echo-core.
Trigger-e:
- Discord: slash command /plan <slug> [descriere] cu autocomplete pe pending,
buton "🧠 Planifică" în RalphProjectView, și /cancel slash command.
- Telegram: /plan + /cancel commands, plus buton "🧠 Planifică" în
ralph project keyboard.
- Router: state-aware routing — dacă chat-ul e în planning, mesajele plain
trec la PlanningOrchestrator.respond() prin --resume; /cancel revine la
status pending; /advance / "Continuă faza" advance fază nouă (fresh
subprocess); /finalize sau "Dau drumul" promote la status approved.
Discord defer pattern: toate butoanele noi (PlanningActiveView,
PlanningFinalView, "🧠 Planifică") apelează await
interaction.response.defer(ephemeral=True) ÎNAINTE de orice IO — evită
"Interaction failed" pe IO >3s.
UX strings warm + colaborativ (per design review): "🧠 Pornesc planning
pentru ...", "Răspunde aici", "Continuă faza", "Dau drumul tonight",
"Anulează" — niciun "Submit/Approve/Cancel" generic.
Tests: 23 noi (test_planning_session, test_planning_orchestrator,
test_router_planning) — toate pass. Mock pe _run_claude pentru a evita
subprocess Claude real în CI.
Files new:
prompts/planning_agent.md
src/planning_session.py
src/planning_orchestrator.py
tests/test_planning_session.py
tests/test_planning_orchestrator.py
tests/test_router_planning.py
Files modified:
src/claude_session.py — _run_claude(cwd=...) optional + surface subtype/is_error
src/router.py — state-aware routing, start_planning_session, planning_advance/approve/cancel, _ralph_propose schema cu planning_session_id + final_plan_path
src/adapters/discord_bot.py — /plan + /cancel slash commands; planning views imported
src/adapters/discord_views.py — PlanningActiveView, PlanningFinalView, "Planifică" button în RalphProjectView, _split_chunks helper
src/adapters/telegram_bot.py — /plan + /cancel handlers, callback_ralph extins cu plan/planadvance/plancancel/planapprove, planning keyboards
Status testelor pe modulele atinse: 75 passed, 0 failed
(test_claude_session security_section preexistent — neatins).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Restructurare Ralph QC loop pe smart gate dispatcher tag-driven (în loc de
5 faze fixe), DAG dependsOn cu propagare blocked, retry guard 3-strike, rate
limit detection, plus dashboard live cu polling 5s.
Changes:
- tools/ralph_prd_generator.py: parametru optional final_plan_path; când e
furnizat, invocă Claude Opus pe final-plan.md pentru extragere user stories
cu schema extinsă (tags, dependsOn, acceptanceCriteria 3-5). Backward compat
păstrat — fără final_plan_path, fallback la heuristic-ul vechi.
- tools/ralph/prd-template.json: schema W3 (tags[], dependsOn[], retries,
failed, blocked, failureReason, requiresDesignReview).
- tools/ralph/prompt.md: 4 faze (impl, base quality, smart gates, commit) +
dispatcher pe story.tags. Tags vide → run-all-gates fallback (safe default).
- tools/ralph_dag.py (nou): tag validation heuristic anti-silent-regression
(force ui dacă diff atinge .vue/.tsx/.html/.css/.scss; force db pentru
migrations sau .sql; force vercel dacă există vercel.json) + topological
sort cu blocked propagation + atomic prd.json updates.
- tools/ralph/ralph.sh: --max-turns 30, DAG-aware story selection, retry
counter cu auto-fail la 3, rate limit detection (sleep 30min + 1 retry),
CLI subcommands prin tools/ralph_dag.py helper.
- dashboard/handlers/ralph.py (nou): /api/ralph/status + /<slug>/log + /prd
+ /stop. Defensive vs corrupt prd.json. Sandbox-ed PID kill.
- dashboard/ralph.html (nou): live cards 3/2/1 col responsive, polling 5s,
drawer pentru log/PRD viewer, status colors (--status-running/blocked/
failed/complete declarate inline), Lucide icons cu aria-labels.
- dashboard/api.py: mount /api/ralph/* (GET status/log/prd, POST stop).
- tests/: 72 teste noi (smart gates, DAG, retry, dashboard endpoint).
Note arhitecturale:
- Polling 5s ales peste SSE/WebSocket (suficient pentru iter Ralph 8-15min)
- Tag validation rulează POST-iter pe diff git pentru anti-silent-regression
- Rate limit retry: 1 dată per rulare, apoi mark failed=rate_limited
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Loads cron/jobs.json and asserts: unique names, valid cron expressions
(APScheduler parseable), bool enabled field; kind:"shell" entries must
have non-empty channel, non-empty command list of strings, valid
report_on, and timeout within [1, 3600] when present; claude entries
must have non-empty prompt, valid model, list-typed allowed_tools.
Sanity-checks that shell commands reference existing scripts in the
repo and that no imported claude prompt still points at /home/moltbot/clawd/.
Four checks:
- The script file exists at the expected path.
- The source contains the marker print statement (fast regression guard).
- Running the script against an empty config produces a matching marker
(^GSTACK-CRON: changes=\d+$) with changes=0.
- The marker is the last non-empty line of stdout so tailers can parse it.
The runtime test copies the script into a tmp cwd so that the script's
SCRIPT_DIR-relative state files (hashes.json, versions.json, snapshots/,
monitor.log) don't pollute the repo.
Adds four new test groups to tests/test_scheduler.py:
- TestTimezone: asserts AsyncIOScheduler is constructed with Europe/Bucharest.
- TestShellKind: 16 cases covering add_shell_job validation (duplicate name
across claude/shell, invalid cron, empty/non-list/non-string command,
bad report_on, bad timeout bounds/type, empty channel, custom report_on
and timeout pass-through).
- TestShellExecute: 14 cases covering the report_on contract:
- exit 0 + marker N>0 → forwards stdout
- exit 0 + marker N==0 → silent
- exit 0 + no marker → silent + warning
- report_on=always and =never variants
- non-zero exit reports stderr even when report_on='never'
- TimeoutExpired and launch exceptions report '[cron:X] Error: ...'
- per-job timeout passed to subprocess.run; default 300 when None
- subprocess.run receives the job's command list verbatim
- stdout trimmed to 1500 ch; stderr trimmed to 500 ch
- TestBackwardCompat: a jobs.json entry without a 'kind' field dispatches
to _execute_claude_job (never to _execute_shell_job); the existing Claude
add_job/run_job round-trip still works with the old CLI invocation.
- TestMarkerRegex: parametrised positive/negative cases for _MARKER_RE.
Fast commands for git, email, calendar, notes, search, reminders, and
diagnostics — all execute instantly without Claude CLI. Incremental
embeddings indexing in heartbeat (1h cooldown) + inline indexing after
/note, /jurnal, /email save. Fix Ollama URL (localhost → 10.0.20.161),
fix email_process.py KB path (kb/ → memory/kb/).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Calendar no longer bypasses quiet hours. First run after quiet hours
sends full daily summary, subsequent runs only remind for next event
within 45 min with deduplication. Calendar cooldown set to 30 min.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Heartbeat system overhaul:
- Fix email/calendar checks to parse JSON output correctly
- Add per-check cooldowns and quiet hours config
- Send findings to Discord channel instead of just logging
- Auto-reindex KB when stale files detected
- Claude CLI called only if HEARTBEAT.md has extra instructions
- All settings configurable via config.json heartbeat section
Move hardcoded values to config.json:
- allowed_tools list (claude_session.py)
- Ollama URL/model (memory_search.py now reads ollama.url from config)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Replace all ~/clawd and ~/.clawdbot paths with ~/echo-core equivalents
in tools (git_commit, ralph_prd_generator, backup_config, lead-gen)
- Update personality files: TOOLS.md repo/paths, AGENTS.md security audit cmd
- Migrate HANDOFF.md architectural decisions to docs/architecture.md
- Tighten credentials/ dir to 700, add to .gitignore
- Add .claude/ and *.pid to .gitignore
- Various adapter, router, and session improvements from prior work
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Switch from --output-format json to --output-format stream-json --verbose
so that _run_claude() parses all assistant text blocks (not just the final
result field). Discord/Telegram/WhatsApp now receive every intermediate
message Claude writes between tool calls.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Created echo-core.service and echo-whatsapp-bridge.service (user units)
- CLI status/doctor now use systemctl --user show instead of PID file
- CLI restart uses kill+start pattern for reliability
- Added echo stop command
- CLI shebang uses venv python directly for keyring support
- Updated tests to mock _get_service_status instead of PID file
- 440 tests pass
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Bridge: allow fromMe messages in groups, include participant field in
message queue, bind to 0.0.0.0 for network access, QR served as HTML.
Adapter: process registered group messages (route to Claude), extract
participant for user identification, fix unbound 'phone' variable.
Tested end-to-end: WhatsApp group chat with Claude working. 442 tests pass.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Node.js bridge (bridge/whatsapp/): Baileys client with Express HTTP API
on localhost:8098 — QR code linking, message queue, reconnection logic.
Python adapter (src/adapters/whatsapp.py): polls bridge every 2s, routes
through router.py, separate whatsapp.owner/admins auth, security logging.
Integrated in main.py alongside Discord + Telegram via asyncio.gather.
CLI: echo whatsapp status/qr. 442 tests pass (32 new, zero failures).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Prompt injection protection: external messages wrapped in [EXTERNAL CONTENT]
markers, system prompt instructs Claude to never follow external instructions
- Invocation logging: all Claude CLI calls logged with channel, model, duration,
token counts to echo-core.invoke logger
- Security logging: separate echo-core.security logger for unauthorized access
attempts (DMs from non-admins, unauthorized admin/owner commands)
- Security log routed to logs/security.log in addition to main log
- Extended echo doctor: Claude CLI functional check, config.json secret scan,
.gitignore completeness, file permissions, Ollama reachability, bot process
- Subprocess env stripping logged at debug level
373 tests pass (10 new security tests).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Semantic search over memory/*.md files using all-minilm embeddings.
Adds /search Discord command and `echo memory search/reindex` CLI.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
/model (show/change), /restart (owner), /logs, set_session_model API, model reset on /clear. 20 new tests (161 total).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Subprocess wrapper for Claude CLI with start/resume/clear sessions, personality system prompt, atomic session tracking. 38 new tests (89 total).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>