Files
echo-core/tests/test_planning_session.py
Marius Mutu 51e56af557 feat(ralph): conversational planning agent (W2)
Echo Core devine planning agent: poartă o conversație multi-fază cu Marius
folosind skill-urile gstack (/office-hours → /plan-ceo-review →
/plan-eng-review → /plan-design-review opt) și produce final-plan.md în
~/workspace/<slug>/scripts/ralph/, gata să fie consumat de Ralph PRD
generator (W3) noaptea.

Decizii arhitecturale (din eng review + spike findings):
- PlanningSession ca clasă SEPARATĂ de chat-ul main (NU mode=string param)
  — separation explicit. claude_session.py rămâne strict pentru chat;
  planning trăiește în src/planning_session.py + src/planning_orchestrator.py.
  Inheritance literală nu se aplică (claude_session.py expune funcții
  module-level, nu o clasă) — separation e satisfacută prin module distinct.
- Fresh subprocess PER skill phase, NU single resumed session — phase-urile
  coordinează via disk artifacts (gstack convention în
  ~/.gstack/projects/<slug>/). Avoids context window growth.
- --max-turns 20 default + retry pe error_max_turns la --max-turns 30.
  Spike a arătat că prompt-uri complexe pot exploda turn budget-ul.
- approved-tasks.json schema extins cu planning_session_id + final_plan_path
  (Status flow: pending → planning → approved → running → complete).
- State separat în sessions/planning.json (NU active.json), keyed pe
  (adapter, channel_id) pentru re-resume la restart echo-core.

Trigger-e:
- Discord: slash command /plan <slug> [descriere] cu autocomplete pe pending,
  buton "🧠 Planifică" în RalphProjectView, și /cancel slash command.
- Telegram: /plan + /cancel commands, plus buton "🧠 Planifică" în
  ralph project keyboard.
- Router: state-aware routing — dacă chat-ul e în planning, mesajele plain
  trec la PlanningOrchestrator.respond() prin --resume; /cancel revine la
  status pending; /advance / "Continuă faza" advance fază nouă (fresh
  subprocess); /finalize sau "Dau drumul" promote la status approved.

Discord defer pattern: toate butoanele noi (PlanningActiveView,
PlanningFinalView, "🧠 Planifică") apelează await
interaction.response.defer(ephemeral=True) ÎNAINTE de orice IO — evită
"Interaction failed" pe IO >3s.

UX strings warm + colaborativ (per design review): "🧠 Pornesc planning
pentru ...", "Răspunde aici", "Continuă faza", "Dau drumul tonight",
"Anulează" — niciun "Submit/Approve/Cancel" generic.

Tests: 23 noi (test_planning_session, test_planning_orchestrator,
test_router_planning) — toate pass. Mock pe _run_claude pentru a evita
subprocess Claude real în CI.

Files new:
  prompts/planning_agent.md
  src/planning_session.py
  src/planning_orchestrator.py
  tests/test_planning_session.py
  tests/test_planning_orchestrator.py
  tests/test_router_planning.py

Files modified:
  src/claude_session.py        — _run_claude(cwd=...) optional + surface subtype/is_error
  src/router.py                — state-aware routing, start_planning_session, planning_advance/approve/cancel, _ralph_propose schema cu planning_session_id + final_plan_path
  src/adapters/discord_bot.py  — /plan + /cancel slash commands; planning views imported
  src/adapters/discord_views.py — PlanningActiveView, PlanningFinalView, "Planifică" button în RalphProjectView, _split_chunks helper
  src/adapters/telegram_bot.py — /plan + /cancel handlers, callback_ralph extins cu plan/planadvance/plancancel/planapprove, planning keyboards

Status testelor pe modulele atinse: 75 passed, 0 failed
(test_claude_session security_section preexistent — neatins).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 18:38:51 +00:00

279 lines
10 KiB
Python

"""Tests for src/planning_session.py — PlanningSession + state persistence."""
from __future__ import annotations
import json
from pathlib import Path
from unittest.mock import patch
import pytest
from src import planning_session
from src.planning_session import (
PHASE_NEEDS_INPUT_MARKER,
PHASE_READY_MARKER,
PlanningSession,
_channel_key,
build_planning_system_prompt,
clear_planning_state,
get_planning_state,
is_in_planning,
)
# ---------------------------------------------------------------------------
# Fixtures
# ---------------------------------------------------------------------------
@pytest.fixture
def tmp_planning_state(tmp_path, monkeypatch):
"""Redirect planning state file into a tmp dir for each test."""
fake_sessions_dir = tmp_path / "sessions"
fake_sessions_dir.mkdir()
fake_state = fake_sessions_dir / "planning.json"
monkeypatch.setattr(planning_session, "SESSIONS_DIR", fake_sessions_dir)
monkeypatch.setattr(planning_session, "PLANNING_STATE_FILE", fake_state)
yield fake_state
@pytest.fixture
def fake_workspace(tmp_path, monkeypatch):
"""Pretend ~/workspace/<slug>/ exists so PlanningSession.cwd resolves."""
workspace = tmp_path / "workspace"
workspace.mkdir()
(workspace / "demo").mkdir()
monkeypatch.setattr(planning_session, "WORKSPACE_ROOT", workspace)
yield workspace
# ---------------------------------------------------------------------------
# build_planning_system_prompt
# ---------------------------------------------------------------------------
class TestBuildPlanningSystemPrompt:
def test_substitutes_slug_phase_description(self):
prompt = build_planning_system_prompt(
slug="demo", description="Add filter X", phase="/office-hours"
)
# Even if the prompt template differs, the values must appear at least
# once each.
assert "demo" in prompt
assert "Add filter X" in prompt
assert "/office-hours" in prompt
def test_returns_empty_when_template_missing(self, tmp_path, monkeypatch):
monkeypatch.setattr(
planning_session, "PLANNING_PROMPT_FILE", tmp_path / "missing.md"
)
assert build_planning_system_prompt("a", "b", "/x") == ""
# ---------------------------------------------------------------------------
# state get/set/clear
# ---------------------------------------------------------------------------
class TestPlanningState:
def test_clear_returns_false_when_absent(self, tmp_planning_state):
assert clear_planning_state("discord", "ch-1") is False
def test_get_returns_none_when_absent(self, tmp_planning_state):
assert get_planning_state("discord", "ch-1") is None
assert is_in_planning("discord", "ch-1") is False
def test_persist_and_recover(self, tmp_planning_state, fake_workspace):
# Build a session WITHOUT actually invoking claude — call _persist directly.
sess = PlanningSession(
slug="demo",
description="desc",
phase="/office-hours",
channel_id="ch-1",
adapter="discord",
session_id="sess-uuid-1",
)
sess._last_response = "hello world " + PHASE_NEEDS_INPUT_MARKER
sess._last_subtype = "success"
sess._persist(action="start", cost_usd=0.42)
assert is_in_planning("discord", "ch-1") is True
state = get_planning_state("discord", "ch-1")
assert state is not None
assert state["slug"] == "demo"
assert state["session_id"] == "sess-uuid-1"
assert state["phase"] == "/office-hours"
assert state["last_subtype"] == "success"
assert "hello world" in state["last_text_excerpt"]
recovered = PlanningSession.from_state("discord", "ch-1")
assert recovered is not None
assert recovered.slug == "demo"
assert recovered.session_id == "sess-uuid-1"
assert recovered.phase == "/office-hours"
assert clear_planning_state("discord", "ch-1") is True
assert get_planning_state("discord", "ch-1") is None
# ---------------------------------------------------------------------------
# is_phase_ready
# ---------------------------------------------------------------------------
class TestIsPhaseReady:
def test_returns_true_when_marker_present(self, fake_workspace):
sess = PlanningSession("demo", "d", "/x", "ch", session_id="abc")
sess._last_response = f"some text {PHASE_READY_MARKER}"
assert sess.is_phase_ready() is True
def test_returns_false_when_marker_absent(self, fake_workspace):
sess = PlanningSession("demo", "d", "/x", "ch", session_id="abc")
sess._last_response = "some text without marker"
assert sess.is_phase_ready() is False
# ---------------------------------------------------------------------------
# cwd resolution
# ---------------------------------------------------------------------------
class TestCwd:
def test_workspace_dir_used_when_present(self, fake_workspace):
sess = PlanningSession("demo", "d", "/x", "ch")
assert sess.cwd == fake_workspace / "demo"
def test_falls_back_to_project_root_when_missing(
self, fake_workspace, monkeypatch
):
sess = PlanningSession("nonexistent-slug", "d", "/x", "ch")
# Falls back to PROJECT_ROOT
assert sess.cwd == planning_session.PROJECT_ROOT
# ---------------------------------------------------------------------------
# command construction
# ---------------------------------------------------------------------------
class TestBuildCmd:
def test_start_includes_skill_phase_and_max_turns(self, fake_workspace):
sess = PlanningSession("demo", "Add filter", "/office-hours", "ch")
cmd = sess._build_cmd(
"/office-hours Add filter",
resume=None,
max_turns=20,
with_system_prompt=False,
)
assert cmd[0:3] == [planning_session.CLAUDE_BIN, "-p", "/office-hours Add filter"]
assert "--max-turns" in cmd
assert "20" in cmd
assert "--output-format" in cmd
assert "stream-json" in cmd
assert "--dangerously-skip-permissions" in cmd
# No --resume on a fresh start
assert "--resume" not in cmd
def test_resume_includes_resume_flag(self, fake_workspace):
sess = PlanningSession(
"demo", "Add filter", "/office-hours", "ch", session_id="abc"
)
cmd = sess._build_cmd(
"user reply",
resume="abc",
max_turns=20,
with_system_prompt=False,
)
assert "--resume" in cmd
assert "abc" in cmd
def test_with_system_prompt_appends_flag(self, fake_workspace, tmp_path, monkeypatch):
# Create a tiny prompt file so build_planning_system_prompt returns text.
fake = tmp_path / "planning_agent.md"
fake.write_text("phase={phase} slug={slug}", encoding="utf-8")
monkeypatch.setattr(planning_session, "PLANNING_PROMPT_FILE", fake)
sess = PlanningSession("demo", "d", "/office-hours", "ch")
cmd = sess._build_cmd(
"prompt", resume=None, max_turns=20, with_system_prompt=True
)
assert "--system-prompt" in cmd
idx = cmd.index("--system-prompt")
assert "phase=/office-hours" in cmd[idx + 1]
assert "slug=demo" in cmd[idx + 1]
# ---------------------------------------------------------------------------
# start() — integration-flavoured, mocks _run_claude
# ---------------------------------------------------------------------------
class TestStart:
def test_persists_session_id_and_response(
self, tmp_planning_state, fake_workspace
):
fake_result = {
"result": "Bună! Câteva întrebări… " + PHASE_NEEDS_INPUT_MARKER,
"session_id": "claude-uuid-99",
"usage": {"input_tokens": 1000, "output_tokens": 200},
"total_cost_usd": 0.55,
"subtype": "success",
"is_error": False,
}
with patch("src.planning_session._run_claude", return_value=fake_result) as mock_run:
sess = PlanningSession.start(
slug="demo",
description="Add filter X",
phase="/office-hours",
channel_id="ch-1",
adapter="discord",
)
mock_run.assert_called_once()
# cwd kw passed
_, kwargs = mock_run.call_args
assert "cwd" in kwargs
assert sess.session_id == "claude-uuid-99"
assert "Bună" in sess.last_response
state = get_planning_state("discord", "ch-1")
assert state["session_id"] == "claude-uuid-99"
assert state["slug"] == "demo"
assert state["phase"] == "/office-hours"
def test_retries_on_error_max_turns(self, tmp_planning_state, fake_workspace):
# First call returns error_max_turns with no session_id; second returns success.
first = {
"result": "deep tool use",
"session_id": "",
"usage": {},
"total_cost_usd": 0.6,
"subtype": "error_max_turns",
"is_error": True,
}
second = {
"result": "now I have a question",
"session_id": "claude-uuid-2",
"usage": {},
"total_cost_usd": 0.7,
"subtype": "success",
"is_error": False,
}
with patch(
"src.planning_session._run_claude", side_effect=[first, second]
) as mock_run:
sess = PlanningSession.start(
slug="demo",
description="Add filter X",
phase="/office-hours",
channel_id="ch-1",
adapter="discord",
)
assert mock_run.call_count == 2
assert sess.session_id == "claude-uuid-2"
# Second call uses RETRY_MAX_TURNS
second_args = mock_run.call_args_list[1][0][0]
assert "30" in second_args # RETRY_MAX_TURNS
def test_respond_requires_session_id(self, fake_workspace):
sess = PlanningSession("demo", "d", "/x", "ch") # no session_id
with pytest.raises(RuntimeError):
sess.respond("hello")