Second voice UX iteration. Targets Marius's live-test pain points from today.
- **Voice-mode system prompt** (personality/VOICE_MODE.md, plumbed via
claude_session.build_system_prompt(voice_mode=True)) — when the voice
adapter starts a session, append voice-tailored instructions: short replies,
no markdown, no abbreviations, time without seconds, distances rounded
to "mii"/"milioane", no curly quotes / em-dash / ellipsis. Marius asked
for a "in-the-car friend" persona for voice.
- **Isolated voice session key** (router.py) — voice mode uses
`voice:<channel_id>` so it doesn't share context with the text adapter
on the same Discord channel. Fresh start, voice prompt applied
automatically without `/clear` ceremony. `/clear` drops both keys.
- **Metric units + Romanian thousands** (src/voice/normalize.py) —
`384.000 km` was being read as "trei sute optzeci și patru virgulă zero
zero zero km" because the dot was treated as decimal separator and `km`
wasn't expanded. New `normalize_thousands` collapses Romanian thousands
separators (`X.000`/`X.000.000`) before number expansion, and
`expand_units` handles km/kg/cm/mm/ml/ha/mp with correct Romanian
pluralization ("un kilometru", "două kilograme", "douăzeci de
centimetri", "o sută de kilometri" with "de" particle).
- **`/voice setvoice <M1-F5>` slash command** (discord_voice.py) — Discord
native autocomplete; swaps the live TTSQueue voice_id AND persists
voice.default_voice to config.json. No restart needed.
- **Verbal voice change** (src/voice/voice_commands.py — new module +
29 tests) — say "schimbă vocea pe M5" / "vorbește cu vocea F3" / "voce
em cinci" from inside the voice channel. Detector requires both a
trigger word (voce/vorbește/schimbă/treci pe) and a recognizable voice
ID (direct "M5", word form "em cinci", or fallback substring match for
Whisper-mangled forms like "unul cinci"=M5 and "Mâcinci"=M5). On
detection: live-swap, persist to config, mirror to chat with
`🎤 ... / 🔊 Voce → M5`, speak short ack in the NEW voice, skip
Claude. "pământinci" still can't be recovered (no recoverable digit
substring); user gets passthrough to Claude in that case.
- **Whisper initial_prompt** now lists the voice-command vocabulary so
STT biases toward producing clean "M5" / "F3" tokens instead of
inventing "pământ" / "unul" phonetic neighbors.
- **Fast barge-in** (pipeline.py EchoVoiceSink) — previously `ttsq.clear()`
only fired in `on_segment_done` (after 800ms silence + 2-3s STT ≈ 3s lag).
Now also fires from the sink as soon as VAD detects ≥2 consecutive
windows (~200ms) of sustained speech on Marius's user while Echo has
pending TTS frames. Single-window glitches don't cut Echo off; sustained
speech does. (Acoustic echo bleed-through still requires headphones —
no AEC in the bot.)
- Tests: 130 voice + router tests pass; updated test_router.py to expect
`/clear` to drop both text and voice session keys.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Vendored fork: discord-ext-voice_recv 0.5.3a+echo.dave1
Patches the receive pipeline to handle Discord's mandatory DAVE E2E
encryption on voice gateway v=8. Without this, opus_decode raised
"corrupted stream" on every received packet in a DAVE-active room and
voice-to-voice never connected.
DAVE patch (vendor/discord-ext-voice-recv/reader.py):
- `_maybe_dave_decrypt(rtp_packet)`: gate mirrors discord.py 2.7.1
`voice_state.can_encrypt`. Uses davey's `can_passthrough(user_id)` to
branch — peers in passthrough send transport-only packets that pass
through verbatim; peers in DAVE epoch go through `davey.decrypt`.
- Hooked in `callback()` between transport decrypt and feed_rtp;
drops on decrypt failure without killing the reader thread.
- Bumps __version__ to '0.5.3a+echo.dave1' (PEP 440 local segment) so a
contract test can fail fast on accidental upstream-sync overwrite.
Pipeline fixes uncovered while testing DAVE end-to-end:
- src/voice/pipeline.py: silero-vad v6+ requires exactly 512 samples per
call at 16kHz; our 100ms window (1600 samples) was silently raising
ValueError → VAD always returned False → STT never fired. Slice the
window into 512-sample chunks. Bump whisper beam_size 1→5 and add a
Romanian `initial_prompt` — transcriptions go from "Eco salt." gibberish
to "Echo, salutare, te rog spune-mi cât este ora."
- src/voice/tts_stream.py: EchoStreamingAudioSource.read() returns a 20ms
silence frame instead of b'' on empty queue. Empty return is treated
by Discord as end-of-stream and kills the player, so any TTS pushed
later would be silently discarded.
- src/adapters/discord_voice.py: actually attach EchoStreamingAudioSource
to the voice client after the wakeup beep (chained via `after=`),
which was missing entirely — TTS frames had no consumer.
Tests:
- tests/test_voice_recv_dave.py: 11 unit + callback integration tests
covering bypass paths, can_passthrough gate, decrypt error handling.
- tests/test_voice_adapter_contract.py: +test_voice_recv_fork_version
and +test_voice_connection_state_has_dave_attrs guards against
upstream drift on either side.
Config:
- config.json: voice.allowed_user_ids whitelist for Marius's user id.
Status: voice-to-voice loop closes end-to-end (DAVE → VAD → Whisper →
Claude → Supertonic → audio out). Latency is ~8-13s per turn, which is
out of scope for this commit — see TODOS.md for the real-time UX
follow-up plan.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Imported claude jobs default channel to echo-work; grup-sprijin-5feb
and grup-sprijin-pregatire route to echo-sprijin. Existing echo-core
channel is preserved.
Loop through consecutive newsletter numbers until one is missing, so
backlog gets delivered in a single run. Use httpx for 404 check and
point to absolute claude binary path for cron. Enable job in config.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- heartbeat saves unread whitelisted emails via email_process --save --json
- fix: add --add-dir so Claude CLI subprocess can access memory/ symlink
- email_check/process: use BODY.PEEK[] to avoid marking emails as read
- email_process: simplify credential loading via credential_store only
- config: heartbeat interval 30→120min, quiet hours end 08→07
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Switch Bash permission patterns from space to colon separator
- Add memory.bak/ to .gitignore
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fast commands for git, email, calendar, notes, search, reminders, and
diagnostics — all execute instantly without Claude CLI. Incremental
embeddings indexing in heartbeat (1h cooldown) + inline indexing after
/note, /jurnal, /email save. Fix Ollama URL (localhost → 10.0.20.161),
fix email_process.py KB path (kb/ → memory/kb/).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Calendar no longer bypasses quiet hours. First run after quiet hours
sends full daily summary, subsequent runs only remind for next event
within 45 min with deduplication. Calendar cooldown set to 30 min.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Heartbeat system overhaul:
- Fix email/calendar checks to parse JSON output correctly
- Add per-check cooldowns and quiet hours config
- Send findings to Discord channel instead of just logging
- Auto-reindex KB when stale files detected
- Claude CLI called only if HEARTBEAT.md has extra instructions
- All settings configurable via config.json heartbeat section
Move hardcoded values to config.json:
- allowed_tools list (claude_session.py)
- Ollama URL/model (memory_search.py now reads ollama.url from config)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Bridge: allow fromMe messages in groups, include participant field in
message queue, bind to 0.0.0.0 for network access, QR served as HTML.
Adapter: process registered group messages (route to Claude), extract
participant for user identification, fix unbound 'phone' variable.
Tested end-to-end: WhatsApp group chat with Claude working. 442 tests pass.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Node.js bridge (bridge/whatsapp/): Baileys client with Express HTTP API
on localhost:8098 — QR code linking, message queue, reconnection logic.
Python adapter (src/adapters/whatsapp.py): polls bridge every 2s, routes
through router.py, separate whatsapp.owner/admins auth, security logging.
Integrated in main.py alongside Discord + Telegram via asyncio.gather.
CLI: echo whatsapp status/qr. 442 tests pass (32 new, zero failures).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>