Files
echo-core/config.json
Marius Mutu e589e4885e feat(voice): voice-mode prompt, isolated session, units, verbal voice swap, fast barge-in
Second voice UX iteration. Targets Marius's live-test pain points from today.

- **Voice-mode system prompt** (personality/VOICE_MODE.md, plumbed via
  claude_session.build_system_prompt(voice_mode=True)) — when the voice
  adapter starts a session, append voice-tailored instructions: short replies,
  no markdown, no abbreviations, time without seconds, distances rounded
  to "mii"/"milioane", no curly quotes / em-dash / ellipsis. Marius asked
  for a "in-the-car friend" persona for voice.

- **Isolated voice session key** (router.py) — voice mode uses
  `voice:<channel_id>` so it doesn't share context with the text adapter
  on the same Discord channel. Fresh start, voice prompt applied
  automatically without `/clear` ceremony. `/clear` drops both keys.

- **Metric units + Romanian thousands** (src/voice/normalize.py) —
  `384.000 km` was being read as "trei sute optzeci și patru virgulă zero
  zero zero km" because the dot was treated as decimal separator and `km`
  wasn't expanded. New `normalize_thousands` collapses Romanian thousands
  separators (`X.000`/`X.000.000`) before number expansion, and
  `expand_units` handles km/kg/cm/mm/ml/ha/mp with correct Romanian
  pluralization ("un kilometru", "două kilograme", "douăzeci de
  centimetri", "o sută de kilometri" with "de" particle).

- **`/voice setvoice <M1-F5>` slash command** (discord_voice.py) — Discord
  native autocomplete; swaps the live TTSQueue voice_id AND persists
  voice.default_voice to config.json. No restart needed.

- **Verbal voice change** (src/voice/voice_commands.py — new module +
  29 tests) — say "schimbă vocea pe M5" / "vorbește cu vocea F3" / "voce
  em cinci" from inside the voice channel. Detector requires both a
  trigger word (voce/vorbește/schimbă/treci pe) and a recognizable voice
  ID (direct "M5", word form "em cinci", or fallback substring match for
  Whisper-mangled forms like "unul cinci"=M5 and "Mâcinci"=M5). On
  detection: live-swap, persist to config, mirror to chat with
  `🎤 ... / 🔊 Voce → M5`, speak short ack in the NEW voice, skip
  Claude. "pământinci" still can't be recovered (no recoverable digit
  substring); user gets passthrough to Claude in that case.

- **Whisper initial_prompt** now lists the voice-command vocabulary so
  STT biases toward producing clean "M5" / "F3" tokens instead of
  inventing "pământ" / "unul" phonetic neighbors.

- **Fast barge-in** (pipeline.py EchoVoiceSink) — previously `ttsq.clear()`
  only fired in `on_segment_done` (after 800ms silence + 2-3s STT ≈ 3s lag).
  Now also fires from the sink as soon as VAD detects ≥2 consecutive
  windows (~200ms) of sustained speech on Marius's user while Echo has
  pending TTS frames. Single-window glitches don't cut Echo off; sustained
  speech does. (Acoustic echo bleed-through still requires headphones —
  no AEC in the bot.)

- Tests: 130 voice + router tests pass; updated test_router.py to expect
  `/clear` to drop both text and voice session keys.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-27 20:59:10 +00:00

123 lines
2.5 KiB
JSON

{
"bot": {
"name": "Echo",
"default_model": "sonnet",
"owner": "949388626146517022",
"admins": [
"5040014994"
]
},
"channels": {
"echo-core": {
"id": "1471916752119009432",
"default_model": "sonnet"
},
"echo-work": {
"id": "1466726254312030259",
"default_model": "sonnet"
},
"echo-sprijin": {
"id": "1466739361503772864",
"default_model": "sonnet"
},
"echo-self": {
"id": "1466739112747864175",
"default_model": "sonnet"
}
},
"telegram_channels": {},
"whatsapp": {
"enabled": true,
"bridge_url": "http://127.0.0.1:8098",
"owner": "40723197939",
"admins": []
},
"whatsapp_channels": {
"echo-test": {
"id": "120363424350922235@g.us",
"default_model": "sonnet"
}
},
"heartbeat": {
"enabled": false,
"interval_minutes": 120,
"channel": "echo-core",
"model": "haiku",
"quiet_hours": [
23,
7
],
"checks": {
"email": true,
"calendar": true,
"kb_index": true,
"git": false
},
"cooldowns": {
"email": 1800,
"calendar": 1800,
"kb_index": 14400,
"git": 14400
}
},
"newsletter_cercetasi": {
"enabled": true,
"cron": "0 17 * * 4,5,1",
"channel": "echo-core"
},
"allowed_tools": [
"Read",
"Edit",
"Write",
"Glob",
"Grep",
"WebFetch",
"WebSearch",
"Bash(python3 *)",
"Bash(.venv/bin/python3 *)",
"Bash(pip *)",
"Bash(pytest *)",
"Bash(git *)",
"Bash(npm *)",
"Bash(node *)",
"Bash(npx *)",
"Bash(systemctl --user *)",
"Bash(trash *)",
"Bash(mkdir *)",
"Bash(cp *)",
"Bash(mv *)",
"Bash(ls *)",
"Bash(cat *)",
"Bash(chmod *)",
"Bash(docker *)",
"Bash(docker-compose *)",
"Bash(docker compose *)",
"Bash(ssh *@10.0.20.*)",
"Bash(ssh root@10.0.20.*)",
"Bash(ssh echo@10.0.20.*)",
"Bash(scp *10.0.20.*)",
"Bash(rsync *10.0.20.*)"
],
"discord": {
"email_webhook_url": "https://discord.com/api/webhooks/1496421990846697583/OM8z1eBsJC6-UB9-Zi5RkHP23NNv9UrEznRMx4Y3wSWOFmLazPoi-8_iEKMp0Qgsqr-m"
},
"ollama": {
"url": "http://10.0.20.161:11434"
},
"voice": {
"allowed_user_ids": [
"949388626146517022"
],
"user_name": "Marius",
"default_voice": "M5",
"auto_leave_minutes": 5
},
"paths": {
"personality": "personality/",
"tools": "tools/",
"memory": "memory/",
"logs": "logs/",
"sessions": "sessions/"
}
}