Optimizări cost 97%: session initialization, model routing, prompt caching
- Session Initialization Rule: Load ONLY SOUL.md, USER.md, IDENTITY.md, memory/YYYY-MM-DD.md
* Skip MEMORY.md, session history on startup (load on-demand via memory_search)
* Result: 50KB → 8KB context = 80% token savings
- Model Routing: Haiku default, Sonnet/Opus for complex reasoning only
* Haiku: routine tasks, memory searches (/bin/bash.00025/1K tokens)
* Sonnet/Opus: architecture, security, complex debugging
- Prompt Caching enabled for Sonnet + Opus (90% discount on reused content)
* TTL: 5m cache window
* Static files (SOUL.md, USER.md) cached automatically
* Savings: 5KB prompt = $0.015 → $0.0015 per reused call
- Rate Limits: 5s between API calls, 10s between searches, max 5 searches/batch
- Budgets: $5/day warning @ 75%, $200/month warning @ 75%
Gateway config (~/.openclaw/clawdbot.json):
* agents.defaults.model.cache enabled for opus + sonnet
* rateLimits + budgets sections added
* heartbeat routing to Ollama ready (manual setup)
Files updated:
- AGENTS.md: Core optimization rules documented
- memory/kb/tools/session-initialization.md: Detailed initialization strategy
- ~/.openclaw/clawdbot.json: Model config + caching + rate limits + budgets