claude-agent-service/tests
Viktor Barzin eccf0dd407
Some checks failed
Build and Push / lint-and-test (push) Has been cancelled
Build and Push / build (push) Has been cancelled
Build and Push / deploy (push) Has been cancelled
Build and Push / notify-failure (push) Has been cancelled
conversational: trim per-turn context to cut brain TTFT ~1.3s
The no-tools conversational agent was dragging the full project context (this
repo's CLAUDE.md, the MCP server configs, local settings) plus the dynamic
system-prompt sections into every voice turn — ~45k input tokens -> ~3.4s
time-to-first-token (measured against the live pod, 2026-06-21).

Add --setting-sources user + --exclude-dynamic-system-prompt-sections to both
the gateway (json) and realtime (stream-json) conversational argvs: context
drops to ~23k and TTFT to ~2.1s (~1.3s/turn faster) with no change to the
reply. Helps the portal-assistant v1 gateway AND the v2 realtime agent (both
run the same turn). The /execute agent path is untouched.

Investigation ruled out the assumed culprits: CLI startup is only ~0.5s, and a
warm prompt cache does NOT lower TTFT (turn 2 read all 45k from cache yet TTFT
was unchanged) — the cost was the context size, not the spawn.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-21 18:00:21 +00:00
..
__init__.py Initial extraction from monorepo 2026-05-07 17:07:12 +00:00
conftest.py afk: add the autonomous issue-implementer loop (SHIPS DISABLED) 2026-06-15 21:15:11 +00:00
test_afk_ci_watcher.py afk: add the autonomous issue-implementer loop (SHIPS DISABLED) 2026-06-15 21:15:11 +00:00
test_afk_dispatch_policy.py afk: wire the T3 adapter to the REAL orchestration contract + fix priority 2026-06-15 22:27:00 +00:00
test_afk_notifier.py afk: add the autonomous issue-implementer loop (SHIPS DISABLED) 2026-06-15 21:15:11 +00:00
test_afk_phase_checklist.py afk: add the autonomous issue-implementer loop (SHIPS DISABLED) 2026-06-15 21:15:11 +00:00
test_afk_poller.py afk: wire the T3 adapter to the REAL orchestration contract + fix priority 2026-06-15 22:27:00 +00:00
test_afk_run_state_machine.py afk: add the autonomous issue-implementer loop (SHIPS DISABLED) 2026-06-15 21:15:11 +00:00
test_afk_t3_client.py afk: wire the T3 adapter to the REAL orchestration contract + fix priority 2026-06-15 22:27:00 +00:00
test_afk_t3_live.py afk: wire the T3 adapter to the REAL orchestration contract + fix priority 2026-06-15 22:27:00 +00:00
test_afk_tracker.py afk: add the autonomous issue-implementer loop (SHIPS DISABLED) 2026-06-15 21:15:11 +00:00
test_afk_watcher.py afk: wire the T3 adapter to the REAL orchestration contract + fix priority 2026-06-15 22:27:00 +00:00
test_breakglass.py breakglass UI v2: attachable sessions (tmux model) + mobile-first redesign 2026-06-14 19:19:03 +00:00
test_concurrency.py parallel execution: replace single-flight lock with bounded semaphore + per-job workspace 2026-06-02 20:57:41 +00:00
test_conversational.py conversational: trim per-turn context to cut brain TTFT ~1.3s 2026-06-21 18:00:21 +00:00
test_main.py parallel execution: replace single-flight lock with bounded semaphore + per-job workspace 2026-06-02 20:57:41 +00:00
test_openai_compat.py chat-completions: stream conversational turns (SSE token relay) for realtime voice 2026-06-17 22:22:38 +00:00