Migrated from monorepo during Forgejo registry consolidation 2026-05-07
Find a file
Viktor Barzin eccf0dd407
Some checks failed
Build and Push / lint-and-test (push) Has been cancelled
Build and Push / build (push) Has been cancelled
Build and Push / deploy (push) Has been cancelled
Build and Push / notify-failure (push) Has been cancelled
conversational: trim per-turn context to cut brain TTFT ~1.3s
The no-tools conversational agent was dragging the full project context (this
repo's CLAUDE.md, the MCP server configs, local settings) plus the dynamic
system-prompt sections into every voice turn — ~45k input tokens -> ~3.4s
time-to-first-token (measured against the live pod, 2026-06-21).

Add --setting-sources user + --exclude-dynamic-system-prompt-sections to both
the gateway (json) and realtime (stream-json) conversational argvs: context
drops to ~23k and TTFT to ~2.1s (~1.3s/turn faster) with no change to the
reply. Helps the portal-assistant v1 gateway AND the v2 realtime agent (both
run the same turn). The /execute agent path is untouched.

Investigation ruled out the assumed culprits: CLI startup is only ~0.5s, and a
warm prompt cache does NOT lower TTFT (turn 2 read all 45k from cache yet TTFT
was unchanged) — the cost was the context size, not the spawn.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-21 18:00:21 +00:00
.github/workflows ci: move image build off-infra to GHA -> ghcr (ADR-0002) 2026-06-13 01:45:36 +00:00
.woodpecker ci: move image build off-infra to GHA -> ghcr (ADR-0002) 2026-06-13 01:45:36 +00:00
agents conversational: add no-tools multi-turn Brain endpoint for portal-assistant 2026-06-17 18:38:44 +00:00
app conversational: trim per-turn context to cut brain TTFT ~1.3s 2026-06-21 18:00:21 +00:00
beads Initial extraction from monorepo 2026-05-07 17:07:12 +00:00
docs docs: capture AFK implementation pipeline design + ADRs 0002-0004 2026-06-14 19:09:12 +00:00
frontend breakglass UI v2: attachable sessions (tmux model) + mobile-first redesign 2026-06-14 19:19:03 +00:00
tests conversational: trim per-turn context to cut brain TTFT ~1.3s 2026-06-21 18:00:21 +00:00
.dockerignore Initial extraction from monorepo 2026-05-07 17:07:12 +00:00
.gitignore breakglass: in-cluster emergency-recovery UI for the devvm 2026-06-12 21:36:05 +00:00
CONTEXT.md docs: capture breakglass design (CONTEXT glossary + ADR 0001) 2026-06-12 20:59:13 +00:00
docker-entrypoint-breakglass.sh breakglass: in-cluster emergency-recovery UI for the devvm 2026-06-12 21:36:05 +00:00
Dockerfile breakglass: in-cluster emergency-recovery UI for the devvm 2026-06-12 21:36:05 +00:00
LICENSE.txt Initial extraction from monorepo 2026-05-07 17:07:12 +00:00
requirements-dev.txt Initial extraction from monorepo 2026-05-07 17:07:12 +00:00
requirements.txt Initial extraction from monorepo 2026-05-07 17:07:12 +00:00