Record the architecture for moving code implementation AFK, decided in a
design/grilling session. The owner wants the human-in-the-loop boundary to
stop at design + spec: once an issue is triaged ready-for-agent, an agent
should implement it test-first, push it, and see it to a healthy deploy on
its own, escalating only when it can't proceed.
Decisions captured:
- claude-agent-service is the control plane (poller + watcher + safety);
a dedicated in-cluster T3 Code instance is the executor + cockpit, because
T3 can only show sessions it launched itself -> we dispatch into it
(ADR 0003).
- AFK code pushes straight to master; on a broken deploy it fix-forwards
then freezes the broken state for forensics rather than reverting
(ADR 0002).
- Implementation agents use persistent per-repo checkouts + git worktrees on
SSD-NFS for warm caches, reversing the throwaway-clone rule for this path
because concurrency is serial-within-repo (ADR 0004).
Pilot-gated: five integration unknowns must be validated against a dedicated
T3 instance before the poller is wired. No code yet.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Viktor wants a Claude-driven web UI on the agent service to act as a
breakglass: when the devvm is down he can open it, have Claude SSH in to
diagnose/repair, and power-cycle the VM via the Proxmox host if needed.
Grilling settled the design. Recording it now as the design record before
implementation:
- CONTEXT.md: glossary for the breakglass language (breakglass agent,
warm/cold case, forced-command verb, cycle vs reset, forensics).
- ADR 0001: the security architecture — isolated deployment in its own
namespace + narrow Vault policy (the existing claude-agent namespace's
terraform-state policy grants secret/data/* to Bash-wielding agents that
ingest untrusted input, so co-locating root-on-devvm keys would be
exfiltratable); warm-case-only scope (devvm wedged, cluster healthy —
the in-cluster UI can't survive the shared PVE host going down, which
stays the separate cold-path SSH design); and bounded-but-broad host
capability (full sudo on devvm, autonomous forced-command PVE power
verbs, forensics-first).
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Multiple agent calls now run concurrently, each in its own isolated git
checkout (local clone of the warm base, hardlinked objects, git-crypt
re-unlocked), so concurrent jobs never share a working tree.
- execution_lock (asyncio.Lock) -> execution_semaphore (default MAX_CONCURRENCY=10);
excess calls queue FIFO instead of 409/503. MAX_QUEUE_DEPTH safety valve.
- /execute never returns 409; jobs go queued -> running. Timeout covers
execution only, not queue wait.
- /v1/chat/completions queues for a slot instead of 503-busy.
- /health: busy = at-capacity, plus active/queued/capacity fields.
- per-job workspace prepare/cleanup under a short git lock; the agent run holds none.
- in-memory job registry evicted past JOB_TTL_SECONDS.
Design: docs/2026-06-02-parallel-execution-design.md
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>