claude-breakglass: in-cluster warm break-glass UI for the devvm
Stand up the infra for Viktor's break-glass: when the devvm is wedged (cluster healthy), open breakglass.viktorbarzin.me, have Claude SSH in to diagnose/fix, and power-cycle VM 102 via the Proxmox host if needed. App half landed in the claude-agent-service repo. New stack stacks/claude-breakglass/ — own namespace + SA, NO Vault role (ESO syncs only its key, so the pod has zero direct Vault access). Hardened to survive the pressure it exists to fix: priorityClassName tier-0-core, broad node-pressure tolerations, anti-affinity off node1, imagePullPolicy Always. auth="required" ingress so it rides the Authentik resilience proxy and stays reachable via the basic-auth fallback during an auth-stack outage. Runs the shared claude-agent-service image with the breakglass entrypoint. files/breakglass-pve is the PVE forced-command (status|forensics|reset|stop| start|cycle on VM 102, forensics-first). Isolation: the shared claude-agent pod's terraform-state Vault policy is explicitly DENIED secret/claude-breakglass/* (stacks/vault/main.tf) so a prompt-injected agent on that pod can't read the root-on-devvm key. traefik: add a checksum/auth-proxy-htpasswd annotation so the auth-proxy rolls when the emergency basic-auth password rotates (it's a subPath mount that doesn't auto-update) — regenerated this session so Viktor has a known emergency credential, which the auth-stack-outage failure domain requires. Docs: docs/runbooks/breakglass-ui.md (full incident + bootstrap procedure, incl. the per-host from= NAT quirks) and a security.md note recording the two new privileged footholds. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This commit is contained in:
parent
02785987dd
commit
32cf75635f
7 changed files with 629 additions and 0 deletions
|
|
@ -598,6 +598,19 @@ resource "vault_policy" "terraform_state" {
|
|||
path "secret/metadata/vault" {
|
||||
capabilities = ["deny"]
|
||||
}
|
||||
# Explicit deny on the breakglass SSH key (added with the claude-breakglass
|
||||
# stack, 2026-06-12). That key grants root-on-devvm + PVE VM-102 power
|
||||
# verbs; it must NOT be readable by the shared claude-agent pod, whose
|
||||
# agents (recruiter-triage, nextcloud-todos-exec) ingest untrusted input
|
||||
# with Bash. The breakglass pod runs in its own namespace with NO Vault
|
||||
# role and gets the key via ESO only. See
|
||||
# claude-agent-service/docs/adr/0001-breakglass-security-architecture.md.
|
||||
path "secret/data/claude-breakglass/*" {
|
||||
capabilities = ["deny"]
|
||||
}
|
||||
path "secret/metadata/claude-breakglass/*" {
|
||||
capabilities = ["deny"]
|
||||
}
|
||||
EOT
|
||||
}
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue