claude-breakglass: in-cluster warm break-glass UI for the devvm
Stand up the infra for Viktor's break-glass: when the devvm is wedged (cluster healthy), open breakglass.viktorbarzin.me, have Claude SSH in to diagnose/fix, and power-cycle VM 102 via the Proxmox host if needed. App half landed in the claude-agent-service repo. New stack stacks/claude-breakglass/ — own namespace + SA, NO Vault role (ESO syncs only its key, so the pod has zero direct Vault access). Hardened to survive the pressure it exists to fix: priorityClassName tier-0-core, broad node-pressure tolerations, anti-affinity off node1, imagePullPolicy Always. auth="required" ingress so it rides the Authentik resilience proxy and stays reachable via the basic-auth fallback during an auth-stack outage. Runs the shared claude-agent-service image with the breakglass entrypoint. files/breakglass-pve is the PVE forced-command (status|forensics|reset|stop| start|cycle on VM 102, forensics-first). Isolation: the shared claude-agent pod's terraform-state Vault policy is explicitly DENIED secret/claude-breakglass/* (stacks/vault/main.tf) so a prompt-injected agent on that pod can't read the root-on-devvm key. traefik: add a checksum/auth-proxy-htpasswd annotation so the auth-proxy rolls when the emergency basic-auth password rotates (it's a subPath mount that doesn't auto-update) — regenerated this session so Viktor has a known emergency credential, which the auth-stack-outage failure domain requires. Docs: docs/runbooks/breakglass-ui.md (full incident + bootstrap procedure, incl. the per-host from= NAT quirks) and a security.md note recording the two new privileged footholds. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This commit is contained in:
parent
02785987dd
commit
32cf75635f
7 changed files with 629 additions and 0 deletions
|
|
@ -851,6 +851,10 @@ resource "kubernetes_deployment" "auth_proxy" {
|
|||
# nginx only reads its config at startup — roll the pods whenever
|
||||
# the ConfigMap content changes.
|
||||
"checksum/auth-proxy-config" = sha1(kubernetes_config_map.auth_proxy_config.data["default.conf"])
|
||||
# The emergency-fallback htpasswd is a subPath secret mount, which
|
||||
# does NOT auto-update on change — roll the pods when it rotates so a
|
||||
# regenerated emergency password actually takes effect.
|
||||
"checksum/auth-proxy-htpasswd" = sha1(var.auth_fallback_htpasswd)
|
||||
}
|
||||
}
|
||||
spec {
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue