infra

Viktor Barzin ae6dde45c2 k8s-version-upgrade: trigger etcd snapshot via existing backup-etcd Job; broaden agent RBAC Stage 2 now reuses the existing default/backup-etcd CronJob (NFS-backed PV pointing at 192.168.1.127:/srv/nfs/etcd-backup) instead of trying to ssh into master and run etcdctl against a non-existent /mnt/main mount. The agent triggers a one-shot Job from cronjob/backup-etcd, waits up to 10 min, then parses the backup-manage container log for "Backup done" line + byte count. Test 2 (dry-run) surfaced 5 real cluster blockers — agent loop works end-to-end at the planning level. Expanded the claude-agent ServiceAccount's privileges via a sibling ClusterRole (claude-agent-upgrade-ops): - patch namespaces/k8s-upgrade (in-flight annotation) - create batch/jobs (trigger etcd snapshot Job) - patch nodes (cordon/uncordon) - create pods/eviction (drain) - delete pods (drain fallback)	2026-05-22 14:16:43 +00:00
..
main.tf	k8s-version-upgrade: trigger etcd snapshot via existing backup-etcd Job; broaden agent RBAC	2026-05-22 14:16:43 +00:00
terragrunt.hcl	k8s-version-upgrade: automated kubeadm/kubelet/kubectl upgrade pipeline	2026-05-22 14:16:42 +00:00

Viktor Barzin ae6dde45c2 k8s-version-upgrade: trigger etcd snapshot via existing backup-etcd Job; broaden agent RBAC

Stage 2 now reuses the existing default/backup-etcd CronJob (NFS-backed
PV pointing at 192.168.1.127:/srv/nfs/etcd-backup) instead of trying to
ssh into master and run etcdctl against a non-existent /mnt/main mount.
The agent triggers a one-shot Job from cronjob/backup-etcd, waits up to
10 min, then parses the backup-manage container log for "Backup done"
line + byte count.

Test 2 (dry-run) surfaced 5 real cluster blockers — agent loop works
end-to-end at the planning level.

Expanded the claude-agent ServiceAccount's privileges via a sibling
ClusterRole (claude-agent-upgrade-ops):
  - patch namespaces/k8s-upgrade (in-flight annotation)
  - create batch/jobs (trigger etcd snapshot Job)
  - patch nodes (cordon/uncordon)
  - create pods/eviction (drain)
  - delete pods (drain fallback)

2026-05-22 14:16:43 +00:00

main.tf

k8s-version-upgrade: trigger etcd snapshot via existing backup-etcd Job; broaden agent RBAC

2026-05-22 14:16:43 +00:00

terragrunt.hcl

k8s-version-upgrade: automated kubeadm/kubelet/kubectl upgrade pipeline

2026-05-22 14:16:42 +00:00