infra/modules
Viktor Barzin 445feb118f infra: per-VM I/O caps + terragrunt v0.77 plumbing + state recovery
WHAT LANDED:
- terragrunt.hcl (root): added telmate/proxmox to k8s_providers
  required_providers. Other stacks just don't instantiate a provider
  block — harmless. Replaces the same-name override trick the infra
  stack used to do, which stopped working under Terragrunt v0.77
  ("Detected generate blocks with the same name").
- stacks/infra/terragrunt.hcl: new generate "proxmox_provider" block
  writes proxmox_provider.tf with the provider config; credentials
  read from Vault secret/viktor at plan/apply time (no env vars).
- modules/create-vm: new mbps_rd / mbps_wr number variables (default 0
  = uncapped), wired into scsi0/scsi1 disk{} blocks as
  mbps_r_concurrent / mbps_wr_concurrent. lifecycle.ignore_changes
  extended to scsi6..scsi29 (K8s nodes have many CSI-managed slots),
  plus scsihw and qemu_os (vary per-VM; non-trivial live changes).
- stacks/infra/main.tf: docker-registry-vm gains mbps_rd=40,
  mbps_wr=40 in HCL — already applied live via qm set on 2026-05-26.

WHAT FAILED AND WAS ROLLED BACK:
- Attempted import of 7 VMs (102 devvm, 103 home-assistant, 200
  k8s-master, 201 k8s-node1, 202 k8s-node2, 203 k8s-node3, 204
  k8s-node4) via import {} blocks. The telmate/proxmox v3.0.2-rc07
  provider mangled proxmox-csi PVC slots on apply for vmid 202 and
  203: every scsi slot got rewritten from `vm-9999-pvc-<uuid>` to
  the boot disk `vm-<vmid>-disk-0`. Restored both .conf files from
  the 2026-05-24 nightly PVE config backup at /mnt/backup/pve-config/
  etc-pve/nodes/pve/qemu-server/{202,203}.conf — no reboots, no data
  loss, K8s CSI reconciled PVC attachments within minutes. Removed
  the 7 imports from state via `terraform state rm` and re-encrypted.
  Tracked in beads code-xzbl: blocked on bpg/proxmox provider
  migration (telmate has the same dynamic-disk defect that bit us on
  iSCSI back in 2026-04-02; see memory id=539).

LIVE CAPS STILL IN PLACE (qm set, 2026-05-26 ~03:13 UTC):
  102 devvm 60/60   103 home-assistant 40/40   200 k8s-master 100/60
  201 k8s-node1 150/120   202 k8s-node2 150/120   203 k8s-node3 150/120
  204 k8s-node4 150/120   220 docker-registry 40/40
  (pfSense 101 BSD + Windows10 300 intentionally out of scope.)

PRE-EXISTING DRIFT EXPOSED (NOT NEW):
- HCL declares k8s-master (200) and k8s-node2 (202) but neither was
  ever imported into TF state — confirmed against the SOPS-encrypted
  state in git (lineage e1cc5bb5, serial 42, last touched 2026-04-06).
  This commit leaves both declarations in place but does NOT import
  them; that's part of the code-xzbl follow-up.

Closes: code-s9xr
2026-05-26 06:46:47 +00:00
..
create-template-vm infra: re-enable unattended-upgrades with kured prometheus-gating 2026-05-10 17:07:32 +00:00
create-vm infra: per-VM I/O caps + terragrunt v0.77 plumbing + state recovery 2026-05-26 06:46:47 +00:00
docker-registry [forgejo] Phase 4 final decommission: drop registry-private container + port 5050 2026-05-07 19:08:17 +00:00
kubernetes anubis: HA with shared valkey/redis store + replicas=2 2026-05-16 11:54:54 +00:00