scripts: hook apply-mbps-caps into the PVE host as a systemd timer

The qm-set I/O caps were previously only applied by manual one-shot
runs of apply-mbps-caps.sh, so any config drift (manual `qm set`,
config restored from /mnt/backup/pve-config like we did on 2026-05-26,
fresh VM clone) would leave the affected VM uncapped until someone
remembered to re-run the script.

Adds apply-mbps-caps.service (Type=oneshot) + apply-mbps-caps.timer
firing:
  - OnBootSec=5min        — catches PVE host reboots & restored configs
  - OnCalendar=hourly     — catches manual qm-set drift / fresh clones
  - Persistent=true       — runs missed schedule after PVE downtime
  - RandomizedDelaySec=2min

Same install pattern as the other PVE operational scripts (nfs-mirror,
daily-backup, offsite-sync-backup, lvm-pvc-snapshot — memory id=609 +
id=542). Source in this repo, deployed to /usr/local/bin + /etc/
systemd/system/ on the PVE host.

Script hardening: kept `set -uo pipefail` but dropped `-e` so one
missing VM doesn't abort the rest; each VM is gated on `qm status`
existence; added a fast-path "already at target" no-op log line for
quiet hourly runs.

Installed on PVE (192.168.1.127) and smoke-tested: all 8 VMs caps
re-applied successfully, next run 12:00 EEST. Journal: `journalctl
-u apply-mbps-caps -f` on the PVE host.
This commit is contained in:
Viktor Barzin 2026-05-26 08:12:15 +00:00
parent 232409e798
commit 56a338f80b
3 changed files with 79 additions and 22 deletions

View file

@ -0,0 +1,12 @@
[Unit]
Description=Apply per-VM I/O caps via qm set (idempotent)
Documentation=https://github.com/ViktorBarzin/infra/blob/master/scripts/apply-mbps-caps.sh
After=pve-cluster.service
Wants=pve-cluster.service
[Service]
Type=oneshot
ExecStart=/usr/local/bin/apply-mbps-caps.sh
StandardOutput=journal
StandardError=journal
SyslogIdentifier=apply-mbps-caps

View file

@ -1,10 +1,19 @@
#!/usr/bin/env bash #!/usr/bin/env bash
# Apply per-VM I/O caps via qm set on the PVE host. # Apply per-VM I/O caps via `qm set` on the PVE host.
# - Reads each VM's current boot-disk options #
# - Appends mbps_rd=<N>,mbps_wr=<N> # - Reads each target VM's current boot-disk options.
# - Re-applies via qm set (live, no reboot needed) # - Appends/normalises `mbps_rd=<N>,mbps_wr=<N>`.
# - Verifies with qm config | grep mbps # - Re-applies via `qm set` (live, no reboot needed).
set -euo pipefail # - Idempotent: re-running with no drift is a no-op at the storage
# level (proxmox config rewrite is cheap).
# - Continues on per-VM failures so one missing/stopped VM doesn't
# skip the rest — designed to be safe under the systemd timer.
#
# Backed by `apply-mbps-caps.{service,timer}` (hourly + 5min-after-boot).
# Why these values: see beads code-9v2j + memory id=2726 (alloy IO storm)
# + memory id=1575 (VMs intentionally out of TF).
set -uo pipefail # NOT -e — keep going if a single VM step fails.
# vmid:disk_slot:mbps_rd:mbps_wr (Linux VMs only — skipping 101 pfsense BSD, 300 Windows) # vmid:disk_slot:mbps_rd:mbps_wr (Linux VMs only — skipping 101 pfsense BSD, 300 Windows)
TARGETS=( TARGETS=(
@ -14,34 +23,52 @@ TARGETS=(
"201:scsi1:150:120" # k8s-node1 (GPU + many CSI disks; boots from scsi1) "201:scsi1:150:120" # k8s-node1 (GPU + many CSI disks; boots from scsi1)
"202:scsi0:150:120" # k8s-node2 "202:scsi0:150:120" # k8s-node2
"203:scsi0:150:120" # k8s-node3 "203:scsi0:150:120" # k8s-node3
"204:scsi0:150:120" # k8s-node4 (currently doing write recovery) "204:scsi0:150:120" # k8s-node4
"220:scsi0:40:40" # docker-registry "220:scsi0:40:40" # docker-registry
) )
for spec in "${TARGETS[@]}"; do apply_one() {
local spec="$1"
local vmid slot rd wr
IFS=: read -r vmid slot rd wr <<<"$spec" IFS=: read -r vmid slot rd wr <<<"$spec"
printf '\n=== VMID %s slot=%s rd=%s MB/s wr=%s MB/s ===\n' "$vmid" "$slot" "$rd" "$wr"
current=$(qm config "$vmid" | awk -v s="$slot:" '$1==s {sub(/^[^ ]+ /, ""); print; exit}') # Skip non-existent VMs cleanly (e.g. node decommissioned, never rebuilt).
if [[ -z "$current" ]]; then if ! qm status "$vmid" >/dev/null 2>&1; then
echo " ERROR: could not read $slot for vmid $vmid — skipping" echo "vmid $vmid: not present on this host — skipping"
continue return 0
fi
local current cleaned newvalue
current=$(qm config "$vmid" | awk -v s="$slot:" '$1==s {sub(/^[^ ]+ /, ""); print; exit}')
if [[ -z "$current" ]]; then
echo "vmid $vmid: no $slot line in config — skipping"
return 0
fi fi
# Strip any existing mbps_rd / mbps_wr from the current string (idempotent)
cleaned=$(echo "$current" | sed -E 's/,mbps_rd=[0-9]+//g; s/,mbps_wr=[0-9]+//g') cleaned=$(echo "$current" | sed -E 's/,mbps_rd=[0-9]+//g; s/,mbps_wr=[0-9]+//g')
newvalue="${cleaned},mbps_rd=${rd},mbps_wr=${wr}" newvalue="${cleaned},mbps_rd=${rd},mbps_wr=${wr}"
# Skip the qm-set call entirely when state already matches — keeps
# journal noise low under the hourly timer.
if [[ "$current" == "$newvalue" ]]; then
echo "vmid $vmid: $slot already at mbps_rd=${rd},mbps_wr=${wr} — no-op"
return 0
fi
echo "vmid $vmid: updating $slot"
echo " before: $current" echo " before: $current"
echo " after: $newvalue" echo " after: $newvalue"
if qm set "$vmid" "--$slot" "$newvalue"; then
echo " ok"
else
echo " FAILED: qm set returned non-zero"
return 1
fi
}
qm set "$vmid" "--$slot" "$newvalue" rc=0
echo " verify: $(qm config "$vmid" | awk -v s="$slot:" '$1==s {print; exit}')"
done
echo
echo "=== Final verification — mbps on all targets ==="
for spec in "${TARGETS[@]}"; do for spec in "${TARGETS[@]}"; do
IFS=: read -r vmid slot _ _ <<<"$spec" apply_one "$spec" || rc=1
echo "vmid $vmid: $(qm config "$vmid" | awk -v s="$slot:" '$1==s {print; exit}')"
done done
exit "$rc"

View file

@ -0,0 +1,18 @@
[Unit]
Description=Re-apply per-VM I/O caps periodically + after PVE boot
[Timer]
# After every PVE host reboot — caps survive in /etc/pve/qemu-server/<vmid>.conf
# normally, but a config restore from backup can drop them (see 2026-05-26
# incident where we restored 202.conf + 203.conf from /mnt/backup/pve-config/).
OnBootSec=5min
# Hourly during normal operation — catches manual `qm set` drift or fresh
# VM clones that haven't had caps applied yet.
OnCalendar=hourly
Persistent=true
RandomizedDelaySec=2min
[Install]
WantedBy=timers.target