apply-mbps-caps: compare normalized option sets (true idempotency) + devvm I/O-stall post-mortem [ci skip]
The raw string compare never matched qm config's canonical key order, so the hourly timer re-issued 'qm set' against every running capped VM, live-rewriting QEMU throttle state via QMP 24x/day. Implicated in today's devvm freeze (15:21-16:48 UTC): the guest's disk I/O stalled inside QEMU (blockstats frozen at 0 while QMP stayed responsive) on the legacy lsi controller path with no iothread. Viktor asked to root-cause the freeze before choosing fixes, then approved mitigating via VM settings: this commit fixes the hourly trigger and documents the incident; the controller swap (virtio-scsi-single + iothread=1 + aio=threads) is staged on VM 102 separately, pending his cold stop/start. Adds docs/post-mortems/2026-06-11-devvm-qemu-io-stall.md (evidence chain, ruled-out causes, capture-before-kill autopsy steps) and syncs compute.md + proxmox-inventory.md. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This commit is contained in:
parent
2e0cebff87
commit
c3a63fcd38
4 changed files with 136 additions and 4 deletions
|
|
@ -27,6 +27,12 @@ TARGETS=(
|
|||
"220:scsi0:40:40" # docker-registry
|
||||
)
|
||||
|
||||
# Sort a disk spec's comma-separated options so two specs with the same
|
||||
# option set but different key order compare equal.
|
||||
normalized() {
|
||||
tr ',' '\n' <<<"$1" | LC_ALL=C sort | paste -sd, -
|
||||
}
|
||||
|
||||
apply_one() {
|
||||
local spec="$1"
|
||||
local vmid slot rd wr
|
||||
|
|
@ -49,8 +55,13 @@ apply_one() {
|
|||
newvalue="${cleaned},mbps_rd=${rd},mbps_wr=${wr}"
|
||||
|
||||
# Skip the qm-set call entirely when state already matches — keeps
|
||||
# journal noise low under the hourly timer.
|
||||
if [[ "$current" == "$newvalue" ]]; then
|
||||
# journal noise low under the hourly timer. Compare option SETS, not raw
|
||||
# strings: `qm config` prints keys in its own canonical order, so a raw
|
||||
# compare never matched and every hourly run re-issued `qm set`, which
|
||||
# live-rewrites the running VM's QEMU throttle state via QMP (implicated
|
||||
# in the 2026-06-11 devvm I/O stall — see
|
||||
# docs/post-mortems/2026-06-11-devvm-qemu-io-stall.md).
|
||||
if [[ "$(normalized "$current")" == "$(normalized "$newvalue")" ]]; then
|
||||
echo "vmid $vmid: $slot already at mbps_rd=${rd},mbps_wr=${wr} — no-op"
|
||||
return 0
|
||||
fi
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue