nodes: journald -> volatile (RAM) to cut sdc write-IOPS
Some checks failed
ci/woodpecker/push/default Pipeline failed
Some checks failed
ci/woodpecker/push/default Pipeline failed
Node "container churn" investigation (code-oflt): container logs (~30 KB/s) and overlayfs (~17 KB/s) are negligible; the node OS-disk churn is ext4 journal (jbd2) metadata writes driven mostly by journald's continuous appends. node4 + node5 had drifted to uncapped persistent journald (4 GB each, ~100 KB/s); master/node1-3 were correctly capped at 500M. Node + pod journals already ship to Loki (alloy loki.source.journal), so on-disk journald is pure write-IOPS overhead on the IOPS-bound sdc. Switch journald to Storage=volatile (RAM, RuntimeMaxUse=200M) fleet-wide: - cloud_init.yaml: drop-in 90-oflt-volatile.conf for new nodes (replaces the old persistent seds). - running nodes (master + node1-5): pushed the same drop-in via qm guest exec + journald restart + cleared /var/log/journal. Verified node5: OS-disk writers jbd2/sda1-8 931->46 KB/s, systemd-journal gone (~94% drop); ~4 GB freed each on node4/node5. Logs stay queryable in Loki. Trade-off: a hard crash loses the last unshipped journal. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
parent
1afe41880e
commit
71501be408
2 changed files with 7 additions and 8 deletions
File diff suppressed because one or more lines are too long
|
|
@ -88,13 +88,12 @@ write_files:
|
||||||
runcmd:
|
runcmd:
|
||||||
# Enable weekly TRIM/discard to reclaim freed blocks in LVM thin pool
|
# Enable weekly TRIM/discard to reclaim freed blocks in LVM thin pool
|
||||||
- systemctl enable --now fstrim.timer
|
- systemctl enable --now fstrim.timer
|
||||||
# Enable persistent journald logging for crash forensics, with size limits to reduce disk wear
|
# journald in RAM (volatile, capped) — node + pod journals already ship to
|
||||||
- mkdir -p /var/log/journal
|
# Loki via alloy (loki.source.journal), so on-disk journald is pure sdc
|
||||||
- sed -i 's/#Storage=auto/Storage=persistent/' /etc/systemd/journald.conf
|
# write-IOPS overhead on the IOPS-bound HDD. code-oflt 2026-06-30.
|
||||||
- sed -i 's/#SystemMaxUse=/SystemMaxUse=500M/' /etc/systemd/journald.conf
|
- mkdir -p /etc/systemd/journald.conf.d
|
||||||
- sed -i 's/#MaxRetentionSec=/MaxRetentionSec=7day/' /etc/systemd/journald.conf
|
- printf '[Journal]\nStorage=volatile\nRuntimeMaxUse=200M\nCompress=yes\n' > /etc/systemd/journald.conf.d/90-oflt-volatile.conf
|
||||||
- sed -i 's/#MaxFileSec=/MaxFileSec=1day/' /etc/systemd/journald.conf
|
- rm -rf /var/log/journal
|
||||||
- sed -i 's/#Compress=yes/Compress=yes/' /etc/systemd/journald.conf
|
|
||||||
- systemctl restart systemd-journald
|
- systemctl restart systemd-journald
|
||||||
%{if is_k8s_template}
|
%{if is_k8s_template}
|
||||||
# Node DNS is intentionally STOCK — no resolved drop-ins, no /etc/hosts
|
# Node DNS is intentionally STOCK — no resolved drop-ins, no /etc/hosts
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue