Commit graph

8 commits

Author SHA1 Message Date
Viktor Barzin
6442978f07 fan-control: merge Fan %/RPM dashboard cards + RPM estimate fallback [ci skip]
The Fan % and Fan RPM sensor-graph cards had identical trend shapes (RPM ∝ %),
so merge them into one "Fan speed" card: % trend (stable Pushgateway sensor) +
RPM beneath. RPM reads sensor.r730_fan_speed (Redfish) but falls back to the
calibrated estimate (rpm≈160·%+1520, shown with a "~" prefix) when that sensor
is unavailable — it blips out intermittently, so the readout never goes blank.
The Override readout likewise shows both "% · rpm". HA-side only; daemon
unchanged.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 14:31:32 +00:00
Viktor Barzin
405ca79531 fan-control: Override slider now tracks live fan speed while unlocked [ci skip]
The dashboard Override slider used to show a stale stored % (e.g. 5%) while the
fans were actually at ~53%, which was confusing. Add
automation.r730_fan_override_track_live_speed_while_unlocked: while unlocked it
mirrors the live commanded % (sensor.r730_fan_control_target) into the Override,
so it always shows the actual absolute fan speed and updates as the fan moves.
While locked it stops tracking and is the user's editable setpoint. The readout
under the slider now shows the live "% · rpm" (actual, not an estimate). HA-side
only; daemon unchanged. Verified live: slider forced to 10 → synced to 58 target.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 14:20:38 +00:00
Viktor Barzin
c059405632 fan-control: simplify HA dashboard + Lock = freeze-current/algo-off [ci skip]
The dashboard-it Server → Fans view is now minimal: fan speed (% + RPM), an
Override % slider, and a Lock toggle. Lock now means "freeze the current speed,
algorithm off" — a new automation (r730_fan_lock_freeze_current_speed_resume_algo)
snapshots the live target % into Override and sets mode=manual on lock-ON, and
mode=auto on lock-OFF. The host daemon is unchanged (the toggle just drives the
mode it already reads). cool/quiet stay reachable via the entity but are off the
simplified view; the 60-min auto-revert is kept as a dormant safety net. Verified
live: lock ON → mode=manual + Override captured the live 60%; lock OFF → auto.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 13:27:46 +00:00
Viktor Barzin
d17b25cdcc fan-control: document the HA Fan Lock (opt out of 60-min auto-revert) [ci skip]
A manual/cool/quiet override in HA auto-reverts to `auto` after 60 min. Add a
Fan Lock (`input_boolean.r730_fan_lock`) that gates that automation so a
deliberate override persists, with a visible "🔒 FAN CONTROL LOCKED" banner on
the dashboard-it Server view so it isn't forgotten. The automation re-checks the
lock after the hour (locking mid-countdown cancels the revert) and the 83 °C
ceiling still wins. HA-side only (helper + automation + dashboard live on
ha-sofia, auto-git-tracked there); these docs are the infra-repo record.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 12:22:00 +00:00
Viktor Barzin
51456a96f6 fan-control: estimate + expose fan power (fan_watts_est)
The iDRAC reports only total DCMI watts + RPM (no per-fan power), so add a
cube-law fan-power estimate: fan_W ~= 0.0205*(RPM/1000)^3, calibrated to the
2026-06-05 sweep (fits within ~3W; ~2W floor -> ~99W full). The daemon reads
live RPM each loop and pushes pve_fan_control_fan_rpm + _fan_watts_est.
Surfaced in HA as sensor.r730_fan_power_est + a "Fan Power (est)" card on the
dashboard-it Server view, next to total power. 46 bash tests green; verified
live (9120rpm -> ~15W est).

[ci skip]

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 11:10:27 +00:00
Viktor Barzin
324f2dc3bf fan-control: continuous linear curve (replaces discrete step-bands)
Replace the step-band fan curve with a continuous linear ramp — the bands
flapped at edges (e.g. 45<->65%). Web-researched: linear + 2-3C hysteresis
is the homelab standard; PID is overkill for this slow thermal loop.
fan% now interpolates between env-tunable anchors:
  COOL  50C/30% -> 83C/100% (~2.1%/C; ~51% at the ~60C equilibrium)
  QUIET 68C/20% -> 83C/100% (near-silent until ~70C)
Both reach 100% at the 83C ceiling. Anti-oscillation: asymmetric
hysteresis (fc_decide) + a MIN_STEP (3%) min-change threshold.
41 bash tests green; deployed + verified live (59C -> 49%, smooth).

[ci skip]

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 10:29:35 +00:00
Viktor Barzin
945c1936e3 fan-control docs: HA control (mode/manual-% + auto-revert + dashboard)
Document the HA-control feature shipped in 8beca1df: the daemon reads the
ha-sofia r730_fan_mode/manual_pct helpers, the 60-min auto-revert automation,
and the dashboard-it Server-view sensors + control tiles.

[ci skip]

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 09:29:35 +00:00
Viktor Barzin
90ad6b9125 fan-control: presence-aware IPMI fan curve for the R730 PVE host
The iDRAC stock curve runs the CPU at ~72°C on the 7080 RPM floor even
under load (optimises for quiet, not cool). Add a bash daemon + systemd
unit that drives the chassis fans from CPU temp on two curves, picked by
garage occupancy (the server is in the garage): COOL when empty
(measured ~58-65°C under load), QUIET near the silent floor when the
ha-sofia garage door shows someone is there (open, or <15min since last
activity).

Manual fan mode is backstopped: bash EXIT trap + systemd ExecStopPost
hand fans back to Dell auto on stop/crash; CPU>=83°C or repeated IPMI
failures do the same. Pushgateway metrics (job=fan_control). 36 unit
tests cover the pure curve/hysteresis/presence/parse logic; DRY_RUN +
RUN_ONCE for integration checks. Deployed and verified on 192.168.1.127
(CPU 70->58°C in cool mode, hysteresis stepping confirmed).

Design:  docs/plans/2026-06-04-pve-fan-control-design.md
Runbook: docs/runbooks/fan-control.md

[ci skip]

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 09:19:11 +00:00