From d91fbd4a604cdad07805a98f42a5186ee7adb5a3 Mon Sep 17 00:00:00 2001 From: Viktor Barzin Date: Fri, 17 Apr 2026 19:42:55 +0000 Subject: [PATCH] [monitoring] Delete orphan server-power-cycle/main.sh with iDRAC default creds MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit ## Context stacks/monitoring/modules/monitoring/server-power-cycle/main.sh is an old shell implementation of a power-cycle watchdog that polled the Dell iDRAC on 192.168.1.4 for PSU voltage. It hardcoded the Dell iDRAC default credentials (root:calvin) in 5 `curl -u root:calvin` calls. Both remotes are public, so those credentials — and the implicit statement that 'this host has not rotated the default BMC password' — have been exposed. The current implementation is main.py in the same directory. It reads iDRAC credentials from the environment variables `idrac_user` and `idrac_password` (see module's iDRAC_USER_ENV_VAR / iDRAC_PASSWORD_ENV_VAR constants), which are populated from Vault via ExternalSecret at runtime. main.sh is not referenced by any Terraform, ConfigMap, or deploy script — grep confirms no `file()` / `templatefile()` / `filebase64()` call loads it, and no hand-rolled shell wrapper invokes it. ## This change - git rm stacks/monitoring/modules/monitoring/server-power-cycle/main.sh main.py is retained unchanged. ## What is NOT in this change - iDRAC password rotation on 192.168.1.4. The BMC should be moved off the vendor default `calvin` regardless; rotation is tracked in the broader remediation plan and in the iDRAC web UI. - A separate finding in stacks/monitoring/modules/monitoring/idrac.tf (the redfish-exporter ConfigMap has `default: username: root, password: calvin` as a fallback for iDRAC hosts not explicitly listed) is NOT addressed here — filed as its own task so the fix (drop the default block vs. source from env) can be considered in isolation. - Git-history scrub of main.sh is pending the broader filter-repo pass. ## Test plan ### Automated $ grep -rn 'server-power-cycle/main\.sh\|main\.sh' \ --include='*.tf' --include='*.hcl' --include='*.yaml' \ --include='*.yml' --include='*.sh' (no consumer references) ### Manual Verification 1. `git show HEAD --stat` shows only the one deletion. 2. `test ! -e stacks/monitoring/modules/monitoring/server-power-cycle/main.sh` 3. `kubectl -n monitoring get deploy idrac-redfish-exporter` still shows the exporter running — unrelated to this file. 4. main.py continues to run its watchdog loop without regression, because it was never coupled to main.sh. Co-Authored-By: Claude Opus 4.7 (1M context) --- .../monitoring/server-power-cycle/main.sh | 66 ------------------- 1 file changed, 66 deletions(-) delete mode 100644 stacks/monitoring/modules/monitoring/server-power-cycle/main.sh diff --git a/stacks/monitoring/modules/monitoring/server-power-cycle/main.sh b/stacks/monitoring/modules/monitoring/server-power-cycle/main.sh deleted file mode 100644 index fd7d362f..00000000 --- a/stacks/monitoring/modules/monitoring/server-power-cycle/main.sh +++ /dev/null @@ -1,66 +0,0 @@ -#!/bin/sh - -tag=server-power-cycle-script -logger -t $tag start $(date '+%F-%R') - -if [ -f /tmp/server-power-cycle-lock ]; then - logger -t $tag 'Script already running. exiting' - exit 0 -fi -touch /tmp/server-power-cycle-lock - - -if [ -f /root/server-power-cycle/state.off ]; then - logger -t $tag 'Server state set to off' - while true; do - sleep 60 # sleep 1 minute - logger -t $tag 'Trying to connect to idrac system...' - curl --connect-timeout 5 -s -k -u root:calvin -H"Content-type: application/json" -X GET https://192.168.1.4/redfish/v1/Chassis/System.Embedded.1/Power/PowerSupplies/PSU.Slot.2 - if [[ $? -eq 0 ]]; then - logger -t $tag "Connected to idrac, assuming power is back on" - logger -t $tag "Power supply restored, sending power on command" - curl -s -k -u root:calvin -X POST -d '{"Action": "Reset", "ResetType": "On"}' -H"Content-type: application/json" https://192.168.1.4/redfish/v1/Systems/System.Embedded.1/Actions/ComputerSystem.Reset - rm /root/server-power-cycle/state.off - - logger -t $tag end $(date '+%F-%R') - rm /tmp/server-power-cycle-lock - exit 0 - fi - done -fi - - -voltage=$(curl -s -k -u root:calvin -H"Content-type: application/json" -X GET https://192.168.1.4/redfish/v1/Chassis/System.Embedded.1/Power/PowerSupplies/PSU.Slot.2 |jq .LineInputVoltage) -# check input voltage on the pwoer supply connected to the outer system -if [[ $voltage -gt 0 ]]; then - logger -t $tag "power supply is on. exiting" - logger -t $tag end $(date '+%F-%R') - rm /tmp/server-power-cycle-lock - exit 0 -fi - -to_wait=30 -echo "Continuously checking power supply for the next $to_wait minutes" - -for i in $(seq 30); do - logger -t $tag "Sleeping a minute..Minute $i" - sleep 60 - - # check input voltage on the pwoer supply connected to the outer system - voltage=$(curl -s -k -u root:calvin -H"Content-type: application/json" -X GET https://192.168.1.4/redfish/v1/Chassis/System.Embedded.1/Power/PowerSupplies/PSU.Slot.2 |jq .LineInputVoltage) - if [[ $voltage -gt 0 ]]; then - logger -t $tag "power supply is on. exiting" - - logger -t $tag end $(date '+%F-%R') - rm /tmp/server-power-cycle-lock - exit 0 - fi - -done - -logger -t $tag "Power supply did not come back, sending graceful shutdown signal" -curl -s -k -u root:calvin -X POST -d '{"Action": "Reset", "ResetType": "GracefulShutdown"}' -H"Content-type: application/json" https://192.168.1.4/redfish/v1/Systems/System.Embedded.1/Actions/ComputerSystem.Reset - -touch /root/server-power-cycle/state.off -rm /tmp/server-power-cycle-lock -logger -t $tag end $(date '+%F-%R')