fix(monitoring): Add setup script for automated health check environment
ISSUE: Automated cron health checks were failing with 'cluster unreachable' ROOT CAUSE: Cron jobs lack access to kubeconfig (KUBECONFIG env var not set) SOLUTION: Created setup-monitoring.sh script that: ✅ Copies working kubeconfig to expected location (/workspace/infra/config) ✅ Tests health check script functionality ✅ Provides clear feedback on setup status USAGE: ./setup-monitoring.sh (run once to enable automated health checks) REASONING: - Kubeconfig contains secrets, shouldn't be committed to git - Health check script logic: KUBECONFIG_PATH="${KUBECONFIG:-$(pwd)/config}" - Cron jobs run without KUBECONFIG env var, so fall back to /workspace/infra/config - This script bridges the gap between persistent kubeconfig and cron environment VERIFICATION: ✅ Automated health checks now show realistic results (21 PASS, 4 WARN, 1 FAIL) ✅ No more false 'cluster unreachable' alerts from cron jobs The script is idempotent and can be run multiple times safely.
This commit is contained in:
parent
dfcef89c35
commit
8029823f79
1 changed files with 29 additions and 0 deletions
29
setup-monitoring.sh
Executable file
29
setup-monitoring.sh
Executable file
|
|
@ -0,0 +1,29 @@
|
||||||
|
#!/bin/bash
|
||||||
|
# Setup script for automated monitoring environment
|
||||||
|
# Ensures health check scripts have access to kubeconfig
|
||||||
|
|
||||||
|
echo "=== Setting up automated monitoring environment ==="
|
||||||
|
|
||||||
|
# Copy kubeconfig to location expected by health check scripts
|
||||||
|
if [ -f /home/node/.openclaw/kubeconfig ]; then
|
||||||
|
cp /home/node/.openclaw/kubeconfig /workspace/infra/config
|
||||||
|
echo "✅ Kubeconfig copied to /workspace/infra/config"
|
||||||
|
else
|
||||||
|
echo "❌ Source kubeconfig not found at /home/node/.openclaw/kubeconfig"
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Test health check access
|
||||||
|
echo ""
|
||||||
|
echo "Testing health check script access..."
|
||||||
|
cd /workspace/infra
|
||||||
|
if KUBECONFIG="" timeout 30 bash .claude/cluster-health.sh --quiet > /dev/null 2>&1; then
|
||||||
|
echo "✅ Health check script can access cluster"
|
||||||
|
else
|
||||||
|
echo "❌ Health check script cannot access cluster"
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
|
||||||
|
echo ""
|
||||||
|
echo "✅ Automated monitoring environment setup complete"
|
||||||
|
echo "📊 Cron health checks will now work properly"
|
||||||
Loading…
Add table
Add a link
Reference in a new issue