Agents: devops-engineer, dba, security-engineer, sre, network-engineer, platform-engineer, observability-engineer, home-automation-engineer. Scripts: deploy-status, db-health, backup-verify, tls-check, crowdsec-status, authentik-audit, oom-investigator, resource-report, dns-check, network-health, nfs-health, truenas-status, platform-status, monitoring-health. Also: known-issues.md suppression list, cluster-health-checker port-forward fix.
590 B
590 B
Known Issues (suppress in all agents)
Permanent
- ha-london Uptime Kuma monitor down — external HA on Raspberry Pi, not in this cluster
- PVFillingUp for navidrome-music — Synology NAS volume, threshold is 95%, expected
Intermittent
- CrowdSec Helm release stuck in pending-upgrade — known issue, workaround: helm rollback
- Resource usage >80% on nodes — WARN only, overcommit is by design (2x LimitRange ratio)
How agents consume this file
Each agent definition includes: "Before reporting issues, read .claude/reference/known-issues.md and suppress any matches."