add infrastructure agent team: 8 specialized agents + 14 diagnostic scripts
Agents: devops-engineer, dba, security-engineer, sre, network-engineer, platform-engineer, observability-engineer, home-automation-engineer. Scripts: deploy-status, db-health, backup-verify, tls-check, crowdsec-status, authentik-audit, oom-investigator, resource-report, dns-check, network-health, nfs-health, truenas-status, platform-status, monitoring-health. Also: known-issues.md suppression list, cluster-health-checker port-forward fix.
This commit is contained in:
parent
fca4e02c54
commit
ff83ec3325
24 changed files with 3153 additions and 1 deletions
12
.claude/reference/known-issues.md
Normal file
12
.claude/reference/known-issues.md
Normal file
|
|
@ -0,0 +1,12 @@
|
|||
# Known Issues (suppress in all agents)
|
||||
|
||||
## Permanent
|
||||
- ha-london Uptime Kuma monitor down — external HA on Raspberry Pi, not in this cluster
|
||||
- PVFillingUp for navidrome-music — Synology NAS volume, threshold is 95%, expected
|
||||
|
||||
## Intermittent
|
||||
- CrowdSec Helm release stuck in pending-upgrade — known issue, workaround: helm rollback
|
||||
- Resource usage >80% on nodes — WARN only, overcommit is by design (2x LimitRange ratio)
|
||||
|
||||
## How agents consume this file
|
||||
Each agent definition includes: "Before reporting issues, read `.claude/reference/known-issues.md` and suppress any matches."
|
||||
Loading…
Add table
Add a link
Reference in a new issue