infra/.claude/reference/known-issues.md
Viktor Barzin ff83ec3325 add infrastructure agent team: 8 specialized agents + 14 diagnostic scripts
Agents: devops-engineer, dba, security-engineer, sre, network-engineer,
platform-engineer, observability-engineer, home-automation-engineer.
Scripts: deploy-status, db-health, backup-verify, tls-check, crowdsec-status,
authentik-audit, oom-investigator, resource-report, dns-check, network-health,
nfs-health, truenas-status, platform-status, monitoring-health.
Also: known-issues.md suppression list, cluster-health-checker port-forward fix.
2026-03-15 02:01:07 +00:00

12 lines
590 B
Markdown

# Known Issues (suppress in all agents)
## Permanent
- ha-london Uptime Kuma monitor down — external HA on Raspberry Pi, not in this cluster
- PVFillingUp for navidrome-music — Synology NAS volume, threshold is 95%, expected
## Intermittent
- CrowdSec Helm release stuck in pending-upgrade — known issue, workaround: helm rollback
- Resource usage >80% on nodes — WARN only, overcommit is by design (2x LimitRange ratio)
## How agents consume this file
Each agent definition includes: "Before reporting issues, read `.claude/reference/known-issues.md` and suppress any matches."