- Add DeploymentReplicasMismatch alert - Add StatefulSetReplicasMismatch alert - Add DaemonSetMissingPods alert - Add .claude/ directory with remote executor and knowledge base
6 KiB
Executable file
Infrastructure Repository Knowledge
Instructions for Claude
- When the user says "remember" something: Always update this file (
.claude/CLAUDE.md) with the information so it persists across sessions - When discovering new patterns or versions: Add them to the appropriate section below
- Use
/update-knowledgecommand: Or edit this file directly to add learnings
Execution Environment (CRITICAL)
- File operations (Read, Edit, Write, Glob, Grep): Run locally at
/Volumes/wizard/code/infra - Git commands: Run locally (git status, git log, git diff, etc.)
- ALL other commands: Use the remote executor relay (kubectl, terraform, helm, python, etc.)
Remote Command Execution (ALWAYS USE THIS)
For any command that is not file editing or git, use the file-based relay:
To execute a remote command:
# 1. Write command
echo "your-command-here" > /Volumes/wizard/code/infra/.claude/cmd_input.txt
# 2. Wait and check status
sleep 1 && cat /Volumes/wizard/code/infra/.claude/cmd_status.txt
# 3. Read output (when status is "done:*")
cat /Volumes/wizard/code/infra/.claude/cmd_output.txt
Status values: ready | running | done:N (N = exit code)
Requires user to start executor in another terminal:
.claude/remote-executor.sh wizard@10.0.10.10 /home/wizard/code/infra
Overview
Terraform-based infrastructure repository managing a home Kubernetes cluster on Proxmox VMs. Uses git-crypt for secrets encryption.
Directory Structure
main.tf- Main Terraform entry point, imports all modulesmodules/kubernetes/- Kubernetes service deployments (one folder per service)modules/create-vm/- Proxmox VM creation modulesecrets/- Encrypted secrets (TLS certs, keys) via git-cryptcli/- Go CLI tool for infrastructure managementscripts/- Helper scripts (cluster management, node updates)playbooks/- Ansible playbooks for node configurationdiagram/- Infrastructure diagrams (Python-based)
Key Patterns
- Each service in
modules/kubernetes/<service>/main.tfdefines its own namespace, deployments, services, and ingress - NFS storage from
10.0.10.15for persistent data - TLS secrets managed via
setup_tls_secretmodule - Ingress uses nginx-ingress with annotations for customization
- GPU workloads use
node_selector = { "gpu": "true" } - Services expose to
*.viktorbarzin.medomains
NFS Volume Pattern
Prefer inline NFS volumes over separate PV/PVC resources. Use the nfs {} block directly in pod/deployment/cronjob specs:
volume {
name = "data"
nfs {
server = "10.0.10.15"
path = "/mnt/main/<service>"
}
}
Only use PV/PVC when the Helm chart requires existingClaim (like the Nextcloud Helm chart).
Factory Pattern (for multi-user services)
Used when a service needs one instance per user. Structure:
modules/kubernetes/<service>/
├── main.tf # Namespace, TLS secret, user module calls
└── factory/
└── main.tf # Deployment, service, ingress templates with ${var.name}
Examples: actualbudget, freedify
To add a new user:
- Export NFS share at
/mnt/main/<service>/<username>in TrueNAS - Add Cloudflare route in tfvars
- Add module block in main.tf calling factory
Common Variables
tls_secret_name- TLS certificate secret nametier- Deployment tier label- Service-specific passwords passed as variables
Service Versions (as of 2025-01)
- Immich: v2.4.1
- Freedify: latest (music streaming, factory pattern)
Useful Commands
# ALWAYS use -target for terraform apply (speeds up execution)
terraform apply -target=module.kubernetes_cluster.module.<service_name>
terraform plan -target=module.kubernetes_cluster.module.<service_name>
terraform fmt -recursive
kubectl get pods -A
Terraform target examples:
terraform apply -target=module.kubernetes_cluster.module.monitoring- Apply monitoringterraform apply -target=module.kubernetes_cluster.module.immich- Apply immichterraform apply -target=module.docker-registry-vm- Apply docker registry VM- Only skip
-targetwhen explicitly told to apply everything
Module Structure
Top-level modules in main.tf:
module.k8s-node-template- K8s node VM templatemodule.non-k8s-node-template- Non-k8s VM templatemodule.docker-registry-template- Docker registry templatemodule.docker-registry-vm- Docker registry VMmodule.kubernetes_cluster- Main K8s cluster (contains all services)
Kubernetes Services (under module.kubernetes_cluster.module.*)
Core (tier 0-1):
metallb,dbaas,technitium,nginx-ingress,crowdsec,cloudflaredredis,metrics-server,authentik,nvidia,vaultwarden,reverse-proxywireguard,headscale,xray,monitoring
GPU (tier 2):
immich,frigate,ollama,ebook2audiobook
Edge/Aux (tier 3-4):
blog,drone,hackmd,mailserver,privatebin,shadowsockscity-guesser,echo,url,webhook_handler,excalidraw,travel_blogdashy,send,ytdlp,uptime-kuma,calibre,audiobookshelfpaperless-ngx,jsoncrack,servarr,ntfy,cyberchef,diunmeshcentral,nextcloud,homepage,matrix,linkwarden,actualbudgetowntracks,dawarich,changedetection,tandoor,n8n,real-estate-crawlertor-proxy,onlyoffice,forgejo,freshrss,navidrome,networking-toolboxtuya-bridge,stirling-pdf,isponsorblocktv,rybbit,wealthfoliokyverno,speedtest,freedify,netbox,f1-stream,kms,k8s-dashboarddescheduler,reloader,infra-maintenance
CI/CD
- Drone CI (
.drone.yml) for automated deployments - Auto-updates TLS certificates
- ALWAYS add
[ci skip]to commit messages when you've already runterraform applyto avoid triggering CI redundantly - After committing, run
git push origin masterto sync changes
Infrastructure
- Proxmox hypervisor for VMs
- Kubernetes cluster with GPU node
- NFS server at 10.0.10.15 for storage
- Redis shared service at
redis.redis.svc.cluster.local