infra/docs
Viktor Barzin 52f5de905d docs(context): freshen infra glossary (modules, tiers, new concepts) [ci skip]
Refresh CONTEXT.md against current repo + cluster reality (grill-with-docs):

- Module taxonomy rewrite: drop fictional k8s_app/helm_app/postgres_app
  factory modules (never existed); name the real four (ingress_factory,
  nfs_volume, anubis_instance, setup_tls_secret) + the shared / Stack-local
  / flat distinction; flag vestigial modules/kubernetes/<app> dirs.
- Rename "Ingress auth tier" -> "Ingress auth" (discrete modes, not tiers);
  reserve "tier" for State tier + Namespace tier only.
- Add local-path entry (cluster default SC; node-local footgun warning).
- Add concepts: Keel, Diun, CNPG/pg-cluster, MetalLB LB-IP split, Calico.
- Add "policy" ambiguity flag (Kyverno vs Calico NetworkPolicy vs Vault/RBAC).
- Fix node count 5 -> 7 (k8s-master + k8s-node1..6).

Doc-sync (same commit per repo rules):
- overview.md: replace fictional factory modules with the real shared
  modules + the flat/stack-local pattern.
- .claude/CLAUDE.md: drop dead nfs-proxmox column from the storage decision
  table + stale cross-reference (vault migrated off it 2026-04-25).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 19:34:49 +00:00
..
architecture docs(context): freshen infra glossary (modules, tiers, new concepts) [ci skip] 2026-06-05 19:34:49 +00:00
benchmarks infra/llama-cpp: benchmark report + -fa flag fix 2026-05-10 15:03:16 +00:00
plans storage: migrate tandoor off proxmox-lvm to NFS (LUN-cap relief) 2026-06-05 19:34:47 +00:00
post-mortems immich: set MACHINE_LEARNING_MODEL_TTL 0->600 to stop GPU VRAM hog 2026-06-02 20:16:11 +00:00
runbooks monitoring: NodeFilesystemFull 90%->95% + Synology storage runbook 2026-06-05 18:18:31 +00:00
known-issues.md docs: known-issues entry for the Ubuntu 26.04 / NVIDIA driver gap 2026-05-17 11:15:26 +00:00
README.md [docs] TrueNAS decommission cleanup — remove references from active docs 2026-04-19 16:55:43 +00:00

Infrastructure Documentation

This repository contains the configuration and documentation for a homelab Kubernetes cluster running on Proxmox. The infrastructure hosts 70+ services managed declaratively with Terraform and Terragrunt.

Quick Reference

Network Ranges

  • Physical Network: 192.168.1.0/24 - Physical devices and host network
  • Management VLAN 10: 10.0.10.0/24 - Infrastructure VMs and management
  • Kubernetes VLAN 20: 10.0.20.0/24 - Kubernetes cluster network

Key URLs

  • Public: viktorbarzin.me
  • Internal: viktorbarzin.lan

Architecture Documentation

Document Description
Overview Infrastructure overview, hardware specs, VM inventory, and service catalog
Networking Network topology, VLANs, routing, and firewall rules
VPN Headscale mesh VPN and Cloudflare Tunnel configuration
Storage Proxmox host NFS, Proxmox CSI (LVM-thin + LUKS2), and persistent volume management
Authentication Authentik SSO, OIDC flows, and service integration
Security CrowdSec IPS, Kyverno policies, and security controls
Monitoring Prometheus, Grafana, Loki, and observability stack
Secrets Management HashiCorp Vault integration and secret rotation
CI/CD Woodpecker CI pipeline and deployment automation
Backup & DR Backup strategy, disaster recovery, and restore procedures
Compute Proxmox VMs, GPU passthrough, K8s resource management, and VPA
Databases PostgreSQL, MySQL, Redis, and database operators
Multi-tenancy Namespace isolation, tier system, and resource quotas

Operations

  • Runbooks - Step-by-step operational procedures
  • Plans - Infrastructure change plans and rollout strategies

Getting Started

  1. Review the Overview for a high-level understanding
  2. Read the Networking doc to understand connectivity
  3. Check Compute for resource management patterns
  4. Explore individual architecture docs based on your area of interest