diff --git a/docs/architecture/compute.md b/docs/architecture/compute.md index fe27f730..d4ccf6e1 100644 --- a/docs/architecture/compute.md +++ b/docs/architecture/compute.md @@ -2,7 +2,7 @@ ## Overview -The infrastructure runs on a single Dell R730 server with Proxmox VE, hosting a 5-node Kubernetes cluster. Compute resources are managed through a combination of Vertical Pod Autoscaler (VPA) recommendations, tier-based LimitRange defaults, and ResourceQuota enforcement. The cluster employs a no-CPU-limits policy to avoid CFS throttling while using memory requests=limits for stability. GPU workloads run on a dedicated node with Tesla T4 passthrough. +The infrastructure runs on a single Dell R730 server with Proxmox VE, hosting a 7-node Kubernetes cluster. Compute resources are managed through a combination of Vertical Pod Autoscaler (VPA) recommendations, tier-based LimitRange defaults, and ResourceQuota enforcement. The cluster employs a no-CPU-limits policy to avoid CFS throttling while using memory requests=limits for stability. GPU workloads run on a dedicated node with Tesla T4 passthrough. ## Architecture Diagram diff --git a/docs/architecture/dns.md b/docs/architecture/dns.md index eec99830..e90956d2 100644 --- a/docs/architecture/dns.md +++ b/docs/architecture/dns.md @@ -28,7 +28,7 @@ graph TB end subgraph "Kubernetes Cluster" - NodeLocalDNS[NodeLocal DNSCache
DaemonSet, 5 nodes
169.254.20.10 + 10.96.0.10] + NodeLocalDNS[NodeLocal DNSCache
DaemonSet, 7 nodes
169.254.20.10 + 10.96.0.10] CoreDNS[CoreDNS
kube-system
.:53 + viktorbarzin.lan:53] KubeDNSUpstream[kube-dns-upstream
ClusterIP, selects CoreDNS pods] diff --git a/docs/architecture/mailserver.md b/docs/architecture/mailserver.md index 0026b932..0edeffb4 100644 --- a/docs/architecture/mailserver.md +++ b/docs/architecture/mailserver.md @@ -22,7 +22,7 @@ flowchart TB MX --> PF[pfSense WAN
vtnet0 192.168.1.2] PF -->|NAT rdr
WAN:25/465/587/993
→ 10.0.20.1:same| HAP HAP[pfSense HAProxy
4 TCP frontends on 10.0.20.1
send-proxy-v2 to backends] - HAP -->|round-robin
tcp-check inter 120s| KN{k8s worker
node1..4} + HAP -->|round-robin
tcp-check inter 120s| KN{k8s worker
node1..6} KN -->|NodePort 30125-30128
ETP: Cluster → kube-proxy SNAT| PODEXT %% Internal ingress path diff --git a/docs/architecture/overview.md b/docs/architecture/overview.md index 8f1a19ce..5ca53660 100644 --- a/docs/architecture/overview.md +++ b/docs/architecture/overview.md @@ -134,10 +134,10 @@ This three-tier network design isolates Kubernetes workloads from management inf ### Compute Layer -The Kubernetes cluster consists of 5 nodes: +The Kubernetes cluster consists of 7 nodes: - **k8s-master (200)**: 8c/32GB control plane running kube-apiserver, etcd, controller-manager -- **k8s-node1 (201)**: 16c/32GB GPU node with Tesla T4 passthrough, tainted for GPU workloads only -- **k8s-node2-4 (202-204)**: 8c/32GB workers running general-purpose workloads +- **k8s-node1 (201)**: 16c/48GB GPU node with Tesla T4 passthrough, tainted for GPU workloads only +- **k8s-node2-6 (202-206)**: 8c/32GB workers running general-purpose workloads GPU passthrough on node1 uses PCIe device 0000:06:00.0. The NVIDIA GPU Operator's gpu-feature-discovery auto-labels whichever node carries the card with `nvidia.com/gpu.present=true`; `null_resource.gpu_node_config` taints the same set of nodes with `nvidia.com/gpu=true:PreferNoSchedule`. No hostname is hardcoded — moving the card to a different node requires no Terraform edits.