Compare commits

...

2 commits

Author SHA1 Message Date
Viktor Barzin
e35d693972 storage: migrate send off proxmox-lvm to NFS (LUN-cap relief)
Some checks failed
ci/woodpecker/push/build-cli Pipeline was successful
ci/woodpecker/push/default Pipeline was canceled
Send (timvisee/send) stores encrypted upload blobs on disk with metadata in
Redis — no embedded DB, NFS-safe. Frees one proxmox-csi SCSI-LUN slot on node2.

- swap send-data-proxmox -> nfs_volume module
- blobs copied + verified (273M, 22 entries, HTTP 200 on NFS)
- block PVC removed; LV retained per SC policy (code-dfjn cleanup)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 20:04:37 +00:00
Viktor Barzin
c24b4a21d8 docs(architecture): fix stale 5-node claim -> 7 nodes (k8s-node1..6) [ci skip]
Cluster grew to 7 nodes (k8s-master + node1..6; node5/6 added ~10d ago)
but several docs still said "5 nodes". Corrected with live specs:

- overview.md: 7-node enumeration; node1 is 16c/48GB (doc wrongly said
  32GB), node2-6 are 8c/32GB general workers
- compute.md: "5-node" -> "7-node" cluster description
- dns.md: NodeLocal DNSCache DaemonSet "5 nodes" -> "7 nodes"
- mailserver.md: HAProxy backend diagram "node1..4" -> "node1..6"

Illustrative "0/5 nodes available" scheduler-error examples left as-is.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 20:03:58 +00:00
5 changed files with 17 additions and 34 deletions

View file

@ -2,7 +2,7 @@
## Overview
The infrastructure runs on a single Dell R730 server with Proxmox VE, hosting a 5-node Kubernetes cluster. Compute resources are managed through a combination of Vertical Pod Autoscaler (VPA) recommendations, tier-based LimitRange defaults, and ResourceQuota enforcement. The cluster employs a no-CPU-limits policy to avoid CFS throttling while using memory requests=limits for stability. GPU workloads run on a dedicated node with Tesla T4 passthrough.
The infrastructure runs on a single Dell R730 server with Proxmox VE, hosting a 7-node Kubernetes cluster. Compute resources are managed through a combination of Vertical Pod Autoscaler (VPA) recommendations, tier-based LimitRange defaults, and ResourceQuota enforcement. The cluster employs a no-CPU-limits policy to avoid CFS throttling while using memory requests=limits for stability. GPU workloads run on a dedicated node with Tesla T4 passthrough.
## Architecture Diagram

View file

@ -28,7 +28,7 @@ graph TB
end
subgraph "Kubernetes Cluster"
NodeLocalDNS[NodeLocal DNSCache<br/>DaemonSet, 5 nodes<br/>169.254.20.10 + 10.96.0.10]
NodeLocalDNS[NodeLocal DNSCache<br/>DaemonSet, 7 nodes<br/>169.254.20.10 + 10.96.0.10]
CoreDNS[CoreDNS<br/>kube-system<br/>.:53 + viktorbarzin.lan:53]
KubeDNSUpstream[kube-dns-upstream<br/>ClusterIP, selects CoreDNS pods]

View file

@ -22,7 +22,7 @@ flowchart TB
MX --> PF[pfSense WAN<br/>vtnet0 192.168.1.2]
PF -->|NAT rdr<br/>WAN:25/465/587/993<br/>→ 10.0.20.1:same| HAP
HAP[pfSense HAProxy<br/>4 TCP frontends on 10.0.20.1<br/>send-proxy-v2 to backends]
HAP -->|round-robin<br/>tcp-check inter 120s| KN{k8s worker<br/>node1..4}
HAP -->|round-robin<br/>tcp-check inter 120s| KN{k8s worker<br/>node1..6}
KN -->|NodePort 30125-30128<br/>ETP: Cluster → kube-proxy SNAT| PODEXT
%% Internal ingress path

View file

@ -134,10 +134,10 @@ This three-tier network design isolates Kubernetes workloads from management inf
### Compute Layer
The Kubernetes cluster consists of 5 nodes:
The Kubernetes cluster consists of 7 nodes:
- **k8s-master (200)**: 8c/32GB control plane running kube-apiserver, etcd, controller-manager
- **k8s-node1 (201)**: 16c/32GB GPU node with Tesla T4 passthrough, tainted for GPU workloads only
- **k8s-node2-4 (202-204)**: 8c/32GB workers running general-purpose workloads
- **k8s-node1 (201)**: 16c/48GB GPU node with Tesla T4 passthrough, tainted for GPU workloads only
- **k8s-node2-6 (202-206)**: 8c/32GB workers running general-purpose workloads
GPU passthrough on node1 uses PCIe device 0000:06:00.0. The NVIDIA GPU Operator's gpu-feature-discovery auto-labels whichever node carries the card with `nvidia.com/gpu.present=true`; `null_resource.gpu_node_config` taints the same set of nodes with `nvidia.com/gpu=true:PreferNoSchedule`. No hostname is hardcoded — moving the card to a different node requires no Terraform edits.

View file

@ -27,33 +27,16 @@ module "tls_secret" {
tls_secret_name = var.tls_secret_name
}
resource "kubernetes_persistent_volume_claim" "data_proxmox" {
wait_until_bound = false
metadata {
name = "send-data-proxmox"
namespace = kubernetes_namespace.send.metadata[0].name
annotations = {
"resize.topolvm.io/threshold" = "10%"
"resize.topolvm.io/increase" = "100%"
"resize.topolvm.io/storage_limit" = "5Gi"
}
}
spec {
access_modes = ["ReadWriteOnce"]
storage_class_name = "proxmox-lvm"
resources {
requests = {
storage = "1Gi"
}
}
}
lifecycle {
# The autoresizer expands requests.storage up to storage_limit and
# PVCs can't shrink. Without this, every TF apply tries to revert
# to the spec value, K8s rejects the shrink, and the PVC ends up
# in Terminating-but-in-use limbo.
ignore_changes = [spec[0].resources[0].requests]
}
# Upload blobs on NFS. Migrated off proxmox-lvm 2026-06-05 for LUN-cap relief
# Send stores encrypted file blobs on disk (metadata in Redis), no embedded DB,
# NFS-safe. See docs/plans/2026-06-05-block-storage-harden-nfs-design.md
module "nfs_send" {
source = "../../modules/kubernetes/nfs_volume"
name = "send-data-nfs"
namespace = kubernetes_namespace.send.metadata[0].name
nfs_server = var.nfs_server
nfs_path = "/srv/nfs/send"
storage = "5Gi"
}
resource "kubernetes_deployment" "send" {
@ -142,7 +125,7 @@ resource "kubernetes_deployment" "send" {
volume {
name = "data"
persistent_volume_claim {
claim_name = kubernetes_persistent_volume_claim.data_proxmox.metadata[0].name
claim_name = module.nfs_send.claim_name
}
}
}