docs: update hardware inventory for R730 RAM upgrade to 272GB

Upgraded from 144GB (4x32G + 2x8G) to 272GB (8x32G + 2x8G) DDR4-2400. Added physical DIMM slot diagram, channel layout, and BIOS speed override notes. Updated compute architecture with correct CPU (single socket), VM memory values, and capacity figures.
2026-04-02 00:48:13 +03:00 · 2026-04-02 00:48:13 +03:00 · 2d8aa5ed89
commit 2d8aa5ed89
parent 87c858f026
3 changed files with 277 additions and 22 deletions
--- a/docs/architecture/compute.md
+++ b/docs/architecture/compute.md
@ -9,16 +9,16 @@ The infrastructure runs on a single Dell R730 server with Proxmox VE, hosting a
 ```mermaid
 graph TB
    subgraph Physical["Dell R730 Physical Host"]
-        CPU["2x Xeon E5-2699 v4<br/>22c/44t each<br/>44c/88t total"]
-        RAM["142GB DDR4 ECC"]
+        CPU["1x Xeon E5-2699 v4<br/>22c/44t<br/>CPU2 unpopulated"]
+        RAM["272GB DDR4-2400 ECC"]
        GPU["NVIDIA Tesla T4<br/>PCIe 0000:06:00.0"]
        DISK["1.1TB SSD<br/>931GB SSD<br/>10.7TB HDD"]
    end

    subgraph Proxmox["Proxmox VE"]
        direction TB
-        MASTER["VM 200: k8s-master<br/>8c / 8GB<br/>10.0.20.100"]
-        NODE1["VM 201: k8s-node1<br/>16c / 16GB<br/>GPU Passthrough<br/>nvidia.com/gpu=true:NoSchedule"]
+        MASTER["VM 200: k8s-master<br/>8c / 16GB<br/>10.0.20.100"]
+        NODE1["VM 201: k8s-node1<br/>16c / 32GB<br/>GPU Passthrough<br/>nvidia.com/gpu=true:NoSchedule"]
        NODE2["VM 202: k8s-node2<br/>8c / 24GB"]
        NODE3["VM 203: k8s-node3<br/>8c / 24GB"]
        NODE4["VM 204: k8s-node4<br/>8c / 24GB"]
@ -60,9 +60,9 @@ graph TB
 | Component | Specification |
 |-----------|---------------|
 | Model | Dell PowerEdge R730 |
-| CPU | 2x Intel Xeon E5-2699 v4 (22 cores / 44 threads each) |
-| Total Cores/Threads | 44 cores / 88 threads |
-| RAM | 142GB DDR4 ECC |
+| CPU | 1x Intel Xeon E5-2699 v4 (22 cores / 44 threads, CPU2 unpopulated) |
+| Total Cores/Threads | 22 cores / 44 threads |
+| RAM | 272GB DDR4-2400 ECC RDIMM (10 DIMMs: 8x32G Samsung + 2x8G Hynix) |
 | GPU | NVIDIA Tesla T4 (16GB GDDR6, PCIe 0000:06:00.0) |
 | Storage | 1.1TB SSD + 931GB SSD + 10.7TB HDD |
 | Hypervisor | Proxmox VE |
@ -71,13 +71,13 @@ graph TB

 | VM | VMID | vCPUs | RAM | Network | Role | Taints |
 |----|------|-------|-----|---------|------|--------|
-| k8s-master | 200 | 8 | 8GB | vmbr1:vlan20 (10.0.20.100) | Control Plane | `node-role.kubernetes.io/control-plane:NoSchedule` |
-| k8s-node1 | 201 | 16 | 16GB | vmbr1:vlan20 | GPU Worker | `nvidia.com/gpu=true:NoSchedule` |
+| k8s-master | 200 | 8 | 16GB | vmbr1:vlan20 (10.0.20.100) | Control Plane | `node-role.kubernetes.io/control-plane:NoSchedule` |
+| k8s-node1 | 201 | 16 | 32GB | vmbr1:vlan20 | GPU Worker | `nvidia.com/gpu=true:NoSchedule` |
 | k8s-node2 | 202 | 8 | 24GB | vmbr1:vlan20 | Worker | None |
 | k8s-node3 | 203 | 8 | 24GB | vmbr1:vlan20 | Worker | None |
 | k8s-node4 | 204 | 8 | 24GB | vmbr1:vlan20 | Worker | None |

-**Total Cluster Resources**: 48 vCPUs, 104GB RAM (excluding control plane)
+**Total Cluster Resources**: 48 vCPUs, 120GB RAM (excluding control plane)

 ### GPU Passthrough

@ -443,7 +443,7 @@ spec:
 **Rationale**:
 - **CFS Throttling**: Linux Completely Fair Scheduler throttles containers to their exact CPU limit, even when CPU is idle. This causes artificial performance degradation.
 - **Burstability**: Services can burst to unused CPU during low-load periods, improving response times.
- **Memory-bound**: With 142GB RAM across 48 vCPUs, memory exhaustion occurs before CPU saturation. Memory is the constraining resource.
+- **Memory-bound**: With 272GB host RAM (180GB allocated to VMs), memory is no longer the primary constraint. 92GB headroom available for new VMs.

 **Tradeoff**: A runaway process could monopolize CPU. Mitigated by CPU requests reserving capacity and PriorityClass preemption.

@ -492,7 +492,7 @@ spec:
 - **Growth Buffer**: Services grow over time (more users, more data). Headroom delays the need for manual intervention.
 - **GPU Volatility**: GPU workloads (ML inference) have unpredictable memory usage. 30% headroom reduces OOMKills.

-**Tradeoff**: Slightly higher memory allocation. Accepted because 142GB RAM provides ample capacity.
+**Tradeoff**: Slightly higher memory allocation. Accepted because 272GB RAM provides ample capacity.

 ## Troubleshooting