stem95su: scheduled Drive->site sync CronJob (every 10m)
CronJob stem95su-gdrive-sync (*/10) mounts the content PVC RW and rclone-syncs the read-only Drive folder "claude" (stem claude/files) onto it (rclone/rclone:1.74.3, scope=drive.readonly, empty-source guard + --max-delete 25). ESO ExternalSecret stem95su-rclone <- Vault secret/stem95su. Requires the GCP OAuth app published to Production or the refresh token expires ~weekly. Lands the gdrive-sync stack on master (it had landed on a feature branch by accident on the shared devvm checkout). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
parent
05b50d2b96
commit
6d224861c4
1168 changed files with 120 additions and 358547 deletions
|
|
@ -1,29 +0,0 @@
|
|||
# Node Configuration Drift Quick Wins — Design
|
||||
|
||||
**Date**: 2026-02-22
|
||||
**Status**: Approved
|
||||
**Context**: From Talos Linux evaluation — these close 95% of the drift gap without changing the OS
|
||||
|
||||
## Quick Win 1: Add GPU Label to Terraform
|
||||
|
||||
**File**: `stacks/platform/modules/nvidia/main.tf`
|
||||
|
||||
Extend the existing `null_resource.gpu_node_taint` to also apply the `gpu=true` label. Rename to `gpu_node_config`. Both commands are idempotent (`--overwrite` for taint, label is a no-op if already set).
|
||||
|
||||
## Quick Win 2: Improve API Server OIDC/Audit Idempotency
|
||||
|
||||
**Files**: `stacks/platform/modules/rbac/apiserver-oidc.tf`, `audit-policy.tf`
|
||||
|
||||
Current grep-before-sed checks prevent duplicate entries but don't handle value changes. Improve the OIDC check to compare the actual issuer URL value, not just the flag name. Audit policy file is always re-uploaded (good), manifest edit is skipped if already configured (acceptable).
|
||||
|
||||
## Quick Win 3: Enable Node-Exporter via Prometheus Helm Chart
|
||||
|
||||
**File**: `stacks/platform/modules/monitoring/prometheus_chart_values.tpl`
|
||||
|
||||
Uncomment `prometheus-node-exporter: enabled: true`. Delete `playbooks/deploy_node_exporter.yaml` (unused, superseded by DaemonSet).
|
||||
|
||||
## Quick Win 4: Document Node Rebuild Procedure
|
||||
|
||||
**File**: `.claude/CLAUDE.md`
|
||||
|
||||
Add a "Node Rebuild Procedure" section documenting the full sequence: VM creation from template → cloud-init → kubeadm join → verify mirrors/labels/taints.
|
||||
Loading…
Add table
Add a link
Reference in a new issue