fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip]
6d224861 came from a --no-checkout worktree whose empty index made the
commit drop every file except two. This restores 05b50d2b's full tree and
correctly adds stacks/stem95su/gdrive-sync.tf + the service-catalog stem95su
entry. Forward-only (parent=6d224861, no force-push); [ci skip] since the
live infra was never applied from the broken commit.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
parent
6d224861c4
commit
fd0f4a0365
1166 changed files with 358546 additions and 0 deletions
203
.claude/reference/authentik-state.md
Normal file
203
.claude/reference/authentik-state.md
Normal file
|
|
@ -0,0 +1,203 @@
|
|||
# Authentik Current State
|
||||
|
||||
> Snapshot of applications, groups, users, and flows. Use `authentik` skill for management tasks.
|
||||
|
||||
## Applications (11)
|
||||
| Application | Provider Type | Auth Flow |
|
||||
|-------------|--------------|-----------|
|
||||
| Cloudflare Access | OAuth2/OIDC | explicit consent |
|
||||
| Domain wide catch all | Proxy (forward auth) | implicit consent |
|
||||
| Forgejo | OAuth2/OIDC | explicit consent |
|
||||
| Grafana | OAuth2/OIDC | implicit consent |
|
||||
| Headscale | OAuth2/OIDC | explicit consent |
|
||||
| Immich | OAuth2/OIDC | explicit consent |
|
||||
| Kubernetes | OAuth2/OIDC (public) | implicit consent |
|
||||
| Kubernetes Dashboard | OAuth2/OIDC (confidential) | implicit consent |
|
||||
| linkwarden | OAuth2/OIDC | explicit consent |
|
||||
| wrongmove | OAuth2/OIDC | implicit consent |
|
||||
|
||||
> **Kubernetes Dashboard** (TF-managed in `stacks/k8s-dashboard/authentik.tf`):
|
||||
> confidential client `k8s-dashboard`, built for seamless dashboard SSO via
|
||||
> oauth2-proxy. **Currently IDLE** — the apiserver rejects all OIDC tokens (see
|
||||
> `docs/plans/2026-06-04-k8s-dashboard-sso-design.md` §12), so the dashboard runs
|
||||
> on forward-auth + token-paste instead and oauth2-proxy is unwired. Kept for a
|
||||
> future SSO retry once apiserver OIDC is fixed.
|
||||
>
|
||||
> **admin-services-restriction** policy (TF-managed in
|
||||
> `stacks/authentik/admin-services-restriction.tf`, adopted 2026-06-04): gates the
|
||||
> 15 admin-only hostnames to `Home Server Admins`, with a carve-out admitting the
|
||||
> `kubernetes-*` RBAC groups to `k8s.viktorbarzin.me` (dashboard login page).
|
||||
|
||||
## Groups (9)
|
||||
| Group | Parent | Superuser | Purpose |
|
||||
|-------|--------|-----------|---------|
|
||||
| Allow Login Users | -- | No | Parent group for login-permitted users |
|
||||
| authentik Admins | -- | Yes | Full admin access |
|
||||
| Headscale Users | Allow Login Users | No | VPN access |
|
||||
| Home Server Admins | Allow Login Users | No | Server admin access |
|
||||
| Wrongmove Users | Allow Login Users | No | Real-estate app access |
|
||||
| kubernetes-admins | -- | No | K8s cluster-admin RBAC |
|
||||
| kubernetes-power-users | -- | No | K8s power-user RBAC |
|
||||
| kubernetes-namespace-owners | -- | No | K8s namespace-owner RBAC |
|
||||
| Task Submitters | -- | No | Task submission access |
|
||||
|
||||
## Users (8 real)
|
||||
| Username | Name | Type | Groups |
|
||||
|----------|------|------|--------|
|
||||
| akadmin | authentik Default Admin | internal | authentik Admins, Home Server Admins, Headscale Users |
|
||||
| vbarzin@gmail.com | Viktor Barzin | internal | authentik Admins, Home Server Admins, Wrongmove Users, Headscale Users |
|
||||
| emil.barzin@gmail.com | Emil Barzin | internal | Home Server Admins, Headscale Users |
|
||||
| ancaelena98@gmail.com | Anca Milea | external | Wrongmove Users, Headscale Users |
|
||||
| vabbit81@gmail.com | GHEORGHE Milea | external | Headscale Users, kubernetes-namespace-owners, sops-vabbit81 |
|
||||
| valentinakolevabarzina@gmail.com | Valentina | internal | Headscale Users |
|
||||
| anca.r.cristian10@gmail.com | -- | internal | Wrongmove Users |
|
||||
| kadir.tugan@gmail.com | Kadir | internal | Wrongmove Users |
|
||||
|
||||
## Login Sources
|
||||
- **Google** (OAuth) -- user matching by identifier
|
||||
- **GitHub** (OAuth) -- user matching by email_link
|
||||
- **Facebook** (OAuth) -- user matching by email_link
|
||||
- All sources use `invitation-enrollment` as enrollment flow (new users require invitation)
|
||||
|
||||
## Authorization Flows
|
||||
- **Explicit consent** (`default-provider-authorization-explicit-consent`): Shows consent screen
|
||||
- **Implicit consent** (`default-provider-authorization-implicit-consent`): Auto-redirects
|
||||
|
||||
## Invitation Enrollment Flow
|
||||
Slug: `invitation-enrollment` | PK: `7d667321-2b02-4e16-8161-148078a8dac1`
|
||||
|
||||
New users can only sign up via invitation link. Admins generate single-use invite links.
|
||||
|
||||
### Stages (in order)
|
||||
| Order | Stage | Type | Purpose |
|
||||
|-------|-------|------|---------|
|
||||
| 10 | invitation-validation | Invitation | Validates `?itoken=` parameter, blocks without valid token |
|
||||
| 20 | enrollment-identification | Identification | Shows social login (Google/GitHub/Facebook) + passkey |
|
||||
| 30 | enrollment-prompt | Prompt | Collects name and email (pre-filled from social login) |
|
||||
| 40 | enrollment-user-write | User Write | Creates user in `Allow Login Users` group |
|
||||
| 50 | enrollment-login | User Login | Auto-login after signup (policy: `invitation-group-assignment` adds user to target group from invitation `fixed_data.group`) |
|
||||
|
||||
### Invitation Management
|
||||
Script: `.claude/scripts/authentik-invite.sh`
|
||||
|
||||
```bash
|
||||
# Create invitation (single-use, no expiry)
|
||||
./authentik-invite.sh create "Headscale Users"
|
||||
|
||||
# Create invitation with expiry
|
||||
./authentik-invite.sh create "Wrongmove Users" --days 7
|
||||
|
||||
# Add user to group after enrollment
|
||||
./authentik-invite.sh assign <username> "Headscale Users"
|
||||
|
||||
# List pending invitations
|
||||
./authentik-invite.sh list
|
||||
```
|
||||
|
||||
Invited users sign up via social login (Google/GitHub/Facebook) or passkey. No username/password enrollment.
|
||||
The target group (e.g. "Headscale Users") is auto-assigned on enrollment via the `invitation-group-assignment` expression policy. The `assign` command is available for manual post-enrollment group changes.
|
||||
|
||||
## Cleanup Log (2026-03-13)
|
||||
### Deleted Flows
|
||||
- `enrollment-inviation` (typo) -- previous invitation attempt
|
||||
- `headscale-authentication` -- not used by any provider
|
||||
- `headscale-authorization` -- not used by any provider
|
||||
- `default-enrollment-flow` -- password-based, unused
|
||||
- `oauth-enrollment` -- replaced by invitation-enrollment
|
||||
|
||||
### Deleted Stages
|
||||
- `enrollment-invitation`, `enrollment-invitation-write` (from old invitation flow)
|
||||
- `invitation` (unbound)
|
||||
- `default-enrollment-prompt-first`, `default-enrollment-prompt-second` (from default enrollment)
|
||||
- `default-enrollment-user-write`, `default-enrollment-email-verification`, `default-enrollment-user-login`
|
||||
|
||||
### Deleted Groups
|
||||
- `authentik Read-only` -- 0 users, unused role
|
||||
|
||||
### Deleted Policies
|
||||
- `map github username to email` -- unbound
|
||||
- `Map Google Attributes` -- unbound
|
||||
|
||||
### Deleted Roles
|
||||
- `authentik Read-only` -- no group assignment
|
||||
|
||||
## Policy Fix (2026-04-06)
|
||||
### Unbound brute-force-protection Policy
|
||||
The `brute-force-protection` ReputationPolicy (PK: `ac98cb11-31d3-46ab-8883-bf51e6b09a60`, `check_username=True`, `check_ip=True`, `threshold=-5`) was bound to 3 authentication flows, causing "Flow does not apply to current user" for all unauthenticated users (no username to evaluate → failure_result=false → flow denied).
|
||||
|
||||
Removed bindings from:
|
||||
- `default-authentication-flow` (PK: `34618cf3`) — username/password login
|
||||
- `webauthn` (PK: `0b60c2a5`) — passkey login
|
||||
- `default-source-authentication` (PK: via policybindingmodel `1a779f24`) — Google/GitHub/Facebook OAuth
|
||||
|
||||
Policy still exists with 0 bindings. If brute-force protection is needed, bind to the **password stage** (not the flow level).
|
||||
|
||||
## Session Duration (2026-05-01)
|
||||
|
||||
Pinned via Terraform in `stacks/authentik/`:
|
||||
|
||||
| Knob | Value | Surface | Effect |
|
||||
|------|-------|---------|--------|
|
||||
| `UserLoginStage.session_duration` on `default-authentication-login` | `weeks=4` | `authentik_stage_user_login.default_login` in `authentik_provider.tf` | Authenticated users stay logged in 4 weeks across browser restarts. No sliding refresh — resets on each login. |
|
||||
| `ProxyProvider.access_token_validity` on `Provider for Domain wide catch all` | `weeks=4` | `authentik_provider_proxy.catchall.access_token_validity` in `authentik_provider.tf` | Cookie `Max-Age` on `authentik_proxy_*` and `expires` on rows in `authentik_providers_proxy_proxysession`. Bumped 2026-05-10 from `hours=168`. **Bumping requires `kubectl rollout restart deploy/ak-outpost-authentik-embedded-outpost`** — the gorilla session store binds the value once at outpost startup; the 5-min provider refresh logs `"reusing existing session store"` and skips rebuild. |
|
||||
| `AUTHENTIK_SESSIONS__UNAUTHENTICATED_AGE` (server + worker) | `hours=2` | `server.env` + `worker.env` in `modules/authentik/values.yaml` | Anonymous Django sessions (bots, healthcheckers, partial flows) are reaped within 2h instead of the 1d default. |
|
||||
|
||||
Notes:
|
||||
- There is **no** `Brand.session_duration`; `UserLoginStage` is the only correct lever for authenticated session lifetime.
|
||||
- Embedded outpost session storage: PostgreSQL table `authentik_providers_proxy_proxysession` in authentik 2025.10+ (PR #16628), but **only when `IsEmbedded()` returns true** (i.e. `Outpost.managed == "goauthentik.io/outposts/embedded"`). Our outpost record had `managed=null` until 2026-05-10, which silently kept it on the gorilla `FilesystemStore` at `/dev/shm` (TMPDIR) and re-exposed the 2026-04-18 mismatched-session-ID class on every pod restart. Fix landed 2026-05-10: see `authentik_outpost.embedded` in `authentik_provider.tf` and post-mortem `2026-04-18-authentik-outpost-shm-full.md`.
|
||||
- The proxy outpost service has a known goauthentik 2026.2.2 bug (`internal/outpost/controllers/k8s/service.py:52`): for embedded outposts the controller sets the Service selector to `app.kubernetes.io/name=authentik` (the server pods), not `authentik-outpost-proxy`. We work around it via a `kubernetes_json_patches.service` patch on the outpost record (replaces `/spec/selector` with the outpost's own labels). Without this, endpoints are empty and Traefik forward-auth fails over to the Basic Auth realm `Emergency Access`.
|
||||
- The standalone embedded-outpost deployment needs `AUTHENTIK_POSTGRESQL__{HOST,PORT,USER,PASSWORD,NAME}` env vars to reach the dbaas cluster — codified via `kubernetes_json_patches.deployment` envFrom the shared `goauthentik` Secret. The `app.kubernetes.io/component=server` pod label is also injected via JSON patch (matches the `component:server` half of the Service selector that the controller adds for embedded outposts).
|
||||
- `ProxyProvider.remember_me_offset` stays UI-managed via `ignore_changes`.
|
||||
- The Authentik provider's resource schema does **not** expose the `Outpost.managed` field. We rely on TF's "write only fields it knows about" semantic: the server-set `goauthentik.io/outposts/embedded` value is preserved across applies because Terraform never writes `managed`. Don't change the resource provider schema expectations without verifying this assumption holds.
|
||||
- The `unauthenticated_age` env var is injected via `server.env` / `worker.env` (not `authentik.sessions.unauthenticated_age`) because we set `authentik.existingSecret.secretName: goauthentik`, which makes the chart skip rendering its own `AUTHENTIK_*` Secret. The `authentik.*` value block is therefore inert in this stack — anything new under `authentik.*` must use the `*.env` arrays instead. The same applies to the existing `authentik.cache.*`, `authentik.web.*`, `authentik.worker.*` blocks (currently inert; live values come from the orphaned, helm-keep-policy `goauthentik` Secret created by chart 2025.10.3 before `existingSecret` was introduced).
|
||||
|
||||
## Upgrade Validation Checklist
|
||||
|
||||
Run after **any** of these:
|
||||
- Authentik chart version bump in `stacks/authentik/modules/authentik/main.tf` (the `version = "..."` line on `helm_release.authentik`).
|
||||
- `goauthentik/authentik` Terraform provider version bump.
|
||||
- Outpost pod recreation (kured reboot, eviction, manual `rollout restart`, scheduler move).
|
||||
|
||||
The fragile surfaces are the `kubernetes_json_patches` and the `Outpost.managed` field — both rely on assumptions that can silently break across upgrades. The checklist exercises the same path the alerts watch, so it doubles as a smoke test for the alerts.
|
||||
|
||||
```bash
|
||||
# 1. Service routes to the outpost pod (NOT the server pods).
|
||||
# Empty endpoints => auth-proxy fallback fires; expected: ONE pod IP, ports 9000/9300/9443.
|
||||
kubectl -n authentik get endpoints ak-outpost-authentik-embedded-outpost
|
||||
|
||||
# 2. Service selector still excludes the server pods. Expected: includes
|
||||
# `app.kubernetes.io/name: authentik-outpost-proxy`. If it flips to
|
||||
# `name: authentik`, the goauthentik upstream bug came back or our
|
||||
# JSON patch was unset.
|
||||
kubectl -n authentik get svc ak-outpost-authentik-embedded-outpost -o jsonpath='{.spec.selector}'
|
||||
|
||||
# 3. Outpost mode + session backend. Expected log lines on startup:
|
||||
# {"embedded":true,"event":"Outpost mode",...}
|
||||
# {"event":"using PostgreSQL session backend",...}
|
||||
# If embedded=false or `using filesystem session backend`, the postgres
|
||||
# fix is broken — likely `Outpost.managed` got cleared, or the upstream
|
||||
# schema started exposing `managed` and TF reset it.
|
||||
kubectl -n authentik logs deploy/ak-outpost-authentik-embedded-outpost | grep -E '"Outpost mode"|"session backend"' | head -3
|
||||
|
||||
# 4. /dev/shm is essentially empty (postgres backend = no filesystem use).
|
||||
# A row count > a few dozen indicates filesystem fallback is firing.
|
||||
kubectl -n authentik exec deploy/ak-outpost-authentik-embedded-outpost -- sh -c 'df -h /dev/shm; ls /dev/shm | wc -l'
|
||||
|
||||
# 5. Postgres session table is growing with traffic. Expected: rows with
|
||||
# `expires` ~28 days out (matches access_token_validity = weeks=4).
|
||||
kubectl -n authentik exec deploy/goauthentik-server -- ak shell -c "
|
||||
from django.db import connection; c = connection.cursor()
|
||||
c.execute('SELECT COUNT(*), MAX(expires) FROM authentik_providers_proxy_proxysession')
|
||||
print(c.fetchone())"
|
||||
|
||||
# 6. Edge auth flow: should be 302 → authentik. NOT 401 with WWW-Authenticate.
|
||||
curl -sS -o /dev/null -D - 'https://terminal.viktorbarzin.me/' -H 'User-Agent: Mozilla/5.0' \
|
||||
| grep -iE '^HTTP|^location|x-auth-fallback|www-authenticate'
|
||||
|
||||
# 7. Terraform plan-to-zero on the whole authentik stack.
|
||||
( cd stacks/authentik && /home/wizard/code/infra/scripts/tg plan ) | grep -E 'No changes|Plan:'
|
||||
```
|
||||
|
||||
Steps 1, 3, 6 cover the failure modes the Prometheus alerts trigger on (`AuthentikForwardAuthFallbackActive`, `AuthentikOutpostForwardAuth400Spike`). Steps 4 and 5 cover the silent-regression case (filesystem fallback) where the alerts don't fire but the system loses its postgres-backed session persistence on the next pod restart.
|
||||
|
||||
If step 2 shows the controller restored `app.kubernetes.io/name=authentik`, watch goauthentik/authentik issue tracker for fixes around `internal/outpost/controllers/k8s/service.py:52` — the upstream patch might let us drop our `kubernetes_json_patches.service` workaround.
|
||||
31
.claude/reference/github-api.md
Normal file
31
.claude/reference/github-api.md
Normal file
|
|
@ -0,0 +1,31 @@
|
|||
# GitHub API Reference
|
||||
|
||||
> Token locations and common API patterns.
|
||||
|
||||
## GitHub API
|
||||
- **Username**: `ViktorBarzin`
|
||||
- **Token**: `grep github_pat terraform.tfvars | cut -d'"' -f2` (git-crypt encrypted)
|
||||
- **Scopes**: Full access (repo, admin:public_key, admin:repo_hook, delete_repo, admin:org, workflow, write:packages)
|
||||
- **`gh` CLI**: Blocked by sandbox — use `curl` instead
|
||||
|
||||
```bash
|
||||
GITHUB_TOKEN=$(grep github_pat terraform.tfvars | cut -d'"' -f2)
|
||||
|
||||
# List repos
|
||||
curl -s -H "Authorization: token $GITHUB_TOKEN" "https://api.github.com/users/ViktorBarzin/repos?per_page=100"
|
||||
|
||||
# Create repo
|
||||
curl -s -X POST -H "Authorization: token $GITHUB_TOKEN" "https://api.github.com/user/repos" \
|
||||
-d '{"name":"repo-name","private":true}'
|
||||
|
||||
# Add deploy key
|
||||
curl -s -X POST -H "Authorization: token $GITHUB_TOKEN" "https://api.github.com/repos/ViktorBarzin/<repo>/keys" \
|
||||
-d '{"title":"key-name","key":"ssh-ed25519 ...","read_only":false}'
|
||||
|
||||
# Create webhook
|
||||
curl -s -X POST -H "Authorization: token $GITHUB_TOKEN" "https://api.github.com/repos/ViktorBarzin/<repo>/hooks" \
|
||||
-d '{"config":{"url":"https://ci.viktorbarzin.me/hook","content_type":"json","secret":"..."},"events":["push","pull_request"]}'
|
||||
```
|
||||
|
||||
## Capabilities
|
||||
- **GitHub**: Create/delete repos, push code, manage SSH/deploy keys, manage webhooks, manage org settings, manage packages
|
||||
12
.claude/reference/known-issues.md
Normal file
12
.claude/reference/known-issues.md
Normal file
|
|
@ -0,0 +1,12 @@
|
|||
# Known Issues (suppress in all agents)
|
||||
|
||||
## Permanent
|
||||
- ha-london Uptime Kuma monitor down — external HA on Raspberry Pi, not in this cluster
|
||||
- PVFillingUp for navidrome-music — Synology NAS volume, threshold is 95%, expected
|
||||
|
||||
## Intermittent
|
||||
- CrowdSec Helm release stuck in pending-upgrade — known issue, workaround: helm rollback
|
||||
- Resource usage >80% on nodes — WARN only, overcommit is by design (2x LimitRange ratio)
|
||||
|
||||
## How agents consume this file
|
||||
Each agent definition includes: "Before reporting issues, read `.claude/reference/known-issues.md` and suppress any matches."
|
||||
115
.claude/reference/patterns.md
Normal file
115
.claude/reference/patterns.md
Normal file
|
|
@ -0,0 +1,115 @@
|
|||
# Detailed Infrastructure Patterns
|
||||
|
||||
Reference file for patterns, procedures, and tables. Read on demand when the specific topic comes up.
|
||||
|
||||
## NFS Volume Pattern
|
||||
Use the `nfs_volume` shared module for all NFS volumes (creates static PVs, CSI-backed, `soft,timeo=30,retrans=3`):
|
||||
```hcl
|
||||
module "nfs_data" {
|
||||
source = "../../modules/kubernetes/nfs_volume" # ../../../../ for platform modules, ../../../ for sub-stacks
|
||||
name = "<service>-data" # Must be globally unique (PV is cluster-scoped)
|
||||
namespace = kubernetes_namespace.<service>.metadata[0].name
|
||||
nfs_server = var.nfs_server # 192.168.1.127 (Proxmox host)
|
||||
nfs_path = "/srv/nfs/<service>" # HDD NFS, or "/srv/nfs-ssd/<service>" for SSD
|
||||
}
|
||||
# In pod spec: persistent_volume_claim { claim_name = module.nfs_data.claim_name }
|
||||
```
|
||||
**Note**: Some legacy PVs still reference `/mnt/main/<service>` paths (from the TrueNAS era). These work via compatibility on the Proxmox host. New PVs should use `/srv/nfs/` or `/srv/nfs-ssd/`.
|
||||
**DO NOT use inline `nfs {}` blocks** — they mount with `hard,timeo=600` defaults which hang forever.
|
||||
|
||||
## Adding NFS Exports
|
||||
1. Create dir on Proxmox host: `ssh root@192.168.1.127 "mkdir -p /srv/nfs/<service> && chmod 777 /srv/nfs/<service>"`
|
||||
2. Edit `/etc/exports` on the Proxmox host — add the export entry
|
||||
3. Reload exports: `ssh root@192.168.1.127 "exportfs -ra"`
|
||||
4. Verify: `showmount -e 192.168.1.127`
|
||||
|
||||
## Static Site Hosting
|
||||
Two patterns for serving a folder of static files (HTML/CSS/JS/media):
|
||||
|
||||
1. **Image-baked** (default for git-native content): bake files into an `nginx:*-alpine` image at build time, deploy like any owned app (CI builds + pushes, Keel/Woodpecker rolls out). Reference: `stacks/blog` (Hugo → nginx, `Website/Dockerfile`). Use when content lives in git and changes via commits.
|
||||
|
||||
2. **NFS-backed** (for externally-authored / large / non-git content): a stock `nginx:1.28-alpine` Deployment mounts an `nfs_volume` PVC **read-only** at `/usr/share/nginx/html`; a tiny ConfigMap supplies `/etc/nginx/conf.d/default.conf` (just `root` + `index <entry>.html`). Files are dropped on `/srv/nfs/<site>` out-of-band (Nextcloud "PVE NFS Pool" or rsync) — no rebuild, auto-backed-up by `nfs-mirror`. Reference: `stacks/stem95su` (established 2026-06-07). Use when content is authored outside git (e.g. exported tools), is large (avoids git/image bloat), or a non-dev updates it. **The export subdir on the PVE host must exist before the pod mounts** — the `nfs_volume` module does NOT create it (see "Adding NFS Exports"; a subdir under the already-exported `/srv/nfs` needs no new `/etc/exports` line).
|
||||
|
||||
Both front with `ingress_factory` (`auth="none"` for open public content → CrowdSec + ai-bot-block still apply; or chain `anubis_instance` for a PoW gate, as `blog` does).
|
||||
|
||||
## ~~iSCSI Storage~~ (REMOVED — replaced by proxmox-lvm)
|
||||
> iSCSI via democratic-csi and TrueNAS has been fully removed (2026-04). All database storage now uses `StorageClass: proxmox-lvm` (Proxmox CSI, LVM-thin hotplug). TrueNAS has been decommissioned.
|
||||
|
||||
## Anti-AI Scraping (4 Active Layers) (Updated 2026-05-10)
|
||||
Default `anti_ai_scraping = true` in ingress_factory. Disable per-service: `anti_ai_scraping = false`.
|
||||
1. **Anubis PoW challenge** (per-site reverse proxy) — `modules/kubernetes/anubis_instance/`. Latest: `ghcr.io/techarohq/anubis:v1.25.0`. Difficulty 2 (~250 ms desktop / ~700 ms mobile), 30-day JWT cookie scoped to `viktorbarzin.me` so a single solve covers every Anubis-fronted subdomain. Active on: `viktorbarzin.me`, `kms.viktorbarzin.me`, `travel.viktorbarzin.me`. Add to a stack: `module "anubis" { source = "../../modules/kubernetes/anubis_instance"; name = "X"; namespace = ...; target_url = "http://<svc>.<ns>.svc.cluster.local" }`, then point ingress_factory at `module.anubis.service_name` + `port = module.anubis.service_port` and set `anti_ai_scraping = false`. Shared ed25519 signing key in Vault `secret/viktor` -> `anubis_ed25519_key`. **Avoid putting Anubis in front of CLI/API/Git endpoints (Forgejo, APIs, WebDAV)** — clients without JS can't solve PoW.
|
||||
2. **Bot blocking forwardAuth** (ForwardAuth → bot-block-proxy → poison-fountain) — global default for non-Anubis sites. `bot-block-proxy` (OpenResty in `traefik` ns) is fail-open with 100 ms connect / 200 ms read timeouts so a downed poison-fountain costs ≤200 ms per request. Source: `stacks/traefik/modules/traefik/main.tf`.
|
||||
3. **X-Robots-Tag noai** — set by `traefik-anti-ai-headers` middleware. Anubis additionally serves a comprehensive `/robots.txt` (`SERVE_ROBOTS_TXT=true`) to well-behaved bots.
|
||||
4. **Tarpit/poison content** (standalone at poison.viktorbarzin.me, `stacks/poison-fountain/`). Currently scaled to `replicas = 0` — fail-open path means no live traffic, no penalty.
|
||||
|
||||
Trap links (formerly a layer) removed April 2026 — rewrite-body plugin broken on Traefik v3.6.12 (Yaegi bugs). `strip-accept-encoding` and `anti-ai-trap-links` middlewares deleted.
|
||||
Rybbit analytics injection now via Cloudflare Worker (`stacks/rybbit/worker/`, HTMLRewriter, wildcard route `*.viktorbarzin.me/*`, 28 site ID mappings).
|
||||
Key files: `modules/kubernetes/anubis_instance/`, `stacks/poison-fountain/`, `stacks/rybbit/worker/`, `stacks/traefik/modules/traefik/main.tf`
|
||||
|
||||
## Terragrunt Architecture
|
||||
- Root `terragrunt.hcl`: DRY providers, backend, variable loading, `generate "tiers"` block
|
||||
- Each stack: `stacks/<service>/main.tf`, state at `state/stacks/<service>/terraform.tfstate`
|
||||
- Platform modules: `stacks/platform/modules/<service>/`, shared: `modules/kubernetes/`
|
||||
- Syntax: `--non-interactive`, `terragrunt run --all -- <command>` (not `run-all`)
|
||||
- Tiers auto-generated into `tiers.tf` — never add `locals { tiers = {} }` manually
|
||||
|
||||
## Factory Pattern (Multi-User Services)
|
||||
Structure: `stacks/<service>/main.tf` + `factory/main.tf`. Examples: `actualbudget`, `freedify`.
|
||||
To add a user: export NFS share, add Cloudflare route in tfvars, add module block calling factory.
|
||||
|
||||
## Node Rebuild Procedure
|
||||
1. Drain: `kubectl drain k8s-nodeX --ignore-daemonsets --delete-emptydir-data`
|
||||
2. Delete: `kubectl delete node k8s-nodeX`
|
||||
3. Destroy VM (remove from `stacks/infra/main.tf`)
|
||||
4. Get fresh join command: `ssh wizard@10.0.20.100 'sudo kubeadm token create --print-join-command'` (tokens expire 24h)
|
||||
5. Update `k8s_join_command` in `terraform.tfvars`, add VM to `stacks/infra/main.tf`, apply
|
||||
6. GPU node (k8s-node1): apply platform stack to re-apply GPU label/taint
|
||||
|
||||
## Kyverno Resource Governance
|
||||
|
||||
### LimitRange Defaults (injected when no explicit `resources {}`)
|
||||
| Tier | Default Mem | Max Mem | Default CPU | Max CPU |
|
||||
|------|------------|---------|-------------|---------|
|
||||
| 0-core | 512Mi | 8Gi | 500m | 4 |
|
||||
| 1-cluster | 512Mi | 4Gi | 500m | 2 |
|
||||
| 2-gpu | 2Gi | 16Gi | 1 | 8 |
|
||||
| 3-edge / 4-aux | 256Mi | 4Gi | 250m | 2 |
|
||||
| No tier | 256Mi | 2Gi | 250m | 1 |
|
||||
|
||||
### ResourceQuota (opt-out: `resource-governance/custom-quota=true`)
|
||||
| Tier | lim CPU | lim Mem | Pods |
|
||||
|------|---------|---------|------|
|
||||
| 0-core | 32 | 64Gi | 100 |
|
||||
| 1-cluster | 16 | 32Gi | 30 |
|
||||
| 2-gpu | 48 | 96Gi | 40 |
|
||||
| 3-edge / 4-aux | 8-16 | 16-32Gi | 20-30 |
|
||||
|
||||
Custom quotas: authentik, monitoring (opted out), nvidia (opted out), nextcloud, onlyoffice.
|
||||
LimitRange opt-out: `resource-governance/custom-limitrange=true` + custom `kubernetes_limit_range` in stack.
|
||||
|
||||
### Other Policies
|
||||
- `inject-priority-class-from-tier` (CREATE only), `inject-ndots` (ndots:2), `sync-tier-label`
|
||||
- `goldilocks-vpa-auto-mode`: VPA `off` globally — Terraform owns resources, Goldilocks observe-only
|
||||
- Security policies ALL Audit mode: `deny-privileged-containers`, `deny-host-namespaces`, `restrict-sys-admin`, `require-trusted-registries`
|
||||
|
||||
### Debugging Container Failures
|
||||
1. **OOMKilled?** → `kubectl describe limitrange tier-defaults -n <ns>`. edge/aux default = 256Mi.
|
||||
2. **Won't schedule?** → `kubectl describe resourcequota tier-quota -n <ns>`.
|
||||
3. **Evicted?** → aux-tier pods (priority 200K, Never preempt) evicted first.
|
||||
4. **Unexpected limits?** → LimitRange injects defaults. Always set explicit resources.
|
||||
5. **Need more?** → Set explicit `resources {}` or add quota/limitrange opt-out labels.
|
||||
|
||||
## Authentik (Identity Provider)
|
||||
- **URL**: `https://authentik.viktorbarzin.me` | **API**: `/api/v3/` | **Token**: `authentik_api_token` in tfvars
|
||||
- 3 server + 3 worker + 3 PgBouncer + embedded outpost
|
||||
- Forward auth: `protected = true` in ingress_factory
|
||||
- OIDC for K8s: issuer `.../application/o/kubernetes/`, client `kubernetes` (public)
|
||||
- See archived skills for management tasks and OIDC gotchas
|
||||
|
||||
## Archived Troubleshooting Runbooks
|
||||
28 skills in `.claude/skills/archived/` — load when the specific issue arises.
|
||||
Topics: authentik, bluestacks, clickhouse-nfs, coturn, crowdsec, fastapi-svelte-gpu,
|
||||
grafana-datasource, helm-stuck, ingress-migration, image-caching, gpu-devices, hpa-storm,
|
||||
nfs-mount, kubelet-manifest, llm-gpu, loki-helm, librespot, nextcloud-calendar, nfsv4-idmapd,
|
||||
openclaw-deploy, pfsense-dnsmasq, pfsense-nat, proxmox-disk, python-sanitize, terraform-state,
|
||||
traefik-helm, traefik-rewrite-body.
|
||||
130
.claude/reference/proxmox-inventory.md
Normal file
130
.claude/reference/proxmox-inventory.md
Normal file
|
|
@ -0,0 +1,130 @@
|
|||
# Proxmox Inventory & Infrastructure
|
||||
|
||||
> Static reference for VMs, hardware, and network topology.
|
||||
|
||||
## Proxmox Host Hardware
|
||||
- **Model**: Dell R730
|
||||
- **CPU**: Intel Xeon E5-2699 v4 @ 2.20GHz (22 cores / 44 threads, single socket, CPU2 unpopulated)
|
||||
- **RAM**: 272 GB DDR4-2400 ECC RDIMM (10 DIMMs, see Memory Layout below)
|
||||
- **GPU**: NVIDIA Tesla T4 (PCIe passthrough to k8s-node1)
|
||||
- **iDRAC**: 192.168.1.4 (root/calvin)
|
||||
- **Disks**: 1.1TB RAID1 SAS (backup) + 931GB Samsung SSD + 10.7TB RAID1 HDD
|
||||
- **NFS server**: Proxmox host serves NFS directly. HDD NFS: `/srv/nfs` on ext4 LV `pve/nfs-data` (2TB). SSD NFS: `/srv/nfs-ssd` on ext4 LV `ssd/nfs-ssd-data` (100GB). Exports use `async` mode (safe with UPS + databases on block storage). TrueNAS (10.0.10.15) decommissioned.
|
||||
- **Proxmox access**: `ssh root@192.168.1.127`
|
||||
|
||||
## Memory Layout (updated 2026-04-01)
|
||||
|
||||
### Physical DIMM Slot Map
|
||||
|
||||
```
|
||||
╔══════════════════════════════════════════════════════════════════════════════╗
|
||||
║ CPU1 DIMM SLOTS ║
|
||||
║ ║
|
||||
║ ┌─── WHITE (1st per channel) ───┐ ║
|
||||
║ │ │ ║
|
||||
║ │ ┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐ ║
|
||||
║ │ │ A1 │ │ A2 │ │ A3 │ │ A4 │ ║
|
||||
║ │ │ 32G │ │ 32G │ │ 32G │ │ 32G │ Samsung M393A4K40BB1-CRC (2R) ║
|
||||
║ │ │██████│ │██████│ │██████│ │██████│ ║
|
||||
║ │ └──────┘ └──────┘ └──────┘ └──────┘ ║
|
||||
║ │ Ch 0 Ch 1 Ch 2 Ch 3 ║
|
||||
║ └────────────────────────────────┘ ║
|
||||
║ ║
|
||||
║ ┌─── BLACK (2nd per channel) ───┐ ║
|
||||
║ │ │ ║
|
||||
║ │ ┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐ ║
|
||||
║ │ │ A5 │ │ A6 │ │ A7 │ │ A8 │ ║
|
||||
║ │ │ 32G │ │ 32G │ │ 32G │ │ 32G │ Samsung M393A4K40CB1-CRC (2R) ║
|
||||
║ │ │▓▓▓▓▓▓│ │▓▓▓▓▓▓│ │▓▓▓▓▓▓│ │▓▓▓▓▓▓│ ║
|
||||
║ │ └──────┘ └──────┘ └──────┘ └──────┘ ║
|
||||
║ │ Ch 0 Ch 1 Ch 2 Ch 3 ║
|
||||
║ └────────────────────────────────┘ ║
|
||||
║ ║
|
||||
║ ┌─── GREEN (3rd per channel) ───┐ ║
|
||||
║ │ │ ║
|
||||
║ │ ┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐ ║
|
||||
║ │ │ A9 │ │ A10 │ │ A11 │ │ A12 │ ║
|
||||
║ │ │ │ │ │ │ 8G │ │ 8G │ SK Hynix HMA81GR7AFR8N-UH (1R) ║
|
||||
║ │ │ empty│ │ empty│ │░░░░░░│ │░░░░░░│ ║
|
||||
║ │ └──────┘ └──────┘ └──────┘ └──────┘ ║
|
||||
║ │ Ch 0 Ch 1 Ch 2 Ch 3 ║
|
||||
║ └────────────────────────────────┘ ║
|
||||
║ ║
|
||||
║ B1-B12: All empty (requires CPU2) ║
|
||||
║ ║
|
||||
║ Legend: ██ = Samsung BB1 32G ▓▓ = Samsung CB1 32G ░░ = Hynix 8G ║
|
||||
╚══════════════════════════════════════════════════════════════════════════════╝
|
||||
```
|
||||
|
||||
### Channel Summary
|
||||
|
||||
```
|
||||
Channel 0: A1 [32G] ──── A5 [32G] ──── A9 [ ] = 64 GB ✓ matched
|
||||
Channel 1: A2 [32G] ──── A6 [32G] ──── A10[ ] = 64 GB ✓ matched
|
||||
Channel 2: A3 [32G] ──── A7 [32G] ──── A11[ 8G ] = 72 GB ~ +8G bonus
|
||||
Channel 3: A4 [32G] ──── A8 [32G] ──── A12[ 8G ] = 72 GB ~ +8G bonus
|
||||
───────── ───────── ──────────
|
||||
WHITE BLACK GREEN TOTAL: 272 GB
|
||||
```
|
||||
|
||||
### DIMM Details
|
||||
|
||||
- **A1-A4**: Samsung M393A4K40BB1-CRC 32GB DDR4-2400 ECC RDIMM (2-rank, original)
|
||||
- **A5-A8**: Samsung M393A4K40CB1-CRC 32GB DDR4-2400 ECC RDIMM (2-rank, added 2026-04-01)
|
||||
- **A11-A12**: SK Hynix HMA81GR7AFR8N-UH 8GB DDR4-2400 ECC RDIMM (1-rank, relocated from A5/A6)
|
||||
- **A9-A10, B1-B12**: Empty (B-side requires CPU2)
|
||||
- **Speed**: 2400 MHz (BIOS override — 3 DPC defaults to 1866 MHz, forced to 2400 via System BIOS > Memory Settings > Memory Frequency)
|
||||
|
||||
## Network Topology
|
||||
```
|
||||
10.0.10.0/24 - Management: Wizard (10.0.10.10)
|
||||
10.0.20.0/24 - Kubernetes: pfSense GW (10.0.20.1), Registry (10.0.20.10),
|
||||
k8s-master (10.0.20.100), DNS (10.0.20.101), MetalLB (10.0.20.102-200)
|
||||
192.168.1.0/24 - Physical: Proxmox (192.168.1.127)
|
||||
```
|
||||
|
||||
## Network Bridges
|
||||
- **vmbr0**: Physical bridge on `eno1`, IP `192.168.1.127/24` — physical/home network
|
||||
- **vmbr1**: Internal-only bridge, VLAN-aware — VLAN 10 (management) and VLAN 20 (kubernetes)
|
||||
|
||||
## VM Inventory
|
||||
|
||||
| VMID | Name | Status | CPUs | RAM | Network | Disk | Notes |
|
||||
|------|------|--------|------|-----|---------|------|-------|
|
||||
| 101 | pfsense | running | 8 | 4GB | vmbr0, vmbr1:vlan10, vmbr1:vlan20 | 32G | Gateway/firewall |
|
||||
| 102 | devvm | running | 16 | 24GB | vmbr1:vlan10 | 100G | Development VM + t3code Workstation host. 8G swapfile (swappiness=10). Capacity budget: ~4-5G RAM/active user, max ~3-4 concurrent active Claude sessions. NOT Terraform-managed. |
|
||||
| 103 | home-assistant | running | 8 | 8GB | vmbr0 | 64G | HA Sofia, net0(vlan10) disabled, SSH: vbarzin@192.168.1.8 |
|
||||
| 105 | pbs | stopped | 16 | 8GB | vmbr1:vlan10 | 32G | Proxmox Backup (unused) |
|
||||
| 200 | k8s-master | running | 8 | 16GB | vmbr1:vlan20 | 64G | Control plane (10.0.20.100) |
|
||||
| 201 | k8s-node1 | running | 16 | 32GB | vmbr1:vlan20 | 256G | GPU node, Tesla T4 |
|
||||
| 202 | k8s-node2 | running | 8 | 24GB | vmbr1:vlan20 | 256G | Worker |
|
||||
| 203 | k8s-node3 | running | 8 | 24GB | vmbr1:vlan20 | 256G | Worker |
|
||||
| 204 | k8s-node4 | running | 8 | 24GB | vmbr1:vlan20 | 256G | Worker |
|
||||
| 220 | docker-registry | running | 4 | 4GB | vmbr1:vlan20 | 64G | MAC DE:AD:BE:EF:22:22 (10.0.20.10) |
|
||||
| 300 | Windows10 | running | 16 | 8GB | vmbr0 | 100G | Windows VM |
|
||||
| ~~9000~~ | ~~truenas~~ | **stopped/decommissioned** | — | — | — | — | NFS migrated to Proxmox host (192.168.1.127) at `/srv/nfs` and `/srv/nfs-ssd` |
|
||||
|
||||
**Total VM RAM allocated**: 196 GB of 272 GB (72%) — 76 GB free for future VMs (devvm corrected 8GB→24GB 2026-06-08)
|
||||
|
||||
## VM Templates
|
||||
| VMID | Name | Purpose |
|
||||
|------|------|---------|
|
||||
| 1000 | ubuntu-2404-cloudinit-non-k8s-template | Base for non-K8s VMs |
|
||||
| 1001 | docker-registry-template | Docker registry VM |
|
||||
| 2000 | ubuntu-2404-cloudinit-k8s-template | Base for K8s nodes |
|
||||
|
||||
## PVE Host Systemd Services (Custom)
|
||||
|
||||
| Unit | Type | Schedule | Purpose |
|
||||
|------|------|----------|---------|
|
||||
| `lvm-pvc-snapshot.timer` | Timer | Daily 03:00 | LVM thin snapshots of all PVCs (7-day retention) |
|
||||
| `daily-backup.timer` | Timer | Daily 05:00 | PVC file backup, auto SQLite backup, pfSense, PVE config |
|
||||
| `offsite-sync-backup.timer` | Timer | Daily 06:00 | Two-step rsync to Synology (sda + NFS via inotify) |
|
||||
| `nfs-change-tracker.service` | Service | Continuous | inotifywait on `/srv/nfs` + `/srv/nfs-ssd`, logs to `/mnt/backup/.nfs-changes.log` |
|
||||
|
||||
## GPU Node (currently k8s-node1)
|
||||
- **VMID**: 201, **PCIe**: `0000:06:00.0` (NVIDIA Tesla T4) — physical passthrough, no Terraform pin
|
||||
- **Taint**: `nvidia.com/gpu=true:PreferNoSchedule` (applied dynamically to every NFD-discovered GPU node)
|
||||
- **Label**: `nvidia.com/gpu.present=true` (auto-applied by gpu-feature-discovery; also `feature.node.kubernetes.io/pci-10de.present=true` from NFD)
|
||||
- GPU workloads need: `node_selector = { "nvidia.com/gpu.present" : "true" }` + nvidia toleration
|
||||
- Taint applied via `null_resource.gpu_node_config` in `stacks/nvidia/modules/nvidia/main.tf`; node discovery keyed on the NFD `pci-10de.present` label so the taint follows the card to whichever host is carrying it
|
||||
164
.claude/reference/upgrade-config.json
Normal file
164
.claude/reference/upgrade-config.json
Normal file
|
|
@ -0,0 +1,164 @@
|
|||
{
|
||||
"github_repo_overrides": {
|
||||
"ghcr.io/immich-app/immich-server": "immich-app/immich",
|
||||
"ghcr.io/immich-app/immich-machine-learning": "immich-app/immich",
|
||||
"docker.io/vaultwarden/server": "dani-garcia/vaultwarden",
|
||||
"vaultwarden/server": "dani-garcia/vaultwarden",
|
||||
"docker.io/mailserver/docker-mailserver": "docker-mailserver/docker-mailserver",
|
||||
"mailserver/docker-mailserver": "docker-mailserver/docker-mailserver",
|
||||
"docker.n8n.io/n8nio/n8n": "n8n-io/n8n",
|
||||
"headscale/headscale": "juanfont/headscale",
|
||||
"technitium/dns-server": "TechnitiumSoftware/DnsServer",
|
||||
"ghcr.io/paperless-ngx/paperless-ngx": "paperless-ngx/paperless-ngx",
|
||||
"ghcr.io/blakeblackshear/frigate": "blakeblackshear/frigate",
|
||||
"ghcr.io/dgtlmoon/changedetection.io": "dgtlmoon/changedetection.io",
|
||||
"ghcr.io/linkwarden/linkwarden": "linkwarden/linkwarden",
|
||||
"ghcr.io/open-webui/open-webui": "open-webui/open-webui",
|
||||
"ghcr.io/advplyr/audiobookshelf": "advplyr/audiobookshelf",
|
||||
"ghcr.io/browserless/chromium": "browserless/chromium",
|
||||
"ghcr.io/rybbit-io/rybbit-backend": "rybbit-io/rybbit",
|
||||
"ghcr.io/rybbit-io/rybbit-client": "rybbit-io/rybbit",
|
||||
"ghcr.io/gurucomputing/headscale-ui": "gurucomputing/headscale-ui",
|
||||
"ghcr.io/dmunozv04/isponsorblocktv": "dmunozv04/iSponsorBlockTV",
|
||||
"ghcr.io/gramps-project/grampsweb": "gramps-project/gramps-web",
|
||||
"ghcr.io/project-osrm/osrm-backend": "Project-OSRM/osrm-backend",
|
||||
"ghcr.io/flaresolverr/flaresolverr": "FlareSolverr/FlareSolverr",
|
||||
"ghcr.io/therobbiedavis/listenarr": "therobbiedavis/listenarr",
|
||||
"ghcr.io/immichframe/immichframe": "immichframe/ImmichFrame",
|
||||
"lscr.io/linuxserver/qbittorrent": "linuxserver/docker-qbittorrent",
|
||||
"lscr.io/linuxserver/lidarr": "linuxserver/docker-lidarr",
|
||||
"lscr.io/linuxserver/prowlarr": "linuxserver/docker-prowlarr",
|
||||
"lscr.io/linuxserver/readarr": "linuxserver/docker-readarr",
|
||||
"lscr.io/linuxserver/speedtest-tracker": "linuxserver/docker-speedtest-tracker",
|
||||
"privatebin/nginx-fpm-alpine": "PrivateBin/PrivateBin",
|
||||
"freshrss/freshrss": "FreshRSS/FreshRSS",
|
||||
"hackmdio/hackmd": "hackmdio/codimd",
|
||||
"onlyoffice/documentserver": "ONLYOFFICE/DocumentServer",
|
||||
"netboxcommunity/netbox": "netbox-community/netbox",
|
||||
"stirlingtools/stirling-pdf": "Stirling-Tools/Stirling-PDF",
|
||||
"phpipam/phpipam-www": "phpipam/phpipam",
|
||||
"rhasspy/wyoming-whisper": "rhasspy/wyoming-addons",
|
||||
"rhasspy/wyoming-piper": "rhasspy/wyoming-addons",
|
||||
"clickhouse/clickhouse-server": "ClickHouse/ClickHouse",
|
||||
"docker.io/athomasson2/ebook2audiobook": "athomasson2/ebook2audiobook",
|
||||
"amruthpillai/reactive-resume": "AmruthPillworking/Reactive-Resume",
|
||||
"dpage/pgadmin4": "pgadmin-org/pgadmin4",
|
||||
"ghcr.io/yourok/torrserver": "YouROK/TorrServer",
|
||||
"opentripplanner/opentripplanner": "opentripplanner/OpenTripPlanner",
|
||||
"codeberg.org/forgejo/forgejo": "forgejo/forgejo",
|
||||
"shlinkio/shlink": "shlinkio/shlink",
|
||||
"shlinkio/shlink-web-client": "shlinkio/shlink-web-client",
|
||||
"dgtlmoon/sockpuppetbrowser": "dgtlmoon/sockpuppetbrowser"
|
||||
},
|
||||
"helm_chart_repo_overrides": {
|
||||
"https://charts.goauthentik.io/": "goauthentik/authentik",
|
||||
"https://traefik.github.io/charts": "traefik/traefik-helm-chart",
|
||||
"https://kyverno.github.io/kyverno/": "kyverno/kyverno",
|
||||
"https://mysql.github.io/mysql-operator/": "mysql/mysql-operator",
|
||||
"https://cloudnative-pg.github.io/charts": "cloudnative-pg/cloudnative-pg",
|
||||
"https://charts.external-secrets.io": "external-secrets/external-secrets",
|
||||
"https://metallb.github.io/metallb": "metallb/metallb",
|
||||
"https://nextcloud.github.io/helm/": "nextcloud/helm",
|
||||
"https://crowdsecurity.github.io/helm-charts": "crowdsecurity/helm-charts",
|
||||
"https://helm.releases.hashicorp.com": "hashicorp/vault-helm",
|
||||
"https://bitnami-labs.github.io/sealed-secrets": "bitnami-labs/sealed-secrets",
|
||||
"https://grafana.github.io/helm-charts": "grafana/helm-charts",
|
||||
"https://prometheus-community.github.io/helm-charts": "prometheus-community/helm-charts",
|
||||
"https://democratic-csi.github.io/charts/": "democratic-csi/democratic-csi",
|
||||
"https://stakater.github.io/stakater-charts": "stakater/Reloader",
|
||||
"https://topolvm.github.io/pvc-autoresizer": "topolvm/pvc-autoresizer",
|
||||
"https://kubernetes-sigs.github.io/descheduler/": "kubernetes-sigs/descheduler",
|
||||
"https://kubernetes-sigs.github.io/metrics-server/": "kubernetes-sigs/metrics-server",
|
||||
"https://charts.fairwinds.com/stable": "FairwindsOps/goldilocks",
|
||||
"https://helm.ngc.nvidia.com/nvidia": "NVIDIA/gpu-operator",
|
||||
"oci://ghcr.io/woodpecker-ci/helm": "woodpecker-ci/helm",
|
||||
"oci://10.0.20.10:5000/bitnamicharts": "bitnami/charts"
|
||||
},
|
||||
"db_backed_services": {
|
||||
"affine": { "type": "postgresql", "db_name": "affine", "shared": true },
|
||||
"claude-memory": { "type": "postgresql", "db_name": "claude_memory", "shared": true },
|
||||
"crowdsec": { "type": "postgresql", "db_name": "crowdsec", "shared": true },
|
||||
"dawarich": { "type": "postgresql", "db_name": "dawarich", "shared": true },
|
||||
"health": { "type": "postgresql", "db_name": "health", "shared": true },
|
||||
"linkwarden": { "type": "postgresql", "db_name": "linkwarden", "shared": true },
|
||||
"n8n": { "type": "postgresql", "db_name": "n8n", "shared": true },
|
||||
"netbox": { "type": "postgresql", "db_name": "netbox", "shared": true },
|
||||
"rybbit": { "type": "postgresql", "db_name": "rybbit", "shared": true },
|
||||
"tandoor": { "type": "postgresql", "db_name": "tandoor", "shared": true },
|
||||
"technitium": { "type": "postgresql", "db_name": "technitium", "shared": true },
|
||||
"trading-bot": { "type": "postgresql", "db_name": "trading_bot", "shared": true },
|
||||
"woodpecker": { "type": "postgresql", "db_name": "woodpecker", "shared": true },
|
||||
"immich": { "type": "postgresql", "db_name": "immich", "dedicated": true, "backup_cronjob": "postgresql-backup", "backup_namespace": "immich" },
|
||||
"authentik": { "type": "postgresql", "dedicated": true, "notes": "Uses PgBouncer, managed by Helm chart" },
|
||||
"hackmd": { "type": "mysql", "db_name": "codimd", "shared": true },
|
||||
"mailserver": { "type": "mysql", "db_name": "mailserver", "shared": true },
|
||||
"monitoring": { "type": "mysql", "db_name": "monitoring", "shared": true, "notes": "Grafana backend" },
|
||||
"nextcloud": { "type": "mysql", "db_name": "nextcloud", "shared": true },
|
||||
"onlyoffice": { "type": "mysql", "db_name": "onlyoffice", "shared": true },
|
||||
"paperless-ngx": { "type": "mysql", "db_name": "paperless_ngx", "shared": true },
|
||||
"phpipam": { "type": "mysql", "db_name": "phpipam", "shared": true },
|
||||
"real-estate-crawler": { "type": "mysql", "db_name": "wrongmove", "shared": true },
|
||||
"speedtest": { "type": "mysql", "db_name": "speedtest", "shared": true },
|
||||
"url": { "type": "mysql", "db_name": "shlink", "shared": true },
|
||||
"vault": { "type": "mysql", "db_name": "vault", "shared": true }
|
||||
},
|
||||
"backup_infrastructure": {
|
||||
"postgresql": {
|
||||
"cronjob_name": "postgresql-backup",
|
||||
"namespace": "dbaas",
|
||||
"credential_secret": "pg-cluster-superuser",
|
||||
"credential_key": "password",
|
||||
"host": "pg-cluster-rw.dbaas",
|
||||
"backup_pvc": "dbaas-postgresql-backup-host"
|
||||
},
|
||||
"mysql": {
|
||||
"cronjob_name": "mysql-backup",
|
||||
"namespace": "dbaas",
|
||||
"credential_secret": "cluster-secret",
|
||||
"credential_key": "ROOT_PASSWORD",
|
||||
"host": "mysql.dbaas",
|
||||
"backup_pvc": "dbaas-mysql-backup-host"
|
||||
}
|
||||
},
|
||||
"version_jump_always_step": [
|
||||
"authentik",
|
||||
"nextcloud",
|
||||
"immich"
|
||||
],
|
||||
"auto_detect_rules": {
|
||||
"ghcr.io/{org}/{repo}": "Use org/repo directly, strip -server/-backend suffixes if repo 404s",
|
||||
"docker.io/{org}/{repo}": "Try org/repo on GitHub",
|
||||
"lscr.io/linuxserver/{app}": "Map to linuxserver/docker-{app}",
|
||||
"quay.io/{org}/{repo}": "Try org/repo on GitHub",
|
||||
"registry.gitlab.com/{org}/{repo}": "Try org/repo on GitHub (may be GitLab-only)"
|
||||
},
|
||||
"skip_image_patterns": [
|
||||
"viktorbarzin/*",
|
||||
"registry.viktorbarzin.me/*",
|
||||
"ancamilea/*",
|
||||
"mghee/*",
|
||||
"*postgres*",
|
||||
"*mysql*",
|
||||
"*redis*",
|
||||
"*clickhouse*",
|
||||
"*etcd*",
|
||||
"registry.k8s.io/*",
|
||||
"quay.io/tigera/*",
|
||||
"quay.io/metallb/*",
|
||||
"nvcr.io/*",
|
||||
"reg.kyverno.io/*"
|
||||
],
|
||||
"breaking_change_keywords": [
|
||||
"breaking",
|
||||
"BREAKING",
|
||||
"migration required",
|
||||
"schema change",
|
||||
"database migration",
|
||||
"manual intervention",
|
||||
"action required",
|
||||
"removed",
|
||||
"deprecated",
|
||||
"renamed",
|
||||
"incompatible"
|
||||
]
|
||||
}
|
||||
Loading…
Add table
Add a link
Reference in a new issue