From 5c42155b81342732badcfce4b549840d8f31100d Mon Sep 17 00:00:00 2001 From: Viktor Barzin Date: Fri, 3 Jul 2026 12:17:45 +0000 Subject: [PATCH] docs: Valia-sites domain language + ADR-0018 (off-infra Pages, in-cluster sync) Grill session with Viktor: his mother Valia will keep asking for 1-page site hosting, so the pattern is being made repeatable. Decisions: all Valia sites serve off-infra on Cloudflare Pages (survive homelab outages); one shared in-cluster CronJob mirrors her Drive folders every 10 min and redeploys on change; English subdomain names picked by Viktor; failed-Job-only visibility; stem95su migrates onto the pattern. CONTEXT.md gains Valia site / Content folder / Entry file; full rationale and rejected options in ADR-0018. Co-Authored-By: Claude Fable 5 --- CONTEXT.md | 15 ++++++ ...a-sites-off-infra-pages-in-cluster-sync.md | 47 +++++++++++++++++++ 2 files changed, 62 insertions(+) create mode 100644 docs/adr/0018-valia-sites-off-infra-pages-in-cluster-sync.md diff --git a/CONTEXT.md b/CONTEXT.md index 5cd34ebc..76b101d0 100644 --- a/CONTEXT.md +++ b/CONTEXT.md @@ -237,6 +237,20 @@ _Avoid_: expecting Diun to deploy; conflating with **Keel**. **Anubis**: A PoW reverse-proxy issuing a 30-day JWT cookie, used in front of public content-bearing sites without app-level auth (blog, wiki, landing pages). Never in front of Git, WebDAV, CalDAV, or API endpoints (clients can't solve PoW). +### Externally-authored sites + +**Valia site**: +A small public static site authored by Valia (Viktor's mother, external to the infra) and hosted for her under `.viktorbarzin.me`. Its source of truth is a **Content folder** she owns; the live site is a mirror of that folder, fresh within ~10 minutes. Hosted **off-infra** (Cloudflare Pages) by decision: a homelab outage freezes content but never takes her sites down. Viktor picks the English subdomain name per site at registration (her folder names stay Bulgarian). Current instances: `stem95su`, `bridge`. +_Avoid_: "school site" (the family may grow beyond school projects); treating the deployed copy as editable — edits land only in the **Content folder**. + +**Content folder**: +The Google Drive folder (or subfolder) Valia shares with `vbarzin@gmail.com` holding one **Valia site**'s files. Strictly read-only from the infra side — nothing ever writes back to her Drive. Empty or half-uploaded folder states must never wipe a live site. +_Avoid_: syncing a folder root when the servable content lives in a subfolder (stem95su serves `stem claude/files/`, not the folder root). + +**Entry file**: +The HTML file a **Valia site** serves at `/`. Defaults to `index.html`; per-site override when she names it differently (stem95su: `stem_board.html`). The override is a registration-time setting, not a constraint on her authoring. +_Avoid_: asking Valia to rename her files to fit hosting conventions. + ## Relationships - A **Service** is defined by exactly one **Stack** — **flat** or wrapping a **Stack-local module** — which sources zero or more shared **Factory modules** and resolves to one or more K8s workloads. @@ -248,6 +262,7 @@ A PoW reverse-proxy issuing a 30-day JWT cookie, used in front of public content - A **Service**'s image reaches the cluster via **Woodpecker deploy** (push-driven, on commit) or **Keel** (poll-driven, on a new registry tag); **Diun** only notifies. Operator-managed StatefulSets are rolled by neither. - An owned **Service**'s image is built by GitHub Actions from the **Canonical repo**'s **GitHub mirror** and hosted on ghcr.io (ADR-0002); the **Forgejo registry** keeps only a frozen last-known-good tag per **Service**. - Tier-1 **State tier** state and ~12 app databases share one **CNPG** `pg-cluster`, reached through **PgBouncer**; their credentials rotate via the `vault-database` store. +- A **Valia site** mirrors exactly one **Content folder** and serves exactly one **Entry file** at `/`; the folder is hers, the subdomain name is Viktor's, the hosting is off-infra. ## Example dialogue diff --git a/docs/adr/0018-valia-sites-off-infra-pages-in-cluster-sync.md b/docs/adr/0018-valia-sites-off-infra-pages-in-cluster-sync.md new file mode 100644 index 00000000..5344382a --- /dev/null +++ b/docs/adr/0018-valia-sites-off-infra-pages-in-cluster-sync.md @@ -0,0 +1,47 @@ +# Valia sites are served off-infra (Cloudflare Pages), synced in-cluster + +Valia (Viktor's mother) authors small one-page static sites in Google Drive folders she +shares, and keeps asking for them to be hosted — two exist already (`stem95su`, `bridge`) +and more are expected. We decided all **Valia sites** are served **off-infra on Cloudflare +Pages** under `.viktorbarzin.me`, kept fresh by **one shared in-cluster +CronJob** (`stacks/valia-sites/`) that mirrors each **Content folder** every 10 minutes +(rclone, drive.readonly) and re-deploys only on change (wrangler direct upload). The +existing in-cluster `stem95su` serving stack (nginx + NFS + ingress + per-site sync) +migrates onto this and is retired. + +Why off-infra serving: these are her sites, shown to teachers/parents — they must survive +homelab outages (cf. the 2026-06-27 egress incident that took every proxied in-cluster +site down). With Pages, a homelab outage degrades to "content frozen until we're back", +never "site down". Serving costs no cluster resources and no per-site nginx/PVC/ingress/ +Anubis. Why the syncer stays in-cluster anyway: secrets stay in Vault (no per-site GHA +secret sprawl), and the stem95su guard patterns (hard-fail on Drive auth errors, never +wipe a live site on an empty/partial folder, capped deletes) carry over wholesale. The +deliberate asymmetry — off-infra serving, on-infra syncing — is the point, not an +accident. + +## Considered options + +- **In-cluster everywhere** (generalise stem95su into a factory module): one roof, no + Cloudflare Pages dependency — but her sites share the homelab's fate and each site + spends cluster resources to serve static files a free CDN serves better. +- **Pages for new sites only**: less work now, two patterns and two runbooks forever. +- **GHA-scheduled sync** (fully off-infra pipeline): no cluster dependency at all, but + Drive + Cloudflare credentials would live as GitHub secrets per repo, outside Vault. + +## Consequences + +- Registration is one entry in the `sites` map (name, Content folder, optional Entry + file); CI applies Pages project, custom domain, public CNAME, and internal-DNS config + together. Names are English, picked by Viktor (most → bridge set the precedent). +- The internal split-horizon zone learns Valia sites from a ConfigMap the + `technitium-ingress-dns-sync` script consumes — declaratively, including **removal** + (the previous static-CNAME approach was add-only; a retired site left a stale record). +- Deploy-on-change is mandatory, not an optimisation: Pages caps monthly deployments on + the free tier, and a 10-minute cadence would burn ~4,300/month if unchanged runs + deployed. +- Failure visibility is **failed-Job-only** by explicit choice (no stale-sync alert, no + per-site uptime monitors, no notifications to Valia) — Viktor fields "it didn't + update" reports, consistent with the alert-noise-reduction posture. Revisit if a + silent stall actually bites. +- If the homelab is down, content updates pause; the sites keep serving last-deployed + content. Accepted degradation.