Swap the single "Returns over time windows" table (panel 9201) for 5 stat
panels (1d/7d/30d/90d/12mo), each showing Return % (Modified-Dietz) as the
headline value + Δ market (£, net of contributions) as a second value,
colored red/green by sign.
Same per-window Modified-Dietz math as the old table, just scoped to one
interval per panel — verified against live wealthfolio_sync PG and reproduced
through Grafana's datasource API (e.g. 30d = 8.15% / £86,875, 12mo = 38.68% /
£297,846, matching the prior table exactly). Kept the same 24×8 grid footprint
so nothing else on the dashboard reflows.
Already applied via targeted `tg apply` of the wealth.json configmap; [ci skip]
because a full monitoring-stack CI apply would pull in unrelated pre-existing
drift.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Explicitly own the keel.sh/policy annotation in TF (was relying on the
Kyverno-stamped `patch` default). Set policy=all + trigger=poll +
pollSchedule, expand ignore_changes per KEEL_LIFECYCLE_V1 to cover
Keel-written runtime annotations (change-cause, update-time, revision,
match-tag).
The agent acts only on newly-created todos; the Updated listener re-fired on
every edit (incl. the agent's own note-append). Live Updated webhook (id=2)
already deleted via OCS API.
Surfaced while installing the nextcloud-todos-api plugin (a pod roll):
- 2026.5.4 gateway rejects an openai-codex `agentRuntime` key it writes itself
(commit 4b39cb72) -> crashloop on any restart. Pinned image back to 2026.2.26.
- startup steps (plugins enable / mcp set / memory index) backgrounded +
timeout-guarded so a hung npm-install can never block the gateway.
- install-nextcloud-todos-plugin init SHA-pinned (:f85c6de1) + Always pull:
IfNotPresent served a stale cached :latest, so the plugin manifest
(configSchema) fix never landed.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Frees a per-node SCSI-LUN slot on node6 (20->19, under the check #47 >=20
WARN). FreshRSS extensions are static plugin files (no embedded DB; app DB is
external MySQL) -> NFS-safe. Empty volume (re-installable). Applied
deadlock-safe: -target deployment+module first (Recreate releases old PVC),
then full apply destroys the now-unused proxmox PVC.
Frees a per-node SCSI-LUN slot on node6 (21->20). Volume holds only
config.json (no embedded DB) -> NFS-safe. config pre-seeded to
/srv/nfs/isponsorblocktv before cutover. RWO-destroy deadlock during apply
(TF deleting the in-use old PVC before rolling the deployment) was broken by
patching the deployment claim to the NFS PVC; TF reconciled to the same value.
His app lives in novelapp, but the dashboard injects his SA token
(system:serviceaccount:vabbit81:dashboard-vabbit81), while the existing
binding only granted the OIDC User vabbit81@gmail.com (OIDC blocked). Add the
SA as a second subject so the web dashboard (token-injector) can manage
novelapp. Verified: SA can list/create in novelapp; injector path returns 200.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
namespace-owners could read all tenants' pods/configmaps/etc cluster-wide
(read-only) via the broad namespace_owner_readonly role. Give the dashboard
SAs a dedicated dashboard-nav-readonly ClusterRole = namespaces + nodes (list)
only — enough for the dashboard namespace-picker/Nodes view, but no
cross-tenant resource reads. Own-namespace access (admin) unchanged. Verified:
gheorghe can list namespaces/nodes + full vabbit81, but list pods/configmaps -A
= no, other namespaces = no.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
monitoring-quota requests.memory sat at 89% (18.2/20Gi), tripping the
ResourceQuota>80% WARN. Root cause was over-provisioned requests, not real
usage: loki requested 3Gi but its VPA upperBound is 364Mi and actual ~315Mi.
prometheus's 4Gi is legitimately required (2Gi tmpfs WAL shares the cgroup;
OOMs at 3Gi during WAL replay) so it stays; grafana's main container is
already 512Mi. Trimmed loki to 1Gi request (~3x its observed ceiling; 4Gi
Burstable limit preserves query-spike headroom) -> quota 78.8%, clears the
WARN. NOTE: alloy DaemonSet (562Mi/node) grows with node count, so revisit
(bump the 20Gi quota) as the cluster expands.
The prometheus-backup sidecar runs monthly on the 1st SUNDAY 04:00 UTC.
Consecutive first-Sundays can be ~35 days apart (e.g. May 3 -> Jun 7), but
the alert threshold was 32d (2764800s) -> it false-fired every year for the
~3 days between day-32 and the next run. Raised to 40d (3456000s): clears
the max first-Sunday spacing with margin, still catches a genuinely missed
monthly backup. Backup itself is healthy (last May 3, next Jun 7). Verified:
live rule now > 3.456e6, alert state inactive.
Elevates the shared claude-agent-service pod (SA claude-agent, ns
claude-agent) so the nextcloud-todos-exec agent can run autonomously.
Viktor explicitly chose to elevate the SHARED service knowing every
agent on the pod inherits these creds — each grant is security-sensitive
and flagged inline for review.
Vault (stacks/vault/main.tf):
- terraform-state k8s-auth role: add `claude-agent` to
bound_service_account_names (was only `default` — the pod's own SA
token could not log in, so scripts/tg apply died fetching the PG
backend password). `default` kept.
- terraform-state policy broadened from `database/static-creds/pg-terraform-state`
read only to read on database/static-creds/*, database/creds/*,
secret/data/* and secret/metadata/* — what stacks read at plan/apply
time. FLAG: grants the shared pod broad Vault READ (effectively all app
secrets + rotating DB creds); not denied: secret/data/vault.
claude-agent-service stack (stacks/claude-agent-service/main.tf):
- ExternalSecret: add FORGEJO_TOKEN (secret/ci/global -> forgejo_push_token,
viktor-scoped admin PAT) and HA_MCP_URL (secret/openclaw -> ha_sofia_mcp_url).
- git-init: add url.insteadOf rewrite to authenticate git pushes to
forgejo.viktorbarzin.me with $FORGEJO_TOKEN (PRs opened via Forgejo API).
- New claude-agent-exec ClusterRole+Binding: cluster-wide
get/list/watch/create/update/patch/delete on core (incl. secrets),
apps, batch, networking.k8s.io, rbac roles/rolebindings. Additive to the
existing read-only claude-agent role; does NOT bind cluster-admin. FLAG:
very broad — close to cluster-admin in blast radius.
- Vault login: VAULT_ADDR + VAULT_K8S_ROLE env + vault-token-refresher
sidecar (k8s-auth login role=terraform-state every 30m -> shared
emptyDir); main container symlinks ~/.vault-token so scripts/tg auto-auths.
- MCP: project-scoped .mcp.json at infra repo root wires `ha` (HTTP,
${HA_MCP_URL}) and `paperless` (in-cluster Service, no token in-cluster).
Not applied, not pushed — code only, for human review of the privilege grants.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The monthly wealthfolio-sync CronJob mounts the same RWO
wealthfolio-data-encrypted volume (shared wealthfolio.db SQLite) as the
always-running wealthfolio app Deployment. RWO attaches to only one node,
but the sync had no affinity — so the 2026-06-01 run landed on node4 while
the app held the volume on node3 and hung in ContainerCreating for 3 days
(FailedAttachVolume / Multi-Attach), surfacing as a problematic_pods WARN.
Add a required podAffinity (app=wealthfolio, topologyKey hostname) so the
sync always schedules onto the app's node, where the volume is already
attached (RWO permits multiple pods on the same node). Verified: a fresh
sync run co-located on node3, attached cleanly, and broker-sync started.
Namespace-owners (e.g. gheorghe) were blocked at forward-auth — k8s.viktorbarzin.me
was Home-Server-Admins-only. Carve-out: the dashboard host now also admits
kubernetes-admins/power-users/namespace-owners so they can reach the login page;
per-namespace access is still enforced by the pasted SA token (dashboard-sa.tf).
All other admin-only hosts unchanged. Policy adopted from UI into TF via import.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Pragmatic dashboard access while OIDC SSO is blocked: each namespace-owner
(from k8s_users) gets a ServiceAccount scoped to admin on their namespace(s)
+ cluster read-only, plus a long-lived token to paste into the dashboard
'Token' login. Real per-namespace isolation, no apiserver-OIDC dependency.
Verified: vabbit81 SA = admin in vabbit81, read-only elsewhere, no cross-ns write.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The chrome-service stack ran `playwright launch-server`, which creates
ephemeral browser contexts per `connect()`. Despite the encrypted PVC
mounted at /profile, no chromium user-data ever persisted — only npm
cache + fontconfig. Logging in via noVNC was effectively a no-op.
Refactor:
- Replace launch-server with direct chromium (TCP CDP on :9223 internal),
fronted by a Python HTTP+WS bridge on :9222 that rewrites the Host
header to bypass Chrome's hardcoded DNS-rebinding protection (no
`--remote-allow-hosts` flag exists in stock Chrome 130; verified by
binary string grep). Bridge also forces Connection: close on HTTP
responses so Node ws opens a fresh TCP for the WS upgrade rather than
trying to reuse the dead keep-alive socket.
- Add `--user-data-dir=/profile/chromium-data` so cookies/localStorage
actually persist on the encrypted PVC.
- New snapshot-server sidecar (stdlib python HTTP) serves
GET /api/snapshot at chrome.viktorbarzin.me/api/snapshot,
bearer-token-gated by the existing api_bearer_token.
- New chrome-service-snapshot-harvester CronJob (hourly) connects via
CDP, dumps storage_state() (cookies + localStorage), writes atomically
to /profile/snapshots/storage-state.json.
- NetworkPolicy: TCP/9222 (was :3000), TCP/8088 added for traefik.
Caller migration:
- f1-stream: `chromium.connect(ws_url)` → `chromium.connect_over_cdp(cdp_url)`,
env var CHROME_WS_URL → CHROME_CDP_URL. CHROME_WS_TOKEN dropped (no
longer used by code; ExternalSecret kept for symmetry with the snapshot
endpoint).
Dev-box side (out of scope for this commit — see ~/.config/systemd/user/):
- playwright-mcp.service flips to `--isolated --storage-state=...`
so per-Claude-Code-session ephemeral contexts seed from the snapshot.
- playwright-snapshot-refresh.{service,timer} (hourly) pulls the
snapshot via the bearer-gated HTTPS endpoint.
Docs updated:
- docs/architecture/chrome-service.md — new architecture diagram + wire protocol.
- docs/runbooks/chrome-service-snapshot.md — day-2 ops (refresh, rotation,
failure modes, restore).
- stacks/chrome-service/README.md — connect_over_cdp recipe.
Design spec at docs/superpowers/specs/2026-06-04-playwright-per-session-browser-design.md.
Dashboard back to the working forward-auth + kong-proxy state. The
oauth2-proxy SSO path is blocked by a deeper issue: the apiserver rejects
ALL valid Authentik OIDC tokens (both legacy --oidc-* flags and structured
AuthenticationConfiguration), despite verified signature, issuer, audience,
email_verified, synced clock, and reachable+trusted JWKS. Needs dedicated
apiserver-OIDC investigation. oauth2-proxy + k8s-dashboard Authentik app
left deployed (idle, harmless) pending that.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Phase 4 infrastructure-as-code for the nextcloud-todos service (watches the
Nextcloud Personal task list; classifies todos via local qwen3-8b and routes
research/mutating work through claude-agent-service). Clones the
recruiter-responder service pattern end-to-end. Written only — NOT applied.
- stacks/nextcloud-todos/{main.tf,terragrunt.hcl}: new aux stack cloning
recruiter-responder — ns (tier aux, istio-injection disabled, keel enrolled),
two ExternalSecrets (vault-kv app secrets + vault-database DSN), Recreate
deployment with alembic-migrate init-container, ClusterIP svc, /cb-only
HMAC-gated ingress (auth=none, proxied), and an idempotent webhook-register
null_resource (OCS webhook_listeners API, both CalendarObject Created/Updated
events -> internal svc URL, Bearer auth).
- stacks/vault/main.tf: pg_nextcloud_todos static role (nextcloud_todos, 7d
rotation) + pg-nextcloud-todos in the postgresql allowed_roles array.
- stacks/dbaas/modules/dbaas/main.tf: pg_nextcloud_todos_db null_resource
(clone of pg_tripit_db) — creates role+DB, pins role search_path, and
creates schema nextcloud_todos AUTHORIZATION nextcloud_todos.
- stacks/openclaw/main.tf: install-nextcloud-todos-plugin init-container,
nextcloud-todos-api in plugins.allow + the doctor-fix re-add + plugins
enable, NEXTCLOUD_TODOS_URL/NEXTCLOUD_TODOS_TOKEN env, and the cross-path
ESO key (secret/nextcloud-todos.webhook_bearer_token).
- stacks/uptime-kuma/modules/uptime-kuma/main.tf: internal /healthz HTTP
monitor. Prometheus /metrics scrape via svc annotations in the new stack.
- .gitleaksignore: allowlist two curl-auth-user false positives (the OCS
webhook curl uses a Vault-sourced shell var, not a literal credential).
KV seed (secret/nextcloud-todos) + applies are deferred to the apply runbook.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add cp lines in the seed-beads-agent init-container so the two new
nextcloud-todos agent definitions (baked into the image at
/usr/share/agent-seed/ by the claude-agent-service Dockerfile) land in
~/.claude/agents/ at pod start. Phase 3, task 3.3.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The grabber + bp-spender shared a 1Gi proxmox-lvm RWO PVC holding two
plain-text files (mam_id cookie + grabbed_ids.txt dedup list — no embedded
DB). On 2026-06-04 a grabber pod wedged in ContainerCreating for >1h because
its proxmox-lvm disk couldn't hot-plug onto a SCSI-LUN-saturated node VM
(k8s-node3, QEMU `query-pci` QMP timeout); `concurrencyPolicy: Forbid` then
blocked every run → 0 grabs → MAMFarmingStuck.
NFS (nfs_volume module, matching the other 9 servarr apps) removes this
volume from the per-VM SCSI hotplug path entirely: it mounts over the
network, consumes zero LUN slots, and is RWX so the grabber + bp-spender can
co-schedule on any node. Data (mam_id + grabbed_ids.txt) was copied across
before the switch; verified a grabber run Succeeds on NFS on node4 with the
preserved dedup list (tracked IDs carried over). Lever #1 from
docs/architecture/storage.md "Per-VM SCSI-LUN cap".
The apiserver rejects the email username-claim when email_verified is false
(invalid bearer token 401). Authentik external/social users are unverified,
so the default scope-email mapping fails. Mirror the proven kubernetes
provider: use the custom 'Kubernetes Email (verified)' mapping (hardcodes
email_verified=true) + 'Kubernetes Groups'. Drop the now-unneeded dual-aud
mapping (apiserver trusts the k8s-dashboard issuer w/ audience=client_id) and
align oauth2-proxy scope to 'openid email profile groups'.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Provider had signing_key=null → Authentik signed id_tokens with HS256 and
served an empty JWKS, so oauth2-proxy (and the apiserver) failed signature
verification (500 'failed to verify id token signature' on the callback).
Use the same 'authentik Self-signed Certificate' keypair the kubernetes
provider uses.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The Authentik group policy denied admins: it gated on kubernetes-* group
membership, but cluster access is email-based RBAC (User bindings from
k8s_users), not group-based. vbarzin@gmail.com (Home Server Admins) gets
cluster-admin via oidc-admin-vbarzin but isn't in any kubernetes-* group,
so the gate locked him out. Apiserver RBAC is now the sole gate — matching
the kubelogin CLI (authenticate freely, RBAC decides actions).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Dashboard now authenticates via Authentik (oauth2-proxy, k8s-dashboard
issuer) and applies each user's own RBAC via the apiserver multi-issuer
AuthenticationConfiguration. Committed so CI converges (uncommitted local
applies were being reverted by the Woodpecker terragrunt-apply pipeline).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Replace the legacy single --oidc-* flags (which kubeadm v1.34 had wiped,
silently disabling apiserver OIDC) with an apiserver.config.k8s.io/v1
AuthenticationConfiguration trusting BOTH the kubernetes (CLI) and
k8s-dashboard (oauth2-proxy) issuers. Enables per-user RBAC for the
dashboard via SSO while keeping the CLI issuer working. Remote script
health-gates /livez and auto-rolls-back on failure (single master).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Adds outbound mail for linked-email verification: EMAIL_PROVIDER=smtp + SMTP_*
app env (submits via the cluster mailserver as spam@, relayed by Brevo),
SMTP_PASSWORD mapped to the existing PLANS_IMAP_PASSWORD (no new secret), and a
token-gated /api/emails/confirm ingress carve-out (auth=none, like the calendar
feed). Applied locally via scripts/tg.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2 replicas in kubernetes-dashboard ns; OIDC code-flow against the
k8s-dashboard Authentik client, injects user id_token as Bearer upstream
to kong-proxy. ESO syncs client/cookie secrets from Vault. Ingress still
points at kong-proxy — no user-facing change yet.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
A hand-created (non-TF) uptime-kuma monitor "Traefik LoadBalancer" (id=95)
port-checked 10.0.20.200:443 — the shared LB IP Traefik moved OFF on
2026-05-30 when it took its dedicated .203 (ETP=Local). It had been DOWN
for ~5 days, surfacing as the cluster-health "uptime_kuma internal down(1)"
WARN.
Add it to local.internal_monitors as "Traefik LoadBalancer (10.0.20.203)"
(port 10.0.20.203:443) so it's managed like the TP-Link/Proxmox direct-IP
probes — a direct check of the MetalLB L2 + Traefik bind, complementing the
[External] traefik (full CF path) and Traefik Dashboard (in-cluster)
monitors. The sync CronJob created it (id=902, reporting UP @1ms); the
orphan id=95 was deleted via the uptime-kuma API.
Context (smart) search latency was caused by the 665MB vchord clip_index
decaying out of PG shared_buffers (~33% resident -> ~1.8s cold ANN reads vs
~4ms warm), NOT by yesterday's ML MODEL_TTL/clip-keepalive change (CLIP textual
is warm ~15ms on GPU). The postStart prewarm runs once at pod start and
pg_prewarm.autoprewarm only re-warms at startup, so the index decays under job
buffer-pressure over days.
- clip-index-prewarm CronJob (immich, */5): pg_prewarm('clip_index') keeps the
whole index resident -> searches stay ~4ms.
- immich-search-probe CronJob (immich, */5): times a random-vector ANN query +
reads clip_index residency, pushes gauges to the Pushgateway.
- Prometheus alerts ImmichSmartSearchSlow / ImmichClipIndexColdCache /
ImmichSearchProbeStale (+ inhibition when the probe is stale).
- cluster_healthcheck.sh check #46 check_immich_search (TOTAL_CHECKS 45->46).
- Docs: infra CLAUDE.md immich note, monitoring.md, cluster-health skill.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
travel-agent's transport-to-airport + weather-brief workflows now run inside
tripit (DB-driven instead of CalDAV), so the standalone CronJob stack is
retired (namespace + ExternalSecret + 2 CronJobs destroyed via scripts/tg).
secret/travel-agent left in Vault as an archive. Applied locally.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Merge travel-agent's two workflows into tripit (beads code-muqi): adds
tripit-transport-nudge (08:00) + tripit-weather-brief (21:00) CronJobs on
Europe/London, an optional per-job timezone, and SLACK_BOT_TOKEN +
DAWARICH_API_KEY in the tripit-secrets ExternalSecret. Nudges post to Slack
#travel + Web Push; Dawarich location via the public host (in-cluster *.svc
is 403'd by Rails host-auth). Vault secret/tripit seeded from
secret/travel-agent + secret/owntracks. Applied locally via scripts/tg.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
In-cluster pods resolved forgejo.viktorbarzin.me to the public IP
(176.12.22.76) and hairpinned out through the WAN gateway, intermittently
timing out buildkit pushes from Woodpecker build pods (which, unlike
kubelet, don't use the per-node containerd Forgejo mirror). This silently
failed CI build-and-push for Forgejo-hosted repos (recruiter-responder
pipelines #15-#18 at the push step).
Add a CoreDNS `rewrite name exact forgejo.viktorbarzin.me
traefik.traefik.svc.cluster.local` so pods resolve to the Traefik ClusterIP
(reachable in-cluster, unlike the ETP=Local LB .203; the Service-name target
auto-tracks the ClusterIP so it can't rot on a Traefik renumber). Traefik's
*.viktorbarzin.me wildcard keeps SNI/TLS valid. Makes the per-pod
woodpecker-server hostAlias belt-and-suspenders.
Applied via targeted apply (coredns ConfigMap only, to avoid reconciling 7
unrelated pre-existing drifts in the stack) + verified:
- pod resolves forgejo.viktorbarzin.me -> 10.111.111.95 (Traefik ClusterIP)
- recruiter-responder pipeline #20 build-and-push succeeds via ClusterIP
Docs: networking.md (K8s cluster DNS path) + .claude/CLAUDE.md (forgejo
registry quick-ref). Advances beads code-yh33.
[ci skip]
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
terragrunt generates backend.tf per run (remote_state generate,
if_exists=overwrite_terragrunt) from get_env("PG_CONN_STR"); these 72 committed
copies are stale artifacts already covered by .gitignore:65. They held a
plaintext (Vault-rotated, ~expired) PG password + the .200 state-backend literal
and were re-committed by CI on every run. git rm --cached stops that; they
regenerate locally from PG_CONN_STR. The live .200:5432 literal now lives only
in scripts/tg (its single bootstrap source).
Part of the L4 LB-IP review (docs/plans/2026-06-03-lb-ip-hygiene-design.md).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Part of the L4 LB-IP review (docs/plans/2026-06-03-lb-ip-hygiene-design.md).
The 2026-05-30 Traefik .200->.203 move left consumers pointing at the dead
.200; this fixes the two in-Terraform ones and replaces the stale networking
doc with an accurate registry + a renumber checklist.
- woodpecker: forgejo.viktorbarzin.me hostAlias hardcoded 10.0.20.200
(.200:443 refuses TLS now; the next woodpecker apply would re-pin it and
break pipeline creation). Now reads the Traefik ClusterIP dynamically via a
kubernetes_service data source -- cannot rot on a future renumber and avoids
the ETP=Local hairpin trap.
- monitoring: ViktorBarzinApexDrift alert summary said "expected 10.0.20.200"
-> 10.0.20.203 (cosmetic; alert logic already correct).
- docs/architecture/networking.md: rewrote the MetalLB section (it wrongly had
KMS on .200, mailserver on a LB IP, "two dedicated") into an accurate 4-IP
registry + LB-IP renumber checklist (in-band + out-of-band consumers).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The shlink-web admin SPA (shlink.viktorbarzin.me) showed "Something went
wrong while loading short URLs". Root cause: the web client was untagged
(:latest) and Keel's 2026-05-26 match-tag rewrite downgraded it to the
ancient 0.1.1 (2019 image), which speaks the removed /rest/v1/authenticate
API (404) and serves nginx on port 80. Backend (shlink:5.0.2) was healthy.
Pin shlink-web-client to 4.7.1 (current stable; :latest/:stable resolve to
it) and align container port + both probes + service target_port to 8080
(the port the 4.x nginx listens on). A clean semver anchor can no longer be
Keel-downgraded to 0.1.1.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Reconciles the tripit stack source with live state and adds the forward
flow. Ingest now polls vbarzin@gmail.com [Gmail]/All Mail read-only over a
rolling 12-month X-GM-RAW travel-sender window (Croatia Jet2 refs excluded),
filing trips under MAIL_DEFAULT_OWNER_EMAIL=vbarzin@gmail.com (Viktor's
Authentik login identity). Adds an ingest-plans CronJob that polls spam@
filtered to To:plans@viktorbarzin.me (the @viktorbarzin.me catch-all target)
so forwarded bookings are extracted and attached to the matching trip;
IMAP_PASSWORD is overridden per-job to spam@'s creds (PLANS_IMAP_PASSWORD).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The recruiter-api plugin's announceEvent() sends recruiter cards to Telegram
via OPENLOBSTER_CHANNELS_TELEGRAM_TOKEN (its fallback path, since OpenClaw
doesn't pass api.bot to "kind: tools" plugins). That env was never set in the
container, so every hourly poll threw on the send, events were never marked
consumed, and no Telegram notification ever went out — the rest of the
"recruiter pipeline has no responses" problem (the GPU/triage half was fixed
separately). Wire it from openclaw-secrets.telegram_bot_token (same token as
channels.telegram.botToken). Verified: the 3 backlogged events were announced
+ consumed on the openclaw restart.
Drafting (the /api/draft 500 that also degraded the cards) was fixed in
parallel by swapping Vault secret/recruiter-responder gpt_mini_model from the
slow/timing-out qwen3-coder-480b to meta/llama-3.3-70b-instruct (~1.6s).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>