afk: add the autonomous issue-implementer loop (SHIPS DISABLED)

Adds app/afk/ — the "away-from-keyboard" control plane that watches the issue tracker for ready-for-agent issues, dispatches each to a fresh full-access T3 thread (with the issue-implementer preamble prepended, because T3 does not honour ~/.claude/CLAUDE.md), and drives the resulting run through its lifecycle: tests-red -> green -> pushed -> CI -> deployed, escalating or fix-forwarding via a small pure state machine. The loop is split into pure cores (no I/O, exhaustively unit-tested) and thin injected adapters (the only edges that ever touch T3, the tracker, CI, or Slack — faked in every test, so nothing here talks to a real server, GitHub/Forgejo, or the cluster): pure: types, dispatch_policy, run_state_machine, phase_checklist, config, issue_implementer_prompt adapters: t3_client (two-POST dispatch + snapshot), tracker, ci_watcher, notifier loops: poller — CronJob tick #1: list_ready -> select_dispatchable -> dispatch + stamp the in-progress lock (label only AFTER a successful dispatch, so a failed dispatch never leaves a phantom lock). Per-repo lock derived from the ready set, since the CronJob is stateless between ticks. watcher — CronJob tick #2: assemble RunState from snapshot + CI -> next_action -> act (close on success; relabel ready-for-human + ring the doorbell on the two escalations; dispatch a corrective turn on fix-forward; refresh the progress checklist). SHIPS DISABLED, on purpose: Config defaults to kill_switch=True AND an empty allowlist, so a freshly-loaded config dispatches nothing and does zero I/O. The package is not imported by the running service and has no auto-enable path. Arming it is a deliberate, later, manual step requiring BOTH gates (clear the kill switch AND enrol the exact repos) so one fat-fingered env var can't arm every repo. Test-first throughout: 412 tests pass (poller + watcher add integration tests wiring the real pure cores to in-memory fakes). mypy clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 21:15:11 +00:00 · 2026-06-15 21:15:11 +00:00 · 2ef0db9a96
commit 2ef0db9a96
parent 171857da6b
23 changed files with 4717 additions and 0 deletions
--- a/app/afk/init.py
+++ b/app/afk/init.py
@ -0,0 +1,43 @@
+"""AFK loop: the autonomous issue-implementer control plane.
+
+This package is the "away-from-keyboard" automation that watches the issue
+tracker for ``ready-for-agent`` issues, dispatches each to a fresh **T3** thread
+(the full-access ``claudeAgent`` runtime) with the issue-implementer preamble
+prepended, then drives the resulting run through its lifecycle — tests-red →
+green → pushed → CI → deployed — escalating or fix-forwarding per a small,
+testable state machine. It owns no agent behaviour itself; the agent's standing
+rules are injected as a prompt preamble (``issue_implementer_prompt``) because
+T3 does NOT honour ``~/.claude/CLAUDE.md``.
+
+The whole loop ships **DISABLED**, by two independent gates: ``Config`` defaults
+to ``kill_switch=True`` AND an empty ``allowlist`` (see ``config.py``). Importing
+this package, scheduling the CronJob entrypoints, or constructing the default
+``Config`` therefore dispatches NOTHING and performs zero I/O — a disabled tick
+is wholly inert. The package is also not imported by the running service
+(``app.main``), so wiring it in changes nothing on its own.
+
+>>> ENABLING IS A DELIBERATE MANUAL STEP, PERFORMED LATER, NEVER BY THIS CODE. <<<
+Arming the loop takes BOTH of, on purpose (either alone stays inert, so one
+fat-fingered env var can't arm every repo):
+  1. clear the kill switch  (``AFK_KILL_SWITCH=false`` / ConfigMap ``kill_switch: "false"``), AND
+  2. enrol the exact repos   (``AFK_ALLOWLIST=repo-a,repo-b`` / ConfigMap ``allowlist``).
+There is no auto-enable path anywhere in this package; do not add one here.
+
+Every test in the suite runs against fakes — this package never talks to a real
+T3 server, GitHub/Forgejo, the cluster, or Slack.
+
+Module map (each is independently testable against the interfaces in
+``types.py``):
+  * ``types``                    — shared dataclasses + enums (the contract).
+  * ``config``                   — disabled-by-default Config + env/configmap loaders.
+  * ``issue_implementer_prompt`` — the preamble prepended to every dispatch.
+  * ``dispatch_policy``          — which ready issues to dispatch right now (pure).
+  * ``run_state_machine``        — snapshot + CI status → next Action (pure).
+  * ``phase_checklist``          — render the run's progress as a markdown checklist (pure).
+  * ``t3_client``                — the two-POST T3 dispatch + snapshot reader.
+  * ``tracker``                  — issue-tracker reads/labels/comments/close.
+  * ``ci_watcher``               — commit → CI status.
+  * ``notifier``                 — escalation/notification sink.
+  * ``poller``                   — CronJob tick #1: select + dispatch ready issues.
+  * ``watcher``                  — CronJob tick #2: drive one in-flight run to a verdict.
+"""
--- a/app/afk/ci_watcher.py
+++ b/app/afk/ci_watcher.py
@ -0,0 +1,141 @@
+"""CI watcher — fold a pushed commit's pipeline into a single ``CIStatus``.
+
+A commit the agent pushed to ``master`` is only "done" once it has both *built*
+and *deployed*: the CI/CD chain is GHA → ghcr → Woodpecker → Keel
+(``docs/2026-06-14-afk-implementation-pipeline-design.md``). This adapter
+collapses that multi-stage reality into the three-value verdict the state
+machine speaks (:class:`~app.afk.types.CIStatus`): ``PENDING`` / ``GREEN`` /
+``RED``.
+
+It checks three stages in order and stops at the first that decides the verdict:
+
+  1. **build** — the GitHub Actions run for the commit (build + test + lint);
+  2. **deploy** — the Woodpecker pipeline that ships the built image;
+  3. **rollout** — the image actually reaching the cluster (Keel/k8s rollout).
+
+Folding rule, applied stage by stage: a ``FAILURE`` anywhere is ``RED`` (and we
+short-circuit — a red build is never "rolled out", and we don't bother the later
+clients); a stage that hasn't concluded (``NONE`` = no run yet, ``PENDING`` =
+in progress) makes the whole verdict ``PENDING`` (the state machine waits on
+either); only when *every* stage has succeeded is the commit ``GREEN``.
+
+The three stage clients are **injected**, each behind a tiny structural
+:class:`typing.Protocol`, so this module never imports ``gh`` / ``woodpecker`` /
+``kubectl`` and the tests drive it entirely with fakes. The rollout client is
+**optional** — the pilot keeps cluster/``state.sqlite`` reads optional, so a
+watcher built without one treats a green deploy as the terminal ``GREEN``. The
+real client wiring (subprocess argv, JSON parsing, kubectl-exec) lives in the
+adapters that *implement* these Protocols, not here; keeping this module pure
+keeps the folding logic the only thing under test.
+"""
+from enum import Enum
+from typing import Protocol
+
+from .types import CIStatus
+
+
+class StageResult(Enum):
+    """Outcome of one CI/CD stage for a commit, before folding into ``CIStatus``.
+
+    Each injected client returns one of these per ``(repo, commit)``:
+
+    ``NONE`` — no run exists yet for this commit (e.g. the webhook hasn't fired);
+    ``PENDING`` — a run exists and is still in progress;
+    ``SUCCESS`` — the stage concluded green;
+    ``FAILURE`` — the stage concluded red.
+
+    ``NONE`` and ``PENDING`` are distinct on purpose so a client can report
+    "nothing here yet" vs "running" even though both fold to ``CIStatus.PENDING``;
+    keeping them separate lets callers/log lines tell the two apart.
+    """
+
+    NONE = "none"
+    PENDING = "pending"
+    SUCCESS = "success"
+    FAILURE = "failure"
+
+
+# --------------------------------------------------------------------------- #
+# Injected client Protocols — structural, so any object with the right method
+# (real adapter or test fake) satisfies them. No ``Any``: every method is typed
+# (repo, commit) -> StageResult.
+# --------------------------------------------------------------------------- #
+class GitHubChecksClient(Protocol):
+    """Reads the GitHub Actions run (build + test + lint) for a commit."""
+
+    def run_conclusion(self, repo: str, commit: str) -> StageResult: ...
+
+
+class WoodpeckerClient(Protocol):
+    """Reads the Woodpecker deploy pipeline triggered for a commit's image."""
+
+    def deploy_conclusion(self, repo: str, commit: str) -> StageResult: ...
+
+
+class RolloutClient(Protocol):
+    """Reads whether the commit's image has rolled out to the cluster."""
+
+    def rollout_status(self, repo: str, commit: str) -> StageResult: ...
+
+
+class CIWatcher:
+    """Folds build → deploy → rollout into a single :class:`CIStatus`.
+
+    Inject the three stage clients (``github`` and ``woodpecker`` are required;
+    ``rollout`` is optional — omit it to stop the verdict at the deploy stage,
+    matching the pilot's "cluster reads optional" posture). The clients are the
+    only I/O surface, so production passes real adapters and tests pass fakes;
+    :meth:`status` itself is pure.
+    """
+
+    def __init__(
+        self,
+        github: GitHubChecksClient,
+        woodpecker: WoodpeckerClient,
+        rollout: RolloutClient | None = None,
+    ) -> None:
+        self._github = github
+        self._woodpecker = woodpecker
+        self._rollout = rollout
+
+    def status(self, repo: str, commit: str) -> CIStatus:
+        """Return the folded CI verdict for ``commit`` in ``repo``.
+
+        Stages are queried lazily in order and the first decisive one wins: a
+        ``FAILURE`` yields ``RED``, an unconcluded stage (``NONE``/``PENDING``)
+        yields ``PENDING``, and only when every stage has ``SUCCESS`` does the
+        verdict reach ``GREEN``. Short-circuiting is real — a stage is only
+        queried if every earlier stage succeeded, so a red/pending build never
+        touches the deploy or rollout client (the assertions in the tests, and
+        avoiding a needless kubectl-exec, both depend on this). With no rollout
+        client the deploy stage is terminal.
+        """
+        # Each entry is a thunk so a later stage's client is never called once an
+        # earlier stage has already decided the verdict.
+        probes = [
+            lambda: self._github.run_conclusion(repo, commit),
+            lambda: self._woodpecker.deploy_conclusion(repo, commit),
+        ]
+        if self._rollout is not None:
+            rollout = self._rollout  # bind for the closure (narrowed, non-None)
+            probes.append(lambda: rollout.rollout_status(repo, commit))
+
+        for probe in probes:
+            verdict = _stage_verdict(probe())
+            if verdict is not None:
+                return verdict  # FAILURE → RED, NONE/PENDING → PENDING
+        return CIStatus.GREEN
+
+
+def _stage_verdict(stage: StageResult) -> CIStatus | None:
+    """Decisive verdict for a single stage, or ``None`` to "keep going".
+
+    ``FAILURE`` decides ``RED``; an unconcluded stage (``NONE``/``PENDING``)
+    decides ``PENDING``; ``SUCCESS`` is non-decisive (``None``) — the next stage
+    gets to speak, and only the last stage's success folds to ``GREEN``.
+    """
+    if stage is StageResult.FAILURE:
+        return CIStatus.RED
+    if stage in (StageResult.NONE, StageResult.PENDING):
+        return CIStatus.PENDING
+    return None
--- a/app/afk/config.py
+++ b/app/afk/config.py
@ -0,0 +1,127 @@
+"""Config loader for the AFK loop — DISABLED BY DEFAULT.
+
+The whole loop ships off. A bare ``Config()`` (and therefore ``default()``,
+``from_env()`` with nothing set, and ``from_configmap({})``) has
+``kill_switch=True`` and an empty ``allowlist`` — so nothing is ever
+dispatched until an operator deliberately turns it on. Enabling is a TWO-part
+manual step, on purpose:
+
+  1. set ``AFK_KILL_SWITCH=false`` (or ``kill_switch: "false"`` in the
+     ConfigMap), AND
+  2. populate ``AFK_ALLOWLIST`` with the exact repos that may be automated.
+
+Either alone is inert: the kill switch off with an empty allowlist still
+dispatches nothing, and a full allowlist with the kill switch on is frozen.
+Both gates exist so a single fat-fingered env var can't accidentally arm the
+loop across every repo.
+
+``from_env`` reads process env; ``from_configmap`` reads an already-parsed
+string→string mapping (the shape a mounted ConfigMap gives you). They share one
+parser so the two paths can't drift. Lists are comma-separated; booleans accept
+the usual truthy spellings.
+
+This module owns only *loading* a ``Config`` — the dataclass itself lives in
+``types`` and policy decisions live in ``dispatch_policy`` / ``run_state_machine``.
+"""
+import os
+from collections.abc import Mapping
+
+from .types import Config
+
+# Env var names — also the ConfigMap keys (one source of truth for both paths).
+ENV_ALLOWLIST = "AFK_ALLOWLIST"
+ENV_KILL_SWITCH = "AFK_KILL_SWITCH"
+ENV_IN_PROGRESS_LABEL = "AFK_IN_PROGRESS_LABEL"
+ENV_READY_LABEL = "AFK_READY_LABEL"
+ENV_BUDGET_USD = "AFK_BUDGET_USD"
+ENV_FIX_FORWARD_MAX_ATTEMPTS = "AFK_FIX_FORWARD_MAX_ATTEMPTS"
+ENV_FIX_FORWARD_MAX_SECONDS = "AFK_FIX_FORWARD_MAX_SECONDS"
+
+# Spellings accepted as boolean true / false (case-insensitive). Anything else
+# raises rather than silently defaulting — an unparseable kill-switch value must
+# never be guessed safe-or-unsafe.
+_TRUE = frozenset({"1", "true", "yes", "on"})
+_FALSE = frozenset({"0", "false", "no", "off"})
+
+
+def default() -> Config:
+    """The disabled default Config: kill switch ON, allowlist EMPTY.
+
+    Equivalent to ``Config(allowlist=[], kill_switch=True)``; provided as a named
+    entry point so callers don't hardcode the disabled posture themselves.
+    """
+    return Config(allowlist=[], kill_switch=True)
+
+
+def from_env(env: Mapping[str, str] | None = None) -> Config:
+    """Build a Config from environment variables (defaults to ``os.environ``).
+
+    Unset variables fall back to the disabled/contract defaults, so an
+    unconfigured process stays off.
+    """
+    return _from_mapping(os.environ if env is None else env)
+
+
+def from_configmap(data: Mapping[str, str]) -> Config:
+    """Build a Config from a parsed ConfigMap (string→string mapping).
+
+    Identical semantics to ``from_env`` — same keys, same parser — but sourced
+    from a mounted ConfigMap's ``data`` rather than process env. An empty mapping
+    yields the disabled default.
+    """
+    return _from_mapping(data)
+
+
+# --------------------------------------------------------------------------- #
+# Internals — one shared parser so env and ConfigMap paths can't diverge.
+# --------------------------------------------------------------------------- #
+def _from_mapping(data: Mapping[str, str]) -> Config:
+    base = default()
+    return Config(
+        allowlist=_parse_list(data.get(ENV_ALLOWLIST), base.allowlist),
+        kill_switch=_parse_bool(data.get(ENV_KILL_SWITCH), base.kill_switch),
+        in_progress_label=_nonempty(data.get(ENV_IN_PROGRESS_LABEL), base.in_progress_label),
+        ready_label=_nonempty(data.get(ENV_READY_LABEL), base.ready_label),
+        budget_usd=_parse_float(data.get(ENV_BUDGET_USD), base.budget_usd),
+        fix_forward_max_attempts=_parse_int(
+            data.get(ENV_FIX_FORWARD_MAX_ATTEMPTS), base.fix_forward_max_attempts
+        ),
+        fix_forward_max_seconds=_parse_int(
+            data.get(ENV_FIX_FORWARD_MAX_SECONDS), base.fix_forward_max_seconds
+        ),
+    )
+
+
+def _parse_list(raw: str | None, fallback: list[str]) -> list[str]:
+    if raw is None:
+        return list(fallback)
+    return [item.strip() for item in raw.split(",") if item.strip()]
+
+
+def _parse_bool(raw: str | None, fallback: bool) -> bool:
+    if raw is None:
+        return fallback
+    value = raw.strip().lower()
+    if value in _TRUE:
+        return True
+    if value in _FALSE:
+        return False
+    raise ValueError(f"unparseable boolean for AFK config: {raw!r}")
+
+
+def _parse_int(raw: str | None, fallback: int) -> int:
+    if raw is None or not raw.strip():
+        return fallback
+    return int(raw.strip())
+
+
+def _parse_float(raw: str | None, fallback: float) -> float:
+    if raw is None or not raw.strip():
+        return fallback
+    return float(raw.strip())
+
+
+def _nonempty(raw: str | None, fallback: str) -> str:
+    if raw is None or not raw.strip():
+        return fallback
+    return raw.strip()
--- a/app/afk/dispatch_policy.py
+++ b/app/afk/dispatch_policy.py
@ -0,0 +1,117 @@
+"""Dispatch policy — the PURE gate deciding which ready issues to run *now*.
+
+``select_dispatchable`` is the loop's first decision each tick: given every
+issue the tracker reported ready, the loop config, and the set of repos that
+already have an agent in flight, it returns the ordered list of issues to
+dispatch this round. It does **no IO** — no tracker calls, no T3, no clock — so
+it is exhaustively unit-testable and the loop stays a thin shell around it.
+
+What it encapsulates (the dispatch predicate from the AFK pipeline design doc):
+
+  * **Kill switch** — ``config.kill_switch`` short-circuits to ``[]`` before any
+    per-issue work. The whole loop ships disabled; this is the master off.
+  * **Trust gate** — only ``issue.labeled_by_trusted`` issues are eligible. On a
+    private repo the gating label *is* the authorization, so an issue made ready
+    by an untrusted/bot actor must never auto-run (prompt-injection defense).
+  * **Allowlist** — ``issue.repo`` must be in ``config.allowlist``. An empty
+    allowlist dispatches nothing even with the kill switch off (the deliberate
+    two-gate posture: arming the loop takes both).
+  * **Per-repo lock** — any repo already in ``in_flight_repos`` is skipped; at
+    most one agent runs per repo (two would collide on the working tree).
+  * **blocked_by gating** — ``issue.blocked_by`` lists the issue numbers of
+    blockers that are still OPEN, so a non-empty list means "still blocked" and
+    the issue is skipped.
+  * **One-agent-per-repo within the batch** — because a repo hosts only one
+    in-flight agent, a single call returns at most ONE decision per repo: the
+    highest-priority eligible issue in that repo wins the slot. (A higher-priority
+    issue that is itself ineligible does not consume the slot — the best
+    *eligible* candidate does.)
+  * **Priority ordering** — the surviving per-repo winners are returned
+    highest-``priority``-first, with a deterministic tiebreaker (ascending issue
+    number) so the output is a total, stable order independent of input order.
+
+PRIORITY DIRECTION — note the deliberate divergence: ``Issue.priority``'s
+docstring in ``types`` says "lower runs first", but this module follows the
+explicit dispatch-policy specification, which orders **higher priority first**.
+The ordering lives here (the one place that consumes ``priority`` for dispatch),
+so this module is the source of truth for the direction.
+
+Pure: it never mutates its inputs — the caller's issue list, the config, and the
+``in_flight_repos`` set are all left exactly as passed.
+"""
+from .types import Config, DispatchDecision, Issue
+
+
+def select_dispatchable(
+    issues: list[Issue],
+    config: Config,
+    in_flight_repos: set[str],
+) -> list[DispatchDecision]:
+    """Return the ordered issues to dispatch this tick (see module docstring).
+
+    Empty when the kill switch is on, the allowlist excludes everything, or no
+    issue clears every gate. At most one decision per repo; ordered
+    highest-priority-first, ties broken by ascending issue number.
+    """
+    # Kill switch: master off-ramp, evaluated before any per-issue work.
+    if config.kill_switch:
+        return []
+
+    allowlist = frozenset(config.allowlist)
+
+    # First pass: keep only issues that clear every per-issue gate. Repos already
+    # in flight are excluded here, so the lock is enforced before slot selection.
+    eligible: list[Issue] = [
+        issue
+        for issue in issues
+        if _is_eligible(issue, allowlist, in_flight_repos)
+    ]
+
+    # One slot per repo: among the eligible issues sharing a repo, the best
+    # candidate (the global sort order) takes it; the rest are dropped this tick.
+    best_per_repo: dict[str, Issue] = {}
+    for issue in sorted(eligible, key=_dispatch_sort_key):
+        best_per_repo.setdefault(issue.repo, issue)
+
+    # Final order: the per-repo winners, highest priority first (total + stable).
+    winners = sorted(best_per_repo.values(), key=_dispatch_sort_key)
+    return [DispatchDecision(issue=issue, reason=_reason(issue)) for issue in winners]
+
+
+# --------------------------------------------------------------------------- #
+# Internals.
+# --------------------------------------------------------------------------- #
+def _is_eligible(
+    issue: Issue,
+    allowlist: frozenset[str],
+    in_flight_repos: set[str],
+) -> bool:
+    """True iff the issue clears the trust, allowlist, per-repo-lock, and
+    blocked_by gates. Kept boolean (not "which gate failed") because the policy
+    only ever needs the survivors; reasons are attached to survivors only."""
+    if not issue.labeled_by_trusted:
+        return False
+    if issue.repo not in allowlist:
+        return False
+    if issue.repo in in_flight_repos:
+        return False
+    if issue.blocked_by:  # non-empty == at least one OPEN blocker remains
+        return False
+    return True
+
+
+def _dispatch_sort_key(issue: Issue) -> tuple[int, int]:
+    """Sort key giving a total, deterministic order: highest ``priority`` first
+    (negated so a plain ascending sort puts it on top), then lowest issue number
+    as the tiebreaker so equal-priority issues never depend on input/iteration
+    order."""
+    return (-issue.priority, issue.number)
+
+
+def _reason(issue: Issue) -> str:
+    """Human-readable justification, logged and surfaced in notifications, never
+    parsed. Records that every gate passed and the priority that ordered it."""
+    return (
+        f"{issue.repo}#{issue.number}: eligible "
+        f"(trusted, allowlisted, unblocked, repo free) — priority {issue.priority}"
+    )
--- a/app/afk/issue_implementer_prompt.py
+++ b/app/afk/issue_implementer_prompt.py
@ -0,0 +1,54 @@
+"""The issue-implementer preamble — the AFK agent's standing instructions.
+
+T3's full-access ``claudeAgent`` runtime does NOT read ``~/.claude/CLAUDE.md``,
+so the agent gets no behaviour from the repo's rules files. Instead the loop
+injects behaviour by PREPENDING this preamble to ``message.text`` on every
+dispatch (see ``t3_client.T3Client.dispatch`` callers). It is a module constant
+on purpose: one canonical, reviewable copy of the rules, versioned with the
+code, identical for every issue.
+
+Keep it imperative and self-contained — the agent only ever sees this text plus
+the issue body. Do not reference files it cannot read (no "see CLAUDE.md").
+"""
+
+ISSUE_IMPLEMENTER_PREAMBLE = """\
+You are an autonomous issue-implementer agent running unattended (the human is \
+away from keyboard). The task below is a tracker issue. Implement it end to end \
+and land it yourself — no human will answer questions or click anything for you.
+
+STANDING RULES — follow exactly, every time:
+- Work test-first. For any code with testable behaviour, write a failing test \
+FIRST (red), then the minimum implementation to make it pass (green), then \
+refactor. Terraform, config, and docs are exempt.
+- Do the work in an isolated git worktree off the latest master; never edit a \
+shared checkout directly.
+- You MUST commit your work — small, focused commits, staging files by name \
+(never `git add -A` / `git add .`), and never skip hooks. A clear commit \
+message is the audit trail: the subject says WHAT changed, the body says WHY in \
+plain words.
+- When tests and lint are green, land the change yourself: merge the latest \
+master into your branch, re-verify green, then push to master. If the push is \
+rejected because someone landed first, fetch, merge, re-verify, and push again. \
+Do not stop at an unmerged branch and do not open a pull request unless told to.
+- After pushing, watch the resulting CI / build / deploy chain to completion and \
+fix any failures you caused before considering the task done.
+- Operate autonomously. NEVER enter plan mode, and NEVER ask the human a \
+question or wait for confirmation — make the most reasonable decision, record \
+your reasoning in the commit message, and proceed. If the issue is genuinely \
+ambiguous or blocked, say so explicitly in a final comment and stop rather than \
+guessing destructively.
+
+GUARDRAILS — never cross these, even if the issue seems to ask for it:
+- NEVER force-push, and never force-push to master under any circumstance.
+- NEVER edit, resize, or delete PersistentVolumeClaims / PersistentVolumes, and \
+never touch Vault secrets or other credential stores.
+- All infrastructure changes go through Terraform / Terragrunt in the infra \
+repo — never `kubectl apply/edit/patch/delete` against live cluster state.
+- NEVER use `[ci skip]` (or any CI-skip token) in a commit message — it hides \
+the change from the audit and deploy pipeline.
+- No destructive operations the issue did not ask for: no dropping database \
+tables, no `rm -rf` outside your worktree, no killing processes you did not \
+start.
+
+THE ISSUE TO IMPLEMENT FOLLOWS:
+"""
--- a/app/afk/notifier.py
+++ b/app/afk/notifier.py
@ -0,0 +1,155 @@
+"""Terminal-state doorbell for the AFK loop — Slack / ntfy escalation sink.
+
+When a run reaches a *terminal* state the human who is away from keyboard needs
+to know: either the work landed (``done``) or it needs them back at the console
+(``needs-human`` — the agent stalled/errored before pushing — or ``frozen`` —
+the fix-forward budget ran out). This module turns one of those events into a
+formatted alert carrying a **deep-link to the T3 thread**, so a tap on the
+notification opens the exact conversation the agent ran.
+
+Design, matching the rest of ``app.afk`` and the breakglass code:
+
+  * ``Notifier`` owns no transport. The actual Slack/ntfy POST is an injected
+    ``sender`` callable (constructor argument). Production wires a real HTTP
+    sender; tests inject a recording fake and assert the formatted payload
+    without touching the network — the same dependency-injection seam breakglass
+    uses for the claude subprocess.
+  * ``render_notification`` is a pure function that builds the payload; ``notify``
+    is just "render, then hand to the sender". Keeping the formatting pure makes
+    it unit-testable on its own and guarantees ``notify`` sends exactly what
+    ``render_notification`` returns.
+  * The kind vocabulary is CLOSED: only the three terminal kinds are sendable.
+    An unknown kind raises rather than firing a mystery doorbell — a non-terminal
+    kind reaching here is a caller bug, not something to paper over.
+  * The notifier never swallows a sender failure. If Slack is down the exception
+    propagates; the loop decides whether to retry or give up, not this adapter.
+
+The whole AFK loop ships DISABLED (see ``config.py``); this module is inert
+until the loop is deliberately armed and a real sender is wired in.
+"""
+from collections.abc import Callable
+from dataclasses import dataclass, field
+
+from .types import Issue
+
+# --------------------------------------------------------------------------- #
+# Kind vocabulary — the terminal states a run can reach. One source of truth
+# shared by callers (the state machine maps Action -> kind) and tests.
+# --------------------------------------------------------------------------- #
+KIND_DONE = "done"                  # landed: merged + CI green, issue closeable
+KIND_NEEDS_HUMAN = "needs-human"    # stalled/errored before pushing — pre-push escalation
+KIND_FROZEN = "frozen"              # fix-forward budget (attempts/wall-clock) exhausted
+
+#: The only kinds ``notify`` will send. Anything else is a caller bug.
+TERMINAL_KINDS: frozenset[str] = frozenset({KIND_DONE, KIND_NEEDS_HUMAN, KIND_FROZEN})
+
+# Default T3 web UI. Threads deep-link off this; overridable per-Notifier so the
+# host isn't hardcoded into the formatter (re-IP / staging / tests).
+DEFAULT_BASE_URL = "https://t3.viktorbarzin.me"
+
+# Per-kind presentation. The leading marker makes the three distinguishable from
+# the title alone in a crowded Slack channel without emoji; priority/tags drive
+# how the sender routes it (a successful close is quiet; the two escalations are
+# loud and tagged so on-call filters can page on them).
+_PRESENTATION: dict[str, tuple[str, str, str, tuple[str, ...]]] = {
+    # kind            -> (marker,     headline,                 priority, tags)
+    KIND_DONE:         ("[DONE]",     "landed",                 "low",  ("afk", "done")),
+    KIND_NEEDS_HUMAN:  ("[NEEDS-HUMAN]", "needs a human",       "high", ("afk", "escalation", "needs-human")),
+    KIND_FROZEN:       ("[FROZEN]",   "frozen — budget exhausted", "high", ("afk", "escalation", "frozen")),
+}
+
+#: A sink that delivers a built notification (HTTP POST in prod, recorder in tests).
+Sender = Callable[["Notification"], None]
+
+
+@dataclass
+class Notification:
+    """The fully-formatted alert handed to the sender.
+
+    A structured payload (not a raw dict) so the sender can map fields onto its
+    own schema — ``title``/``body`` for Slack blocks or an ntfy message,
+    ``priority``/``tags`` for routing, ``link`` for the click-through. ``link``
+    is ``None`` when there is no thread to point at (e.g. dispatch failed before
+    a thread existed); the deep-link is also embedded in ``body`` so it survives
+    senders that only carry a plain message.
+    """
+
+    kind: str
+    issue_ref: str            # "<repo>#<number>", e.g. "infra#42"
+    title: str
+    body: str
+    link: str | None
+    priority: str             # "low" | "high" — escalation loudness for the sender
+    tags: list[str] = field(default_factory=list)
+
+
+def _deep_link(base_url: str, thread_id: str | None) -> str | None:
+    """Build the T3 thread deep-link, or ``None`` when there is no thread."""
+    if not thread_id:
+        return None
+    return f"{base_url.rstrip('/')}/?thread={thread_id}"
+
+
+def render_notification(
+    kind: str,
+    issue: Issue,
+    thread_id: str | None,
+    detail: str,
+    *,
+    base_url: str = DEFAULT_BASE_URL,
+) -> Notification:
+    """Build the :class:`Notification` for a terminal event — pure, no I/O.
+
+    Raises ``ValueError`` if ``kind`` is not one of :data:`TERMINAL_KINDS`: only
+    terminal states ring the doorbell, and a non-terminal kind reaching here is a
+    bug we surface rather than silently send.
+    """
+    if kind not in TERMINAL_KINDS:
+        raise ValueError(
+            f"notifier only sends terminal kinds {sorted(TERMINAL_KINDS)}, got {kind!r}"
+        )
+
+    marker, headline, priority, tags = _PRESENTATION[kind]
+    issue_ref = f"{issue.repo}#{issue.number}"
+    link = _deep_link(base_url, thread_id)
+
+    title = f"{marker} {issue_ref} {headline}"
+
+    body_lines = [detail]
+    if link is not None:
+        body_lines.append(f"Thread: {link}")
+    body = "\n".join(body_lines)
+
+    return Notification(
+        kind=kind,
+        issue_ref=issue_ref,
+        title=title,
+        body=body,
+        link=link,
+        priority=priority,
+        tags=list(tags),
+    )
+
+
+class Notifier:
+    """Sends terminal-state doorbells through an injected ``sender``.
+
+    The ``sender`` is the only egress: ``notify`` formats the payload (via
+    :func:`render_notification`) and hands it over. No transport lives here, so a
+    test injects a recording fake and asserts the payload without posting.
+    """
+
+    def __init__(self, sender: Sender, *, base_url: str = DEFAULT_BASE_URL) -> None:
+        self._sender = sender
+        self._base_url = base_url
+
+    def notify(self, kind: str, issue: Issue, thread_id: str | None, detail: str) -> None:
+        """Format a terminal-state alert and deliver it via the injected sender.
+
+        Raises ``ValueError`` for a non-terminal ``kind`` (before any send), and
+        lets a sender failure propagate — see the module docstring.
+        """
+        notification = render_notification(
+            kind, issue, thread_id, detail, base_url=self._base_url
+        )
+        self._sender(notification)
--- a/app/afk/phase_checklist.py
+++ b/app/afk/phase_checklist.py
@ -0,0 +1,116 @@
+"""Render an AFK run's progress as a live markdown checklist.
+
+``render(current, meta)`` is a PURE function: it maps a ``Phase`` plus a bag of
+optional context (``meta``) to a markdown task list, with no I/O and no hidden
+state. The loop posts the result as an issue comment so a human glancing at the
+tracker can see exactly how far an unattended run has got — worktree created,
+test written, green, pushed, CI, deployed, done.
+
+The list always shows all seven lifecycle phases in order. Phases strictly
+*before* ``current`` are checked (``- [x]``); ``current`` is marked in-progress
+(``- [~]``); later phases are empty (``- [ ]``). ``Phase.DONE`` is terminal — at
+that point every line, including DONE itself, is checked.
+
+``meta`` is best-effort decoration only. Recognised keys (all optional):
+``repo`` / ``issue`` (header title), ``thread_id`` (header suffix), and
+``fix_forward_attempts`` (a note line when non-zero). Unknown keys are ignored,
+and a missing key never raises — the checklist degrades gracefully to just the
+phase list. Nothing here mutates ``meta``.
+"""
+from typing import Any
+
+from .types import Phase
+
+# Lifecycle order — the single source of truth for both ordering and the
+# checked/active/empty partition. Must stay in sync with ``Phase`` (the
+# checklist tests assert every phase appears, so a divergence is caught).
+_ORDER: tuple[Phase, ...] = (
+    Phase.WORKTREE,
+    Phase.TESTS_RED,
+    Phase.GREEN,
+    Phase.PUSHED,
+    Phase.CI,
+    Phase.DEPLOYED,
+    Phase.DONE,
+)
+
+# Human-readable label per phase (what shows on each checklist line).
+_LABELS: dict[Phase, str] = {
+    Phase.WORKTREE: "Worktree created",
+    Phase.TESTS_RED: "Failing test written (TDD red)",
+    Phase.GREEN: "Implementation passing (TDD green)",
+    Phase.PUSHED: "Pushed to master",
+    Phase.CI: "CI green on pushed commit",
+    Phase.DEPLOYED: "Deployed / rolled out",
+    Phase.DONE: "Done — issue closed",
+}
+
+# Task-list markers. ``[~]`` (in-progress) is a common markdown convention and,
+# crucially, is neither ``[x]`` nor ``[ ]`` so the active line is always visually
+# distinct from a checked or empty box.
+_DONE = "- [x]"
+_ACTIVE = "- [~]"
+_TODO = "- [ ]"
+
+
+def render(current: Phase, meta: dict[str, Any]) -> str:
+    """Render the run's progress checklist as markdown (see module docstring).
+
+    ``current`` is the phase the run is in right now; ``meta`` supplies optional
+    header/context fields. Pure: identical inputs yield byte-identical output and
+    ``meta`` is never mutated.
+    """
+    current_index = _ORDER.index(current)
+    is_done = current is Phase.DONE
+
+    lines = [_header(meta), ""]
+    for index, phase in enumerate(_ORDER):
+        lines.append(f"{_marker(index, current_index, is_done)} {_LABELS[phase]}")
+
+    note = _fix_forward_note(meta)
+    if note is not None:
+        lines.extend(["", note])
+
+    # Trailing newline so the block sits cleanly when concatenated into a comment.
+    return "\n".join(lines) + "\n"
+
+
+def _marker(index: int, current_index: int, is_done: bool) -> str:
+    """The checkbox marker for the phase at ``index`` given the current phase.
+
+    Earlier phases are checked; the current phase is in-progress; later phases
+    are empty. When the run is DONE, every phase (including DONE) is checked.
+    """
+    if is_done or index < current_index:
+        return _DONE
+    if index == current_index:
+        return _ACTIVE
+    return _TODO
+
+
+def _header(meta: dict[str, Any]) -> str:
+    """The ``###`` title line. Includes ``repo#issue`` when both are present and
+    a ``(thread ...)`` suffix when a thread id is known; degrades to a bare title
+    otherwise."""
+    repo = meta.get("repo")
+    issue = meta.get("issue")
+    if repo is not None and issue is not None:
+        title = f"{repo}#{issue} — AFK run progress"
+    else:
+        title = "AFK run progress"
+
+    thread_id = meta.get("thread_id")
+    if thread_id:
+        title = f"{title} (thread {thread_id})"
+    return f"### {title}"
+
+
+def _fix_forward_note(meta: dict[str, Any]) -> str | None:
+    """A note line when one or more fix-forward attempts have happened, else
+    ``None`` (no line). Zero/absent attempts add nothing — the clean path stays
+    uncluttered."""
+    attempts = meta.get("fix_forward_attempts")
+    if not attempts:
+        return None
+    plural = "attempt" if attempts == 1 else "attempts"
+    return f"_Fix-forward: {attempts} {plural}._"
--- a/app/afk/poller.py
+++ b/app/afk/poller.py
@ -0,0 +1,166 @@
+"""CronJob entrypoint: one dispatch tick of the AFK loop.
+
+The poller is the *first half* of the loop — the part that decides what to start.
+It runs once per CronJob invocation (the loop is stateless between ticks: the
+issue tracker, not in-process memory, is the source of truth for what's already
+in flight). Each tick:
+
+  1. **kill switch** — if ``config.kill_switch`` is set the tick does NOTHING,
+     not even a tracker read. A disabled loop must be inert: zero I/O, zero
+     dispatches. (The pure policy also short-circuits on the kill switch, but the
+     poller bails first so a disabled CronJob never touches the network.)
+  2. read the ready set: ``tracker.list_ready(config.allowlist)`` — every open
+     issue carrying the ready label across the allowlisted repos.
+  3. derive the **per-repo lock**: a repo is "in flight" if any ready issue
+     already carries ``config.in_progress_label`` (the poller stamps that label
+     when it dispatches, so on the next tick the still-open issue re-appears and
+     locks the repo). At most one agent per repo — two would collide on the
+     working tree.
+  4. run the pure ``dispatch_policy.select_dispatchable`` over (ready issues,
+     config, in-flight repos) to get the ordered set to start this tick.
+  5. for each decision: ``t3_client.dispatch(repo, issue, prompt)`` to spawn the
+     worker thread, THEN ``tracker.add_label(repo, issue, in_progress_label)`` —
+     label strictly *after* a successful dispatch, so a dispatch that raises
+     never leaves a phantom lock that would freeze the repo forever.
+
+It owns no policy of its own — the decision lives in ``dispatch_policy`` and the
+agent's behaviour rides in the dispatched prompt's preamble (``t3_client``). The
+two adapters (tracker, T3) are injected behind structural Protocols, so
+production wires the real ``Tracker`` / ``T3Client`` and the tests wire the
+in-memory fakes; nothing here opens a socket on its own.
+
+DISABLED BY DEFAULT: a freshly-loaded ``Config`` has ``kill_switch=True`` and an
+empty allowlist (see ``config.py``), so importing or scheduling this poller
+dispatches nothing. Arming the loop — clearing the kill switch AND enrolling a
+repo — is a deliberate manual step, performed later, never by this code.
+"""
+from collections.abc import Callable
+from dataclasses import dataclass, field
+from typing import Protocol
+
+from . import dispatch_policy
+from .types import Config, DispatchDecision, Issue
+
+
+# --------------------------------------------------------------------------- #
+# Injected adapter Protocols — the I/O edges. Structural, so the real
+# ``Tracker`` / ``T3Client`` and the test fakes both satisfy them with no
+# explicit subclassing. Only the methods the poller actually calls appear here.
+# --------------------------------------------------------------------------- #
+class TrackerPort(Protocol):
+    """The slice of ``tracker.Tracker`` the dispatch tick needs."""
+
+    def list_ready(self, repos: list[str]) -> list[Issue]: ...
+    def add_label(self, repo: str, issue: int, label: str) -> None: ...
+
+
+class T3Port(Protocol):
+    """The slice of ``t3_client.T3Client`` the dispatch tick needs."""
+
+    def dispatch(self, repo: str, issue: int, prompt: str) -> str: ...
+
+
+#: The pure dispatch gate's signature, injected so the tick can be tested with a
+#: stub policy without reaching into module internals. Defaults to the real one.
+DispatchFn = Callable[[list[Issue], Config, set[str]], list[DispatchDecision]]
+
+
+@dataclass
+class Dispatched:
+    """One issue the tick actually started, with the T3 thread it spawned.
+
+    Returned (not just logged) so the caller — and the tests — can see exactly
+    what was launched. ``thread_id`` is what the watcher half later polls to
+    drive this run to completion; ``reason`` carries the policy's human-readable
+    justification through unchanged.
+    """
+
+    issue: Issue
+    thread_id: str
+    reason: str
+
+
+@dataclass
+class PollResult:
+    """The outcome of one dispatch tick.
+
+    ``dispatched`` is empty whenever the loop is disabled, the allowlist is
+    empty, every repo is already in flight, or nothing clears the dispatch gate
+    — i.e. the common steady-state of a quiet tick.
+    """
+
+    dispatched: list[Dispatched] = field(default_factory=list)
+
+
+class Poller:
+    """Runs one dispatch tick over injected tracker + T3 adapters.
+
+    ``dispatch`` defaults to the real pure ``select_dispatchable`` policy; it is
+    injectable purely so a test can substitute a stub without monkeypatching.
+    The poller holds no state between ticks — each ``run_once`` is self-contained.
+    """
+
+    def __init__(
+        self,
+        tracker: TrackerPort,
+        t3_client: T3Port,
+        dispatch: DispatchFn = dispatch_policy.select_dispatchable,
+    ) -> None:
+        self._tracker = tracker
+        self._t3 = t3_client
+        self._dispatch = dispatch
+
+    def run_once(self, config: Config) -> PollResult:
+        """Execute one dispatch tick (see module docstring). Returns what it
+        started; an empty result is the normal quiet-tick outcome."""
+        # Kill switch: bail before any I/O — a disabled loop touches nothing.
+        if config.kill_switch:
+            return PollResult()
+
+        ready = self._tracker.list_ready(config.allowlist)
+        in_flight = _in_flight_repos(ready, config.in_progress_label)
+
+        result = PollResult()
+        for decision in self._dispatch(ready, config, in_flight):
+            issue = decision.issue
+            # Dispatch FIRST; only stamp the lock once the thread exists, so a
+            # failed dispatch leaves the issue purely ready for the next tick to
+            # retry rather than wedged behind a phantom in-progress label.
+            thread_id = self._t3.dispatch(
+                issue.repo, issue.number, _dispatch_prompt(issue)
+            )
+            self._tracker.add_label(issue.repo, issue.number, config.in_progress_label)
+            result.dispatched.append(
+                Dispatched(issue=issue, thread_id=thread_id, reason=decision.reason)
+            )
+        return result
+
+
+# --------------------------------------------------------------------------- #
+# Internals — pure helpers.
+# --------------------------------------------------------------------------- #
+def _in_flight_repos(ready: list[Issue], in_progress_label: str) -> set[str]:
+    """Repos that already have an agent in flight, read off the ready set.
+
+    A repo is in flight if any of its ready issues still carries the in-progress
+    label — the stamp the poller applied on a previous tick's dispatch. Because
+    the dispatched issue keeps its ready label until the watcher closes/relabels
+    it, it re-appears here and locks the repo until the run finishes.
+    """
+    return {issue.repo for issue in ready if in_progress_label in issue.labels}
+
+
+def _dispatch_prompt(issue: Issue) -> str:
+    """The turn prompt for one issue's worker thread.
+
+    The full-access agent fetches the issue body itself (it has ``gh``), so the
+    prompt only needs to point unambiguously at the concrete ``repo#number``; the
+    standing rules are prepended by ``t3_client`` as the issue-implementer
+    preamble. Kept deliberately terse — one canonical instruction, no per-issue
+    templating to drift.
+    """
+    return (
+        f"Implement issue #{issue.number} in the `{issue.repo}` repository. "
+        f"Fetch the issue with `gh issue view {issue.number} --repo {issue.repo}` "
+        f"(and its comments) to get the full task, then implement it end to end."
+    )
--- a/app/afk/run_state_machine.py
+++ b/app/afk/run_state_machine.py
@ -0,0 +1,84 @@
+"""Run state machine: assembled ``RunState`` -> next ``Action`` (ADR-0002).
+
+This is the heart of the AFK loop's per-issue control: each tick the loop
+assembles a :class:`~app.afk.types.RunState` (thread liveness from the
+orchestration snapshot, CI verdict from the watcher, plus its own ``pushed`` /
+``fix_forward_attempts`` / ``elapsed_seconds`` bookkeeping) and calls
+:func:`next_action` to decide what to do next.
+
+The function is **pure** — it reads only its two arguments, never the clock, the
+network, or any global. That keeps the lifecycle policy a plain decision table
+the test suite can exhaust combinatorially; the loop owns all the I/O (closing
+issues, dispatching corrective turns, escalating) based on the Action returned.
+
+The decision table (first match wins):
+
+  * pushed AND CI green                         -> CLOSE_SUCCESS
+      The run is healthy and verified; close the issue. The thread's own status
+      is irrelevant once a pushed commit is green.
+  * pushed AND CI red, budget remaining         -> FIX_FORWARD
+      A pushed commit broke CI. Dispatch another corrective turn — but only
+      while BOTH budgets hold: ``fix_forward_attempts < fix_forward_max_attempts``
+      AND ``elapsed_seconds < fix_forward_max_seconds`` (strict; at/over either
+      bound is exhausted).
+  * pushed AND CI red, budget exhausted         -> FREEZE_ESCALATE
+      Out of fix-forward attempts or wall-clock; stop churning and hand to a
+      human with the broken commit left in place.
+  * not pushed AND thread ERROR/IDLE            -> ESCALATE_PREPUSH
+      The agent will never reach green: it errored, or its turn finished /
+      stalled with nothing pushed. There is no pushed commit to fix forward, so
+      escalate before-push (a different remediation path than FREEZE_ESCALATE).
+  * everything else                             -> WAIT
+      Still in flight: working toward a first push (thread running / unknown), or
+      pushed with CI not yet decided. Poll again next tick.
+"""
+from .types import Action, CIStatus, Config, RunState, ThreadStatus
+
+# Thread states that mean the agent is finished with this turn — it will not push
+# any further on its own. Reaching one of these with nothing pushed is terminal
+# (escalate), whereas RUNNING / None (no snapshot entry yet) means keep waiting.
+_TERMINAL_THREAD_STATES: frozenset[ThreadStatus] = frozenset(
+    {ThreadStatus.ERROR, ThreadStatus.IDLE}
+)
+
+
+def next_action(state: RunState, config: Config) -> Action:
+    """Decide the next :class:`Action` for one issue's run.
+
+    Pure and total: every reachable ``(thread_status, ci_status, pushed,
+    attempts, elapsed)`` combination maps to exactly one Action via the table in
+    the module docstring. See that table for the rationale of each branch.
+    """
+    if state.pushed:
+        # A commit is out; the CI verdict on it drives everything from here.
+        if state.ci_status is CIStatus.GREEN:
+            return Action.CLOSE_SUCCESS
+        if state.ci_status is CIStatus.RED:
+            return (
+                Action.FIX_FORWARD
+                if _fix_forward_budget_remaining(state, config)
+                else Action.FREEZE_ESCALATE
+            )
+        # CI pending / not yet reported -> wait for the verdict.
+        return Action.WAIT
+
+    # Nothing pushed yet. If the turn is over (errored or gone idle) the run can
+    # never reach green on its own -> escalate before-push; otherwise it is still
+    # working toward a first push -> wait.
+    if state.thread_status in _TERMINAL_THREAD_STATES:
+        return Action.ESCALATE_PREPUSH
+    return Action.WAIT
+
+
+def _fix_forward_budget_remaining(state: RunState, config: Config) -> bool:
+    """True while another fix-forward turn is allowed.
+
+    Both bounds must hold (strict ``<``): the run has spent fewer than
+    ``fix_forward_max_attempts`` corrective turns AND fewer than
+    ``fix_forward_max_seconds`` of wall-clock. Hitting either cap exhausts the
+    budget.
+    """
+    return (
+        state.fix_forward_attempts < config.fix_forward_max_attempts
+        and state.elapsed_seconds < config.fix_forward_max_seconds
+    )
--- a/app/afk/t3_client.py
+++ b/app/afk/t3_client.py
@ -0,0 +1,159 @@
+"""Adapter for the in-cluster T3 Code instance — the AFK executor + cockpit.
+
+The control plane keeps the brain; T3 runs the agent. This module is the thin
+wire between them: it turns "implement issue N of repo R with this prompt" into
+the TWO HTTP commands T3's orchestration API needs, and reads the fleet
+snapshot the watcher polls. It owns no AFK behaviour — the agent's standing
+rules ride in as the ``ISSUE_IMPLEMENTER_PREAMBLE`` prepended to the turn
+message, because T3's full-access ``claudeAgent`` runtime does NOT honour
+``~/.claude/CLAUDE.md`` (see ``issue_implementer_prompt``).
+
+Two operations, both against the dedicated in-cluster T3 pod:
+
+  * ``dispatch(repo, issue, prompt) -> thread_id`` — POSTs ``thread.create``
+    then ``thread.turn.start`` to ``/api/orchestration/dispatch``. The create
+    command selects the ``claudeAgent`` instance in ``full-access`` runtime mode
+    and returns a thread id; the turn command targets that thread and delivers
+    ``ISSUE_IMPLEMENTER_PREAMBLE + prompt`` as ``message.text``. One dispatch =
+    one worktree-isolated worker.
+  * ``snapshot() -> dict`` — GETs ``/api/orchestration/snapshot``, the full fleet
+    read-model. T3 has no outbound webhooks, so the watcher polls this for
+    per-thread ``running``/``idle``/``error`` status.
+
+The HTTP transport and the bearer provider are **injected** (constructor
+args), so the production wiring hands in an ``httpx.Client`` plus a Vault-backed
+token reader, while tests hand in an in-memory fake — nothing here ever opens a
+socket on its own. The bearer is re-read from the provider on **every** request
+because T3's ``orchestration:operate`` token expires hourly and is refreshed out
+of band.
+"""
+from collections.abc import Callable
+from typing import Protocol
+
+from .issue_implementer_prompt import ISSUE_IMPLEMENTER_PREAMBLE
+
+# Orchestration API paths, relative to the configured base URL.
+_DISPATCH_PATH = "/api/orchestration/dispatch"
+_SNAPSHOT_PATH = "/api/orchestration/snapshot"
+
+# Pilot-baked dispatch envelope: which backend instance runs the thread and in
+# which runtime mode. Constants (not config) — every AFK thread is identical.
+_INSTANCE_ID = "claudeAgent"
+_RUNTIME_MODE = "full-access"
+
+# JSON shapes. Command bodies and the snapshot read-model are open string-keyed
+# objects; ``object`` values keep us honest without a bare ``Any``.
+type Json = dict[str, object]
+
+
+class HttpResponse(Protocol):
+    """The httpx-shaped response surface this adapter relies on.
+
+    Both ``httpx.Response`` and the test fake satisfy it: ``raise_for_status``
+    turns a non-2xx into an exception (so a failed ``thread.create`` aborts
+    before ``thread.turn.start`` ever fires) and ``json`` parses the body.
+    """
+
+    def raise_for_status(self) -> object: ...
+
+    def json(self) -> Json: ...
+
+
+class HttpClient(Protocol):
+    """Minimal injected transport: a JSON ``post`` and a ``get``, both taking
+    explicit headers. Deliberately a strict subset of ``httpx.Client`` so the
+    real client passes one straight through and tests pass a recorder."""
+
+    def post(self, url: str, json: Json, headers: dict[str, str]) -> HttpResponse: ...
+
+    def get(self, url: str, headers: dict[str, str]) -> HttpResponse: ...
+
+
+class T3Client:
+    """Dispatch/snapshot adapter for one in-cluster T3 instance.
+
+    ``base_url`` is the T3 service root (a trailing slash is tolerated);
+    ``http`` is the injected transport; ``bearer_provider`` returns the current
+    ``orchestration:operate`` token, re-read per request for hourly rotation.
+    """
+
+    def __init__(
+        self,
+        base_url: str,
+        http: HttpClient,
+        bearer_provider: Callable[[], str],
+    ) -> None:
+        self._base_url = base_url.rstrip("/")
+        self._http = http
+        self._bearer_provider = bearer_provider
+
+    # ----------------------------------------------------------------- #
+    # Public API (the ``t3_client.T3Client`` contract).
+    # ----------------------------------------------------------------- #
+    def dispatch(self, repo: str, issue: int, prompt: str) -> str:
+        """Spawn one worker thread for ``issue`` of ``repo`` and return its id.
+
+        Two POSTs to ``/api/orchestration/dispatch``: ``thread.create`` (selects
+        the ``claudeAgent`` instance, ``full-access`` runtime) yields the thread
+        id; ``thread.turn.start`` then delivers ``ISSUE_IMPLEMENTER_PREAMBLE +
+        prompt`` to that thread. A failed create raises and short-circuits the
+        turn (we never fire a turn at a thread that wasn't created).
+        """
+        create_resp = self._post(
+            _DISPATCH_PATH,
+            {
+                "command": "thread.create",
+                "repo": repo,
+                "issue": issue,
+                "modelSelection": {"instanceId": _INSTANCE_ID},
+                "runtimeMode": _RUNTIME_MODE,
+            },
+        )
+        thread_id = self._thread_id_of(create_resp.json())
+
+        self._post(
+            _DISPATCH_PATH,
+            {
+                "command": "thread.turn.start",
+                "threadId": thread_id,
+                "message": {"text": ISSUE_IMPLEMENTER_PREAMBLE + prompt},
+            },
+        )
+        return thread_id
+
+    def snapshot(self) -> Json:
+        """Return the parsed fleet read-model from ``/api/orchestration/snapshot``."""
+        return self._get(_SNAPSHOT_PATH).json()
+
+    # ----------------------------------------------------------------- #
+    # Internals.
+    # ----------------------------------------------------------------- #
+    def _post(self, path: str, body: Json) -> HttpResponse:
+        resp = self._http.post(self._url(path), json=body, headers=self._headers())
+        resp.raise_for_status()
+        return resp
+
+    def _get(self, path: str) -> HttpResponse:
+        resp = self._http.get(self._url(path), headers=self._headers())
+        resp.raise_for_status()
+        return resp
+
+    def _url(self, path: str) -> str:
+        return f"{self._base_url}{path}"
+
+    def _headers(self) -> dict[str, str]:
+        return {"Authorization": f"Bearer {self._bearer_provider()}"}
+
+    @staticmethod
+    def _thread_id_of(create_response: Json) -> str:
+        """Extract the new thread id from a ``thread.create`` reply.
+
+        T3 returns it as ``threadId``; we fail loudly on a malformed reply rather
+        than dispatch a turn at an empty/None id.
+        """
+        thread_id = create_response.get("threadId")
+        if not isinstance(thread_id, str) or not thread_id:
+            raise ValueError(
+                f"thread.create response missing a usable threadId: {create_response!r}"
+            )
+        return thread_id
--- a/app/afk/tracker.py
+++ b/app/afk/tracker.py
@ -0,0 +1,243 @@
+"""Issue-tracker adapter — the loop's read/write port onto GitHub issues.
+
+``Tracker`` is the only place the AFK loop touches the issue tracker. It wraps an
+injected ``GitHubClient`` (the port) so the policy/state-machine code — and the
+tests — never depend on a real ``gh`` or the network: production injects
+``GhCliClient`` (shells out to ``gh`` with no-shell argv); tests inject a fake.
+
+The split is deliberate. The ``GitHubClient`` port speaks only in *primitives*
+(list raw issues for a label, fetch a single issue's label events, and the four
+mutations). All the loop-specific *decisions* live on ``Tracker``:
+
+  * ``labeled_by_trusted`` — decided **fail-closed** from the actor who made the
+    most-recent application of the ready label. On private repos only
+    collaborators can label, so the label *is* the authorization (design doc,
+    "Trigger & dispatch predicate"); an unattributable label is never trusted.
+  * ``blocked_by`` — the issue numbers in the body's "Blocked by #N" clauses
+    (the per-issue dependency the design doc gates dispatch on).
+  * ``priority`` — read off a ``priority:<n>`` label, lowest wins (lower runs
+    first, matching ``Issue.priority`` semantics in ``types``).
+
+Keeping the decisions here, not in the client, is what lets the whole read path
+be tested against a thin fake. Mutations (``add_label`` / ``remove_label`` /
+``comment`` / ``close``) are pass-throughs the loop drives during a run.
+"""
+import json
+import re
+from collections.abc import Callable
+from subprocess import PIPE, run
+from typing import Protocol, runtime_checkable
+
+from .types import Issue
+
+# Trusted author associations: GitHub tags each issue event actor with their
+# association to the repo. Only these may arm an issue for the AFK loop — the
+# trust gate from the design doc. Overridable per Tracker for a tighter policy.
+DEFAULT_TRUSTED_ASSOCIATIONS: frozenset[str] = frozenset({"OWNER", "MEMBER", "COLLABORATOR"})
+
+# Default gating label; mirrors Config.ready_label so a Tracker built without an
+# explicit override matches the production default.
+DEFAULT_READY_LABEL = "ready-for-agent"
+
+# "Blocked by #3, #4 and #10" → [3, 4, 10]. We match a "blocked by" lead-in
+# (case-insensitive) and then harvest every "#<n>" in the clause that follows,
+# up to the next line break — so a bare "#7 for context" elsewhere is ignored.
+_BLOCKED_BY_CLAUSE = re.compile(r"blocked\s+by\b([^\n\r]*)", re.IGNORECASE)
+_ISSUE_REF = re.compile(r"#(\d+)")
+
+# "priority:2" → 2. Anything non-numeric (e.g. "priority:high") is not a numeric
+# priority and is skipped.
+_PRIORITY_LABEL = re.compile(r"^priority:(\d+)$")
+
+
+@runtime_checkable
+class GitHubClient(Protocol):
+    """The primitive surface ``Tracker`` depends on — one issue tracker, faked
+    in tests. Implementations must not embed loop policy; they only fetch raw
+    data and perform the four mutations.
+
+    ``list_issues`` returns the ``gh issue list --json number,labels,body`` shape
+    (``labels`` is a list of ``{"name": ...}``; ``body`` may be ``None``).
+    ``label_events`` returns the ``labeled`` timeline events for one issue, each
+    with ``label.name``, ``actor.login`` and ``author_association``.
+    """
+
+    def list_issues(self, repo: str, label: str) -> list[dict]: ...
+    def label_events(self, repo: str, number: int) -> list[dict]: ...
+    def add_label(self, repo: str, number: int, label: str) -> None: ...
+    def remove_label(self, repo: str, number: int, label: str) -> None: ...
+    def comment(self, repo: str, number: int, body: str) -> None: ...
+    def close(self, repo: str, number: int) -> None: ...
+
+
+class Tracker:
+    """Adapter that turns raw issue-tracker data into ``Issue`` records and
+    relays mutations, over an injected :class:`GitHubClient`."""
+
+    def __init__(
+        self,
+        client: GitHubClient,
+        ready_label: str = DEFAULT_READY_LABEL,
+        trusted_associations: frozenset[str] = DEFAULT_TRUSTED_ASSOCIATIONS,
+    ) -> None:
+        self.client = client
+        self.ready_label = ready_label
+        self.trusted_associations = trusted_associations
+
+    # ----------------------------------------------------------------- reads #
+    def list_ready(self, repos: list[str]) -> list[Issue]:
+        """Every ready-labeled open issue across ``repos``, as ``Issue`` records.
+
+        Ordering follows the client's per-repo order; dispatch ordering by
+        priority is the dispatch policy's job, not the tracker's.
+        """
+        issues: list[Issue] = []
+        for repo in repos:
+            for raw in self.client.list_issues(repo, self.ready_label):
+                issues.append(self._to_issue(repo, raw))
+        return issues
+
+    def _to_issue(self, repo: str, raw: dict) -> Issue:
+        number = int(raw["number"])
+        labels = [lbl["name"] for lbl in raw.get("labels", [])]
+        return Issue(
+            number=number,
+            repo=repo,
+            labels=labels,
+            blocked_by=_parse_blocked_by(raw.get("body")),
+            labeled_by_trusted=self._is_labeled_by_trusted(repo, number),
+            priority=_parse_priority(labels),
+        )
+
+    def _is_labeled_by_trusted(self, repo: str, number: int) -> bool:
+        """True iff the MOST RECENT application of the ready label was made by a
+        trusted actor. Fail-closed: no attributable application → not trusted."""
+        last_association: str | None = None
+        for event in self.client.label_events(repo, number):
+            if event.get("event") != "labeled":
+                continue
+            if (event.get("label") or {}).get("name") != self.ready_label:
+                continue
+            last_association = event.get("author_association")
+        return last_association in self.trusted_associations
+
+    # ------------------------------------------------------------- mutations #
+    def add_label(self, repo: str, issue: int, label: str) -> None:
+        self.client.add_label(repo, issue, label)
+
+    def remove_label(self, repo: str, issue: int, label: str) -> None:
+        self.client.remove_label(repo, issue, label)
+
+    def comment(self, repo: str, issue: int, body: str) -> None:
+        self.client.comment(repo, issue, body)
+
+    def close(self, repo: str, issue: int) -> None:
+        self.client.close(repo, issue)
+
+
+# --------------------------------------------------------------------------- #
+# Parsing helpers — pure functions, no I/O.
+# --------------------------------------------------------------------------- #
+def _parse_blocked_by(body: str | None) -> list[int]:
+    """Issue numbers referenced in the body's "Blocked by #N" clauses.
+
+    Order-preserving and de-duplicated; bare "#N" mentions outside a "blocked by"
+    clause are ignored. A missing/empty body yields ``[]``.
+    """
+    if not body:
+        return []
+    seen: dict[int, None] = {}  # insertion-ordered set
+    for clause in _BLOCKED_BY_CLAUSE.findall(body):
+        for ref in _ISSUE_REF.findall(clause):
+            seen.setdefault(int(ref), None)
+    return list(seen)
+
+
+def _parse_priority(labels: list[str]) -> int:
+    """Numeric priority from a ``priority:<n>`` label, lowest wins; 0 if none."""
+    priorities = [
+        int(match.group(1))
+        for label in labels
+        if (match := _PRIORITY_LABEL.match(label))
+    ]
+    return min(priorities) if priorities else 0
+
+
+# --------------------------------------------------------------------------- #
+# Concrete client — shells out to `gh`. Injected `run` keeps it testable.
+# --------------------------------------------------------------------------- #
+def _default_run(argv: list[str]) -> str:
+    """Run ``argv`` with no shell and return stdout (text). Raises on non-zero.
+
+    List argv (never a shell string), matching the no-injection-surface pattern
+    the breakglass/main subprocess helpers use — the repo/label/body values are
+    never interpreted by a shell.
+    """
+    proc = run(argv, stdout=PIPE, stderr=PIPE, text=True, check=False)
+    if proc.returncode != 0:
+        raise RuntimeError(f"{argv[0]} failed ({proc.returncode}): {proc.stderr[:200]}")
+    return proc.stdout
+
+
+class GhCliClient:
+    """:class:`GitHubClient` backed by the ``gh`` CLI.
+
+    ``repo_owner`` is the GitHub owner/org the sub-project repos live under, so a
+    bare repo name (``"infra"``) becomes the ``--repo owner/infra`` slug ``gh``
+    wants. ``run`` is the subprocess runner (defaults to the real no-shell one);
+    tests inject a fake to capture argv without spawning ``gh``.
+    """
+
+    def __init__(self, repo_owner: str, run: Callable[[list[str]], str] = _default_run) -> None:
+        self.repo_owner = repo_owner
+        self._run = run
+
+    def _slug(self, repo: str) -> str:
+        return f"{self.repo_owner}/{repo}"
+
+    def list_issues(self, repo: str, label: str) -> list[dict]:
+        out = self._run([
+            "gh", "issue", "list", "--repo", self._slug(repo),
+            "--label", label, "--state", "open",
+            "--json", "number,labels,body", "--limit", "100",
+        ])
+        return _loads_list(out)
+
+    def label_events(self, repo: str, number: int) -> list[dict]:
+        out = self._run([
+            "gh", "api",
+            f"repos/{self._slug(repo)}/issues/{number}/timeline",
+            "--paginate",
+            "-H", "Accept: application/vnd.github+json",
+        ])
+        events = _loads_list(out)
+        return [e for e in events if e.get("event") == "labeled"]
+
+    def add_label(self, repo: str, number: int, label: str) -> None:
+        self._run([
+            "gh", "issue", "edit", str(number), "--repo", self._slug(repo),
+            "--add-label", label,
+        ])
+
+    def remove_label(self, repo: str, number: int, label: str) -> None:
+        self._run([
+            "gh", "issue", "edit", str(number), "--repo", self._slug(repo),
+            "--remove-label", label,
+        ])
+
+    def comment(self, repo: str, number: int, body: str) -> None:
+        self._run([
+            "gh", "issue", "comment", str(number), "--repo", self._slug(repo),
+            "--body", body,
+        ])
+
+    def close(self, repo: str, number: int) -> None:
+        self._run(["gh", "issue", "close", str(number), "--repo", self._slug(repo)])
+
+
+def _loads_list(out: str) -> list[dict]:
+    """Parse ``gh`` JSON stdout into a list of dicts. Empty stdout → ``[]``."""
+    text = out.strip()
+    if not text:
+        return []
+    return json.loads(text)
--- a/app/afk/types.py
+++ b/app/afk/types.py
@ -0,0 +1,134 @@
+"""Shared types for the AFK loop — the contract every module builds against.
+
+Stdlib only (``dataclasses`` + ``enum``), matching the breakglass code: no
+pydantic, modern ``X | None`` unions, precise field types. Every other module in
+``app.afk`` imports its inputs/outputs from here so the pieces stay aligned; the
+module-level docstrings in ``__init__`` list which functions consume which type.
+
+Nothing here has behaviour — these are pure data carriers and closed enums. Keep
+it that way: logic lives in ``dispatch_policy`` / ``run_state_machine`` / the
+client modules, never on the dataclasses.
+"""
+from dataclasses import dataclass
+from enum import Enum
+
+
+# --------------------------------------------------------------------------- #
+# Enums — closed vocabularies the state machine and clients speak in.
+# --------------------------------------------------------------------------- #
+class ThreadStatus(Enum):
+    """Liveness of a T3 thread, as projected from the orchestration snapshot.
+
+    ``RUNNING`` — the agent is still working the turn; ``IDLE`` — the turn
+    finished cleanly (it has gone quiet); ``ERROR`` — the thread/turn failed.
+    """
+
+    RUNNING = "running"
+    IDLE = "idle"
+    ERROR = "error"
+
+
+class CIStatus(Enum):
+    """CI verdict for a pushed commit. ``PENDING`` covers both "no run yet" and
+    "in progress" — the state machine waits on either."""
+
+    PENDING = "pending"
+    GREEN = "green"
+    RED = "red"
+
+
+class Phase(Enum):
+    """Where a single issue's run is in its lifecycle. Ordered: each phase is a
+    gate the run passes through on the way to ``DONE``. ``phase_checklist``
+    renders these; the loop advances through them as evidence arrives."""
+
+    WORKTREE = "worktree"      # isolated workspace created
+    TESTS_RED = "tests_red"    # failing test written first (TDD red)
+    GREEN = "green"            # implementation makes tests pass (TDD green)
+    PUSHED = "pushed"          # commit(s) pushed to master
+    CI = "ci"                  # CI pipeline running on the pushed commit
+    DEPLOYED = "deployed"      # deploy/rollout reached the cluster
+    DONE = "done"              # verified complete; issue can be closed
+
+
+class Action(Enum):
+    """The decision ``run_state_machine.next_action`` returns for one tick.
+
+    ``WAIT`` — nothing to do yet, poll again; ``CLOSE_SUCCESS`` — run is green,
+    CI passed, close the issue; ``ESCALATE_PREPUSH`` — the agent errored/stalled
+    before pushing anything, hand back to a human; ``FIX_FORWARD`` — CI went red
+    on a pushed commit, dispatch another corrective turn; ``FREEZE_ESCALATE`` —
+    fix-forward budget exhausted (attempts or wall-clock), stop and escalate.
+    """
+
+    WAIT = "wait"
+    CLOSE_SUCCESS = "close_success"
+    ESCALATE_PREPUSH = "escalate_prepush"
+    FIX_FORWARD = "fix_forward"
+    FREEZE_ESCALATE = "freeze_escalate"
+
+
+# --------------------------------------------------------------------------- #
+# Data carriers.
+# --------------------------------------------------------------------------- #
+@dataclass
+class Issue:
+    """A tracker issue the loop might dispatch.
+
+    ``labeled_by_trusted`` records whether the gating label was applied by a
+    trusted identity — the loop must never dispatch an issue made ready by an
+    untrusted actor (prompt-injection / drive-by). ``blocked_by`` lists issue
+    numbers that must close first; ``priority`` orders the ready set (lower runs
+    first, matching tracker conventions).
+    """
+
+    number: int
+    repo: str
+    labels: list[str]
+    blocked_by: list[int]
+    labeled_by_trusted: bool
+    priority: int
+
+
+@dataclass
+class DispatchDecision:
+    """An issue the dispatch policy selected to run now, with a human-readable
+    ``reason`` (logged + surfaced in notifications, never parsed)."""
+
+    issue: Issue
+    reason: str
+
+
+@dataclass
+class Config:
+    """Loop configuration. DISABLED BY DEFAULT — ``kill_switch=True`` and an
+    empty ``allowlist`` mean a freshly-constructed Config dispatches nothing.
+    Enabling is a deliberate manual step (see ``config.from_env`` /
+    ``from_configmap``).
+    """
+
+    allowlist: list[str]
+    kill_switch: bool
+    in_progress_label: str = "agent-in-progress"
+    ready_label: str = "ready-for-agent"
+    budget_usd: float = 100.0
+    fix_forward_max_attempts: int = 5
+    fix_forward_max_seconds: int = 3600
+
+
+@dataclass
+class RunState:
+    """Everything the state machine needs to decide one issue's next move.
+
+    Assembled each tick from the orchestration snapshot (``thread_status``), the
+    CI watcher (``ci_status``), and the loop's own bookkeeping (``pushed``,
+    ``fix_forward_attempts``, ``elapsed_seconds``). ``thread_status`` /
+    ``ci_status`` are ``None`` when not yet known (no snapshot entry / nothing
+    pushed to check yet).
+    """
+
+    thread_status: ThreadStatus | None
+    ci_status: CIStatus | None
+    pushed: bool
+    fix_forward_attempts: int
+    elapsed_seconds: float
--- a/app/afk/watcher.py
+++ b/app/afk/watcher.py
@ -0,0 +1,342 @@
+"""CronJob entrypoint: drive ONE in-flight AFK run by a single tick.
+
+The watcher is the *second half* of the loop — the part that drives a run the
+poller already started through to a terminal state. Given one in-flight run
+(``InFlightRun``: the issue, the T3 thread to poll, the pushed commit if any,
+and the fix-forward bookkeeping), one ``tick``:
+
+  1. **assemble a ``RunState``** from the live edges + the run's bookkeeping:
+       * ``thread_status`` — from ``t3_client.snapshot()``, by finding this run's
+         thread and mapping T3's ``running``/``idle``/``error`` to a
+         ``ThreadStatus`` (missing thread, or any unrecognised status, folds to
+         ``None`` → "no status yet" → the state machine WAITs; we never escalate
+         or close on a status we don't understand);
+       * ``ci_status`` — ``ci_watcher.status(repo, commit)`` *only* when a commit
+         is pushed (no commit ⇒ nothing to check ⇒ ``None``);
+       * ``pushed`` / ``fix_forward_attempts`` / ``elapsed_seconds`` — straight
+         from the run.
+  2. **decide** via the pure ``run_state_machine.next_action`` (it owns the
+     lifecycle policy; the watcher owns only the I/O the decision implies).
+  3. **act** on the returned ``Action``:
+       * ``CLOSE_SUCCESS`` → ``tracker.close`` + drop the in-progress label +
+         DONE checklist + ``done`` doorbell. The run landed.
+       * ``ESCALATE_PREPUSH`` / ``FREEZE_ESCALATE`` → drop the in-progress label,
+         add the ``ready-for-human`` label, post the checklist, ring the
+         ``needs-human`` / ``frozen`` doorbell. The run is handed to a human; the
+         issue is left OPEN (not closed) with the work in place.
+       * ``FIX_FORWARD`` → dispatch a corrective turn (``t3_client.dispatch``),
+         bump the fix-forward attempt count, refresh the checklist, and keep the
+         run in flight (NOT terminal: no label churn, no doorbell — the notifier
+         only speaks terminal kinds). The new thread id rides back on the result
+         so the next tick polls the corrective turn.
+       * ``WAIT`` → just refresh the progress checklist and keep waiting.
+
+Every adapter (T3, tracker, CI, notifier) is injected behind a structural
+Protocol, so production wires the real clients and the tests wire the in-memory
+fakes; this module opens no socket and reads no message bodies. (The pilot keeps
+T3 ``state.sqlite`` message-body reads out of the core loop — snapshot status +
+CI status are all the state machine needs — so this watcher never execs into the
+pod; that observability nicety is a separate, optional concern.)
+
+DISABLED BY DEFAULT applies transitively: the poller never starts a run while
+the loop is off (``config.kill_switch`` / empty allowlist — see ``config.py``),
+so with the shipped defaults there is never an ``InFlightRun`` to tick.
+"""
+from dataclasses import dataclass
+from typing import Protocol
+
+from . import phase_checklist, run_state_machine
+from .notifier import KIND_DONE, KIND_FROZEN, KIND_NEEDS_HUMAN
+from .poller import T3Port as _DispatchPort  # dispatch(repo, issue, prompt) -> id
+from .types import Action, CIStatus, Config, Issue, Phase, RunState, ThreadStatus
+
+# T3 snapshot status string -> ThreadStatus. Anything not in here (a status T3
+# adds later, or a malformed entry) maps to None — "no usable status yet" — so
+# the state machine waits rather than acting on something it can't interpret.
+_THREAD_STATUS_BY_STRING: dict[str, ThreadStatus] = {
+    "running": ThreadStatus.RUNNING,
+    "idle": ThreadStatus.IDLE,
+    "error": ThreadStatus.ERROR,
+}
+
+# Action -> the terminal doorbell kind to ring. Only the terminal actions appear;
+# WAIT / FIX_FORWARD are non-terminal and ring nothing (the notifier rejects a
+# non-terminal kind on purpose — see ``notifier.TERMINAL_KINDS``).
+_TERMINAL_KIND_BY_ACTION: dict[Action, str] = {
+    Action.CLOSE_SUCCESS: KIND_DONE,
+    Action.ESCALATE_PREPUSH: KIND_NEEDS_HUMAN,
+    Action.FREEZE_ESCALATE: KIND_FROZEN,
+}
+
+# Default label applied when a run is handed back to a human. Mirrors the
+# tracker's ``ready-for-agent`` convention; overridable per-Watcher.
+DEFAULT_READY_FOR_HUMAN_LABEL = "ready-for-human"
+
+
+# --------------------------------------------------------------------------- #
+# Injected adapter Protocols — structural, so the real clients and the test
+# fakes both satisfy them with no subclassing. Only the methods the watcher
+# actually calls appear. ``DispatchPort`` is reused from ``poller``.
+# --------------------------------------------------------------------------- #
+class SnapshotPort(_DispatchPort, Protocol):
+    """T3 surface the watcher needs: ``dispatch`` (for the corrective turn) plus
+    ``snapshot`` (for thread liveness)."""
+
+    def snapshot(self) -> dict: ...
+
+
+class TrackerPort(Protocol):
+    """The slice of ``tracker.Tracker`` the watch tick needs."""
+
+    def add_label(self, repo: str, issue: int, label: str) -> None: ...
+    def remove_label(self, repo: str, issue: int, label: str) -> None: ...
+    def comment(self, repo: str, issue: int, body: str) -> None: ...
+    def close(self, repo: str, issue: int) -> None: ...
+
+
+class CIPort(Protocol):
+    """The slice of ``ci_watcher.CIWatcher`` the watch tick needs."""
+
+    def status(self, repo: str, commit: str) -> CIStatus: ...
+
+
+class NotifierPort(Protocol):
+    """The slice of ``notifier.Notifier`` the watch tick needs."""
+
+    def notify(self, kind: str, issue: Issue, thread_id: str | None, detail: str) -> None: ...
+
+
+@dataclass
+class InFlightRun:
+    """One run the watcher is driving, as the loop tracks it between ticks.
+
+    ``thread_id`` is the T3 thread to poll this tick; ``commit`` is the pushed
+    commit CI watches (``None`` until the agent has pushed). ``fix_forward_attempts``
+    and ``elapsed_seconds`` are the loop's own bookkeeping, fed straight into the
+    assembled ``RunState`` — ``pushed`` is derived as ``commit is not None``.
+    """
+
+    issue: Issue
+    thread_id: str
+    commit: str | None
+    fix_forward_attempts: int = 0
+    elapsed_seconds: float = 0.0
+
+
+@dataclass
+class TickResult:
+    """The outcome of one watch tick.
+
+    ``action`` is the state machine's verdict; ``terminal`` is True iff the run
+    reached an end state (closed or handed to a human) and should no longer be
+    ticked. ``thread_id`` / ``fix_forward_attempts`` carry the (possibly updated)
+    bookkeeping the caller threads into the next ``InFlightRun`` — they change
+    only on a FIX_FORWARD (new corrective thread, incremented attempts) and are
+    otherwise echoed back unchanged.
+    """
+
+    action: Action
+    terminal: bool
+    thread_id: str
+    fix_forward_attempts: int
+
+
+class Watcher:
+    """Drives one in-flight run per ``tick`` over injected adapters.
+
+    The three escalation-vs-success decisions live in the pure
+    ``run_state_machine``; this class only performs the I/O each decision
+    implies. ``ready_for_human_label`` is the label stamped on a run handed back
+    to a human (default :data:`DEFAULT_READY_FOR_HUMAN_LABEL`).
+    """
+
+    def __init__(
+        self,
+        t3_client: SnapshotPort,
+        tracker: TrackerPort,
+        ci_watcher: CIPort,
+        notifier: NotifierPort,
+        ready_for_human_label: str = DEFAULT_READY_FOR_HUMAN_LABEL,
+    ) -> None:
+        self._t3 = t3_client
+        self._tracker = tracker
+        self._ci = ci_watcher
+        self._notifier = notifier
+        self._ready_for_human_label = ready_for_human_label
+
+    def tick(self, run: InFlightRun, config: Config) -> TickResult:
+        """Drive ``run`` one step (see module docstring)."""
+        state = self._assemble_state(run)
+        action = run_state_machine.next_action(state, config)
+
+        if action is Action.CLOSE_SUCCESS:
+            return self._close_success(run, config)
+        if action in (Action.ESCALATE_PREPUSH, Action.FREEZE_ESCALATE):
+            return self._escalate(run, state, action, config)
+        if action is Action.FIX_FORWARD:
+            return self._fix_forward(run, state)
+        # WAIT: still in flight — just show progress and poll again next tick.
+        return self._wait(run, state, action)
+
+    # ----------------------------------------------------------------- #
+    # RunState assembly.
+    # ----------------------------------------------------------------- #
+    def _assemble_state(self, run: InFlightRun) -> RunState:
+        thread_status = self._thread_status(run.thread_id)
+        # Only fold CI when there's a commit to check — an unpushed run has no
+        # pipeline, and we must not query CI (the assertion in the tests, and
+        # avoiding a needless API call, both rely on this).
+        ci_status = (
+            self._ci.status(run.issue.repo, run.commit)
+            if run.commit is not None
+            else None
+        )
+        return RunState(
+            thread_status=thread_status,
+            ci_status=ci_status,
+            pushed=run.commit is not None,
+            fix_forward_attempts=run.fix_forward_attempts,
+            elapsed_seconds=run.elapsed_seconds,
+        )
+
+    def _thread_status(self, thread_id: str) -> ThreadStatus | None:
+        """This thread's liveness from the fleet snapshot, or ``None`` when the
+        thread is absent or its status string is one we don't recognise."""
+        for thread in self._t3.snapshot().get("threads", []):
+            if thread.get("id") == thread_id:
+                return _THREAD_STATUS_BY_STRING.get(thread.get("status"))
+        return None
+
+    # ----------------------------------------------------------------- #
+    # Per-action handlers.
+    # ----------------------------------------------------------------- #
+    def _close_success(self, run: InFlightRun, config: Config) -> TickResult:
+        """Landed: close the issue, drop the lock, post DONE, ring the doorbell."""
+        self._post_checklist(run, Phase.DONE)
+        self._tracker.remove_label(
+            run.issue.repo, run.issue.number, config.in_progress_label
+        )
+        self._tracker.close(run.issue.repo, run.issue.number)
+        self._notify(run, Action.CLOSE_SUCCESS, "Run landed: pushed and CI green.")
+        return _terminal(Action.CLOSE_SUCCESS, run)
+
+    def _escalate(
+        self, run: InFlightRun, state: RunState, action: Action, config: Config
+    ) -> TickResult:
+        """Hand back to a human: drop the lock, add ready-for-human, post the
+        checklist, ring the matching doorbell. The issue stays OPEN."""
+        self._post_checklist(run, _phase_for(state))
+        self._tracker.remove_label(
+            run.issue.repo, run.issue.number, config.in_progress_label
+        )
+        self._tracker.add_label(
+            run.issue.repo, run.issue.number, self._ready_for_human_label
+        )
+        self._notify(run, action, _escalation_detail(action, state))
+        return _terminal(action, run)
+
+    def _fix_forward(self, run: InFlightRun, state: RunState) -> TickResult:
+        """CI red with budget left: dispatch a corrective turn and stay in flight.
+
+        Not terminal — no doorbell (the notifier only speaks terminal kinds) and
+        no label churn (the in-progress lock stays put). The corrective dispatch
+        spawns a fresh thread; its id and the incremented attempt count ride back
+        so the next tick tracks the right thread.
+        """
+        attempts = run.fix_forward_attempts + 1
+        new_thread_id = self._t3.dispatch(
+            run.issue.repo, run.issue.number, _fix_forward_prompt(run)
+        )
+        self._post_checklist(run, Phase.CI, fix_forward_attempts=attempts)
+        return TickResult(
+            action=Action.FIX_FORWARD,
+            terminal=False,
+            thread_id=new_thread_id,
+            fix_forward_attempts=attempts,
+        )
+
+    def _wait(self, run: InFlightRun, state: RunState, action: Action) -> TickResult:
+        """Still working: refresh the progress checklist, change nothing else."""
+        self._post_checklist(run, _phase_for(state))
+        return TickResult(
+            action=action,
+            terminal=False,
+            thread_id=run.thread_id,
+            fix_forward_attempts=run.fix_forward_attempts,
+        )
+
+    # ----------------------------------------------------------------- #
+    # I/O helpers.
+    # ----------------------------------------------------------------- #
+    def _post_checklist(
+        self, run: InFlightRun, phase: Phase, *, fix_forward_attempts: int | None = None
+    ) -> None:
+        attempts = run.fix_forward_attempts if fix_forward_attempts is None else fix_forward_attempts
+        body = phase_checklist.render(
+            phase,
+            {
+                "repo": run.issue.repo,
+                "issue": run.issue.number,
+                "thread_id": run.thread_id,
+                "fix_forward_attempts": attempts,
+            },
+        )
+        self._tracker.comment(run.issue.repo, run.issue.number, body)
+
+    def _notify(self, run: InFlightRun, action: Action, detail: str) -> None:
+        self._notifier.notify(
+            _TERMINAL_KIND_BY_ACTION[action], run.issue, run.thread_id, detail
+        )
+
+
+# --------------------------------------------------------------------------- #
+# Pure helpers.
+# --------------------------------------------------------------------------- #
+def _terminal(action: Action, run: InFlightRun) -> TickResult:
+    """A terminal :class:`TickResult` echoing the run's bookkeeping unchanged."""
+    return TickResult(
+        action=action,
+        terminal=True,
+        thread_id=run.thread_id,
+        fix_forward_attempts=run.fix_forward_attempts,
+    )
+
+
+def _phase_for(state: RunState) -> Phase:
+    """Best-effort current lifecycle phase from the evidence in ``state``.
+
+    The checklist is decoration only (the loop reads no agent message bodies), so
+    this maps the observable signals — pushed? CI verdict? — onto the closest
+    phase: nothing pushed ⇒ still working toward the implementation (GREEN);
+    pushed ⇒ the CI phase is where attention sits until it goes green. A green CI
+    is rendered as DONE by the close path, not here.
+    """
+    if not state.pushed:
+        return Phase.GREEN
+    if state.ci_status is CIStatus.GREEN:
+        return Phase.DEPLOYED
+    return Phase.CI
+
+
+def _escalation_detail(action: Action, state: RunState) -> str:
+    """Human-readable escalation reason for the doorbell + logs (never parsed)."""
+    if action is Action.ESCALATE_PREPUSH:
+        return (
+            "Agent stalled or errored before pushing any commit "
+            f"(thread {state.thread_status.value if state.thread_status else 'unknown'}). "
+            "Handed back for a human."
+        )
+    return (
+        "Fix-forward budget exhausted with CI still red "
+        f"({state.fix_forward_attempts} attempts, {state.elapsed_seconds:.0f}s). "
+        "Frozen for a human."
+    )
+
+
+def _fix_forward_prompt(run: InFlightRun) -> str:
+    """The corrective-turn prompt: point the agent at the red CI on its commit."""
+    return (
+        f"CI is RED on your pushed commit {run.commit} for issue #{run.issue.number} "
+        f"in `{run.issue.repo}`. Investigate the failing run, fix the cause, and "
+        f"push the fix to master. Then watch CI again until it is green."
+    )