Commit graph

16 commits

Author SHA1 Message Date
Viktor Barzin
a7b4f7ba32 [beadboard] Add agent-dispatch and agent-status API routes
## Context

The Dispatch-to-Agent button in the right-panel (previous commit) calls
two endpoints that did not yet exist server-side:

- `POST /api/agent-dispatch` — resolve the bead via the same Dolt pool
  the existing `/api/beads/read` route uses, build a prompt from it, and
  forward to `claude-agent-service`'s `/execute` endpoint with a bearer
  token. This is the server-side trust boundary: the bearer token lives
  in env (never in the browser), and the bead is re-read from Dolt
  (never trusted from the client payload) so a malicious client cannot
  inject a prompt.

- `GET /api/agent-status` — proxy `claude-agent-service`'s `/health`
  endpoint (which already returns `{status, busy}`), with a 2 s in-
  memory cache so 5 s UI polls across multiple open tabs don't hammer
  the service.

The claude-agent-service serialises jobs behind an `asyncio.Lock` — a
second `/execute` while one is running returns HTTP 409 "Agent is busy".
We surface that 409 through to the browser unchanged so the UI can show
the right toast/status line without re-mapping status codes.

```
client ──► /api/agent-dispatch ──► claude-agent-service /execute
             │                        (asyncio.Lock guards)
             └─► reads bead from Dolt pool (same path as /api/beads/read)
             └─► buildDispatchPrompt(bead)
             └─► POST {prompt, agent, budget, timeout} + Bearer token

client ──► /api/agent-status (2s cache) ──► /health  ──► {busy: bool}
```

## This change

### `src/app/api/agent-dispatch/route.ts`

- `POST {taskId: string}` handler.
- Validates JSON body and non-empty taskId (→ 400).
- Early 500 if `CLAUDE_AGENT_SERVICE_URL` or `CLAUDE_AGENT_BEARER_TOKEN`
  is missing — fail fast, fail loud.
- Resolves the bead via `readIssuesFromDisk({preferBd: true})`, which
  uses the existing Dolt client (and falls back to `issues.jsonl`).
  Filtering by id after is acceptable at BeadBoard's scale (~hundreds
  of beads) and avoids introducing a new single-bead query path.
- 400 when the bead is missing, or when `acceptance_criteria?.trim()` is
  empty — defense in depth alongside the UI disable. The button should
  already hide in these cases, but a curl'd POST must still be rejected.
- Forwards to `${CLAUDE_AGENT_SERVICE_URL}/execute` with agent
  `beads-task-runner`, max_budget_usd 5, timeout_seconds 900 (matches
  the values in the task spec).
- Passes through 409 verbatim. Other upstream errors collapse to 502.

### `src/app/api/agent-status/route.ts`

- `GET` handler, module-level snapshot cache with 2 s TTL to avoid
  hammering `/health`.
- In-flight de-dup: a single pending `fetchRemoteStatus()` is shared
  across concurrent requests so we only hit the upstream once per
  window even under bursty load.
- When `CLAUDE_AGENT_SERVICE_URL` is unset, returns `{busy: false}` and
  skips the fetch entirely — this is how the dev server boots before
  the service env is configured.
- HTTP 503 from upstream is interpreted as `busy: true` (future-proofing
  in case the service swaps 409 for 503 on overload).
- Any network error degrades gracefully to `{busy: false}` — the 409
  path on `/api/agent-dispatch` is the authoritative gate.

### Test coverage

- `tests/api/agent-dispatch-route.test.ts` (3 cases): invalid JSON body,
  missing taskId, missing env returns 500.
- `tests/api/agent-status-route.test.ts` (3 cases): unset service URL
  returns `{busy: false}` without a fetch, `/health busy:true` proxies
  through, HTTP 503 maps to busy=true. Uses a `globalThis.fetch` stub
  and cache-busting query params on dynamic import so each case starts
  from a fresh module snapshot.

## What is NOT in this change

- End-to-end happy-path coverage (bead loads, fetch returns 200, route
  yields job_id) would require mocking `readIssuesFromDisk` — that's a
  bigger refactor than this commit warrants. Live integration happens
  once the pipeline is wired.
- No retry/backoff on upstream failures. 502 passes through; the client
  decides whether to retry.
- No auth on the routes themselves — they inherit the Next.js app's
  session model (same as all other `/api/*` routes today).

## Test Plan

### Automated

```
$ node --import tsx --test tests/api/agent-dispatch-route.test.ts \
    tests/api/agent-status-route.test.ts
# tests 6 pass 6 fail 0 duration_ms 1560
```

All six route cases pass. Typecheck output shows only the pre-existing
`OrchestratorChatMessage` gap in `left-panel.tsx`. Lint output is
unchanged from `main` (1 pre-existing `no-require-imports` error in
`src/lib/bb-pi-bootstrap.ts`).

### Manual Verification

1. Export env:
   ```
   export CLAUDE_AGENT_SERVICE_URL=http://claude-agent-service.claude-agent.svc.cluster.local:8080
   export CLAUDE_AGENT_BEARER_TOKEN=$(vault kv get -field=api_bearer_token secret/claude-agent-service)
   ```
2. `npm run dev`
3. `curl -s http://localhost:3000/api/agent-status` → `{"busy":false}`
   (if the service is reachable; `{"busy":true}` if a job is running).
4. `curl -X POST http://localhost:3000/api/agent-dispatch \
       -H 'content-type: application/json' \
       -d '{"taskId":"beadboard-xyz"}'`
   - 400 if bead does not exist or lacks acceptance criteria.
   - 200 `{"job_id":"..."}` on success.
   - 409 `{"error":"Agent is busy"}` when an agent is already running.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 14:07:21 +00:00
zenchantlive
1c4b5ab401 Cleanup: Runtime artifacts, hard-coded paths, PR 14 bug fixes 2026-03-05 15:57:33 -08:00
ZenchantLive
e16ddcb532 test(coord): expand registry/api/reservation coverage 2026-03-03 18:48:13 -08:00
ZenchantLive
dcca324bfb feat(api): add agent mail and reservations routes 2026-03-03 18:32:11 -08:00
ZenchantLive
b5db7a7753 checkpoint: pre-split branch cleanup 2026-03-03 16:43:42 -08:00
zenchantlive
835018c183 Add beads: Skill v4 epic (1bg), Quality gates (n1h), Brainstorm epics (jq5, 2e6), memory nodes 2026-03-01 22:56:18 -08:00
zenchantlive
c8c91736b8 fix: remove buildProjectContext usage causing build error 2026-03-01 21:22:46 -08:00
zenchantlive
c246ceaf21 feat(ux): consolidate Launch Swarm + telemetry UX with minimized strip
- Removed broken LaunchSwarmDialog (formula-based) from TopBar/LeftPanel
- All Rocket buttons (TopBar, LeftPanel, DAG nodes, social cards) now open
  AssignmentPanel (archetype-based) which actually works
- Every Rocket clears taskId first so assignMode && !taskId condition passes
- Conversation button priority: taskId always shows conversation, not assign panel
- Added TelemetryStrip: minimized right sidebar with status dots when non-telemetry
  panel (conversation/assignment) is active
- Live feed has minimize button → restores last taskId or assignMode
- DAG nodes: Signal icon → restores telemetry feed
- Social button on DAG nodes: single router.push to avoid race (setView + setTaskId)
- Fixed social card message button: opens right panel with drawer:closed (no popup)

Co-Authored-By: Oz <oz-agent@warp.dev>
2026-03-01 18:17:58 -08:00
zenchantlive
654f73f83d feat(swarm): modify unified-shell to render swarm layout 2026-02-20 17:25:05 -08:00
zenchantlive
1925d625b6 feat(swarm): add GET /api/swarm/archetypes route with passing test 2026-02-20 17:10:26 -08:00
zenchantlive
4ee550c333 feat(telemetry): complete bb-buff.1.3 - Backend Liveness Refactor
STORY:
The session backend needed to aggregate agent health from a live
telemetry stream rather than static bead metadata. This refactor
makes liveness signals real-time and accurate.

COLLABORATION:
We extended the ActivityEvent model with a native 'heartbeat' kind,
updated extendActivityLease() to emit through the activity bus, and
refactored getAgentLivenessMap() to prioritize heartbeat activity
history over stale bead metadata.

DELIVERABLES:
- ActivityEvent extended with 'heartbeat' kind
- extendActivityLease() emits heartbeats through activity bus
- getAgentLivenessMap() prefers telemetry over static metadata
- Registry APIs support projectRoot injection for testing
- Tests verify preference logic via TDD

VERIFICATION:
- 93/93 tests PASSING
- Heartbeat override verified in isolated temp projects

CLOSES: bb-buff.1.3
BLOCKS: bb-buff.3.2, bb-buff.3.3, bb-buff.2.1
2026-02-15 21:14:05 -08:00
zenchantlive
bfe4f853f0 feat(observability): chronological timeline and agent productivity APIs
We added the third major surface to the BeadBoard workspace: the Chronological Timeline. This provides the 'Audit' layer of our operational hierarchy.

Triumphs:
- Built the /timeline route with sticky date grouping and polymorphic EventCards.
- Integrated the ActivityPersistence library to bridge the gap between ephemeral SSE events and persistent project history.
- Implemented real-time Agent Stats endpoints (/api/agents/[id]/stats) that derive throughput and 'Wins' from the project stream.

Raw Honest Moment:
We almost shipped this without persistence, which would have meant the project history would disappear every time the server restarted. Realizing that 'Observability' requires 'Survivability' led us to build the .beadboard/activity.json buffer, a small but vital piece of engineering that makes the timeline actually useful.
2026-02-14 00:21:02 -08:00
zenchantlive
b4cb09a6cc Merge main into master and unify realtime + project-context test matrix 2026-02-11 21:06:38 -08:00
zenchantlive
3f2ae384f5 Add realtime watcher+SSE transport with tests and lock-retry read path 2026-02-11 21:05:27 -08:00
zenchantlive
c836be46cf feat: add Windows project registry API and persistence 2026-02-11 20:35:36 -08:00
zenchantlive
2c80265258 Add bd exec bridge and mutation API routes with tests 2026-02-11 19:46:02 -08:00