Next Session: v5 Skill Validation + Full Critique

Objective

Do not implement new features first. Use this session to pressure-test the finished beadboard-driver v5 skill end-to-end, critique it hard, and produce a prioritized fix list.

Scope

Validate and critique all of:

skills/beadboard-driver/SKILL.md
skills/beadboard-driver/project.template.md
skills/beadboard-driver/references/*.md
skills/beadboard-driver/scripts/*.mjs
skills/beadboard-driver/scripts/lib/driver-lib.mjs
skills/beadboard-driver/tests/*.contract.test.mjs
tests/skills/beadboard-driver/*.test.ts

Ground Rules

Treat this as an adversarial review, not a celebration.
Evidence before assertions: every finding must cite command output or file evidence.
Prefer identifying regressions, ambiguities, missing guarantees, and operator confusion risks.
For every critique finding, include a concrete fix proposal.

Session Steps

Step 1: Context Recovery

cd /mnt/c/Users/Zenchant/codex/beadboard
git log --oneline -12
bd show beadboard-maf
bd ready

Step 2: Run Full Gates (Baseline)

npm run typecheck
npm run lint
npm run test

Capture exact pass/fail state and any warnings.

Step 3: Run Skill-Local Contract Suite Explicitly

node skills/beadboard-driver/tests/run-tests.mjs

Step 4: Manual Runbook Dry-Run Against SKILL.md

Walk through SKILL.md steps exactly as written and verify each command exists/is actionable.

Required checks:

Preflight commands run cleanly (or fail with useful remediation)
Mail delegate validation behaves as documented
Runbook commands use real flags (--assignee, slot hook flow, etc.)
No deprecated command surfaces remain
project.md lifecycle guidance is clear for first vs later agents

Step 5: Documentation Quality Critique

Critique every major doc on:

Cold-start clarity (can a new agent execute without guessing?)
Command accuracy (flags/surfaces real and current)
Consistency across docs (no contradictions)
Operational safety (state, mail, evidence, closeout)
Cognitive load (too verbose vs too vague)

Step 6: Test Coverage Critique

Identify missing coverage, especially:

Global install assumptions (bd, bb/beadboard)
Linux/WSL path discovery edge cases
Mail delegate misconfiguration and mismatch paths
bb-mail-shim lifecycle and invalid message ID behavior
project.template.md contract assumptions not exercised by tests

Step 7: Produce Findings Artifact

Create a single markdown report under:

docs/reviews/YYYY-MM-DD-beadboard-driver-v5-audit.md

Required report structure:

Executive verdict (ship-ready / conditionally-ready / not-ready)
Findings by severity (Critical, High, Medium, Low)
Evidence per finding (commands + file refs)
Proposed fixes per finding
Suggested bead breakdown for remediation

Step 8: Create Remediation Beads

From findings, create actionable beads using:

beadboard-<new-epic>.x.x naming format
explicit Scope, Out of Scope, Success Criteria
correct dependency order

Step 9: Session Closeout

Update bead notes with evidence summary
If reusable lesson emerged, create canonical memory bead; otherwise note no new memory
Update this file (NEXT_SESSION_PROMPT.md) with next concrete action

Deliverable Definition of Done

This session is done only when all are true:

Gates executed with captured output.
Full skill critique written to docs/reviews/....
Remediation bead set created with dependency graph.
Clear go/no-go verdict stated with evidence.

3.6 KiB Raw Blame History