|
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
OpenAI-compatible chat completions endpoint so existing OpenAI-API
clients (fire-planner's examples/llm_extract.py and others) can target
this service without rewriting their client.
Behaviour:
- POST /v1/chat/completions accepts the OpenAI chat-completions request
shape (model, messages, max_tokens?, temperature?, stream?).
- Reuses the existing Bearer auth from /execute.
- Synthesises a single prompt body from system+user messages
("System instructions:\n... --- Request:\n...") so the agent treats
them as the user's request rather than seeing raw JSON.
- Internally shares the execution path with /execute by extracting
_invoke_claude_subprocess(). Holds execution_lock for the duration;
returns 503 (not 409) when busy, since OpenAI callers have no
job-id model to retry against.
- Returns the OpenAI chat-completions envelope with the final
assistant text extracted from `claude -p --output-format json`
(falls back to raw stdout if parsing fails).
- stream=true -> 400 {"error": "streaming not supported"}.
- Underlying failure (non-zero exit, timeout, exception) -> 503
{"error": "execution failed", "detail": "<one line>"}.
Model -> agent mapping is hardcoded to `recruiter-triage` for all
models for v1 (broadest tool surface among current agents). Budget
is hardcoded to $2.00/call; timeout 900s. Revisit when a true
general-purpose agent lands.
Tests: 9 new tests covering happy path, streaming rejection, missing
auth, wrong token, job failure, empty messages, JSON-parse fallback,
prompt synthesis, and busy-503. All 20 tests (11 existing + 9 new)
pass; ruff clean.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
||
|---|---|---|
| .. | ||
| __init__.py | ||
| main.py | ||