claude-agent-service

Viktor Barzin 07dcfca333 All checks were successful ci/woodpecker/push/woodpecker Pipeline was successful Details openai-compat: add /v1/chat/completions endpoint OpenAI-compatible chat completions endpoint so existing OpenAI-API clients (fire-planner's examples/llm_extract.py and others) can target this service without rewriting their client. Behaviour: - POST /v1/chat/completions accepts the OpenAI chat-completions request shape (model, messages, max_tokens?, temperature?, stream?). - Reuses the existing Bearer auth from /execute. - Synthesises a single prompt body from system+user messages ("System instructions:\n... --- Request:\n...") so the agent treats them as the user's request rather than seeing raw JSON. - Internally shares the execution path with /execute by extracting _invoke_claude_subprocess(). Holds execution_lock for the duration; returns 503 (not 409) when busy, since OpenAI callers have no job-id model to retry against. - Returns the OpenAI chat-completions envelope with the final assistant text extracted from `claude -p --output-format json` (falls back to raw stdout if parsing fails). - stream=true -> 400 {"error": "streaming not supported"}. - Underlying failure (non-zero exit, timeout, exception) -> 503 {"error": "execution failed", "detail": "<one line>"}. Model -> agent mapping is hardcoded to `recruiter-triage` for all models for v1 (broadest tool surface among current agents). Budget is hardcoded to $2.00/call; timeout 900s. Revisit when a true general-purpose agent lands. Tests: 9 new tests covering happy path, streaming rejection, missing auth, wrong token, job failure, empty messages, JSON-parse fallback, prompt synthesis, and busy-503. All 20 tests (11 existing + 9 new) pass; ruff clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-29 06:24:20 +00:00
..
__init__.py	Initial extraction from monorepo	2026-05-07 17:07:12 +00:00
main.py	openai-compat: add /v1/chat/completions endpoint	2026-05-29 06:24:20 +00:00

ci/woodpecker/push/woodpecker Pipeline was successful

Details

openai-compat: add /v1/chat/completions endpoint

OpenAI-compatible chat completions endpoint so existing OpenAI-API
clients (fire-planner's examples/llm_extract.py and others) can target
this service without rewriting their client.

Behaviour:
- POST /v1/chat/completions accepts the OpenAI chat-completions request
  shape (model, messages, max_tokens?, temperature?, stream?).
- Reuses the existing Bearer auth from /execute.
- Synthesises a single prompt body from system+user messages
  ("System instructions:\n... --- Request:\n...") so the agent treats
  them as the user's request rather than seeing raw JSON.
- Internally shares the execution path with /execute by extracting
  _invoke_claude_subprocess(). Holds execution_lock for the duration;
  returns 503 (not 409) when busy, since OpenAI callers have no
  job-id model to retry against.
- Returns the OpenAI chat-completions envelope with the final
  assistant text extracted from `claude -p --output-format json`
  (falls back to raw stdout if parsing fails).
- stream=true -> 400 {"error": "streaming not supported"}.
- Underlying failure (non-zero exit, timeout, exception) -> 503
  {"error": "execution failed", "detail": "<one line>"}.

Model -> agent mapping is hardcoded to `recruiter-triage` for all
models for v1 (broadest tool surface among current agents). Budget
is hardcoded to $2.00/call; timeout 900s. Revisit when a true
general-purpose agent lands.

Tests: 9 new tests covering happy path, streaming rejection, missing
auth, wrong token, job failure, empty messages, JSON-parse fallback,
prompt synthesis, and busy-503. All 20 tests (11 existing + 9 new)
pass; ruff clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

2026-05-29 06:24:20 +00:00

__init__.py

Initial extraction from monorepo

2026-05-07 17:07:12 +00:00

main.py

openai-compat: add /v1/chat/completions endpoint

2026-05-29 06:24:20 +00:00