Files
goclaw/internal/http
Duc Nguyen 23d0b5eb0b fix(providers): auto-clamp max_tokens on model rejection (#267)
* fix(providers): auto-clamp max_tokens on model rejection + fix verify for reasoning models

When OpenAI-compat models reject max_tokens as too large (e.g. gpt-3.5-turbo
supports 4096 but we send 8192), parse the model's stated limit from the 400
error, clamp the value, and retry once. This fixes agent creation for models
with lower output token limits without hardcoding model names.

Also increase the provider verify endpoint's max_tokens from 1 to 50 so
reasoning models (gpt-5, o-series) have enough headroom for internal
reasoning during the check call.

Closes #248, closes #245

* refactor(providers): extract chat retry closure + fix clamp log key

- Extract duplicate retry closure into chatRequestFn() to follow DRY
- Fix slog logging wrong key: body["max_tokens"] was nil for reasoning
  models that use max_completion_tokens — now uses clampedLimit() helper
- Remove unnecessary _ = resp in provider verify endpoint

---------

Co-authored-by: viettranx <viettranx@gmail.com>
2026-03-19 08:41:20 +07:00
..