PR #590 • reviewer artifact

OpenAI compatibility fix, shown as behavior not prose.

This PR has no product UI. The reviewer surface is wire behavior: how long tool-call IDs are serialized and how prefixed OpenAI-compatible model IDs are classified. The visuals below show the exact before/after deltas that matter for issue #532.

Primary bug
Legacy or replayed tool IDs could alias after a 40-char prefix cut.
Secondary bug
Prefixed models like openai/o3-mini bypassed capability gates.
Fix shape
Hash long IDs at the serializer boundary and normalize model family once.
Scope
Covers main loop, resumed history, memory flush, and subagent replay through one path.

Before: prefix cut preserved length, not uniqueness.

The original defensive fallback capped IDs by slicing the first 40 characters. That stops the HTTP 400, but it still lets two distinct long IDs collapse into the same wire value if they only diverge after character 40.

Why this was still a blocker

Fresh live tool calls were already hashed upstream, but resumed history and replay paths still depended on the serializer. That meant the bug could survive in exactly the sessions reviewers care about: legacy, resumed, or replayed state.

Why the serializer fix is the right layer

Every OpenAI-compatible request path uses the same request builder. Fixing the boundary closes the main loop and the side loops together instead of patching each caller independently.

Before

Shared-prefix IDs collide after truncation

Collision risk remains on replay

Two distinct IDs from a replayed or deduped session can become the same outbound value. The request is shorter, but the assistant tool calls and tool results are no longer uniquely identifiable.

{
  "tool_calls": [
    { "id": "call_0123456789abcdef0123456789abcdef012" },
    { "id": "call_0123456789abcdef0123456789abcdef012" }
  ],
  "messages": [
    { "role": "tool", "tool_call_id": "call_0123456789abcdef0123456789abcdef012" },
    { "role": "tool", "tool_call_id": "call_0123456789abcdef0123456789abcdef012" }
  ]
}
  • Distinct legacy IDs differ only after byte 40.
  • Prefix slicing makes both values identical on the wire.
  • Parallel tool results can no longer be correlated safely.
After

Long IDs shorten to stable hashed values

40-char cap without aliasing

IDs longer than 40 characters are now re-encoded as deterministic call_-prefixed hashes. The same original ID always maps to the same shortened value, and different long IDs stay different.

{
  "tool_calls": [
    { "id": "call_6625d2d3e5ec2c6c86f70fd9a7064cc77a1" },
    { "id": "call_60b8d9101dbdde451723a0888a84317d14f" }
  ],
  "messages": [
    { "role": "tool", "tool_call_id": "call_6625d2d3e5ec2c6c86f70fd9a7064cc77a1" },
    { "role": "tool", "tool_call_id": "call_60b8d9101dbdde451723a0888a84317d14f" }
  ]
}
  • Still meets the 40-character API limit.
  • Preserves assistant/tool-result correlation because the same helper is used for both fields.
  • Protects replay and legacy sessions, not just newly generated tool calls.

Before: prefixed model IDs skipped the intended reasoning-model rules.

The request builder already supports provider-prefixed IDs such as openai/o3-mini. The bug was that capability checks ran on the full string, so prefixed reasoning models could keep temperature and use the wrong token field.

Before

Prefixed reasoning models look like unknown families

Wrong request shape for openai/o3-mini

Because the gate looked at the full model string, the builder missed both the reasoning-model token routing and the temperature skip.

{
  "model": "openai/o3-mini",
  "max_tokens": 123,
  "temperature": 0.7
}
  • Contradicted the “model-level, not provider-specific” finding.
  • Left OpenRouter-style OpenAI IDs outside the intended guardrail.
After

Capability checks use normalized model family

Same rules for bare and prefixed IDs

The builder now strips transport prefixes once, then applies the same capability logic to both bare and prefixed model IDs.

{
  "model": "openai/o3-mini",
  "max_completion_tokens": 123
}
  • openai/o3-mini now follows the same route as o3-mini.
  • openai/gpt-5.4 still keeps temperature, matching the flagship-model rule.
  • Azure and non-Azure paths are validated as model-based, not URL-based.

Tests and reviewer-facing outcomes

The changes above were verified with focused provider and agent tests, plus a full-suite spot check to expose unrelated red tests explicitly rather than hide them under a green subset.

Passed
  • go test ./internal/providers ./internal/agent
  • go test ./internal/providers -run 'TestTruncateToolCallID|TestBuildRequestBody' -count=1
  • go test ./internal/agent -run '^TestUniquifyToolCallIDs$' -count=1
Known unrelated failure

go test ./... still fails in pkg/browser at TestResolveRemoteCDP_DefaultPort with /json/version returned HTTP 404. This pre-exists the provider changes and is unrelated to the OpenAI compatibility patch.