mirror of
https://github.com/tiennm99/litellm.git
synced 2026-06-17 12:48:57 +00:00
f76938af5e
* Update CLAUDE.md with qwen3 tool_calls bug fix instructions (#18922) * fix(ollama): set finish_reason to "tool_calls" when tool_calls present When qwen3 models return tool_calls through Ollama, the finish_reason was incorrectly left as "stop" instead of being set to "tool_calls". This caused clients to miss the tool_calls in the response. Added _get_finish_reason helper method following OpenAI provider's pattern, and fixed both streaming and non-streaming response paths. Fixes: https://github.com/BerriAI/litellm/issues/18922 * fix(ollama): pass tools directly without model capability check The previous code tried to check model capability via get_model_info() which made network calls to localhost:11434. When Ollama is remote, this fails and falls back to JSON format, breaking tool calling. Ollama 0.4+ supports native tool calling - let Ollama handle model capability detection instead of LiteLLM. Fixes #18922 * fix(ollama): transform tool_calls response to OpenAI format Ollama returns tool_calls with arguments as dict, but OpenAI format requires arguments to be a JSON string. Also ensures 'type': 'function' field is present. Completes the fix for #18922 * fix(ollama): set finish_reason to "tool_calls" when tool_calls present Fixes #18922 Two issues addressed: 1. Remove broken model capability check - get_model_info() fails when Ollama runs on remote server - Broken fallback triggered JSON prompt injection - Now passes tools directly - Ollama 0.4+ handles detection 2. Set finish_reason correctly - Was hardcoded to "stop" even with tool_calls present - Clients use this to know how to process the response - Now returns "tool_calls" when tool_calls are in response Both streaming and non-streaming responses are fixed. Tests: - All 14 existing Ollama tests pass - Added 3 focused tests for the fixes
4 lines
52 B
TOML
4 lines
52 B
TOML
version = 1
|
|
revision = 3
|
|
requires-python = ">=3.13"
|