* Update CLAUDE.md with qwen3 tool_calls bug fix instructions (#18922)
* fix(ollama): set finish_reason to "tool_calls" when tool_calls present
When qwen3 models return tool_calls through Ollama, the finish_reason
was incorrectly left as "stop" instead of being set to "tool_calls".
This caused clients to miss the tool_calls in the response.
Added _get_finish_reason helper method following OpenAI provider's
pattern, and fixed both streaming and non-streaming response paths.
Fixes: https://github.com/BerriAI/litellm/issues/18922
* fix(ollama): pass tools directly without model capability check
The previous code tried to check model capability via get_model_info()
which made network calls to localhost:11434. When Ollama is remote,
this fails and falls back to JSON format, breaking tool calling.
Ollama 0.4+ supports native tool calling - let Ollama handle
model capability detection instead of LiteLLM.
Fixes#18922
* fix(ollama): transform tool_calls response to OpenAI format
Ollama returns tool_calls with arguments as dict, but OpenAI format
requires arguments to be a JSON string. Also ensures 'type': 'function'
field is present.
Completes the fix for #18922
* fix(ollama): set finish_reason to "tool_calls" when tool_calls present
Fixes#18922
Two issues addressed:
1. Remove broken model capability check
- get_model_info() fails when Ollama runs on remote server
- Broken fallback triggered JSON prompt injection
- Now passes tools directly - Ollama 0.4+ handles detection
2. Set finish_reason correctly
- Was hardcoded to "stop" even with tool_calls present
- Clients use this to know how to process the response
- Now returns "tool_calls" when tool_calls are in response
Both streaming and non-streaming responses are fixed.
Tests:
- All 14 existing Ollama tests pass
- Added 3 focused tests for the fixes