Replace copy.deepcopy with model_dump + model_validate in streaming
iterator logging to handle Pydantic ValidatorIterator objects that
cannot be pickled when tool_choice uses allowed_tools mode.
Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>
* Add openai metadata filed in the request
* Add docs related to openai metadata
* Add utils
* test_completion_openai_metadata[True]
* Added support for though signature for gemini 3 in responses api (#16872)
* Added support for though signature for gemini 3
* Update docs with all supported endpoints and cost tracking
* Added config based routing support for batches and files
* fix lint errors
* Litellm anthropic image url support (#16868)
* Add image as url support to anthropic
* fix mypy errors
* fix tests
* Fix: Populate spend_logs_metadata in batch and files endpoints (#16921)
* Add spend-logs-metadata to the metadata
* Add tests for spend logs metadata in batches
* use better names
* Remove support for penalty param for gemini 3 (#16907)
* Remove support for penalty param
* remove halucinated model names
* fix mypy/test errors
* fix tests
* fix too many lines error
* fix too many lines error
* Add config for cicd test case
* Fix final tests
* fix batch tests
* fix batch tests
## Problem
The `extra_body` parameter in `litellm.responses()` and `litellm.aresponses()`
was being accepted but never passed to the HTTP request sent to the LLM provider.
This prevented users from sending custom/experimental parameters to provider APIs.
## Changes
- Added `data.update(extra_body)` in `async_response_api_handler` (line 2138)
- Added `data.update(extra_body)` in `response_api_handler` (line 2012)
- Added tests to `test_openai_responses_api.py` for extra_body functionality
## Testing
- Tests verify extra_body params are passed in both sync and async modes
- Existing Responses API tests continue to pass
- Manually verified with OpenAI API that custom params are sent correctly
## Impact
Users can now pass custom/experimental parameters via extra_body:
```python
litellm.aresponses(
model="gpt-4o",
input="hello",
extra_body={"custom_param": "value"} # Now works!
)
```
This aligns with the OpenAI SDK pattern and matches behavior in other
LiteLLM endpoints (completion, embedding, etc.) that already support extra_body.
This commit fixes two bugs in Responses API streaming tests:
1. **Usage field naming bug**: Tests were using `input_tokens` and
`output_tokens` but the Usage object uses `prompt_tokens` and
`completion_tokens`.
2. **Missing cost in streaming usage**: When `include_cost_in_streaming_usage`
was enabled, the cost was calculated and added to ResponseAPIUsage, but was
lost during the transformation to the Usage object.
Changes:
- Updated test assertions to use correct field names (prompt_tokens, completion_tokens)
- Added cost preservation logic in FakeStreamerResponsesAPIIterator
- Modified _transform_response_api_usage_to_chat_usage() to preserve cost attribute
All streaming tests now pass successfully.
* fix: use fastuuid helper across the codebase
First batch of changes, simple drop in replacement.
* second batch of changes
* fixed: script mistake on helper file
* fix: ensure /responses/cancel works for non admins
* test: cancel endpoint
* fix responses API cancel endpoint
* test fix
* TestGoogleAIStudioResponsesAPITest