Cost tracking was failing for Responses API when using custom deployment names
with base_model configuration. The issue occurred because:
- Chat Completions API stores model_info in 'metadata'
- Responses API stores model_info in 'litellm_metadata'
- Cost calculator only checked 'metadata', missing Responses API costs
Changes:
- Updated _get_base_model_from_metadata() to check both metadata locations
- Added comprehensive unit tests covering all scenarios
- Maintains backward compatibility (metadata takes precedence)
Fixes#16772
- Test multiple tool calls without explicit indices receive sequential indices
- Verify Delta class assigns indices 0, 1, 2... instead of defaulting all to 0
- Add comprehensive assertions for tool call details preservation
- Cover provider-agnostic streaming response scenarios
* Check content and order of trimmed messages
* Assert tool calls are preserved if below max_tokens
* Unreverse order of tool calls
* Return tool calls alongside other messages
* Write test for trimming untokenizable field
* Return original messages in case of exception
* fix: add flag for disabling use_aiohttp_transport
* feat: add _create_async_transport
* feat: fixes for transport
* add httpx-aiohttp
* feat: fixes for transport
* refactor: fixes for transport
* build: fix deps
* fixes: test fixes
* fix: ensure aiohttp does not auto set content type
* test: test fixes
* feat: add LiteLLMAiohttpTransport
* fix: fixes for responses API handling
* test: fixes for responses API handling
* test: fixes for responses API handling
* feat: fixes for transport
* fix: base embedding handler
* test: test_async_http_handler_force_ipv4
* test: fix failing deepeval test
* fix: add YARL for bedrock urls
* fix: issues with transport
* fix: comment out linting issues
* test fix
* test: XAI is unstable
* test: fixes for using respx
* test: XAI fixes
* test: XAI fixes
* test: infinity testing fixes
* docs(config_settings.md): document param
* test: test_openai_image_edit_litellm_sdk
* test: remove deprecated test
* bump respx==0.22.0
* test: test_xai_message_name_filtering
* test: fix anthropic test after bumping httpx
* use n 4 for mapped tests (#11109)
* fix: use 1 session per event loop
* test: test_client_session_helper
* fix: linting error
* fix: resolving GET requests on httpx 0.28.1
* test fixes proxy unit tests
* fix: add ssl verify settings
* fix: proxy unit tests
* fix: refactor
* tests: basic unit tests for aiohttp transports
* tests: fixes xai
---------
Co-authored-by: Krrish Dholakia <krrishdholakia@gmail.com>
* added tests
messages_with_counts: Made tolerance explicit for each test. But they match the new implementation(which beats the old)
* new token counter impl
* compare old and new implementation in test
* delete old token counter
* moved tests to /tests/litellm/litellm_core_utils
* use existing types
* docstrings
* warn about using default params on unknown model.
* created type for the token_counter_function
* check key == "content"
* throw error on invalid detail-type, ignore type-warning.
* fix imports
* feat(utils.py): support global flag for 'check_provider_endpoints'
enables setting this for `/models` on proxy
* feat(utils.py): add caching to 'get_valid_models'
Prevents checking endpoint repeatedly
* fix(utils.py): ensure mutations don't impact cached results
* test(test_utils.py): add unit test to confirm cache invalidation logic
* feat(utils.py): get_valid_models - support passing litellm params dynamically
Allows for checking endpoints based on received credentials
* test: update test
* feat(model_checks.py): pass router credentials to get_valid_models - ensures it checks correct credentials
* refactor(utils.py): refactor for simpler functions
* fix: fix linting errors
* fix(utils.py): fix test
* fix(utils.py): set valid providers to custom_llm_provider, if given
* test: update test
* fix: fix ruff check error
* fix: initial commit for adding provider model discovery to gemini
* feat(gemini/): add model discovery for gemini/ route
* docs(set_keys.md): update docs to show you can check available gemini models as well
* feat(anthropic/): add model discovery for anthropic api key
* feat(xai/): add model discovery for XAI
enables checking what models an xai key can call
* ci: bump ci config yml
* fix(topaz/common_utils.py): fix linting error
* fix: fix linting error for python38
* refactor: introduce new transformation config for gpt-4o-transcribe models
* refactor: expose new transformation configs for audio transcription
* ci: fix config yml
* feat(openai/transcriptions): support provider config transformation on openai audio transcriptions
allows gpt-4o and whisper audio transformation to work as expected
* refactor: migrate fireworks ai + deepgram to new transform request pattern
* feat(openai/): working support for gpt-4o-audio-transcribe
* build(model_prices_and_context_window.json): add gpt-4o-transcribe to model cost map
* build(model_prices_and_context_window.json): specify what endpoints are supported for `/audio/transcriptions`
* fix(get_supported_openai_params.py): fix return
* refactor(deepgram/): migrate unit test to deepgram handler
* refactor: cleanup unused imports
* fix(get_supported_openai_params.py): fix linting error
* test: update test
* feat(bedrock/converse/transformation.py): support claude-3-7-sonnet reasoning_Content transformation
Closes https://github.com/BerriAI/litellm/issues/8777
* fix(bedrock/): support returning `reasoning_content` on streaming for claude-3-7
Resolves https://github.com/BerriAI/litellm/issues/8777
* feat(bedrock/): unify converse reasoning content blocks for consistency across anthropic and bedrock
* fix(anthropic/chat/transformation.py): handle deepseek-style 'reasoning_content' extraction within transformation.py
simpler logic
* feat(bedrock/): fix streaming to return blocks in consistent format
* fix: fix linting error
* test: fix test
* feat(factory.py): fix bedrock thinking block translation on tool calling
allows passing the thinking blocks back to bedrock for tool calling
* fix(types/utils.py): don't exclude provider_specific_fields on model dump
ensures consistent responses
* fix: fix linting errors
* fix(convert_dict_to_response.py): pass reasoning_content on root
* fix: test
* fix(streaming_handler.py): add helper util for setting model id
* fix(streaming_handler.py): fix setting model id on model response stream chunk
* fix(streaming_handler.py): fix linting error
* fix(streaming_handler.py): fix linting error
* fix(types/utils.py): add provider_specific_fields to model stream response
* fix(streaming_handler.py): copy provider specific fields and add them to the root of the streaming response
* fix(streaming_handler.py): fix check
* fix: fix test
* fix(types/utils.py): ensure messages content is always openai compatible
* fix(types/utils.py): fix delta object to always be openai compatible
only introduce new params if variable exists
* test: fix bedrock nova tests
* test: skip flaky test
* test: skip flaky test in ci/cd
* feat(litellm_pre_call_utils.py): support `x-litellm-tags` request header
allow tag based routing + spend tracking via request headers
* docs(request_headers.md): document new `x-litellm-tags` for tag based routing and spend tracking
* docs(tag_routing.md): add to docs
* fix(utils.py): only pass str values for openai metadata param
* fix(utils.py): drop non-str values for metadata param to openai
preview-feature, otel span was being sent in
* use class ResetBudgetJob
* refactor reset budget job
* update reset_budget job
* refactor reset budget job
* fix LiteLLM_UserTable
* refactor reset budget job
* add telemetry for reset budget job
* dd - log service success/failure on DD
* add detailed reset budget reset info on DD
* initialize_scheduled_background_jobs
* refactor reset budget job
* trigger service failure hook when fails to reset a budget for team, key, user
* fix resetBudgetJob
* unit testing for ResetBudgetJob
* test_duration_in_seconds_basic
* testing for triggering service logging
* fix logs on test teams fail
* remove unused imports
* fix import duration in s
* duration_in_seconds
* fix(litellm_logging.py): support saving applied guardrails in logging object
allows list of applied guardrails to be logged for proxy admin's knowledge
* feat(spend_tracking_utils.py): log applied guardrails to spend logs
makes it easy for admin to know what guardrails were applied on a request
* ci(config.yml): uninstall posthog from ci/cd
* test: fix tests
* test: update test
* fix(utils.py): handle key error in msg validation
* Support running Aim Guard during LLM call (#7918)
* support running Aim Guard during LLM call
* Rename header
* adjust docs and fix type annotations
* fix(timeout.md): doc fix for openai example on dynamic timeouts
---------
Co-authored-by: Tomer Bin <117278227+hxtomer@users.noreply.github.com>
* refactor _add_callbacks_from_db_config
* fix check for _custom_logger_exists_in_litellm_callbacks
* move loc of test utils
* run ci/cd again
* test_add_custom_logger_callback_to_specific_event_with_duplicates_callbacks
* fix _custom_logger_class_exists_in_success_callbacks
* unit testing for test_add_callbacks_from_db_config
* test_custom_logger_exists_in_callbacks_individual_functions
* fix config.yml
* fix test test_stream_chunk_builder_openai_audio_output_usage - use direct dict comparison