Commit Graph

96 Commits

Author SHA1 Message Date
Ishaan Jaff c3e673b627 [Feat] Add github co-pilot as a new LLM API provider (#12325)
* Litellm dev 03 05 2025 contributor prs (#9079)

* feat: add support for copilot provider

* test: add tests for github copilot

* chore: clean up github copilot authenticator

* test: add test for github copilot authenticator

* test: add test for github copilot for sonnet 3.7 thought model

* Fix #7629 - Add tzdata package to Dockerfile (#8915)

* Add tzdata package to Dockerfile

* Move tzdata to python requirement.txt

* feat: add support for copilot provider (#8577)

* feat: add support for copilot provider

* test: add tests for github copilot

* chore: clean up github copilot authenticator

* test: add test for github copilot authenticator

* test: add test for github copilot for sonnet 3.7 thought model

---------

Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>

* feat: add model information for copilot models

* fix: fix linting errors

* test: remove integration test for github_copilot + fix misisng mock

* fix: use print to make sure the logger message shown

* test: remove debug print

* fix lint (#11112)

* Add init files to make test directories Python packages and update import paths in test_token_counter.py (#11119)

* Update litellm/model_prices_and_context_window_backup.json

Co-authored-by: மனோஜ்குமார் பழனிச்சாமி <smartmanoj42857@gmail.com>

---------

Co-authored-by: Son H. Nguyen <nhs.000.dev@gmail.com>
Co-authored-by: subnet.dev <50828879+subnet-dev@users.noreply.github.com>
Co-authored-by: Son H. Nguyen <33925625+nhs000@users.noreply.github.com>
Co-authored-by: மனோஜ்குமார் பழனிச்சாமி <smartmanoj42857@gmail.com>
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>

* refactor github copilot

* test_github_copilot_transformation.py

* test_github_copilot_authenticator.py

* add GitHub Copilot

* fix order

* doc fix

---------

Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>
Co-authored-by: Son H. Nguyen <nhs.000.dev@gmail.com>
Co-authored-by: subnet.dev <50828879+subnet-dev@users.noreply.github.com>
Co-authored-by: Son H. Nguyen <33925625+nhs000@users.noreply.github.com>
Co-authored-by: மனோஜ்குமார் பழனிச்சாமி <smartmanoj42857@gmail.com>
2025-07-04 13:12:16 -07:00
Joost van Doorn 453591ed7c Fix: Fix custom ca bundle support in aiohttp transport (#12281)
* Unify usage of get_ssl_configuration

* Fix doc
2025-07-04 12:48:20 -07:00
Ishaan Jaff 39955129f5 fix mapped tests (#12320)
* fix - use flush llm client cache

* faster mapped tests

* test_async_multiple_response_ids_routing

* fix tests

* test_ateam_member_update_admin_requires_premium

* regular mapped tests

* Revert "Fix: Initialize JSON logging for all loggers when JSON_LOGS=True (#12206)"

This reverts commit 2c60c316ec.

* reset num workers
2025-07-04 10:04:43 -07:00
Low Jian Sheng c2d5682a90 Fix gemini tool call sequence (#11999)
* fix gemini tool call sequence

* modify tests
2025-07-04 07:29:39 -07:00
Ishaan Jaff 7319d6b003 test llamafile 2025-07-03 22:35:22 -07:00
Krish Dholakia bba75aa12b Add 'audio_url' message type support for VLLM (#12270)
* fix(openai.py): add audio_url content type for vllm

Fixes https://github.com/BerriAI/litellm/issues/12196

* test: fix test
2025-07-02 20:37:45 -07:00
Nathan Brake 14feb5e454 feat: Turn Mistral to use llm_http_handler (#12245)
* Enhance Mistral API: Add support for parallel tool calls and refine name handling in tool messages. Plus, introduce a new test for parallel tool calls in the Mistral model.

* tests

* make mypy happy

* Refine name handling in Mistral chat transformation: clarify conditions for removing the 'name' field based on message role and content.

* refactor: streamline Mistral integration by removing deprecated references and adding a new handler

- Removed "mistral" from the list of compatible providers in constants.
- Updated the completion function in main.py to utilize the new Mistral handler.
- Deleted outdated Mistral chat and embedding files.
- Introduced a new handler for Mistral chat completions, implementing the llm_http_handler pattern.
- Added integration tests for the Mistral handler to ensure proper API base and key handling.

* lint

* fix: remove unneeded handler object

* add tests

* Addres PR comments
2025-07-02 14:01:56 -07:00
Nathan Brake c2dbc9c64b fix: mistral transform_response handling for empty string content (#12202)
* Enhance Mistral API: Add support for parallel tool calls and refine name handling in tool messages. Plus, introduce a new test for parallel tool calls in the Mistral model.

* tests

* make mypy happy

* Refine name handling in Mistral chat transformation: clarify conditions for removing the 'name' field based on message role and content.

* handle mistral returning '' instead of None
2025-07-01 22:31:50 -07:00
Ishaan Jaff 7471a30dcd Revert "Fix: Preserve full path structure for Gemini custom api_base (#12215)" (#12227)
This reverts commit f47254ecab.
2025-07-01 20:41:39 -07:00
Ciprian Tomoiaga b3b4c65ac4 Fix default parameters for ollama-chat (#12201) 2025-07-01 17:59:04 -07:00
Cole McIntosh f47254ecab Fix: Preserve full path structure for Gemini custom api_base (#12215)
* Fix: Preserve full path structure for Gemini custom api_base (Fixes #11959)

This fix addresses an issue where custom api_base URLs (like Cloudflare AI Gateway)
were not working correctly with Google AI Studio (Gemini) models.

The problem was that the _check_custom_proxy method was simply appending the endpoint
to the custom base URL, resulting in malformed URLs like:
https://gateway.ai.cloudflare.com/v1/my-id/my-gateway/google-ai-studio:generateContent

Instead of the correct format:
https://gateway.ai.cloudflare.com/v1/my-id/my-gateway/google-ai-studio/v1beta/models/gemini-2.5-flash:generateContent

Changes:
- Modified _check_custom_proxy to preserve the full path structure from the original URL
- Extracts the path from the original Google AI URL and appends it to the custom base
- Maintains backward compatibility for Vertex AI models (unchanged behavior)
- Added comprehensive tests to verify the fix works correctly

Fixes #11959

* Fix: Update test to match actual Gemini URL format and fix double colon issue

- Fixed test expectation to include the full model path with 'gemini/' prefix
- Fixed double colon issue in Vertex AI URL construction when using custom api_base
- All tests now pass successfully
2025-07-01 17:56:22 -07:00
Krish Dholakia 9582c88eab Non-anthropic (gemini/openai/etc.) models token usage returned when calling /v1/messages (#12184)
* fix(proxy_server.py): handle empty config yaml

Fixes https://github.com/BerriAI/litellm/issues/12163

* fix(gemini/common_utils.py): replace models/ as expected, instead of using 'strip'

Fixes https://github.com/BerriAI/litellm/issues/12160

* fix(anthropic/experimental_pass_through/messages/transformation.py): check for env var when selecting api key

* fix(anthropic/transformation.py): return tool_use content block start on anthropic bridge

Closes https://github.com/BerriAI/litellm/issues/12158

* fix(anthropic/streaming_iterator.py): fix setting index in block

ensure index is set just once and increments correctly when a new block is created

* fix(anthropic/adapters/handler.py): update logging obj with stream options value if set

* feat(anthropic/streaming_iterator.py): return usage from chat completion to messages bridge

enables usage tracking for non-anthropic models

Closes https://github.com/BerriAI/litellm/issues/12132

* fix(streaming_iterator.py): safely access usage chunk

* fix: suppress linting error

* test: update tests

* fix: fix streaming errors
2025-07-01 17:41:48 -07:00
Krish Dholakia ee9dd158dd Fix - handle empty config.yaml + Fix gemini /models - replace models/ as expected, instead of using 'strip' (#12189)
* fix(proxy_server.py): handle empty config yaml

Fixes https://github.com/BerriAI/litellm/issues/12163

* fix(gemini/common_utils.py): replace models/ as expected, instead of using 'strip'

Fixes https://github.com/BerriAI/litellm/issues/12160

* fix(anthropic/experimental_pass_through/messages/transformation.py): check for env var when selecting api key

* docs(config_settings.md): add api key to docs
2025-06-30 21:56:03 -07:00
Young Han 042f6b187e [Bug Fix] Fix Error code: 307 for LlamaAPI Streaming Chat (#11946)
* fix: add follow_redirects to avoid 307 error code

* feat: add error 307 test case

* feat: add follow_redirects=True to AsyncClient
2025-06-30 16:52:42 -07:00
Krish Dholakia f7af8902b0 /v1/messages - Remove hardcoded model name on streaming + Tags - enable setting custom header tags (#12131)
* fix(anthropic/experimental_pass_through): use given model name when returning streaming chunks

don't harcode model name on streaming

confusing for user

* fix(anthropic/streaming_iterator.py): remove scope of import

* feat(litellm_logging.py): allow admin to specify additional headers for using as spend tags

Closes https://github.com/BerriAI/litellm/issues/12129

* test(test_litellm_logging.py): add unit tests

* feat(openweb_ui.md): add custom tag tutorial to docs

* docs(cost_tracking.md): add tag based usage UI screenshot

* test: update test

* fix: fix import
2025-06-28 21:49:35 -07:00
Krish Dholakia ee6e76e1f9 Bedrock Passthrough cost tracking (/invoke + /converse routes - streaming + non-streaming) (#12123)
* refactor(passthrough_endpoints-success-handler): refactor llm passthrough logging logic

isolate the llm translation work to enable cost tracking on sdk

* feat: initial implementation of passthrough SDK cost calculation

enables bedrock passthrough cost tracking to work

* feat(cost_calculator.py): working cost calculation for bedrock passthrough

* feat(litellm_logging.py): consider allm_passthrough in cost tracking

allows async calls (e.g. via proxy) to work

* feat(bedrock/passthrough): working event stream decoding for bedrock passthrough calls + logging instrumentation for passthrough sdk calls (log on stream completion)

Enables bedrock streaming cost calculation

* feat(litellm_logging.py): support streaming passthrough cost tracking

* feat(passthrough/main.py): working async streaming cost calculation

Closes https://github.com/BerriAI/litellm/issues/11359

* feat(proxy_server.py): fix passthrough routing when llm router enabled

* feat: further fixes

* feat(bedrock/): working bedrock passthrough cost tracking (non-streaming)

* feat(litellm_logging.py): working usage tracking for bedrock passthrough calls

ensures tokens are logged

* feat(bedrock/passthrough): add converse passthrough cost tracking support

* feat(base_llm/passthrough): remove redundant function

* refactor(litellm_logging.py): refactor function to be below 50 LOC

* test: update test

* test: remove redundant test
2025-06-27 20:01:12 -07:00
Ishaan Jaff ebf6395bc1 [Feat] Add Eleven Labs - Speech To Text Support on LiteLLM (#12119)
* add ELEVENLABS as a provider

* add deepgram to main.py

* add ElevenLabsException

* add ElevenLabsAudioTranscriptionConfig

* add transform_audio_transcription_response

* TestElevenLabsAudioTranscription

* add elevenlabs/scribe_v1 to model cost map

* add ElevenLabsAudioTranscriptionConfig

* add AudioTranscriptionRequestData

* add ElevenLabs transform

* use AudioTranscriptionRequestData

* refactoring fixes

* add ProcessedAudioFile util for reading audio files

* test_elevenlabs_diarize_parameter_passthrough

* docs eleven labs

* docs fixes

* fix code qa checks

* fixes - audio transcription

* ui - add ElevenLabs logo

* add elevenlabs logo

* docs - ElevenLabs

* test fix elevenlabs
2025-06-27 17:50:49 -07:00
Kishan fc17da0aef [Bug Fix] Anthropic - Token Usage Null Handling in calculate_usage (#12068)
* [Bug Fix] Anthropic - Token Usage Null Handling in calculate_usage (BerriAI/litellm#11920)

* [Fix] Missed a null check and used a cast instead by error
2025-06-27 10:00:23 -07:00
Ishaan Jaff 22ff3da3cf [Fix] Allow using HTTP_ Proxy settings with trust_env (#12066)
* allow using trust_env

* add docs on how to use HTTP_PROXY

* docs AIOHTTP_TRUST_ENV

* test_aiohttp_transport_trust_env_setting

* docs fix
2025-06-26 08:37:22 -07:00
Ishaan Jaff 5b8e300150 [Feat] gemini-cli integration - Add Logging + Cost tracking for stream + non-stream Vertex / Google AI Studio routes (#12058)
* add google generate content to call types

* Revert "add google generate content to call types"

This reverts commit 6f57dde293e50674c8fca0feac6e01c27d9e1c96.

* add CallTypesLiteral for gemini

* allow passing model to vertexpass through logging handler

* update logging handler

* fix checking if stream

* add async streaming logging for vtx

* refactor _transform_google_generate_content_to_openai_model_response

* fix logging_obj

* fixes _handle_non_streaming_google_genai_generate_content_response_logging

* logging callback tests

* ruff check fixes

* test _is_streaming_request

* test_ensure_initialize_azure_sdk_client_always_used

* fix BaseGoogleGenAIGenerateContentStreamingIterator

* fix - linting errors

* req - add google-genai
2025-06-25 22:26:20 -07:00
Krish Dholakia 1a4ad8bf18 Update mistral 'supports_response_schema' field + Fix ollama embedding (#12024)
* build(model_prices_and_context_window.json): update all mistral models (besides codestral-mamba) to indicate support for response schema

Closes https://github.com/BerriAI/litellm/issues/12012

* fix(route_llm_request.py): if llm router is not initialized, go straight through to litellm sdk

Fixes https://github.com/BerriAI/litellm/issues/12008

* test: add unit test

* fix(ollama_embeddings): fix unecessary await

Fixes https://github.com/BerriAI/litellm/issues/11997

* test: update ollama embedding tests
2025-06-25 07:20:13 -07:00
Krish Dholakia 24c2cd1bd9 Anthropic /v1/messages - Custom LLM Server support (#12016)
* fix(handler.py): support routing custom llm's to chat completion handler

Adds custom llm support for anthropic

* test(test_anthropic_experimental_pass_through_messages_handler.py): add unit test confirming custom llm respected

* docs(custom_llm_server.md): document anthropic custom llm translation

* test(volcengine.py): map thinking in extra body

Fixes https://github.com/BerriAI/litellm/issues/11879

* feat(main.py): support `azure/responses/<deployment-name>` model string

this allows us to route the model correctly

Closes https://github.com/BerriAI/litellm/issues/11879

* docs(azure_responses.md): document calling azure responses api models via chat completions bridge

Closes https://github.com/BerriAI/litellm/issues/11917

* fix: fix custom provider check

* test: update tests
2025-06-24 22:00:44 -07:00
Ishaan Jaff d6cc384780 [Feat] OpenAI/Azure OpenAI - Add support for creating vector stores on LiteLLM (#12021)
* add create/acreate vector store

* add azure config

* add _base_validate_azure_environment

* fix base test

* add get_base_create_vector_store_args

* use base llm for headers responses api

* add _get_base_azure_url

* fix AzureOpenAIVectorStoreConfig

* TestAzureOpenAIVectorStore

* fix azure openai vector store

* fix test comment

* fix unused imports

* test_validate_environment_azure_api_key_within_secret_str

* test_azure_transformation.py
2025-06-24 20:46:48 -07:00
Ishaan Jaff 1467a99aab [Fix] Magistral small system prompt diverges too much from the official recommendation (#12007)
* fix mistral _get_mistral_reasoning_system_prompt

* fix test_get_mistral_reasoning_system_prompt
2025-06-24 13:45:58 -07:00
Krish Dholakia a89397a798 Litellm dev 06 23 2025 p1 (#11989)
* fix(litellm_logging.py): fix using router model id for logging calls

Fixes https://github.com/BerriAI/litellm/issues/11975#issuecomment-2995882238

* test(test_litellm_logging.py): add unit test for custom price tracking

* fix(vertex_ai/): don't send invalid format parameter to vertex

causes calls to fail

* fix(vertex_ai_context_caching.py): if cached content present and tools in message, cache tools as well

gemini throws errors if tools passed in alongside cached content

* test: add unit tests

* fix: fix linting errors

* test: test_vertex_ai_common_utils.py

update test

* fix(streaming_handler.py): unset response cost when creating model response
2025-06-23 22:33:06 -07:00
Ishaan Jaff 3d542846b2 test fix 2025-06-23 20:03:26 -07:00
Ishaan Jaff 4f98862ef9 test_azure_common_utils.py 2025-06-23 18:34:48 -07:00
hsuyuming 180ed14918 fix: fix test_get_azure_ad_token_with_oidc_token testcase issue, because (#11988)
CI/CD pipeline setup AZURE_CLIENT_SECRET within enviroment variable, so
we need to setup as None in this case
2025-06-23 18:11:06 -07:00
Cole McIntosh 02a095d4db feat: implement Perplexity citation tokens and search queries cost calculation (#11938)
* feat: add citation_cost_per_token and search_queries_cost_per_1000 fields to ModelInfoBase

- Add citation_cost_per_token field to ModelInfoBase for Perplexity citation token costs
- Add search_queries_cost_per_1000 field to ModelInfoBase for Perplexity search query costs
- Update _get_model_info_helper to include these fields in model info responses
- Enables proper cost calculation for Perplexity-specific usage metrics

* feat: update Perplexity sonar-deep-research model pricing configuration

- Update input/output token costs to / per million tokens respectively
- Add reasoning token cost at  per million tokens
- Add citation_cost_per_token at  per million tokens (same as input)
- Add search_queries_cost_per_1000 at /bin/zsh.005 per 1000 search queries
- Remove deprecated search_context_cost_per_query structure
- Aligns with Perplexity's updated pricing model for deep research capabilities

* feat: implement Perplexity-specific cost calculator

- Create cost_per_token function for Perplexity provider
- Calculate standard input/output token costs
- Add citation token cost calculation using citation_cost_per_token rate
- Add reasoning token cost calculation with fallback to completion_tokens_details
- Add search query cost calculation using search_queries_cost_per_1000 rate
- Return separate prompt_cost and completion_cost for accurate billing
- Handles all Perplexity-specific usage metrics: citation_tokens, num_search_queries, reasoning_tokens

* feat: integrate Perplexity cost calculator with main cost calculation system

- Import perplexity_cost_per_token function in main cost calculator
- Add perplexity provider case to cost_per_token function
- Enables automatic routing of Perplexity cost calculations to provider-specific logic
- Maintains compatibility with existing cost calculation patterns
- Supports all Perplexity-specific cost metrics through unified interface

* feat: enhance Perplexity response transformation to extract cost-related fields

- Override transform_response method to extract Perplexity-specific usage fields
- Add _enhance_usage_with_perplexity_fields method to process API responses
- Extract citation_tokens from citations array using character-based estimation (~4 chars/token)
- Extract num_search_queries from both usage field and root level with priority handling
- Create usage object when none exists to ensure cost fields are always captured
- Handle empty citations and missing fields gracefully
- Enables automatic extraction of cost metrics from Perplexity API responses

* test: add comprehensive test suite for Perplexity cost calculation features

Add 82 comprehensive tests across 3 test files:

- test_perplexity_cost_calculator.py (59 tests):
  * Cost calculation with citation tokens, search queries, reasoning tokens
  * Various combinations and edge cases
  * Integration with main cost calculator
  * Model info access and validation
  * Zero values and missing fields handling

- test_perplexity_chat_transformation.py (12 tests):
  * Citation token extraction from API responses
  * Search query extraction from usage and root fields
  * Priority handling and field aggregation
  * Empty citations and missing fields handling
  * Token estimation accuracy validation

- test_perplexity_integration.py (11 tests):
  * End-to-end cost calculation workflows
  * High-volume and edge case scenarios
  * Model info integration validation
  * Case-insensitive provider matching
  * Transformation preservation of existing fields

Ensures reliability and correctness of all Perplexity cost features with comprehensive coverage of happy path, edge cases, and error conditions.

* fix: remove unused Union import from Perplexity transformation

- Remove unused typing.Union import from litellm/llms/perplexity/chat/transformation.py
- Fixes F401 linting error: 'typing.Union imported but unused'
- Maintains only necessary imports: Any, List, Optional, Tuple

* Fix JSON schema validation and use web_search_requests field

- Add citation_cost_per_token and search_queries_cost_per_1000 to JSON schema
- Update Perplexity transformation to use web_search_requests in PromptTokensDetailsWrapper
- Update Perplexity cost calculator to read from web_search_requests field
- Maintain backward compatibility while using standard LiteLLM fields

* Fix type errors in Perplexity cost calculator

- Add null checks for token counts and cost values to prevent None multiplication errors
- Use .get() with fallback values instead of direct dictionary access
- Ensure all arithmetic operations handle None values safely

This fixes the failing job 44517525148 type errors.

* Refactor Perplexity cost calculation tests to improve accuracy and consistency

- Replace absolute difference assertions with math.isclose for better precision in cost comparisons
- Update tests to utilize PromptTokensDetailsWrapper for handling web search requests
- Ensure all test cases correctly reflect the new structure of usage fields, enhancing clarity and maintainability

* fix: address type hinting issues in PerplexityChatConfig usage handling

- Add type ignore comments to model_response.usage assignments to resolve type checking errors
- Ensures compatibility with type definitions while maintaining existing functionality

* Update model pricing configuration in JSON backup

- Add citation_cost_per_token and search_queries_cost_per_1000 fields to enhance cost tracking
- Remove deprecated search_context_cost_per_query structure to streamline pricing model
- Aligns with recent updates in Perplexity's pricing strategy

* Update search queries cost structure in model_prices_and_context_window.json to use search_context_cost_per_query

* Refactor search queries cost structure in model_prices_and_context_window_backup.json and update related code to use search_queries_cost_per_query. Remove deprecated search_queries_cost_per_1000 references across model info and tests.

* Enhance cost calculation in cost_calculator.py by introducing a safe float casting function to handle potential None and invalid values. Update cost calculations for input, citation, output, reasoning, and search query tokens to use this new function, ensuring more robust handling of model pricing data.

* Refactor cost calculation in cost_calculator.py to support both legacy and current search cost keys. Enhance handling of search cost values by accommodating both dictionary and float formats, ensuring robust cost computation for search queries.

* Update test cases to reflect changes in cost structure, renaming search_queries_cost_per_query to search_context_cost_per_query for consistency with recent refactor. Ensure assertions in tests align with updated cost keys.

* Update test_perplexity_integration.py to rename search_queries_cost_per_query to search_context_cost_per_query, ensuring consistency with recent cost structure changes. Adjust assertions to align with updated cost keys.
2025-06-23 14:15:25 -07:00
Ishaan Jaff ef7f8cce93 [Bug Fix] Perplexity - LiteLLM doesn't support 'web_search_options' for Perplexity' Sonar Pro model (#11983)
* TestPerplexityWebSearch

* use supports_web_search

* Update tests/test_litellm/llms/perplexity/test_perplexity.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-06-23 10:21:50 -07:00
Juan Cruz-Benito 962fd67227 Fixing watsonx error: 'model_id' or 'model' cannot be specified in the request body for models in a deployment space (#11854)
* Fixing watsonx error: 'model_id' or 'model' cannot be specified in the request body for models in a deployment space

* Revert "Fixing watsonx error: 'model_id' or 'model' cannot be specified in the request body for models in a deployment space"

This reverts commit 9d16a3000b24d255f70ab74c01eb6267354df151.

* Implementing feedback from code review
2025-06-23 10:14:10 -07:00
Ishaan Jaff cde20cf825 fix - checking proxy settings (#11947) 2025-06-23 09:30:40 -07:00
hsuyuming e3ba888c63 fix: make response api support Azure Authentication method (#11941)
* fix: make response api support Azure Authentication method
1. Support diverse Azure authentication methods
2. Use distinct headers for API key and Azure AD token base on this
   documentation (https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/responses?tabs=rest-api#generate-a-text-response)

* fix: fix path issue

* fix lint error

* rename test_transformation.py to test_azure_transformation.py

* change litellm_params as Optional type
2025-06-23 08:43:20 -07:00
Johnny.H e8ce3995ca fix aws bedrock claude tool call index (#11842) 2025-06-20 23:21:08 -07:00
Ishaan Jaff 99d851544a [Feat] Add Azure Codex Models on LiteLLM + new /v1 preview Azure OpenAI API (#11934)
* fix get_complete_url

* fixes _is_azure_v1_api_version

* test_azure_responses_api_preview_api_version

* TestAzureResponsesAPIConfig

* add azure/codex-mini

* fix azure/codex-mini

* Update litellm/llms/azure/responses/transformation.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* fix linting

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-06-20 18:08:44 -07:00
Ishaan Jaff 75298af605 [Bug Fix] Cost tracking and logging via the /v1/messages API are not working when using Claude Code (#11928)
* add test_anthropic_messages_litellm_router_streaming_with_logging to base tests

* move test

* fixes for base ant tests

* working bedrock ant logging

* use BaseAnthropicMessagesStreamingIterator

* use common iterator for messages streaming

* TestAnthropicDirectAPI

* test_anthropic_claude3_transformation.py

* fix code QA checks

* fix logging for anthropic messages in SLP

* fix TestAnthropicOpenAIAPI

* remove hard coded usage for adapter

* test_anthropic_messages_litellm_router_streaming_with_logging
2025-06-20 18:08:35 -07:00
Krish Dholakia cf83b541e5 Volcengine - thinking param support + Azure - handle more gpt custom naming patterns (#11914)
* fix(volcengine.py): add thinking param support

Closes https://github.com/BerriAI/litellm/issues/11879

* fix(gpt_transformation.py): handle azure custom names - e.g. `gpt-4-1`

Closes https://github.com/BerriAI/litellm/issues/11834
2025-06-20 09:40:33 -07:00
Krish Dholakia 308e82d885 LiteLLM SDK <-> Proxy improvement (don't transform message client-side) + Bedrock - handle qs:.. in base64 file data + Tag Management - support adding public model names (#11908)
* fix(factory.py): handle qs:.. in mime type

Fixes https://github.com/BerriAI/litellm/issues/11839

* feat(litellm_proxy/): don't transform messages client-side

leave litellm proxy messages untouched - allow proxy to handle transformation

 prevents double transformation

* feat(tag_management_endpoints.py): support adding models to tag by adding model_name

Closes https://github.com/BerriAI/litellm/issues/11884

* test(test_tag_management_endpoints.py): add unit tests for adding new model by public model name

* test: update test
2025-06-19 22:34:18 -07:00
Krish Dholakia 40cc61c8f3 build(model_prices_and_context_window.json): mark all gemini-2.5 mode… (#11907)
* build(model_prices_and_context_window.json): mark all gemini-2.5 models as supporting pdf input

Closes https://github.com/BerriAI/litellm/issues/11881

* fix(anthropic_transformation.py): set custom llm provider custom property

Fixes https://github.com/BerriAI/litellm/issues/11861

* test: add unit test for checking supports_reasoning

* test: add test for vertex ai flow

* feat(bedrock/anthropic): ensure thinking param correctly passed for bedrock/invoke
2025-06-19 21:07:25 -07:00
Nathan Brake 1c4fdb4a8f Enhance Mistral API: Add support for parallel tool calls (#11770)
* Enhance Mistral API: Add support for parallel tool calls and refine name handling in tool messages. Plus, introduce a new test for parallel tool calls in the Mistral model.

* tests

* make mypy happy

* Refine name handling in Mistral chat transformation: clarify conditions for removing the 'name' field based on message role and content.
2025-06-19 20:12:39 -07:00
Ishaan Jaff d4b34549bc [Fix] Networking - allow using CA Bundles (#11906)
* fix _get_ssl_context

* fixes for using HTTP handler
2025-06-19 20:09:08 -07:00
Pascal Lim ad2e2302e2 feat: add workload identity federation between GCP and AWS (#10210) 2025-06-19 18:31:58 -07:00
Ishaan Jaff 29bf89cf9c fix(vertex_ai): Handle missing tokenCount in promptTokensDetails (#11… (#11896)
* fix(vertex_ai): Handle missing tokenCount in promptTokensDetails (#11581)

This PR is a Solution to the Error converting to a valid response block='tokenCount'. File an issue if litellm error - https://github.com/BerriAI/litellm/issues

It's happening because vertex_ai is not sometimes sending the token count for the audio modality.

* test_vertex_ai_usage_metadata_missing_token_count

---------

Co-authored-by: Nishith Jain <167524748+KingNish24@users.noreply.github.com>
2025-06-19 13:54:02 -07:00
Ishaan Jaff 08b2b4f5f5 [Feat] Enable Tool Calling for meta_llama (#11895)
* Enable Tool Calling for `meta_llama` (#11825)

* feat: enable tools and function_call features

* fix: ignore pydantic warnings for StreamingChoices from llama-api

* docs: add tool calling examples

* docs: change default models to Maverick

* docs: fix output of tool use

* test_map_openai_params

---------

Co-authored-by: Young Han <110819238+seyeong-han@users.noreply.github.com>
2025-06-19 13:44:22 -07:00
Krish Dholakia 0d09c8ec96 Litellm dev 06 18 2025 p1 (#11872)
* fix(spend_tracking_utils.py): add user agent tags from standard logging payload, in spend logs payload

* feat(litellm_logging.py): identify user agent tags as `User-Agent: ..` and allow admin to disable storing user agent as tag

* fix(azure_ai/): pass content type header in azure ai request

Fixes https://github.com/BerriAI/litellm/issues/11227

* test: add unit test

* fix(router.py): fix passing dynamic credentials to retrieve batch

Fixes batch retrieval when using router

* test: add more unit tests
2025-06-18 21:24:36 -07:00
Krish Dholakia 7f8b2579a2 Minor Fixes (#11868)
* fix(litellm_pre_call_utils.py): add user agent tags to spend logs in standard logging payload logic

avoid clash when tag based routing is enabled

* test: remove redundant test

* test: rename oidc test to run earlier

quicker debuging

* fix(azure.py): return more detailed error message

* fix(azure/common_utils.py): use default scope, if scope is none

fixes oidc test

* fix: always default to cognitiveservices.azure.com

* test: update test
2025-06-18 14:12:59 -07:00
Krish Dholakia e5fd313a48 Completion-To-Responses Bridge: Support passing image url's (#11833)
* fix(completion_to_responses_bridge_transformation.py): support passing image urls' to responses api models

Fixes https://github.com/BerriAI/litellm/issues/11820

* fix(base_aws_llm.py): if boto3 present, try and get the configured region name

Closes https://github.com/BerriAI/litellm/issues/8847

* fix: fix imports

* fix: fix linting error s
2025-06-18 12:48:17 -07:00
Ishaan Jaff 6ffebe7394 [Fix] v1/messages endpoint always uses us-central1 with vertex_ai-anthropic models (#11831)
* fix - vertex location

* test_validate_environment_uses_vertex_ai_location
2025-06-18 07:00:04 -07:00
X4tar 2740c8d77d Fix vertex ai claude thinking params (#11796)
* fix:  vertex_ai/claude-sonnet-4 thinking params can not be accepted

* CHORE: add unit test

---------

Co-authored-by: wick.hu <wick.hu@momenta.ai>
2025-06-17 22:35:31 -07:00
Abinand P 99c2a7fb70 feat: update the feature of ollama_embeddings to work on a sync api (#11746)
* feat: update the feature of ollama_embeddings to work on a sync api

Signed-off-by: Abinand P <abinand0911@gmail.com>

* lint:fixing of the lint file

Signed-off-by: Abinand P <abinand0911@gmail.com>

* fix:test

Signed-off-by: Abinand P <abinand0911@gmail.com>

* chore: added test for ollama embedding and refactored handler

Signed-off-by: Abinand P <abinand0911@gmail.com>

* fix:lint error

Signed-off-by: Abinand P <abinand0911@gmail.com>

---------

Signed-off-by: Abinand P <abinand0911@gmail.com>
2025-06-16 19:07:33 -07:00