Commit Graph

3254 Commits

Author SHA1 Message Date
Krish Dholakia 270d612029 Merge branch 'main' into litellm_dev_09_10_2025_p1 2025-09-19 22:01:57 -07:00
Ishaan Jaffer a6795a6560 test fix 2025-09-19 18:05:36 -07:00
Ishaan Jaffer eabc9cd415 test test_e2e_bedrock_embedding 2025-09-19 18:03:20 -07:00
Ishaan Jaffer ed5c9f1c69 test fixes for mapped tests 2025-09-19 18:01:36 -07:00
Krish Dholakia 6142c3ac3d Merge pull request #14738 from BerriAI/litellm_dev_09_19_2025_p1
UI SSO - consider token info endpoint on generic SSO route for access control groups
2025-09-19 17:58:37 -07:00
Ishaan Jaffer dec15a80a2 test: mcp test fix 2025-09-19 17:52:32 -07:00
Ishaan Jaffer 15898c89e1 test: test_azure_openai_gpt_5_responses_api 2025-09-19 17:45:34 -07:00
Ishaan Jaffer c3f150b13d mcp test fix 2025-09-19 17:41:15 -07:00
Ishaan Jaffer 0ba4c7753a test fix 2025-09-19 17:33:37 -07:00
Krrish Dholakia 7694e9f6a9 test: refactoring + testing 2025-09-19 17:20:15 -07:00
Ishaan Jaffer d739d226ed fix: test 2025-09-19 16:28:09 -07:00
Ishaan Jaff 90ee9e4587 [Feat] Dynamic Rate Limiter v3 - fixes to ensure priority routing works as expected (#14734)
* fix: dynamic limiter v3

* fix: dynamic limiter v3

* feat: add dynamic limiter v3

* feat: add dynamic limiter v3

* feat: add dynamic limiter v3 in init litellm_logging

* feat: add dynamic limiter v3 in init litellm_logging

* fix: priority rate limiting

* Potential fix for code scanning alert no. 3397: Clear-text logging of sensitive information

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

* fix: priority rate limiting

* fix: ruff

* fix: mypy lint

---------

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
2025-09-19 16:04:45 -07:00
Felipe Garé a696ffe4a6 Litellm gemini batch (#14733)
* feat: add Vertex AI support for file content retrieval

- Extended `custom_llm_provider` to include "vertex_ai" in `afile_content` function.
- Implemented file content retrieval logic for Vertex AI in `VertexAIFilesHandler`.
- Added helper method to extract bucket and object from URL-encoded file_id.
- Created comprehensive unit and integration tests for Vertex AI file handling.
- Updated transformation logic to ensure compatibility with Vertex AI file responses.

* fix: update Vertex AI file transformation logic

- Modified the transformation logic in `VertexAIFilesConfig` to return a newline-separated JSON string for batch JSONL files instead of a array if JSON strings.

* fix: enhance Vertex AI output handling in transformation logic

- Updated the transformation logic in `VertexAIBatchTransformation` to utilize the new `OutputInfo` TypedDict for retrieving the GCS output directory.
- Added `OutputInfo` class to type definitions for better structure and clarity in Vertex AI responses.
2025-09-19 15:22:52 -07:00
Max Falk 12da4039b9 fix: Prevent AttributeError for _get_tags_from_request_kwargs (#14735)
* fix: avoid NoneType AttributeError when extracting tags

I've been running into this error:
```
21:47:08 - LiteLLM:ERROR: litellm_logging.py:2396 - LiteLLM.LoggingError: [Non-Blocking] Exception occurred while success logging Traceback (most recent call last):

  File "/usr/lib/python3.13/site-packages/litellm/litellm_core_utils/litellm_logging.py", line 2312, in async_success_handler

    await callback.async_log_success_event(

    ...<6 lines>...

    )

  File "/usr/lib/python3.13/site-packages/litellm/router_strategy/budget_limiter.py", line 396, in async_log_success_event

    request_tags = _get_tags_from_request_kwargs(kwargs)

  File "/usr/lib/python3.13/site-packages/litellm/router_strategy/tag_based_routing.py", line 144, in _get_tags_from_request_kwargs

    return _metadata.get("tags", [])

           ^^^^^^^^^^^^^

AttributeError: 'NoneType' object has no attribute 'get' 
```

This makes the function more resilient without resorting to try catch.

* add tests

Signed-off-by: Max Falk <gmdfalk@gmail.com>

---------

Signed-off-by: Max Falk <gmdfalk@gmail.com>
2025-09-19 15:21:02 -07:00
Krish Dholakia f471cf002d Merge pull request #14715 from timelfrink/fix/issue-14120-gemini-2.5-flash-image-preview
Fix: gemini-2.5-flash-image-preview model routing for image generation
2025-09-19 07:43:03 -07:00
Tim Elfrink 5323ca8346 Fix Gemini 2.5 Flash Image Preview response parsing
- Add response_modalities configuration to request format
- Fix response parsing to use camelCase 'inlineData' instead of snake_case 'inline_data'
- Update test to validate proper request format and response parsing
- All existing Gemini image generation tests pass
2025-09-19 11:20:03 +02:00
Tim Elfrink 3a98fd6096 Remove hardcoded model name and fix breaking change
- Reverted GEMINI_2_5_FLASH_IMAGE_PREVIEW_MODEL constant usage
- Made endpoint selection conditional for gemini-2.5-flash-image-preview only
- Preserved existing Imagen models functionality with :predict endpoint
- Fixed potential breaking change that would affect 6 other Gemini image models
2025-09-19 09:29:14 +02:00
Krish Dholakia aa54994994 Merge pull request #14666 from michaeltansg/feat/add-bangkok-timezone
Added Indochina Time timezone support for budget resets
2025-09-18 23:41:27 -07:00
Krish Dholakia f5838594e0 Merge pull request #14675 from tcx4c70/fix/response-price
fix: cost calculation for responses
2025-09-18 23:41:08 -07:00
Krish Dholakia ad6ba8f5c5 Merge pull request #14695 from uc4w6c/fix/mcp-gateway-tools-list
Fix/mcp gateway tools list
2025-09-18 23:40:14 -07:00
Krish Dholakia d5a839d971 Merge pull request #14700 from BerriAI/litellm_contributor_prs_09_18_2025_p2
Update Bedrock documentation for Titan V2 encoding_format support + Anthropic - account for 1h vs. 5m cache creation token cost difference + UI - add langsmith_sampling_rate as a dynamic param
2025-09-18 23:38:29 -07:00
Krrish Dholakia 92e841e311 fix: fix test 2025-09-18 23:37:38 -07:00
Tim Elfrink f5e6246143 Add test for gemini-2.5-flash-image-preview fix
- Test validates correct endpoint routing to :generateContent
- Mock HTTP responses to avoid API limits
- Verify request format uses Gemini contents structure
- Ensure image generation functionality works correctly
2025-09-19 08:06:01 +02:00
Ishaan Jaff 80bd8e007f fix contributor PR linting failing (#14710)
* validate fix

* fix linting error
2025-09-18 20:03:27 -07:00
Krish Dholakia 9a6b1651d2 Merge pull request #14707 from ARajan1084/bedrock-guardrail-silent-failure-correction
fix: Bedrock guardrail silent failure correction
2025-09-18 20:03:17 -07:00
Krish Dholakia 664c83cfb5 Merge branch 'litellm_contributor_prs_09_18_2025_p2' into litellm_dev_09_17_2025_p2_v2 2025-09-18 19:50:55 -07:00
= 14b3cc2f95 Update test_bedrock_guardrails.py 2025-09-18 19:48:16 -07:00
= ed4bd504ea Revert "removed unnecessary test"
This reverts commit ad663e5240.
2025-09-18 19:45:13 -07:00
= ad663e5240 removed unnecessary test 2025-09-18 19:37:15 -07:00
Ishaan Jaffer 725cf3627d fix: license check.ini 2025-09-18 19:28:07 -07:00
Krrish Dholakia 4d87199266 fix(prometheus.py): fix spend metrics 2025-09-18 19:12:07 -07:00
Krrish Dholakia aa7839e4cb fix: fix test 2025-09-18 19:02:52 -07:00
Krish Dholakia f387803655 Merge pull request #14658 from ARajan1084/bedrock-custom-guardrail-fix
fix: check for AWS exceptions despite a 200 response
2025-09-18 18:34:32 -07:00
Alexsander Hamir 60800698f2 feature: generic object pool (#14702)
* add: generic object pool & tests

Introduced a reusable object pool that can be applied across the codebase.
Note: memory growth is managed via eviction settings—using a hard cap could
reduce performance, so eviction is the preferred safeguard.

* fix: simpler tests
2025-09-18 18:32:45 -07:00
Krish Dholakia 63c26d7a4f Merge branch 'litellm_contributor_prs_09_18_2025_p2' into fix/issue-14685-bedrock-titan-v2-encoding-format 2025-09-18 17:54:33 -07:00
Alexsander Hamir 59409429d4 fix: reduced __inits__ overhead in 7% (#14689)
* fix: avoid redundant __init__ calls on hot path

Previously, imports on the request hot path caused __init__ to run
excessively for every request. This change ensures initialization
happens once, reducing cpu overhead.

* fix: remove redundant __init__ import

The current implementation no longer requires an import at the top of the function.

* fix: placed on core utils for future reuse

* test: add coverage & remove inline import

A general import-checking tool across all endpoints would be a large PR.
This commit focuses on a smaller, targeted fix for the discussed case.

* added import check to CI
2025-09-18 17:18:05 -07:00
Ishaan Jaff 4c983f985a [Feat] Add Bedrock Twelve Labs embedding provider support (#14697)
* fix: add 12 labs to bedrock embedding

* fix: get_bedrock_embedding_provider

* test: test_text_embedding

* fix: 12 labs embedding transform

* fix: refactor 12 labs transform logic

* fix: test_e2e_bedrock_embedding

* fix: test_e2e_bedrock_embedding

* feat: add bedrock twelvelabs pricing

* DOCS: docs bedrock embedding

* DOCS: 12 labs bedrock overview

* fix: bedrock embeddings 12 labs
2025-09-18 17:16:45 -07:00
katsuhiro muto ec61a7152a Support for is_streamed_request widh datadog (#14673) 2025-09-18 15:55:16 -07:00
= a3f0a3c05f Update test_bedrock_guardrails.py 2025-09-18 15:51:42 -07:00
= 911918474a removed duplicate code 2025-09-18 15:49:16 -07:00
Yuta Saito 7f8b1d0708 test: fix failing tests after conflict resolution 2025-09-19 07:48:37 +09:00
= 917c8bb43c Update test_bedrock_guardrails.py 2025-09-18 15:42:41 -07:00
= a86b9a1808 check for AWS exceptions despite a 200 response 2025-09-18 15:42:36 -07:00
Mubashir Osmani a7a6381926 fix: flaky passthrough tests (#14692)
* fix: flaky passthrough tests

* Revert "fix: flaky passthrough tests"

This reverts commit ffe692e017600a8853ab7c31f95485958ab74c5f.

* fix: serialize prisma objects
2025-09-18 15:35:14 -07:00
Yuta Saito 654f1d3290 fix: stop including spec_version in MCP server registration inserts 2025-09-19 07:06:15 +09:00
Yuta Saito 6c291093e9 fix: remove adding Mcp-Protocol-Version header (#14069)
The Mcp-Protocol-Version header is already handled in the MCP Python SDK, so the explicit addition on LiteLLM Proxy was redundant.
2025-09-19 07:05:20 +09:00
Tim Elfrink b100328435 Add test coverage for Bedrock Titan V2 encoding_format parameter
- Test encoding_format='float' parameter mapping and response handling
- Test encoding_format='base64' parameter mapping to binary format
- Verify parameter transformation and response processing
- Mock AWS API responses for both float and binary formats
- Ensure OpenAI compatibility with new encoding_format support
2025-09-18 20:24:41 +02:00
Ishaan Jaffer e733b619db fix: test_user_email_in_all_required_metrics 2025-09-18 11:23:13 -07:00
Sameer Kankute 36bedc69ff Add TwelveLabs marengo model (#14674) 2025-09-18 11:21:35 -07:00
Sameer Kankute d213a2e066 correct the gaurdcontent name (#14684)
* correct the gaurdcontent name

* correct the gaurdcontent name

* fix model required error in test

* Add correct model
2025-09-18 11:00:19 -07:00