Commit Graph

27370 Commits

Author SHA1 Message Date
YutaSaito f487f4e3a9 feat: add dynamic OAuth2 metadata discovery for MCP servers (#16676)
* feat: add dynamic OAuth2 metadata discovery for MCP servers

* fix: lint error
2025-11-14 18:14:43 -08:00
yuneng-jiang 6063a75155 Remove Description Field from LLM Credentials (#16608) 2025-11-14 17:48:44 -08:00
Ishaan Jaffer 7faff1a7c0 pkg lock 2025-11-14 17:43:43 -08:00
Ishaan Jaffer 936bed056b security fix 2025-11-14 17:43:00 -08:00
Ishaan Jaff 8a43fbe8f7 Revert "[Feat] VertexAI - Add BGE Embeddings support (#16033)" (#16677)
This reverts commit 7133488282.
2025-11-14 17:41:06 -08:00
Ishaan Jaffer f8c022c8f5 bump new schema 2025-11-14 17:39:54 -08:00
Ishaan Jaffer 6b532b31a6 bump: version 1.79.3 → 1.79.4 2025-11-14 17:38:46 -08:00
Ishaan Jaffer 87b7780182 fix LiteLLM_DailyTagSpend add request_id 2025-11-14 17:35:54 -08:00
Ishaan Jaffer 8fdb12a44b fix bedrock agentocre 2025-11-14 17:31:34 -08:00
Ishaan Jaffer c18f411a5e test_encrypt_response_id_success 2025-11-14 17:28:15 -08:00
Ishaan Jaffer 74763f6cfc fix _map_reasoning_effort 2025-11-14 17:27:32 -08:00
Ishaan Jaff bffc36794f docs fix spend tracking (#16675) 2025-11-14 17:22:21 -08:00
Ishaan Jaff efa4ec9294 [Docs] Add docs on APIs for model access management (#16673)
* docs access groups

* docs ui for access groups

* fix code snippets

* docs model access
2025-11-14 17:07:54 -08:00
Cesar Garcia 0a0b5eee47 fix: Resolve pytest module name collision for test_transformation.py files (#16661)
Fixes #16613

The issue was caused by two test files having the same module name
(test_transformation.py) in different directories, which caused pytest
to fail with an import file mismatch error.

Changes:
- Renamed tests/test_litellm/llms/xai/responses/test_transformation.py
  to test_xai_responses_transformation.py
- Renamed tests/test_litellm/llms/openai_like/chat/test_transformation.py
  to test_openai_like_chat_transformation.py

Both files now have unique, descriptive names that reflect their
specific test purposes and prevent module name collisions.
2025-11-14 16:50:40 -08:00
Emerson Gomes 1dac777346 Add Vertex Kimi-K2-Thinking (#16671)
* Add Vertex Kimi-K2-Thinking

* Update model_prices_and_context_window.json

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update litellm/model_prices_and_context_window_backup.json

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-11-14 16:49:48 -08:00
Ishaan Jaffer e240ccef63 check if DB model 2025-11-14 16:41:32 -08:00
Ishaan Jaffer d9df1e4ffd code qa fix 2025-11-14 16:41:28 -08:00
Ishaan Jaff b360cc957e [Feat] Model Management API - Add API Endpoint for creating model access group (#16663)
* add NewModelGroupRequest

* add endpoint for create_model_group

* fix model_access_group_management_router

* add UpdateModelGroupRequest, info and delete

* fix model management tag

* fix validate_models_exist

* fix get_all_access_groups_from_db

* test_create_duplicate_access_group_fails

* test fixes

* fix working create access groups

* fix access group management endpoints

* add is db model checks for model access groups
2025-11-14 16:40:43 -08:00
yuneng-jiang f1ff195bd8 Add Model uses endpoint info (#16664) 2025-11-14 16:08:27 -08:00
Ishaan Jaff 2bd6d0d82b [Feat] Bedrock Batches - Add support for custom KMS encryption keys in Bedrock Batch operations (#16662)
* add s3_encryption_key_id

* add s3EncryptionKeyId to BedrockS3OutputDataConfig

* use s3EncryptionKeyId in bedrock output

* docs s3_encryption_key_id

* test_bedrock_batch_with_encryption_key_in_post_request
2025-11-14 16:00:43 -08:00
yuneng-jiang 7e22f4abc6 Normalize table action columns (#16657) 2025-11-14 14:11:01 -08:00
fzowl b1922e19f8 Voyageai pricing and doc update (#16641)
* Refresh VoyageAI models and prices and context

* Refresh VoyageAI models and prices and context

* Refresh VoyageAI models and prices and context

* Updating the available VoyageAI models in the docs

* Updating the available VoyageAI models in the docs

* Updating the model prices and the docs
2025-11-14 14:09:11 -08:00
Cesar Garcia 65061bafc7 feat(openai): Add support for reasoning_effort='none' in GPT-5.1 (#16658)
* feat(openai): Add support for reasoning_effort='none' in GPT-5.1

OpenAI's GPT-5.1 introduced a new reasoning effort parameter 'none'
which replaces the previous 'minimal' setting for faster, lower-latency
responses. This is now the default setting for GPT-5.1.

Changes:
- Updated REASONING_EFFORT type to include 'none' value
- Added GPT-5.1, GPT-5-mini, and GPT-5-nano to documentation
- Updated docs to reflect 'none' as GPT-5.1's default reasoning effort
- Added test to verify reasoning_effort='none' passes through correctly

Fixes #16633

* feat(responses): Add support for reasoning_effort='none' in Responses API transformation
2025-11-14 13:41:49 -08:00
Alexsander Hamir c7847125c2 [Perf] Embeddings: Use router's O(1) lookup and shared sessions (#16344)
* Refactor proxy embeddings to use shared processor

- allow ProxyBaseLLMRequestProcessing to accept the aembedding route so embeddings requests reuse the base pipeline hooks

- route embeddings requests through base_process_llm_request, sharing logging, hook execution, retries, and header handling with chat/responses

- tighten token array decoding logic by using router deployment lookups and the unified error handler

* Fix: Correctly process embedding requests with token arrays

The `test_embedding_input_array_of_tokens` test was failing due to a regression that caused embedding requests with token arrays to be processed incorrectly. This prevented the `aembedding` function from being called as expected.

This was caused by a combination of three distinct issues:

1.  In `litellm/proxy/common_request_processing.py`, the `function_setup` utility was called with `aembedding` as the `original_function` for embedding routes. This has been corrected to `embedding` to ensure proper request setup.

2.  In `litellm/proxy/proxy_server.py`, a `TypeError` occurred because the `get_deployment` method was called with the `model_name` keyword argument instead of the expected `model_id`. This has been corrected. Additionally, the check for token arrays was improved to validate that all elements in the input subarray are integers.

3.  In `litellm/proxy/litellm_pre_call_utils.py`, the check for the `enforced_params` enterprise feature was too strict. It blocked valid requests even when the `enforced_params` list was empty. The condition has been adjusted to trigger the check only for non-empty lists.

Finally, the `test_embedding_input_array_of_tokens` assertion was updated to be more robust. The previous `assert_called_once_with` was overly strict, causing failures when unrelated internal parameters were added to the function call. The test now first asserts that `aembedding` is called and then separately verifies the `model` and `input` arguments. This makes the test more resilient to future changes without sacrificing its ability to catch regressions.

* test: align proxy embedding assertions

Update the embedding proxy test to match the new request pipeline: keep the data the proxy builds, expect the extra control kwargs, let the post-call hook return the actual response, and assert the normalized 'embeddings' hook type. This proves the refactor still forwards metadata and returns the mocked payload.

* Update proxy exception test

The proxy now forwards additional kwargs (request_timeout, litellm_call_id, litellm_logging_obj) to llm_router.aembedding. The test needs to accept these to match the real call signature and keep validating the error path instead of the kwargs list.

* testing: unsure of this change

I don't remember why I changed this, will revert and see if any tests fail since the manual test isn't failing without it.

* fix: remove unrelated change

This change was not related to the embeddings refactor and actually belonged to a different branch.
2025-11-14 09:21:45 -08:00
Sameer Kankute 52a42e1728 Add all imagen variants in fal ai in model map (#16579) 2025-11-13 22:31:49 -08:00
Sameer Kankute 13993d6ea3 Add fal-ai/flux/schnell support (#16580) 2025-11-13 22:31:31 -08:00
Krrish Dholakia 266744a5bd docs: add contribution guide for new guardrails 2025-11-13 22:29:42 -08:00
Dmitrii Tunikov a22b2b0a67 fix(mcp): Fix Gemini conversation format issue with MCP auto-execution (#16592)
When using MCP tools with require_approval='never' and Gemini models,
the follow-up call after tool execution was failing with:

  'Please ensure that function call turn comes immediately after a user
   turn or after a function response turn.'

This was caused by adding an empty assistant message between the user
message and function calls, which violates Gemini's conversation format
requirements.

Changes:
- Only add assistant message to follow-up input if it contains actual content
- Allow function calls to come directly after user messages (as Gemini requires)
- Add explanatory comments about Gemini's format requirements

This fix allows MCP auto-execution to work correctly with Gemini models
while maintaining compatibility with other models.

Fixes: #[issue-number-if-any]
2025-11-13 22:20:50 -08:00
Tomáš Dvořák eca226286a fix: parse failed chunks for Groq (#16595)
* fix: parse failed chunks for Groq

Ref: #13960
Signed-off-by: Tomas Dvorak <toomas2d@gmail.com>

* chore: formatting

Signed-off-by: Tomas Dvorak <toomas2d@gmail.com>

---------

Signed-off-by: Tomas Dvorak <toomas2d@gmail.com>
2025-11-13 22:07:15 -08:00
Otavio Brito aedfe8f7a1 remove generic exception handling (#16599) 2025-11-13 22:03:47 -08:00
yuneng-jiang 379aa7b79a Pagination for /spend/logs/session/ui endpoint (#16603) 2025-11-13 22:03:00 -08:00
yuneng-jiang 01065a1284 Fixed inconsistent button sizes and variants (#16600) 2025-11-13 22:02:07 -08:00
YutaSaito 331be4f57b fix: avoid crashing when MCP server record lacks credentials (#16601) 2025-11-13 22:01:11 -08:00
pnookala-godaddy 44bb18a6ba fix: forward OpenAI organization for image generation (#16607) 2025-11-13 21:51:27 -08:00
yuneng-jiang 792339200a Migrate Add Model Fields to backend (#16620) 2025-11-13 21:48:57 -08:00
Ishaan Jaff 21ba491656 [UI] Add RunwayML on Admin UI supported models/providers (#16606)
* add runway.png

* add gen4_turbo
2025-11-13 21:46:35 -08:00
sep-grindr 40a9d72be7 fix(ui): remove misleading 'Custom' option mention from OpenAI endpoint tooltips (#16622)
The tooltip for OpenAI api_base select fields incorrectly mentioned 'choose Custom to enter your own' but there was no Custom option available in the dropdown. This fix updates the tooltip text to accurately reflect the available options.

Affected providers:
- OpenAI
- OpenAI_Text
2025-11-13 21:43:57 -08:00
Ishaan Jaff 4a486dc669 [Bug fix] Fixes SambaNova API rejecting requests when message content is passed as a list format (#16612)
* add runwayml_models

* test_call_with_end_user_over_budget

* TestSambanovaContentListHandling

* add _transform_messages for sambanova
2025-11-13 17:03:14 -08:00
Ishaan Jaffer 3feae855bd fix mapped test 2025-11-13 17:00:09 -08:00
Ishaan Jaffer 3c662eadb4 add runwayml/eleven_multilingual_v2 pricing 2025-11-13 16:45:35 -08:00
Ishaan Jaffer ee8b1cfabc test_call_with_end_user_over_budget 2025-11-13 16:26:02 -08:00
Ishaan Jaffer 32885087c2 add runwayml_models 2025-11-13 16:25:27 -08:00
Ishaan Jaffer 8be374c98d fix docker v 2025-11-13 16:12:48 -08:00
Ishaan Jaff 124ba463f8 [Feat] RunwayML - Add support for /audio/speech eleven_multilingual_v2 endpoint (#16604)
* init RunwayMLTextToSpeechConfig

* add RunwayMLTextToSpeechConfig

* add  RunwayMLTextToSpeechConfig

* test_runwayml_tts_async

* runway ml speech

* fix voices

* fix test

* docs runway lm

* add runwayml here

* fix RunwayMLTextToSpeechConfig

* test_openai_voice_mapping_to_runwayml
2025-11-13 14:32:09 -08:00
Nicholas Couture 4be372eb48 fix: support Anthropic tool_use and tool_result in token counter (#16351)
* fix: support Anthropic tool_use and tool_result in token counter

* refactor(token_counter): add dynamic field inference for Anthropic content blocks

* test: Add additional tests

* make format

* Fix lint error

* Fix mypy narrow type lint errors
2025-11-13 14:30:46 -08:00
Ishaan Jaff 911a009869 [Docs] LiteLLM Quick start - show how model resolution works (#16602)
* docs nderstanding Model Configuration

* docs fix
2025-11-13 13:28:01 -08:00
Ishaan Jaff 7133488282 [Feat] VertexAI - Add BGE Embeddings support (#16033)
* Support for Custom Vertex AI Models via PSC Endpoint with api_base (#15953)

* Support for Custom Vertex AI Models via PSC Endpoint with api_base

* Add docs related psc

* remove not needed files

* remove print statemnt

* fix mypy errors

* add TextEmbeddingBGEInput

* add VertexBGEConfig

* add BGE handling

* test_vertex_ai_bge_embedding_with_custom_api_base

* fix request transform vertex BGE

* test_vertex_ai_bge_embedding_with_custom_api_base

* tes BGE

* test_is_bge_model_detection

* docs cleanup

* handling BGE URL

* fix VertexBGEConfig

* test_vertex_ai_bge_with_endpoint_id_pattern

* docs vertex BGE

* docs

* docs fix

* fix VertexAIModelRoute

* from ..common_utils import VertexAIError, get_vertex_base_model_name
add

* fix VertexAIGemmaModels

* fix get_vertex_base_model_name

* test_vertex_ai_bge_psc_endpoint_url_construction

---------

Co-authored-by: Sameer Kankute <sameer@berri.ai>
2025-11-13 12:41:00 -08:00
YutaSaito 8e0b66a814 fix: exclude unauthorized MCP servers from allowed server list (#16551)
* fix: exclude unauthorized MCP servers from allowed server list

* fix: test after resolving merge conflicts
2025-11-13 12:33:54 -08:00
Sameer Kankute ea80510f78 [Feat] Day-0, Add gpt-5.1 and gpt-5.1-codex family support (#16598)
* Add day 0 support for gpt-5.1 models

* Add gpt-5.1-codex day 0 support

* update pricing values
2025-11-13 10:55:54 -08:00
Jón Levy 555d7b8be8 feat(bedrock): Add bearer token authentication support for AgentCore (#16556) v1.79.3.dev7 2025-11-13 08:17:36 -08:00