Commit Graph

27389 Commits

Author SHA1 Message Date
Cesar Garcia cc72037cec feat(openai): Add verbosity parameter support for GPT-5 family models (#16660)
OpenAI's GPT-5 model family supports a verbosity parameter to control
the length and detail of responses. This parameter accepts three values:
'low', 'medium', or 'high'.

Changes:
- Added verbosity parameter to completion() and acompletion() signatures
- Added verbosity to DEFAULT_CHAT_COMPLETION_PARAM_VALUES in constants.py
- Added verbosity to get_optional_params() in utils.py
- Added verbosity to GPT-5 supported params list
- Updated OpenAI docs with verbosity usage examples
- Added comprehensive test for verbosity parameter

Supported models: gpt-5, gpt-5.1, gpt-5-mini, gpt-5-nano, gpt-5-codex, gpt-5-pro
2025-11-14 19:38:27 -08:00
Rob Geada d35d9008c9 Ensure detector-id is passed as header to IBM detector server (#16649) 2025-11-14 19:35:49 -08:00
Luka Pečnik d84e97f211 fix: preserve $defs for Anthropic tools input schema (#16648)
* fix: preserve $defs for Anthropic tools input schema

* fix: preserve $defs for Anthropic tools input schema

* fix: preserve $defs for Anthropic tools input schema
2025-11-14 19:35:27 -08:00
yuneng-jiang 473886d35a user_alias in read and update path (#16669) 2025-11-14 19:34:28 -08:00
pnookala-godaddy f599a462c1 openai(video): use GET for /videos/{id}/content by returning empty params; add tests to assert GET (#16672) 2025-11-14 19:33:37 -08:00
Ishaan Jaffer 63994e302e test_call_with_key_over_model_budget 2025-11-14 19:05:00 -08:00
Ishaan Jaffer acad73018d fix pkg lock 2025-11-14 18:59:49 -08:00
Ishaan Jaffer 39c1a970a1 fix 2025-11-14 18:56:30 -08:00
Krrish Dholakia 9ced18b695 perf: use reusable http client 2025-11-14 18:56:14 -08:00
Krrish Dholakia 938ec7c39a fix: fix linting error 2025-11-14 18:54:24 -08:00
Ishaan Jaffer 11cf22e7b8 add "provider_specific_fields": null 2025-11-14 18:49:35 -08:00
Ishaan Jaffer a69011883a fix logging_testing 2025-11-14 18:48:20 -08:00
Ishaan Jaffer efb80ffab7 fix pkg lock 2025-11-14 18:43:41 -08:00
Ishaan Jaffer a1286fb609 security fix 2025-11-14 18:43:41 -08:00
yuneng-jiang 09e226b140 [Feature] UI - New Callbacks table (#16512)
* New Callbacks table

* Change Action Buttons to use Icons

* Changed to follow our existing pattern

* Removed unused import
2025-11-14 18:36:59 -08:00
Ishaan Jaffer 9e8653ad3c fix prisma client 2025-11-14 18:25:27 -08:00
Krrish Dholakia 54e7792933 fix: remove dailytag request id migration from agents_table.sql 2025-11-14 18:24:01 -08:00
Krish Dholakia 8097fafc05 Agents - support agent registration + discovery (A2A spec) (#16615)
* fix: initial commit adding types

* refactor: refactor to include agent registry

* feat(agents/): endpoints.py

working endpoint for agent discovery

* feat(agent_endpoints/endpoints.py): add permission management logic to agents endpoint

* feat: public endpoint for showing publicly discoverable agents

* feat: make /public/agent_hub discoverable

* feat(agent_endpoints/endpoints.py): working create agent endpoint

adds dynamic agent registration to the proxy

* feat: working crud endpoints

* feat: working multi-instance create/delete agents

* feat(migration.sql): add migration for agents table
2025-11-14 18:23:30 -08:00
Ishaan Jaffer 65468353d1 provider_specific_fields 2025-11-14 18:17:50 -08:00
YutaSaito f487f4e3a9 feat: add dynamic OAuth2 metadata discovery for MCP servers (#16676)
* feat: add dynamic OAuth2 metadata discovery for MCP servers

* fix: lint error
2025-11-14 18:14:43 -08:00
yuneng-jiang 6063a75155 Remove Description Field from LLM Credentials (#16608) 2025-11-14 17:48:44 -08:00
Ishaan Jaffer 7faff1a7c0 pkg lock 2025-11-14 17:43:43 -08:00
Ishaan Jaffer 936bed056b security fix 2025-11-14 17:43:00 -08:00
Ishaan Jaff 8a43fbe8f7 Revert "[Feat] VertexAI - Add BGE Embeddings support (#16033)" (#16677)
This reverts commit 7133488282.
2025-11-14 17:41:06 -08:00
Ishaan Jaffer f8c022c8f5 bump new schema 2025-11-14 17:39:54 -08:00
Ishaan Jaffer 6b532b31a6 bump: version 1.79.3 → 1.79.4 2025-11-14 17:38:46 -08:00
Ishaan Jaffer 87b7780182 fix LiteLLM_DailyTagSpend add request_id 2025-11-14 17:35:54 -08:00
Ishaan Jaffer 8fdb12a44b fix bedrock agentocre 2025-11-14 17:31:34 -08:00
Ishaan Jaffer c18f411a5e test_encrypt_response_id_success 2025-11-14 17:28:15 -08:00
Ishaan Jaffer 74763f6cfc fix _map_reasoning_effort 2025-11-14 17:27:32 -08:00
Ishaan Jaff bffc36794f docs fix spend tracking (#16675) 2025-11-14 17:22:21 -08:00
Ishaan Jaff efa4ec9294 [Docs] Add docs on APIs for model access management (#16673)
* docs access groups

* docs ui for access groups

* fix code snippets

* docs model access
2025-11-14 17:07:54 -08:00
Cesar Garcia 0a0b5eee47 fix: Resolve pytest module name collision for test_transformation.py files (#16661)
Fixes #16613

The issue was caused by two test files having the same module name
(test_transformation.py) in different directories, which caused pytest
to fail with an import file mismatch error.

Changes:
- Renamed tests/test_litellm/llms/xai/responses/test_transformation.py
  to test_xai_responses_transformation.py
- Renamed tests/test_litellm/llms/openai_like/chat/test_transformation.py
  to test_openai_like_chat_transformation.py

Both files now have unique, descriptive names that reflect their
specific test purposes and prevent module name collisions.
2025-11-14 16:50:40 -08:00
Emerson Gomes 1dac777346 Add Vertex Kimi-K2-Thinking (#16671)
* Add Vertex Kimi-K2-Thinking

* Update model_prices_and_context_window.json

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update litellm/model_prices_and_context_window_backup.json

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-11-14 16:49:48 -08:00
Ishaan Jaffer e240ccef63 check if DB model 2025-11-14 16:41:32 -08:00
Ishaan Jaffer d9df1e4ffd code qa fix 2025-11-14 16:41:28 -08:00
Ishaan Jaff b360cc957e [Feat] Model Management API - Add API Endpoint for creating model access group (#16663)
* add NewModelGroupRequest

* add endpoint for create_model_group

* fix model_access_group_management_router

* add UpdateModelGroupRequest, info and delete

* fix model management tag

* fix validate_models_exist

* fix get_all_access_groups_from_db

* test_create_duplicate_access_group_fails

* test fixes

* fix working create access groups

* fix access group management endpoints

* add is db model checks for model access groups
2025-11-14 16:40:43 -08:00
yuneng-jiang f1ff195bd8 Add Model uses endpoint info (#16664) 2025-11-14 16:08:27 -08:00
Ishaan Jaff 2bd6d0d82b [Feat] Bedrock Batches - Add support for custom KMS encryption keys in Bedrock Batch operations (#16662)
* add s3_encryption_key_id

* add s3EncryptionKeyId to BedrockS3OutputDataConfig

* use s3EncryptionKeyId in bedrock output

* docs s3_encryption_key_id

* test_bedrock_batch_with_encryption_key_in_post_request
2025-11-14 16:00:43 -08:00
yuneng-jiang 7e22f4abc6 Normalize table action columns (#16657) 2025-11-14 14:11:01 -08:00
fzowl b1922e19f8 Voyageai pricing and doc update (#16641)
* Refresh VoyageAI models and prices and context

* Refresh VoyageAI models and prices and context

* Refresh VoyageAI models and prices and context

* Updating the available VoyageAI models in the docs

* Updating the available VoyageAI models in the docs

* Updating the model prices and the docs
2025-11-14 14:09:11 -08:00
Cesar Garcia 65061bafc7 feat(openai): Add support for reasoning_effort='none' in GPT-5.1 (#16658)
* feat(openai): Add support for reasoning_effort='none' in GPT-5.1

OpenAI's GPT-5.1 introduced a new reasoning effort parameter 'none'
which replaces the previous 'minimal' setting for faster, lower-latency
responses. This is now the default setting for GPT-5.1.

Changes:
- Updated REASONING_EFFORT type to include 'none' value
- Added GPT-5.1, GPT-5-mini, and GPT-5-nano to documentation
- Updated docs to reflect 'none' as GPT-5.1's default reasoning effort
- Added test to verify reasoning_effort='none' passes through correctly

Fixes #16633

* feat(responses): Add support for reasoning_effort='none' in Responses API transformation
2025-11-14 13:41:49 -08:00
Alexsander Hamir c7847125c2 [Perf] Embeddings: Use router's O(1) lookup and shared sessions (#16344)
* Refactor proxy embeddings to use shared processor

- allow ProxyBaseLLMRequestProcessing to accept the aembedding route so embeddings requests reuse the base pipeline hooks

- route embeddings requests through base_process_llm_request, sharing logging, hook execution, retries, and header handling with chat/responses

- tighten token array decoding logic by using router deployment lookups and the unified error handler

* Fix: Correctly process embedding requests with token arrays

The `test_embedding_input_array_of_tokens` test was failing due to a regression that caused embedding requests with token arrays to be processed incorrectly. This prevented the `aembedding` function from being called as expected.

This was caused by a combination of three distinct issues:

1.  In `litellm/proxy/common_request_processing.py`, the `function_setup` utility was called with `aembedding` as the `original_function` for embedding routes. This has been corrected to `embedding` to ensure proper request setup.

2.  In `litellm/proxy/proxy_server.py`, a `TypeError` occurred because the `get_deployment` method was called with the `model_name` keyword argument instead of the expected `model_id`. This has been corrected. Additionally, the check for token arrays was improved to validate that all elements in the input subarray are integers.

3.  In `litellm/proxy/litellm_pre_call_utils.py`, the check for the `enforced_params` enterprise feature was too strict. It blocked valid requests even when the `enforced_params` list was empty. The condition has been adjusted to trigger the check only for non-empty lists.

Finally, the `test_embedding_input_array_of_tokens` assertion was updated to be more robust. The previous `assert_called_once_with` was overly strict, causing failures when unrelated internal parameters were added to the function call. The test now first asserts that `aembedding` is called and then separately verifies the `model` and `input` arguments. This makes the test more resilient to future changes without sacrificing its ability to catch regressions.

* test: align proxy embedding assertions

Update the embedding proxy test to match the new request pipeline: keep the data the proxy builds, expect the extra control kwargs, let the post-call hook return the actual response, and assert the normalized 'embeddings' hook type. This proves the refactor still forwards metadata and returns the mocked payload.

* Update proxy exception test

The proxy now forwards additional kwargs (request_timeout, litellm_call_id, litellm_logging_obj) to llm_router.aembedding. The test needs to accept these to match the real call signature and keep validating the error path instead of the kwargs list.

* testing: unsure of this change

I don't remember why I changed this, will revert and see if any tests fail since the manual test isn't failing without it.

* fix: remove unrelated change

This change was not related to the embeddings refactor and actually belonged to a different branch.
2025-11-14 09:21:45 -08:00
Sameer Kankute 52a42e1728 Add all imagen variants in fal ai in model map (#16579) 2025-11-13 22:31:49 -08:00
Sameer Kankute 13993d6ea3 Add fal-ai/flux/schnell support (#16580) 2025-11-13 22:31:31 -08:00
Krrish Dholakia 266744a5bd docs: add contribution guide for new guardrails 2025-11-13 22:29:42 -08:00
Dmitrii Tunikov a22b2b0a67 fix(mcp): Fix Gemini conversation format issue with MCP auto-execution (#16592)
When using MCP tools with require_approval='never' and Gemini models,
the follow-up call after tool execution was failing with:

  'Please ensure that function call turn comes immediately after a user
   turn or after a function response turn.'

This was caused by adding an empty assistant message between the user
message and function calls, which violates Gemini's conversation format
requirements.

Changes:
- Only add assistant message to follow-up input if it contains actual content
- Allow function calls to come directly after user messages (as Gemini requires)
- Add explanatory comments about Gemini's format requirements

This fix allows MCP auto-execution to work correctly with Gemini models
while maintaining compatibility with other models.

Fixes: #[issue-number-if-any]
2025-11-13 22:20:50 -08:00
Tomáš Dvořák eca226286a fix: parse failed chunks for Groq (#16595)
* fix: parse failed chunks for Groq

Ref: #13960
Signed-off-by: Tomas Dvorak <toomas2d@gmail.com>

* chore: formatting

Signed-off-by: Tomas Dvorak <toomas2d@gmail.com>

---------

Signed-off-by: Tomas Dvorak <toomas2d@gmail.com>
2025-11-13 22:07:15 -08:00
Otavio Brito aedfe8f7a1 remove generic exception handling (#16599) 2025-11-13 22:03:47 -08:00
yuneng-jiang 379aa7b79a Pagination for /spend/logs/session/ui endpoint (#16603) 2025-11-13 22:03:00 -08:00