litellm

mirror of https://github.com/tiennm99/litellm.git synced 2026-06-24 15:38:19 +00:00

Author	SHA1	Message	Date
Cesar Garcia	cc72037cec	feat(openai): Add verbosity parameter support for GPT-5 family models (#16660 ) OpenAI's GPT-5 model family supports a verbosity parameter to control the length and detail of responses. This parameter accepts three values: 'low', 'medium', or 'high'. Changes: - Added verbosity parameter to completion() and acompletion() signatures - Added verbosity to DEFAULT_CHAT_COMPLETION_PARAM_VALUES in constants.py - Added verbosity to get_optional_params() in utils.py - Added verbosity to GPT-5 supported params list - Updated OpenAI docs with verbosity usage examples - Added comprehensive test for verbosity parameter Supported models: gpt-5, gpt-5.1, gpt-5-mini, gpt-5-nano, gpt-5-codex, gpt-5-pro	2025-11-14 19:38:27 -08:00
Rob Geada	d35d9008c9	Ensure detector-id is passed as header to IBM detector server (#16649 )	2025-11-14 19:35:49 -08:00
Luka Pečnik	d84e97f211	fix: preserve $defs for Anthropic tools input schema (#16648 ) * fix: preserve $defs for Anthropic tools input schema * fix: preserve $defs for Anthropic tools input schema * fix: preserve $defs for Anthropic tools input schema	2025-11-14 19:35:27 -08:00
yuneng-jiang	473886d35a	user_alias in read and update path (#16669 )	2025-11-14 19:34:28 -08:00
pnookala-godaddy	f599a462c1	openai(video): use GET for /videos/{id}/content by returning empty params; add tests to assert GET (#16672 )	2025-11-14 19:33:37 -08:00
Ishaan Jaffer	63994e302e	test_call_with_key_over_model_budget	2025-11-14 19:05:00 -08:00
Ishaan Jaffer	acad73018d	fix pkg lock	2025-11-14 18:59:49 -08:00
Ishaan Jaffer	39c1a970a1	fix	2025-11-14 18:56:30 -08:00
Krrish Dholakia	9ced18b695	perf: use reusable http client	2025-11-14 18:56:14 -08:00
Krrish Dholakia	938ec7c39a	fix: fix linting error	2025-11-14 18:54:24 -08:00
Ishaan Jaffer	11cf22e7b8	add "provider_specific_fields": null	2025-11-14 18:49:35 -08:00
Ishaan Jaffer	a69011883a	fix logging_testing	2025-11-14 18:48:20 -08:00
Ishaan Jaffer	efb80ffab7	fix pkg lock	2025-11-14 18:43:41 -08:00
Ishaan Jaffer	a1286fb609	security fix	2025-11-14 18:43:41 -08:00
yuneng-jiang	09e226b140	[Feature] UI - New Callbacks table (#16512 ) * New Callbacks table * Change Action Buttons to use Icons * Changed to follow our existing pattern * Removed unused import	2025-11-14 18:36:59 -08:00
Ishaan Jaffer	9e8653ad3c	fix prisma client	2025-11-14 18:25:27 -08:00
Krrish Dholakia	54e7792933	fix: remove dailytag request id migration from agents_table.sql	2025-11-14 18:24:01 -08:00
Krish Dholakia	8097fafc05	Agents - support agent registration + discovery (A2A spec) (#16615 ) * fix: initial commit adding types * refactor: refactor to include agent registry * feat(agents/): endpoints.py working endpoint for agent discovery * feat(agent_endpoints/endpoints.py): add permission management logic to agents endpoint * feat: public endpoint for showing publicly discoverable agents * feat: make /public/agent_hub discoverable * feat(agent_endpoints/endpoints.py): working create agent endpoint adds dynamic agent registration to the proxy * feat: working crud endpoints * feat: working multi-instance create/delete agents * feat(migration.sql): add migration for agents table	2025-11-14 18:23:30 -08:00
Ishaan Jaffer	65468353d1	provider_specific_fields	2025-11-14 18:17:50 -08:00
YutaSaito	f487f4e3a9	feat: add dynamic OAuth2 metadata discovery for MCP servers (#16676 ) * feat: add dynamic OAuth2 metadata discovery for MCP servers * fix: lint error	2025-11-14 18:14:43 -08:00
yuneng-jiang	6063a75155	Remove Description Field from LLM Credentials (#16608 )	2025-11-14 17:48:44 -08:00
Ishaan Jaffer	7faff1a7c0	pkg lock	2025-11-14 17:43:43 -08:00
Ishaan Jaffer	936bed056b	security fix	2025-11-14 17:43:00 -08:00
Ishaan Jaff	8a43fbe8f7	Revert "[Feat] VertexAI - Add BGE Embeddings support (#16033 )" (#16677 ) This reverts commit `7133488282`.	2025-11-14 17:41:06 -08:00
Ishaan Jaffer	f8c022c8f5	bump new schema	2025-11-14 17:39:54 -08:00
Ishaan Jaffer	6b532b31a6	bump: version 1.79.3 → 1.79.4	2025-11-14 17:38:46 -08:00
Ishaan Jaffer	87b7780182	fix LiteLLM_DailyTagSpend add request_id	2025-11-14 17:35:54 -08:00
Ishaan Jaffer	8fdb12a44b	fix bedrock agentocre	2025-11-14 17:31:34 -08:00
Ishaan Jaffer	c18f411a5e	test_encrypt_response_id_success	2025-11-14 17:28:15 -08:00
Ishaan Jaffer	74763f6cfc	fix _map_reasoning_effort	2025-11-14 17:27:32 -08:00
Ishaan Jaff	bffc36794f	docs fix spend tracking (#16675 )	2025-11-14 17:22:21 -08:00
Ishaan Jaff	efa4ec9294	[Docs] Add docs on APIs for model access management (#16673 ) * docs access groups * docs ui for access groups * fix code snippets * docs model access	2025-11-14 17:07:54 -08:00
Cesar Garcia	0a0b5eee47	fix: Resolve pytest module name collision for test_transformation.py files (#16661 ) Fixes #16613 The issue was caused by two test files having the same module name (test_transformation.py) in different directories, which caused pytest to fail with an import file mismatch error. Changes: - Renamed tests/test_litellm/llms/xai/responses/test_transformation.py to test_xai_responses_transformation.py - Renamed tests/test_litellm/llms/openai_like/chat/test_transformation.py to test_openai_like_chat_transformation.py Both files now have unique, descriptive names that reflect their specific test purposes and prevent module name collisions.	2025-11-14 16:50:40 -08:00
Emerson Gomes	1dac777346	Add Vertex Kimi-K2-Thinking (#16671 ) * Add Vertex Kimi-K2-Thinking * Update model_prices_and_context_window.json Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update litellm/model_prices_and_context_window_backup.json Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-11-14 16:49:48 -08:00
Ishaan Jaffer	e240ccef63	check if DB model	2025-11-14 16:41:32 -08:00
Ishaan Jaffer	d9df1e4ffd	code qa fix	2025-11-14 16:41:28 -08:00
Ishaan Jaff	b360cc957e	[Feat] Model Management API - Add API Endpoint for creating model access group (#16663 ) * add NewModelGroupRequest * add endpoint for create_model_group * fix model_access_group_management_router * add UpdateModelGroupRequest, info and delete * fix model management tag * fix validate_models_exist * fix get_all_access_groups_from_db * test_create_duplicate_access_group_fails * test fixes * fix working create access groups * fix access group management endpoints * add is db model checks for model access groups	2025-11-14 16:40:43 -08:00
yuneng-jiang	f1ff195bd8	Add Model uses endpoint info (#16664 )	2025-11-14 16:08:27 -08:00
Ishaan Jaff	2bd6d0d82b	[Feat] Bedrock Batches - Add support for custom KMS encryption keys in Bedrock Batch operations (#16662 ) * add s3_encryption_key_id * add s3EncryptionKeyId to BedrockS3OutputDataConfig * use s3EncryptionKeyId in bedrock output * docs s3_encryption_key_id * test_bedrock_batch_with_encryption_key_in_post_request	2025-11-14 16:00:43 -08:00
yuneng-jiang	7e22f4abc6	Normalize table action columns (#16657 )	2025-11-14 14:11:01 -08:00
fzowl	b1922e19f8	Voyageai pricing and doc update (#16641 ) * Refresh VoyageAI models and prices and context * Refresh VoyageAI models and prices and context * Refresh VoyageAI models and prices and context * Updating the available VoyageAI models in the docs * Updating the available VoyageAI models in the docs * Updating the model prices and the docs	2025-11-14 14:09:11 -08:00
Cesar Garcia	65061bafc7	feat(openai): Add support for reasoning_effort='none' in GPT-5.1 (#16658 ) * feat(openai): Add support for reasoning_effort='none' in GPT-5.1 OpenAI's GPT-5.1 introduced a new reasoning effort parameter 'none' which replaces the previous 'minimal' setting for faster, lower-latency responses. This is now the default setting for GPT-5.1. Changes: - Updated REASONING_EFFORT type to include 'none' value - Added GPT-5.1, GPT-5-mini, and GPT-5-nano to documentation - Updated docs to reflect 'none' as GPT-5.1's default reasoning effort - Added test to verify reasoning_effort='none' passes through correctly Fixes #16633 * feat(responses): Add support for reasoning_effort='none' in Responses API transformation	2025-11-14 13:41:49 -08:00
Alexsander Hamir	c7847125c2	[Perf] Embeddings: Use router's O(1) lookup and shared sessions (#16344 ) * Refactor proxy embeddings to use shared processor - allow ProxyBaseLLMRequestProcessing to accept the aembedding route so embeddings requests reuse the base pipeline hooks - route embeddings requests through base_process_llm_request, sharing logging, hook execution, retries, and header handling with chat/responses - tighten token array decoding logic by using router deployment lookups and the unified error handler * Fix: Correctly process embedding requests with token arrays The `test_embedding_input_array_of_tokens` test was failing due to a regression that caused embedding requests with token arrays to be processed incorrectly. This prevented the `aembedding` function from being called as expected. This was caused by a combination of three distinct issues: 1. In `litellm/proxy/common_request_processing.py`, the `function_setup` utility was called with `aembedding` as the `original_function` for embedding routes. This has been corrected to `embedding` to ensure proper request setup. 2. In `litellm/proxy/proxy_server.py`, a `TypeError` occurred because the `get_deployment` method was called with the `model_name` keyword argument instead of the expected `model_id`. This has been corrected. Additionally, the check for token arrays was improved to validate that all elements in the input subarray are integers. 3. In `litellm/proxy/litellm_pre_call_utils.py`, the check for the `enforced_params` enterprise feature was too strict. It blocked valid requests even when the `enforced_params` list was empty. The condition has been adjusted to trigger the check only for non-empty lists. Finally, the `test_embedding_input_array_of_tokens` assertion was updated to be more robust. The previous `assert_called_once_with` was overly strict, causing failures when unrelated internal parameters were added to the function call. The test now first asserts that `aembedding` is called and then separately verifies the `model` and `input` arguments. This makes the test more resilient to future changes without sacrificing its ability to catch regressions. * test: align proxy embedding assertions Update the embedding proxy test to match the new request pipeline: keep the data the proxy builds, expect the extra control kwargs, let the post-call hook return the actual response, and assert the normalized 'embeddings' hook type. This proves the refactor still forwards metadata and returns the mocked payload. * Update proxy exception test The proxy now forwards additional kwargs (request_timeout, litellm_call_id, litellm_logging_obj) to llm_router.aembedding. The test needs to accept these to match the real call signature and keep validating the error path instead of the kwargs list. * testing: unsure of this change I don't remember why I changed this, will revert and see if any tests fail since the manual test isn't failing without it. * fix: remove unrelated change This change was not related to the embeddings refactor and actually belonged to a different branch.	2025-11-14 09:21:45 -08:00
Sameer Kankute	52a42e1728	Add all imagen variants in fal ai in model map (#16579 )	2025-11-13 22:31:49 -08:00
Sameer Kankute	13993d6ea3	Add fal-ai/flux/schnell support (#16580 )	2025-11-13 22:31:31 -08:00
Krrish Dholakia	266744a5bd	docs: add contribution guide for new guardrails	2025-11-13 22:29:42 -08:00
Dmitrii Tunikov	a22b2b0a67	fix(mcp): Fix Gemini conversation format issue with MCP auto-execution (#16592 ) When using MCP tools with require_approval='never' and Gemini models, the follow-up call after tool execution was failing with: 'Please ensure that function call turn comes immediately after a user turn or after a function response turn.' This was caused by adding an empty assistant message between the user message and function calls, which violates Gemini's conversation format requirements. Changes: - Only add assistant message to follow-up input if it contains actual content - Allow function calls to come directly after user messages (as Gemini requires) - Add explanatory comments about Gemini's format requirements This fix allows MCP auto-execution to work correctly with Gemini models while maintaining compatibility with other models. Fixes: #[issue-number-if-any]	2025-11-13 22:20:50 -08:00
Tomáš Dvořák	eca226286a	fix: parse failed chunks for Groq (#16595 ) * fix: parse failed chunks for Groq Ref: #13960 Signed-off-by: Tomas Dvorak <toomas2d@gmail.com> * chore: formatting Signed-off-by: Tomas Dvorak <toomas2d@gmail.com> --------- Signed-off-by: Tomas Dvorak <toomas2d@gmail.com>	2025-11-13 22:07:15 -08:00
Otavio Brito	aedfe8f7a1	remove generic exception handling (#16599 )	2025-11-13 22:03:47 -08:00
yuneng-jiang	379aa7b79a	Pagination for /spend/logs/session/ui endpoint (#16603 )	2025-11-13 22:03:00 -08:00

1 2 3 4 5 ...

27389 Commits