litellm

mirror of https://github.com/tiennm99/litellm.git synced 2026-07-03 17:08:43 +00:00

Author	SHA1	Message	Date
Krrish Dholakia	92ebf5b918	fix(router.py): fix print statement	2025-08-11 17:46:14 -07:00
Ishaan Jaff	9f78287000	[Bug Fix]: Azure OpenAI GPT-5 max_tokens + `reasoning` param support (#13510 ) * add AzureOpenAIGPT5Config * add AzureOpenAIGPT5Config * add AzureOpenAIGPT5Config * add AzureOpenAIGPT5Config * test_azure_gpt5_supports_reasoning_effort * test_azure_gpt5_reasoning * test_azure_gpt5_reasoning * ruff check fixes * docs azure gpt5	2025-08-11 15:40:53 -07:00
Ishaan Jaff	1cd827874f	[Bug Fix] - Allow using `reasoning_effort` for gpt-5 model family and `reasoning` for Responses API (#13475 ) * test_openai_gpt5_reasoning * test_openai_gpt5_reasoning_effort_parameter * add OpenAIGPT5ResponsesAPIConfig * test_openai_gpt5_reasoning_effort_parameter * fixes	2025-08-10 09:55:36 -07:00
Krish Dholakia	9f6f96d76c	Litellm dev 08 07 2025 p1 (#13418 ) * fix(router.py): support base model for model group usage allows model group info to show accurate cost information for azure models * fix(router.py): fix changes * test: add unit tests * build(pyproject.toml): bump openai version requirements support custom tool from responses api Closes https://github.com/BerriAI/litellm/issues/13391 * docs(responses_api.md): add verbosity + free-form function calling parameters * docs(responses_api.md): add cfg + minimal reasoning to docs Closes https://github.com/BerriAI/litellm/issues/13391 * docs(responses_api.md): add proxy examples to docs * refactor: fix ruff error	2025-08-09 16:30:04 -07:00
Sannan Nasir	0e53b1feab	Add digitalocean provider (#12169 ) * Add digitalocean provider * Add digitalocean provider * Revert "Add digitalocean provider" This reverts commit 96dda40f45b3d12ea03e861d060ec81460b7759e. * changes * fixes * Update transformation * refactoring * rename provider to Gradient AI * fixes * Incorporte review comments * revert changes * fix typo * revert change * incorporated review comments * Revert "Incorporte review comments" This reverts commit 37bd51bd54ef4fd52ccc12866e47f8de9476d597. * changes * Revert "Revert "Incorporte review comments" This reverts commit 37bd51bd54ef4fd52ccc12866e47f8de9476d597." This reverts commit 68c8a198ee0d6441c3a52f6c6a49c9c95a4cb0a8. * changes * fixes * Update provider_specific_fields.tsx	2025-08-09 16:26:33 -07:00
Ishaan Jaff	f60a9cf908	[Bug]: Fix JWTs access not working with model groups (#13474 ) * fix can_team_access_model * test_find_team_with_model_access_model_group	2025-08-09 16:14:51 -07:00
Jugal D. Bhatt	67833590d6	[Proxy changes] Litellm add model price reload schedule for multi-pod (#13470 ) * added mcp guardrails doc in mcp.md * add button to reload models * Added button changes * added button for scheduling reload * add multi pod support to reloading the model price json * fix ruff	2025-08-09 16:12:13 -07:00
Krish Dholakia	1c8761111f	Router - reduce p99 latency w/ redis enabled by 50% + OTEL - track pre_call hook latency (#13362 ) * feat(proxy/utils.py): track pre-call hooks in OTEL some pre call hooks can cause latency in high traffic - make sure this is tracked * fix(router.py): move redis call on deployment_callback_on_success to pipeline operation reduces p99 latency by half when redis is enabled * fix(parallel_request_limiter_v3.py): only run check if any item has rate limits set Prevents unnecessary latency added by rate limit checks * test: add unit tests * Latency Improvements: only track tpm/rpm usage when set on deployment+ LLM Caching - use an in-memory cache to reduce redis calls + OTEL - track time spent on LLM caching (#13472) * fix(router.py): only track usage for deployments with tpm/rpm set ensures additional latency avoided for non-tpm/rpm models * fix(caching_handler.py): log time spent on request get cache to OTEL enables easy debugging of call latency * fix(caching_handler.py): use dual cache object for in-memory caching + trace redis call within caching handler * fix(caching_handler.py): working in-memory cache for redis calls ensures dual cache works when redis cache setup for llm calls makes calls quicker by only checking redis when in-memory cache missed for llm api call * test: remove redundant test * test: add unit tests	2025-08-09 16:09:51 -07:00
Ishaan Jaff	60306d34a0	[Bug Fix] Allow using Swagger for /chat/completions (#13469 ) * fix get_openapi_schema * fixes for ProxyChatCompletionRequest * TestSwaggerChatCompletions * fix working request body * fix - add "messages" * fix messages * TestSwaggerChatCompletions * test_messages_field_has_example * ruff check fix	2025-08-09 15:35:45 -07:00
Jugal D. Bhatt	1270df08a4	[Proxy + UI] Litellm add reload model api and button (#13464 ) * added mcp guardrails doc in mcp.md * add button to reload models * Added button changes * remove the model_reload	2025-08-09 13:52:56 -07:00
Jugal D. Bhatt	10a1fe21c5	[LLM Translation] Litellm azure o series drop params (#13353 ) * added route check * fix ruff * Added support for dropping o_series params * Added ruff fix * fix tests	2025-08-09 13:52:45 -07:00
Ishaan Jaff	eb4bd26f24	[Bug Fix] - Get Routes (#13466 ) * fixes get_routes_for_mounted_app * fix - use _safe_get_endpoint_name * fix code QA check * test_get_routes_for_mounted_app_with_static_files * test fixes	2025-08-09 12:52:23 -07:00
Ishaan Jaff	825ea65b96	[Bug Fix] Responses API - Responses API failed if input containing ResponseReasoningItem (#13465 ) * add test_responses_api_multi_turn_with_reasoning_and_structured_output * fix transform_responses_api_request	2025-08-09 11:20:34 -07:00
Ishaan Jaff	a843e876a8	[Feat] Working e2e flow for Responses API session management with media (#13456 ) * add MultimodalContent on chat UI * add multi modal img on chat ui * utils for responses API imgs * add code snippet with imgs * chat UI add imgs * add imge upload * chat ui allow adding images * fix chat send button * fix button styles * fix clear chat * fixes session management * fixes for session management * QA fix _should_check_cold_storage_for_full_payload * test_should_check_cold_storage_for_full_payload	2025-08-08 18:28:10 -07:00
Ishaan Jaff	3b65733af8	[Bug fix] - Error creating standard logging object - can't register atexit after shutdownLitellm fixes standard logging payload (#13436 ) * fix: _generate_cold_storage_object_key * _get_configured_cold_storage_custom_logger * test_e2e_generate_cold_storage_object_key_runtime_error_handled	2025-08-08 12:38:26 -07:00
Jugal D. Bhatt	51c2ff7c15	fix user membership issue (#13433 )	2025-08-08 12:00:58 -07:00
Ishaan Jaff	3a35c82884	[Feat] Add `reasoning_effort` to OpenAIGPT5Config (#13434 ) * add reasoning_effort toi OpenAIGPT5Config * test_gpt5_supports_reasoning_effort	2025-08-08 11:57:12 -07:00
Thiago Salvatore	c2ad858c83	fix(access group): allow access group on mcp tool retrieval (#13425 ) * fix(access group): allow access group on mcp tool retrieval * fix(test): fix broken tests and add test case for access group * fix(mypy): fix typing issues	2025-08-08 08:55:46 -07:00
Ishaan Jaff	9761ba7c7a	[Bug Fix] Responses api session management for streaming responses (#13396 ) * fix proxy config * fix(responses api): fix streaming ID consistency and tool format handling (#12640) * fix(responses): ensure streaming chunk IDs use consistent encoding format Fixes streaming ID inconsistency where streaming responses used raw provider IDs while non-streaming responses used properly encoded IDs with provider context. Changes: - Updated LiteLLMCompletionStreamingIterator to accept provider context - Added _encode_chunk_id() method using same logic as non-streaming responses - Modified chunk transformation to encode all streaming item_ids with resp_ prefix - Updated handlers to pass custom_llm_provider and litellm_metadata to streaming iterator Impact: - Streaming chunk IDs now format: resp_<base64_encoded_provider_context> - Enables session continuity when using streaming response IDs as previous_response_id - Allows provider detection and load balancing with streaming responses - Maintains backward compatibility with existing streaming functionality 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * fix(types): add explicit Optional[str] type annotation for model_id This resolves MyPy type checking error where model_id could be None but wasn't explicitly typed as Optional[str]. * fix(types): handle None case for litellm_metadata access Prevents 'Item None has no attribute get' error by checking for None before accessing litellm_metadata dictionary. * test: add comprehensive tests for streaming ID consistency Adds unit and E2E tests to verify streaming chunk IDs are properly encoded with consistent format across streaming responses. ## Tests Added ### Unit Test (test_reasoning_content_transformation.py) - `test_streaming_chunk_id_encoding()`: Validates the `_encode_chunk_id()` method correctly encodes chunk IDs with `resp_` prefix and provider context ### E2E Tests (test_e2e_openai_responses_api.py) - `test_streaming_id_consistency_across_chunks()`: Tests that all streaming chunk IDs are properly encoded across multiple chunks in a real streaming response - `test_streaming_response_id_as_previous_response_id()`: Tests the core use case - using streaming response IDs for session continuity with `previous_response_id` ## Key Testing Approach - Uses Gemini (non-OpenAI model) to test the transformation logic rather than OpenAI passthrough, since the streaming ID consistency issue occurs when LiteLLM transforms responses rather than just passing through to native OpenAI responses API - Tests validate that streaming chunk IDs now use same encoding as non-streaming responses - Verifies session continuity works with streaming responses Addresses @ishaan-jaff's request for unit tests covering the streaming ID consistency fix. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * fix(lint): remove unused imports in transformation.py Removes unused imports to fix CI linting errors: - GenericResponseOutputItem - OutputFunctionToolCall * test: remove E2E tests from openai_endpoints_tests Remove streaming ID consistency E2E tests as requested by @ishaan-jaff. Keep only the mock/unit test in test_reasoning_content_transformation.py * revert: remove streaming chunk ID encoding to original behavior This reverts the streaming chunk ID encoding changes to understand the original issue better. Original behavior was: - Streaming chunks: raw provider IDs - Streaming final response: raw IDs (PROBLEM!) - Non-streaming final response: encoded IDs (correct) The real issue: streaming final response IDs were not encoded, breaking session continuity. * fix(responses): encode streaming final response IDs to match OpenAI behavior Fixes streaming ID inconsistency to match OpenAI's Responses API behavior: - Streaming chunks: raw message IDs (like OpenAI's msg_xxx) - Final response: encoded IDs (like OpenAI's resp_xxx) This enables session continuity by ensuring streaming final response IDs have the same encoded format as non-streaming responses, allowing them to be used as previous_response_id in follow-up requests. Changes: - Add custom_llm_provider and litellm_metadata to LiteLLMCompletionStreamingIterator - Update handlers to pass provider context to streaming iterator - Apply _update_responses_api_response_id_with_model_id to final streaming response - Keep streaming chunks as raw IDs to match OpenAI format Impact: - Session continuity works with streaming responses - Load balancing can detect provider from streaming final response IDs - Format matches OpenAI's Responses API exactly 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * test: update unit test to match correct OpenAI-compatible behavior Updates the unit test to verify streaming chunk IDs are raw (not encoded) to match OpenAI's responses API format: - Streaming chunks: raw message IDs (like msg_xxx) - Final response: encoded IDs (like resp_xxx) This reflects the correct behavior implemented in the fix. --------- Co-authored-by: Claude <noreply@anthropic.com> * cleanup * TestBaseResponsesAPIStreamingIterator --------- Co-authored-by: Javier de la Torre <jatorre@carto.com> Co-authored-by: Claude <noreply@anthropic.com>	2025-08-07 20:13:24 -07:00
Ishaan Jaff	7695882d8a	test_supports_tool_choice	2025-08-07 16:56:45 -07:00
Ishaan Jaff	2037037258	[Bug Fix] OpenAI gpt-5 series does not support "max_tokens" parameter and `temperature` values that are not = 1 (#13390 ) * add OpenAIGPT5Config * add map_openai_params for gpt5 * add OpenAIGPT5Config * add OpenAI gpt 5 transform * docs gpt 5 openai	2025-08-07 16:35:00 -07:00
Ishaan Jaff	e8c081b8ff	test_stream_chunk_builder_litellm_usage_chunks	2025-08-07 15:22:52 -07:00
Ishaan Jaff	dbb651ea95	remove old mapped test	2025-08-07 13:51:50 -07:00
Ishaan Jaff	621b3dca7b	[Bug Fix] Mistral Tool Calling - Grammar error: at 3(11): failed to compile JSON schema (#13389 ) * test_claude_tool_use_with_gemini * add _remove_json_schema_refs * add _clean_tool_schema_for_mistral * fixes mistral tool calls * _remove_json_schema_refs * fix - vertex, remove hardcoded test	2025-08-07 13:50:22 -07:00
Ishaan Jaff	984f91f4f5	test_completion_gemini_stream	2025-08-07 13:24:00 -07:00
Ishaan Jaff	08ac2aeb6d	Revert "Fix SSO Logout \| Create Unified Login Page with SSO and Username/Password Options (#12703 )" (#13387 ) This reverts commit `a752d7acc9`.	2025-08-07 13:13:05 -07:00
Ishaan Jaff	4d941c914e	[Feat] Responses API Session Handling - Multi media support (#13347 ) * rename ResponsesSessionHandler * use ResponsesSessionHandler * test session handler * refactor ResponsesSessionHandler * fix get_proxy_server_request_from_spend_log * use constant for LITELLM_TRUNCATED_PAYLOAD_FIELD * add _should_check_cold_storage_for_full_payload * add get_class_type_for_custom_logger_name * get_active_custom_logger_for_callback_name * add get_proxy_server_request_from_cold_storage to CustomLogger * add ColdStorageHandler * start using cold storage integration * add get_proxy_server_request_from_cold_storage * fixes from manual testing * s3 v2 fix getting region name * ChatCompletionImageUrlObject * use _get_configured_cold_storage_custom_logger * fixes for _should_check_cold_storage_for_full_payload * fix _download_object_from_s3 * test_s3_v2_with_cold_storage * add cold_storage_object_key to StandardLoggingMetadata * use get_proxy_server_request_from_cold_storage_with_object_key * add cold_storage_object_key to SpendLogsMetadata * add cold_storage_object_key * get_proxy_server_request_from_cold_storage_with_object_key * use get_proxy_server_request_from_cold_storage_with_object_key * test responses API * add get_proxy_server_request_from_cold_storage_with_object_key * session handler fixes * test session handler * fix ruff checks * _download_object_from_s3 * cleanup * test * lint fix * test_e2e_cold_storage_successful_retrieval * test_e2e_generate_cold_storage_object_key_successful * test_async_gcs_pub_sub_v1 * test fix * test fix * test fix * test_standard_logging_metadata_has_cold_storage_object_key_field * test_sanitize_request_body_for_spend_logs_payload_basic * test_transform_input_image_item_to_image_item_with_image_data	2025-08-07 10:59:53 -07:00
Anand Khinvasara	96dca4eff8	fix: 12152 - Redacted sensitive information logged in bedrock guardrails (#13356 )	2025-08-07 08:42:11 -07:00
Edward D'Amato	30fc5b871c	feat(integrations): allow setting of braintrust callback base url (#13368 ) * feat(integrations): allow setting of braintrust callback base url * chore(misc): remove extra additions due to merge	2025-08-07 08:40:11 -07:00
Ishaan Jaff	dfada882f1	vtx test fix gemini-2.5-flash-lite	2025-08-07 00:11:10 -07:00
yeahyung	a92bf8173e	Fix create, search vector store error (#13285 ) * (#13284) add avector_store_create to route_type which doesn't require model * (#13284) exclude hidden params in metadata when create vector store * (#13284) fix lint error * (#13284) keep metadata None if metadata is None(not empty dict) * (#13284) add test code * (#13284) change test code name * (#13284) add avector_store_search to route_type which doesn't require model	2025-08-06 11:15:17 -07:00
Jugal D. Bhatt	b1a8968895	[MCP Gateway] fix auth on ui for bearer servers (#13312 ) * fix auth on ui for bearer servers * add tests and fixes * fix tests	2025-08-06 09:46:10 -07:00
Ishaan Jaff	eeed03a78f	test fix: gcp deprecated gemini-1.5-flash	2025-08-06 08:43:45 -07:00
Krish Dholakia	0da25fadc0	Exclude none fields on `/chat/completion` - fixes n8n bug + Allow calling `/v1/models` when end user over budget (#13320 ) * fix(proxy_server.py): exclude none fields before returning Fixes https://github.com/BerriAI/litellm/issues/13055 * test: add unit tests * feat(auth_checks.py): allow info routes to work when end user over budget Fixes https://github.com/BerriAI/litellm/issues/13286	2025-08-05 21:39:46 -07:00
zjx20	92c525ddfe	feat(JinaAI): support multimodal embedding models (#13181 ) * feat(JinaAI): support multimodal embedding models * add test case * add test * fix test	2025-08-05 19:21:56 -07:00
Krish Dholakia	324cfe8bdc	fix(streaming_handler.py): include cost in streaming usage object (#13319 ) Fixes https://github.com/BerriAI/litellm/issues/12689	2025-08-05 18:38:31 -07:00
Jugal D. Bhatt	b6fcda2f8a	[LLM Translation] Fix model group on clientside auth with API calls (#13314 ) * fix unsupported operand type(s) for +=: 'NoneType' and 'str' on clientside auth creds for responses * fix the client side auth to use correct metadata * add more tests * fix tests	2025-08-05 17:46:47 -07:00
Ishaan Jaff	b455ada161	[Bug Fix] [Bug]: New Databricks Foundation Models databricks-gpt-oss-20b and databricks-gpt-oss-120b failed with error: litellm.APIConnectionError: 'signature' (#13318 ) * test_transform_choices_without_signature * fix ChatCompletionThinkingBlock * extract_reasoning_content	2025-08-05 17:46:40 -07:00
Ishaan Jaff	dab8ba03e3	[Feat] - When using custom tags on prometheus allow using wildcard patterns (#13316 ) * _tag_matches_wildcard_configured_pattern * test_get_custom_labels_from_tags_wildcard_patterns * docs Custom Tags * docs how custom tags work * fix	2025-08-05 17:46:13 -07:00
Ishaan Jaff	0ccc493455	[Bug]: Fix Mimetype Resolution Error in Bedrock Document Understanding (#13309 ) * fix _validate_format for BedrockImageProcessor * add test * fix _validate_format for bedrock * _get_document_format * test_bedrock_get_document_format_fallback_mimes * fix: add fallback method for mime type detection	2025-08-05 17:07:10 -07:00
Jugal D. Bhatt	32501c85f5	fix unsupported operand type(s) for +=: 'NoneType' and 'str' on clientside auth creds for responses (#13293 )	2025-08-05 13:16:16 -07:00
Jugal D. Bhatt	609fa9f5ca	[LLM Translation + Coding tools] Added litellm claude code count tokens support (#13261 ) * Added litellm claude code count tokens support * fix mypy * create helper * Revert construct * revert construct * fix return * Add reutrn none * change to factory approach * refactor to BaseModelInfo * enum fix	2025-08-05 10:57:24 -07:00
Jugal D. Bhatt	29a8c583c2	added redis iam auth (#13275 )	2025-08-05 10:56:34 -07:00
Ishaan Jaff	5a02eb473b	test_function_calling_with_tool_response	2025-08-05 09:55:47 -07:00
Krish Dholakia	416da066eb	fix(main.py): handle tool being a pydantic object (#13274 ) * fix(main.py): handle tool being a pydantic object Fixes https://github.com/BerriAI/litellm/issues/13064 * fix(prompt_templates/common_utils.py): fix unpack defs deepcopy issue Fixes https://github.com/BerriAI/litellm/issues/13151 * fix(utils.py): handle tools is none	2025-08-04 23:44:02 -07:00
Krish Dholakia	eb49f987de	Ensure disable_llm_api_endpoints works + Add wildcard model support for 'team-byok' model (#13278 ) * fix(route_checks.py): ensure disable llm api endpoints is correctly set * fix(route_checks.py): raise httpexception raise expected exceptions * fix(router.py): handle team only wildcard models fixes issue where team only wildcard models were not considered during auth checks * fix(router.py): handle team only wildcard models fixes issue where team only wildcard models were not considered during auth checks	2025-08-04 23:19:51 -07:00
Jugal D. Bhatt	efd34966dc	[LLM Translation] Support /v1/models/{model_id} retrieval (#13268 ) * added model id endpoint * fix test * add route to internal users * make the functions reusable * fixed mypy	2025-08-04 18:03:59 -07:00
Jugal D. Bhatt	de7108b5f8	input cost per token higher than 1 test (#13270 )	2025-08-04 18:02:03 -07:00
Ishaan Jaff	ba1882fdd5	[Bug Fix] Prometheus - fix for `litellm_input_tokens_metric`, `litellm_output_tokens_metric` - Note this updates the metric name (#13271 ) * fixes for litellm_tokens_metric * test_prometheus_token_metrics_with_prometheus_config	2025-08-04 17:22:21 -07:00
Pascal Bro	a17d483c89	Add GCS bucket caching support (#13122 )	2025-08-04 16:09:33 -07:00

1 2 3 4 5 ...

2637 Commits