litellm

mirror of https://github.com/tiennm99/litellm.git synced 2026-06-17 22:48:35 +00:00

Author	SHA1	Message	Date
Ishaan Jaffer	50328d15d4	test_process_chunk_with_response_completed_event	2025-11-26 11:52:05 -08:00
Sameer Kankute	82dc0354ce	Litellm sameer nov 3 stable branch (#16963 ) * Add openai metadata filed in the request * Add docs related to openai metadata * Add utils * test_completion_openai_metadata[True] * Added support for though signature for gemini 3 in responses api (#16872) * Added support for though signature for gemini 3 * Update docs with all supported endpoints and cost tracking * Added config based routing support for batches and files * fix lint errors * Litellm anthropic image url support (#16868) * Add image as url support to anthropic * fix mypy errors * fix tests * Fix: Populate spend_logs_metadata in batch and files endpoints (#16921) * Add spend-logs-metadata to the metadata * Add tests for spend logs metadata in batches * use better names * Remove support for penalty param for gemini 3 (#16907) * Remove support for penalty param * remove halucinated model names * fix mypy/test errors * fix tests * fix too many lines error * fix too many lines error * Add config for cicd test case * Fix final tests * fix batch tests * fix batch tests	2025-11-22 09:35:05 -08:00
Ishaan Jaffer	1d2bdaebb6	test_openai_streaming_logging	2025-11-08 11:49:36 -08:00
Ishaan Jaffer	4fb521c251	test_basic_openai_responses_api_non_streaming_with_logging	2025-11-08 11:12:35 -08:00
Ishaan Jaffer	bbcdf6f996	test_basic_openai_responses_api_non_streaming_with_logging	2025-11-08 10:47:48 -08:00
Ishaan Jaffer	0148b8d2f7	test_basic_openai_responses_api_non_streaming_with_logging	2025-11-08 10:36:47 -08:00
Ishaan Jaffer	89157f4b5c	test_basic_openai_responses_api_streaming_with_logging	2025-11-08 10:30:41 -08:00
Ishaan Jaffer	6bb963dca2	test_basic_openai_responses_api_non_streaming_with_logging	2025-11-08 10:13:27 -08:00
Ishaan Jaffer	f9d95b71bb	fix _get_assembled_streaming_response	2025-11-08 10:08:03 -08:00
Ishaan Jaffer	61cf169926	test_basic_openai_responses_api_streaming	2025-11-06 16:18:30 -08:00
Cesar Garcia	1fec48499f	fix: Pass extra_body to provider in Responses API requests (#16320 ) ## Problem The `extra_body` parameter in `litellm.responses()` and `litellm.aresponses()` was being accepted but never passed to the HTTP request sent to the LLM provider. This prevented users from sending custom/experimental parameters to provider APIs. ## Changes - Added `data.update(extra_body)` in `async_response_api_handler` (line 2138) - Added `data.update(extra_body)` in `response_api_handler` (line 2012) - Added tests to `test_openai_responses_api.py` for extra_body functionality ## Testing - Tests verify extra_body params are passed in both sync and async modes - Existing Responses API tests continue to pass - Manually verified with OpenAI API that custom params are sent correctly ## Impact Users can now pass custom/experimental parameters via extra_body: ```python litellm.aresponses( model="gpt-4o", input="hello", extra_body={"custom_param": "value"} # Now works! ) ``` This aligns with the OpenAI SDK pattern and matches behavior in other LiteLLM endpoints (completion, embedding, etc.) that already support extra_body.	2025-11-06 14:54:40 -08:00
Ishaan Jaffer	4a83ae0695	test_aresponses_service_tier_and_safety_identifier	2025-11-04 18:05:30 -08:00
Cesar Garcia	78ed5126a5	fix: Fix Responses API streaming tests usage field names and cost (#16236 ) This commit fixes two bugs in Responses API streaming tests: 1. Usage field naming bug: Tests were using `input_tokens` and `output_tokens` but the Usage object uses `prompt_tokens` and `completion_tokens`. 2. Missing cost in streaming usage: When `include_cost_in_streaming_usage` was enabled, the cost was calculated and added to ResponseAPIUsage, but was lost during the transformation to the Usage object. Changes: - Updated test assertions to use correct field names (prompt_tokens, completion_tokens) - Added cost preservation logic in FakeStreamerResponsesAPIIterator - Modified _transform_response_api_usage_to_chat_usage() to preserve cost attribute All streaming tests now pass successfully.	2025-11-04 15:57:59 -08:00
Krish Dholakia	74ae7aed44	build: Squashed commit of the following: (#16176 ) commit bb0b050fb01633d83c1c2932f8e9c11432911847 Author: Krrish Dholakia <krrishdholakia@gmail.com> Date: Sat Nov 1 20:00:01 2025 -0700 test: update tests commit b2da4bdac23868e69a9452805b231f8830e49912 Author: Krrish Dholakia <krrishdholakia@gmail.com> Date: Wed Oct 22 14:58:01 2025 -0700 fix(langfuse_otel_attributes.py): log tools and other optional params commit 75bee1f2748f32b230467de0b085c55bf1d687a9 Author: Krrish Dholakia <krrishdholakia@gmail.com> Date: Wed Oct 22 14:42:05 2025 -0700 feat(langfuse_otel/): working request/response logging on spans Closes https://github.com/BerriAI/litellm/issues/13764 commit a3e4fa5b81e82f71c74fb9e7dc859c6cb40495f5 Author: Krrish Dholakia <krrishdholakia@gmail.com> Date: Wed Oct 22 14:20:39 2025 -0700 fix: initial commit fixing langfuse request/response logging with OTEL commit 09fc9deac844004104822810e42975cd9c68f0e3 Author: Krrish Dholakia <krrishdholakia@gmail.com> Date: Wed Oct 22 13:33:52 2025 -0700 fix(litellm_logging.py): for responses api - return a unified usage object for logging ensures logging integrations all pull the right usage information	2025-11-02 09:46:40 -08:00
Sameer Kankute	f804ab6de5	Add LLM provider response headers to Responses API (#16091 ) * Add llm headers to responses api * fix mock test	2025-11-01 13:25:56 -07:00
Ishaan Jaffer	33371d18f4	test fix claude-sonnet-4-5-20250929	2025-10-28 19:05:13 -07:00
Ishaan Jaffer	1b49dba1dd	fix claude-sonnet-4-5	2025-10-28 17:37:08 -07:00
Ishaan Jaffer	a1d3790198	TestAzureResponsesAPITest	2025-10-25 16:22:52 -07:00
Ishaan Jaffer	f0ae2bef4f	TestAzureResponsesAPITest	2025-10-25 16:09:04 -07:00
Ishaan Jaffer	667f2613de	TestAzureOpenAIVectorStore	2025-10-25 14:06:28 -07:00
Ishaan Jaffer	eb425550a2	test responses API fixes	2025-10-25 10:58:39 -07:00
Ishaan Jaff	d91efa7a7b	[Bug Fix]: ErrorEvent ValidationError when OpenAI Responses API returns nested error structure (#15804 ) * add ErrorEventError nested field * test_openai_responses_api_token_limit_error * test_openai_responses_api_token_limit_error	2025-10-22 14:18:46 -07:00
Sameer Kankute	44495c0117	fix encrypted content error (#15782 )	2025-10-21 23:29:48 -07:00
Ishaan Jaffer	97626f2d02	fix test	2025-10-11 11:30:19 -07:00
Ishaan Jaffer	ce6102d54f	test_azure_responses_api_status_error	2025-10-11 10:41:15 -07:00
Ishaan Jaffer	86881a8fc1	test_azure_responses_api_status_error	2025-10-11 10:21:07 -07:00
Krrish Dholakia	5336fcc000	fix(azure/responses): always remove status unsupported parameter	2025-10-06 18:08:57 -07:00
Ishaan Jaff	708c0bd78d	[Feat] Return Cost for Responses API Streaming requests (#15053 ) * test_basic_openai_responses_api_streaming * _transform_chat_completion_usage_to_responses_usage * ResponseAPIUsage.cost * test fixes for anthropic cost with /responses * fix mypy typng	2025-09-29 19:47:04 -07:00
Alexsander Hamir	eaa04cd8ce	fix: use fastuuid helper (#14903 ) * fix: use fastuuid helper across the codebase First batch of changes, simple drop in replacement. * second batch of changes * fixed: script mistake on helper file	2025-09-25 15:47:01 -07:00
Ishaan Jaff	8e22cf5d65	[Fix] /responses API - add cancel endpoint + allow non-admins to use this as an llm api endpoint (#14594 ) * fix: ensure /responses/cancel works for non admins * test: cancel endpoint * fix responses API cancel endpoint * test fix * TestGoogleAIStudioResponsesAPITest	2025-09-15 18:49:54 -07:00
Sameer Kankute	110ce543c2	[Feat]Add cancel endpoint support for openai and azure (#14561 ) * Add cancel endpoint support for openai and azure * fix lint error * fix cancel url contruction azure * readd changes	2025-09-15 07:08:56 -07:00
Sameer Kankute	ad9f54a192	Move test to test_litellm/ folder	2025-09-05 10:08:34 +05:30
Krish Dholakia	94c1b21ae7	Merge branch 'main' into litellm_responses_structured_output	2025-09-04 12:23:26 -07:00
Sameer Kankute	fc9560573b	[BUG] Fix response api for reasoning item in input for litellm proxy (#14200 ) * fix response api for litellm proxy * Add test for checking if status is getting removed * add test in correct file * remove hardcoded fields * Make the handling simpler * fix lint error:	2025-09-04 10:36:48 -07:00
Sameer Kankute	8d67392e99	Merge branch 'main' into litellm_responses_structured_output	2025-09-04 22:35:30 +05:30
Sameer Kankute	5f79e8aac6	Litellm passthrough cost tracking chat completion (#14256 ) * feat: add structured output for sdk * Add support for cost tracking for chat completion in passthrough * remove not required changes	2025-09-04 09:57:48 -07:00
Sameer Kankute	8e363fe78c	feat: add structured output for sdk	2025-09-03 17:58:55 +05:30
Ishaan Jaff	37d885e46a	[Fix] LiteLLM does not support new web_search tool (Responses API) (#14083 ) * test_basic_openai_responses_with_websearch * fix: ResponsesAPIResponse * fix: StandardBuiltInToolCostTracking	2025-08-29 18:33:52 -07:00
Ishaan Jaff	b9132968b2	[Perf] Improvements for Async Success Handler (Logging Callbacks) - Approx +130 RPS (#13905 ) * [Performance] Reduce Significant CPU overhead from litellm_logging.py (#13895) * fix: litellm.configured_cold_storage_logger * fix Session Management - Non-OpenAI Models docs * ruff fix * test fix * create LoggingWorker * add GLOBAL_LOGGING_WORKER for async task handling * fix logging tests * add conftest * fix conftest * test fix location of encode bedrock runtime modelid arn * fix conftest.py * tuning LoggingWorker * conftest.py * fix conftest batches/ * test_async_chat_azure * event_loop * test_bedrock_streaming_passthrough_test2 * fix GLOBAL_LOGGING_WORKER * logging worker * add flush for global logging worker * Revert "fix GLOBAL_LOGGING_WORKER" This reverts commit d254f508f48935652f054777652938ad71976cce. * fix conftest clear_queue * fix conftest clear_queue * setup_and_teardown for llm translation * docs AWS_REGION * test_async_chat_azure * change test DIR * run ci/cd again * use 1 job for litellm_router_unit_testing * fix space * fix litellm_router_unit_testing * test_aaarouter_dynamic_cooldown_message_retry_time * litellm_router_unit_testing * conftest.py clearing qu * fixes litellm_router_unit_testing * fixes clear_queue * fix router_unit_tests * remove conftest * add back conftest for router * fix event loop test * test fix * fixes for LoggingWorker * ruff fix	2025-08-23 13:13:23 -07:00
Krrish Dholakia	cbb161f10b	test: handle internal server errors	2025-08-23 11:00:18 -07:00
Ishaan Jaff	76f1064229	[Bug Fix] litellm incompatible with newest release of openAI v1.100.0 (#13728 ) * fix imports OpenAI SDK * ResponseText fixes * fixes ResponseText * fix imports * catch AttributeError * fix import * use openai==1.100.1 * fix build from PIP * fix lint test * Print OpenAI version * fix Install dependencies	2025-08-18 18:26:17 -07:00
Ishaan Jaff	1cd827874f	[Bug Fix] - Allow using `reasoning_effort` for gpt-5 model family and `reasoning` for Responses API (#13475 ) * test_openai_gpt5_reasoning * test_openai_gpt5_reasoning_effort_parameter * add OpenAIGPT5ResponsesAPIConfig * test_openai_gpt5_reasoning_effort_parameter * fixes	2025-08-10 09:55:36 -07:00
Ishaan Jaff	825ea65b96	[Bug Fix] Responses API - Responses API failed if input containing ResponseReasoningItem (#13465 ) * add test_responses_api_multi_turn_with_reasoning_and_structured_output * fix transform_responses_api_request	2025-08-09 11:20:34 -07:00
Ishaan Jaff	9761ba7c7a	[Bug Fix] Responses api session management for streaming responses (#13396 ) * fix proxy config * fix(responses api): fix streaming ID consistency and tool format handling (#12640) * fix(responses): ensure streaming chunk IDs use consistent encoding format Fixes streaming ID inconsistency where streaming responses used raw provider IDs while non-streaming responses used properly encoded IDs with provider context. Changes: - Updated LiteLLMCompletionStreamingIterator to accept provider context - Added _encode_chunk_id() method using same logic as non-streaming responses - Modified chunk transformation to encode all streaming item_ids with resp_ prefix - Updated handlers to pass custom_llm_provider and litellm_metadata to streaming iterator Impact: - Streaming chunk IDs now format: resp_<base64_encoded_provider_context> - Enables session continuity when using streaming response IDs as previous_response_id - Allows provider detection and load balancing with streaming responses - Maintains backward compatibility with existing streaming functionality 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * fix(types): add explicit Optional[str] type annotation for model_id This resolves MyPy type checking error where model_id could be None but wasn't explicitly typed as Optional[str]. * fix(types): handle None case for litellm_metadata access Prevents 'Item None has no attribute get' error by checking for None before accessing litellm_metadata dictionary. * test: add comprehensive tests for streaming ID consistency Adds unit and E2E tests to verify streaming chunk IDs are properly encoded with consistent format across streaming responses. ## Tests Added ### Unit Test (test_reasoning_content_transformation.py) - `test_streaming_chunk_id_encoding()`: Validates the `_encode_chunk_id()` method correctly encodes chunk IDs with `resp_` prefix and provider context ### E2E Tests (test_e2e_openai_responses_api.py) - `test_streaming_id_consistency_across_chunks()`: Tests that all streaming chunk IDs are properly encoded across multiple chunks in a real streaming response - `test_streaming_response_id_as_previous_response_id()`: Tests the core use case - using streaming response IDs for session continuity with `previous_response_id` ## Key Testing Approach - Uses Gemini (non-OpenAI model) to test the transformation logic rather than OpenAI passthrough, since the streaming ID consistency issue occurs when LiteLLM transforms responses rather than just passing through to native OpenAI responses API - Tests validate that streaming chunk IDs now use same encoding as non-streaming responses - Verifies session continuity works with streaming responses Addresses @ishaan-jaff's request for unit tests covering the streaming ID consistency fix. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * fix(lint): remove unused imports in transformation.py Removes unused imports to fix CI linting errors: - GenericResponseOutputItem - OutputFunctionToolCall * test: remove E2E tests from openai_endpoints_tests Remove streaming ID consistency E2E tests as requested by @ishaan-jaff. Keep only the mock/unit test in test_reasoning_content_transformation.py * revert: remove streaming chunk ID encoding to original behavior This reverts the streaming chunk ID encoding changes to understand the original issue better. Original behavior was: - Streaming chunks: raw provider IDs - Streaming final response: raw IDs (PROBLEM!) - Non-streaming final response: encoded IDs (correct) The real issue: streaming final response IDs were not encoded, breaking session continuity. * fix(responses): encode streaming final response IDs to match OpenAI behavior Fixes streaming ID inconsistency to match OpenAI's Responses API behavior: - Streaming chunks: raw message IDs (like OpenAI's msg_xxx) - Final response: encoded IDs (like OpenAI's resp_xxx) This enables session continuity by ensuring streaming final response IDs have the same encoded format as non-streaming responses, allowing them to be used as previous_response_id in follow-up requests. Changes: - Add custom_llm_provider and litellm_metadata to LiteLLMCompletionStreamingIterator - Update handlers to pass provider context to streaming iterator - Apply _update_responses_api_response_id_with_model_id to final streaming response - Keep streaming chunks as raw IDs to match OpenAI format Impact: - Session continuity works with streaming responses - Load balancing can detect provider from streaming final response IDs - Format matches OpenAI's Responses API exactly 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * test: update unit test to match correct OpenAI-compatible behavior Updates the unit test to verify streaming chunk IDs are raw (not encoded) to match OpenAI's responses API format: - Streaming chunks: raw message IDs (like msg_xxx) - Final response: encoded IDs (like resp_xxx) This reflects the correct behavior implemented in the fix. --------- Co-authored-by: Claude <noreply@anthropic.com> * cleanup * TestBaseResponsesAPIStreamingIterator --------- Co-authored-by: Javier de la Torre <jatorre@carto.com> Co-authored-by: Claude <noreply@anthropic.com>	2025-08-07 20:13:24 -07:00
Ishaan Jaff	f3749709b8	Bug Fix - Responses API raises error with Gemini Tool Calls in `input` (#13260 ) * add _transform_responses_api_function_call_to_chat_completion_message * test_responses_api_with_tool_calls * TestFunctionCallTransformation * fixes for responses API testing google ai studio * TestGoogleAIStudioResponsesAPITest * test_responses_api_with_tool_calls * test_responses_api_with_tool_calls * test_basic_openai_responses_streaming_delete_endpoint	2025-08-04 12:01:33 -07:00
Ishaan Jaff	dae72003a7	[Bug Fix] OpenAI / Azure Responses API - Add `service_tier` , `safety_identifier` supported params (#13258 ) * test_aresponses_service_tier_and_safety_identifier * add service_tier + safety_identifier * fix get_supported_openai_params * add safety_identifier + service_tier for responses()	2025-08-04 10:51:53 -07:00
Jugal D. Bhatt	eb8a338d9b	[MCP Guardrails] move pre and during hooks to ProxyLoggin (#13109 ) * move pre and during hooks t o ProxyLoggin * fix lint * fix ruff * fix tests	2025-07-30 13:58:41 -07:00
Ishaan Jaff	66a139a86a	test_basic_openai_responses_api_streaming	2025-07-19 15:30:03 -07:00
Ishaan Jaff	274baac9df	test_mcp_tools_with_responses_api	2025-07-03 14:53:30 -07:00
Ishaan Jaff	03a589d323	fix - MCP deepwiki mcp is unstable, move to stable mcp	2025-07-03 14:24:32 -07:00

1 2

88 Commits