Commit Graph

51 Commits

Author SHA1 Message Date
Ishaan Jaff 37d885e46a [Fix] LiteLLM does not support new web_search tool (Responses API) (#14083)
* test_basic_openai_responses_with_websearch

* fix: ResponsesAPIResponse

* fix: StandardBuiltInToolCostTracking
2025-08-29 18:33:52 -07:00
Ishaan Jaff b9132968b2 [Perf] Improvements for Async Success Handler (Logging Callbacks) - Approx +130 RPS (#13905)
* [Performance] Reduce Significant CPU overhead from litellm_logging.py (#13895)

* fix: litellm.configured_cold_storage_logger

* fix Session Management - Non-OpenAI Models docs

* ruff fix

* test fix

* create LoggingWorker

* add GLOBAL_LOGGING_WORKER for async task handling

* fix logging tests

* add conftest

* fix conftest

* test fix location of encode bedrock runtime modelid arn

* fix conftest.py

* tuning LoggingWorker

* conftest.py

* fix conftest batches/

* test_async_chat_azure

* event_loop

* test_bedrock_streaming_passthrough_test2

* fix GLOBAL_LOGGING_WORKER

* logging worker

* add flush for global logging worker

* Revert "fix GLOBAL_LOGGING_WORKER"

This reverts commit d254f508f48935652f054777652938ad71976cce.

* fix conftest clear_queue

* fix conftest clear_queue

* setup_and_teardown for llm translation

* docs AWS_REGION

* test_async_chat_azure

* change test DIR

* run ci/cd again

* use 1 job for litellm_router_unit_testing

* fix space

* fix litellm_router_unit_testing

* test_aaarouter_dynamic_cooldown_message_retry_time

* litellm_router_unit_testing

* conftest.py clearing qu

* fixes litellm_router_unit_testing

* fixes clear_queue

* fix router_unit_tests

* remove conftest

* add back conftest for router

* fix event loop test

* test fix

* fixes for LoggingWorker

* ruff fix
2025-08-23 13:13:23 -07:00
Krrish Dholakia cbb161f10b test: handle internal server errors 2025-08-23 11:00:18 -07:00
Ishaan Jaff 76f1064229 [Bug Fix] litellm incompatible with newest release of openAI v1.100.0 (#13728)
* fix imports OpenAI SDK

* ResponseText fixes

* fixes ResponseText

* fix imports

* catch AttributeError

* fix import

* use openai==1.100.1

* fix build from PIP

* fix lint test

* Print OpenAI version

* fix Install dependencies
2025-08-18 18:26:17 -07:00
Ishaan Jaff 1cd827874f [Bug Fix] - Allow using reasoning_effort for gpt-5 model family and reasoning for Responses API (#13475)
* test_openai_gpt5_reasoning

* test_openai_gpt5_reasoning_effort_parameter

* add OpenAIGPT5ResponsesAPIConfig

* test_openai_gpt5_reasoning_effort_parameter

* fixes
2025-08-10 09:55:36 -07:00
Ishaan Jaff 825ea65b96 [Bug Fix] Responses API - Responses API failed if input containing ResponseReasoningItem (#13465)
* add test_responses_api_multi_turn_with_reasoning_and_structured_output

* fix transform_responses_api_request
2025-08-09 11:20:34 -07:00
Ishaan Jaff 9761ba7c7a [Bug Fix] Responses api session management for streaming responses (#13396)
* fix proxy config

* fix(responses api): fix streaming ID consistency and tool format handling (#12640)

* fix(responses): ensure streaming chunk IDs use consistent encoding format

Fixes streaming ID inconsistency where streaming responses used raw provider IDs
while non-streaming responses used properly encoded IDs with provider context.

Changes:
- Updated LiteLLMCompletionStreamingIterator to accept provider context
- Added _encode_chunk_id() method using same logic as non-streaming responses
- Modified chunk transformation to encode all streaming item_ids with resp_ prefix
- Updated handlers to pass custom_llm_provider and litellm_metadata to streaming iterator

Impact:
- Streaming chunk IDs now format: resp_<base64_encoded_provider_context>
- Enables session continuity when using streaming response IDs as previous_response_id
- Allows provider detection and load balancing with streaming responses
- Maintains backward compatibility with existing streaming functionality

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* fix(types): add explicit Optional[str] type annotation for model_id

This resolves MyPy type checking error where model_id could be None
but wasn't explicitly typed as Optional[str].

* fix(types): handle None case for litellm_metadata access

Prevents 'Item None has no attribute get' error by checking for None
before accessing litellm_metadata dictionary.

* test: add comprehensive tests for streaming ID consistency

Adds unit and E2E tests to verify streaming chunk IDs are properly encoded
with consistent format across streaming responses.

## Tests Added

### Unit Test (test_reasoning_content_transformation.py)
- `test_streaming_chunk_id_encoding()`: Validates the `_encode_chunk_id()` method
  correctly encodes chunk IDs with `resp_` prefix and provider context

### E2E Tests (test_e2e_openai_responses_api.py)
- `test_streaming_id_consistency_across_chunks()`: Tests that all streaming chunk IDs
  are properly encoded across multiple chunks in a real streaming response
- `test_streaming_response_id_as_previous_response_id()`: Tests the core use case -
  using streaming response IDs for session continuity with `previous_response_id`

## Key Testing Approach
- Uses **Gemini** (non-OpenAI model) to test the transformation logic rather than
  OpenAI passthrough, since the streaming ID consistency issue occurs when LiteLLM
  transforms responses rather than just passing through to native OpenAI responses API
- Tests validate that streaming chunk IDs now use same encoding as non-streaming responses
- Verifies session continuity works with streaming responses

Addresses @ishaan-jaff's request for unit tests covering the streaming ID consistency fix.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* fix(lint): remove unused imports in transformation.py

Removes unused imports to fix CI linting errors:
- GenericResponseOutputItem
- OutputFunctionToolCall

* test: remove E2E tests from openai_endpoints_tests

Remove streaming ID consistency E2E tests as requested by @ishaan-jaff.
Keep only the mock/unit test in test_reasoning_content_transformation.py

* revert: remove streaming chunk ID encoding to original behavior

This reverts the streaming chunk ID encoding changes to understand the original issue better.
Original behavior was:
- Streaming chunks: raw provider IDs
- Streaming final response: raw IDs (PROBLEM!)
- Non-streaming final response: encoded IDs (correct)

The real issue: streaming final response IDs were not encoded, breaking session continuity.

* fix(responses): encode streaming final response IDs to match OpenAI behavior

Fixes streaming ID inconsistency to match OpenAI's Responses API behavior:
- Streaming chunks: raw message IDs (like OpenAI's msg_xxx)
- Final response: encoded IDs (like OpenAI's resp_xxx)

This enables session continuity by ensuring streaming final response IDs
have the same encoded format as non-streaming responses, allowing them
to be used as previous_response_id in follow-up requests.

Changes:
- Add custom_llm_provider and litellm_metadata to LiteLLMCompletionStreamingIterator
- Update handlers to pass provider context to streaming iterator
- Apply _update_responses_api_response_id_with_model_id to final streaming response
- Keep streaming chunks as raw IDs to match OpenAI format

Impact:
- Session continuity works with streaming responses
- Load balancing can detect provider from streaming final response IDs
- Format matches OpenAI's Responses API exactly

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* test: update unit test to match correct OpenAI-compatible behavior

Updates the unit test to verify streaming chunk IDs are raw (not encoded)
to match OpenAI's responses API format:
- Streaming chunks: raw message IDs (like msg_xxx)
- Final response: encoded IDs (like resp_xxx)

This reflects the correct behavior implemented in the fix.

---------

Co-authored-by: Claude <noreply@anthropic.com>

* cleanup

* TestBaseResponsesAPIStreamingIterator

---------

Co-authored-by: Javier de la Torre <jatorre@carto.com>
Co-authored-by: Claude <noreply@anthropic.com>
2025-08-07 20:13:24 -07:00
Ishaan Jaff f3749709b8 Bug Fix - Responses API raises error with Gemini Tool Calls in input (#13260)
* add _transform_responses_api_function_call_to_chat_completion_message

* test_responses_api_with_tool_calls

* TestFunctionCallTransformation

* fixes for responses API testing google ai studio

* TestGoogleAIStudioResponsesAPITest

* test_responses_api_with_tool_calls

* test_responses_api_with_tool_calls

* test_basic_openai_responses_streaming_delete_endpoint
2025-08-04 12:01:33 -07:00
Ishaan Jaff dae72003a7 [Bug Fix] OpenAI / Azure Responses API - Add service_tier , safety_identifier supported params (#13258)
* test_aresponses_service_tier_and_safety_identifier

* add service_tier + safety_identifier

* fix get_supported_openai_params

* add safety_identifier + service_tier for responses()
2025-08-04 10:51:53 -07:00
Jugal D. Bhatt eb8a338d9b [MCP Guardrails] move pre and during hooks to ProxyLoggin (#13109)
* move pre and during hooks t o ProxyLoggin

* fix lint

* fix ruff

* fix tests
2025-07-30 13:58:41 -07:00
Ishaan Jaff 66a139a86a test_basic_openai_responses_api_streaming 2025-07-19 15:30:03 -07:00
Ishaan Jaff 274baac9df test_mcp_tools_with_responses_api 2025-07-03 14:53:30 -07:00
Ishaan Jaff 03a589d323 fix - MCP deepwiki mcp is unstable, move to stable mcp 2025-07-03 14:24:32 -07:00
Jugal D. Bhatt 88834b8550 [Bump] Litellm responses format (#12253)
* Add responses format changes

* Add check

* Add check

* added more testing
2025-07-02 16:32:06 -07:00
Ishaan Jaff 4e7115bc34 Bug Fix - responses api fix got multiple values for keyword argument 'litellm_trace_id' (#12225)
* fix - handling trace id arg on responses api

* test_async_response_api_handler_merges_trace_id_without_error

* test_anthropic_with_responses_api
2025-07-01 18:12:22 -07:00
Ishaan Jaff 99d851544a [Feat] Add Azure Codex Models on LiteLLM + new /v1 preview Azure OpenAI API (#11934)
* fix get_complete_url

* fixes _is_azure_v1_api_version

* test_azure_responses_api_preview_api_version

* TestAzureResponsesAPIConfig

* add azure/codex-mini

* fix azure/codex-mini

* Update litellm/llms/azure/responses/transformation.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* fix linting

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-06-20 18:08:44 -07:00
Ishaan Jaff cda759c8e7 [Bug Fix]: Fix gemini - web search error with responses API (#11894)
* feat - add websearch tools to responses to chat transform

* test_basic_google_ai_studio_responses_api_with_tools

* fix web search to responses api

* linting fixes
2025-06-19 14:11:35 -07:00
Ishaan Jaff 80501b8268 [Feat] Day-0 Support for OpenAI Re-usable prompts Responses API (#11782)
* add prompt to responses params

* add OpenAI PromptObject

* add prompt param to responses api

* test_get_optional_params_responses_api

* test_openai_responses_litellm_router_with_prompt

* docs Reusable Prompts
2025-06-16 21:28:50 -07:00
Ishaan Jaff 4dc9626fd5 [Feat] New LLM API Endpoint - Add List input items for Responses API (#11602)
* (feat) add list_input_items

* add alist_input_items to router

* add GET input_items for responses API

* test_basic_openai_list_input_items_endpoint

* TestTransformListInputItemsRequest

* test_ensure_initialize_azure_sdk_client_always_used
2025-06-10 15:47:16 -07:00
Ishaan Jaff 86cdb8382b [Feat] Use aiohttp transport by default - 97% lower median latency (#11097)
* fix: add flag for disabling use_aiohttp_transport

* feat: add _create_async_transport

* feat: fixes for transport

* add httpx-aiohttp

* feat: fixes for transport

* refactor: fixes for transport

* build: fix deps

* fixes: test fixes

* fix: ensure aiohttp does not auto set content type

* test: test fixes

* feat: add LiteLLMAiohttpTransport

* fix: fixes for responses API handling

* test: fixes for responses API handling

* test: fixes for responses API handling

* feat: fixes for transport

* fix: base embedding handler

* test: test_async_http_handler_force_ipv4

* test: fix failing deepeval test

* fix: add YARL for bedrock urls

* fix: issues with transport

* fix: comment out linting issues

* test fix

* test: XAI is unstable

* test: fixes for using respx

* test: XAI fixes

* test: XAI fixes

* test: infinity testing fixes

* docs(config_settings.md): document param

* test: test_openai_image_edit_litellm_sdk

* test: remove deprecated test

* bump respx==0.22.0

* test: test_xai_message_name_filtering

* test: fix anthropic test after bumping httpx

* use n 4 for mapped tests (#11109)

* fix: use 1 session per event loop

* test: test_client_session_helper

* fix: linting error

* fix: resolving GET requests on httpx 0.28.1

* test fixes proxy unit tests

* fix: add ssl verify settings

* fix: proxy unit tests

* fix: refactor

* tests: basic unit tests for aiohttp transports

* tests: fixes xai

---------

Co-authored-by: Krrish Dholakia <krrishdholakia@gmail.com>
2025-05-23 22:55:35 -07:00
Ishaan Jaff dd4a65b83a Feat: add MCP to Responses API and bump openai python sdk (#11029)
* feat: add MCP to responses API

* feat: bump openai version to 1.75.0

* docs MCP + responses API

* fixes: type checking

* fixes: type checking

* build: use latest openai 1.81.0

* fix: linting error

* fix: linting error

* fix: test

* fix: linting errors

* fix: test

* fix: test

* fix: linting

* Revert "fix: linting"

This reverts commit ebb19ff8cb1f8fcc3e224390e351676daccb33de.

* fix: linting
2025-05-22 07:24:10 -07:00
Krrish Dholakia 66cf75cd5d test: handle internal server errors 2025-05-01 16:47:30 -07:00
Ishaan Jaff a69fa1dc1e [Bug Fix] Responses API - fix for handling multiturn responses API sessions (#10415)
* fix for handling previous_response_id id sessions on responses API

* fix test_decode_previous_response_id_to_original_previous_response_id
2025-04-29 17:22:47 -07:00
Ishaan Jaff dc9b058dbd [Feat] Add support for GET Responses Endpoint - OpenAI, Azure OpenAI (#10235)
* Added get responses API (#10234)

* test_basic_openai_responses_get_endpoint

* transform_get_response_api_request

* test_basic_openai_responses_get_endpoint

---------

Co-authored-by: Prathamesh Saraf <pratamesh1867@gmail.com>
2025-04-23 15:19:29 -07:00
Ishaan Jaff 0dba2886f0 fix test 2025-04-22 18:37:56 -07:00
Ishaan Jaff 868cdd0226 [Feat] Add Support for DELETE /v1/responses/{response_id} on OpenAI, Azure OpenAI (#10205)
* add transform_delete_response_api_request to base responses config

* add transform_delete_response_api_request

* add delete_response_api_handler

* fixes for deleting responses, response API

* add adelete_responses

* add async test_basic_openai_responses_delete_endpoint

* test_basic_openai_responses_delete_endpoint

* working delete for streaming on responses API

* fixes azure transformation

* TestAnthropicResponsesAPITest

* fix code check

* fix linting

* fixes for get_complete_url

* test_basic_openai_responses_streaming_delete_endpoint

* streaming fixes
2025-04-22 18:27:03 -07:00
Ishaan Jaff 653570824a Bug Fix - Responses API, Loosen restrictions on allowed environments for computer use tool (#10168)
* loosen allowed types on ComputerToolParam

* test_basic_computer_use_preview_tool_call
2025-04-19 14:40:32 -07:00
Ishaan Jaff 0717369ae6 [Feat] Expose Responses API on LiteLLM UI Test Key Page (#10166)
* add /responses API on UI

* add makeOpenAIResponsesRequest

* add makeOpenAIResponsesRequest

* fix add responses API on UI

* fix endpoint selector

* responses API render chunks on litellm chat ui

* fixes to streaming iterator

* fix render responses completed events

* fixes for MockResponsesAPIStreamingIterator

* transform_responses_api_request_to_chat_completion_request

* fix for responses API

* test_basic_openai_responses_api_streaming

* fix base responses api tests
2025-04-19 13:18:54 -07:00
Ishaan Jaff 3d5022bd79 [Feat] Support for all litellm providers on Responses API (works with Codex) - Anthropic, Bedrock API, VertexAI, Ollama (#10132)
* transform request

* basic handler for LiteLLMCompletionTransformationHandler

* complete transform litellm to responses api

* fixes to test

* fix stream=True

* fix streaming iterator

* fixes for transformation

* fixes for anthropic codex support

* fix pass response_api_optional_params

* test anthropic responses api tools

* update responses types

* working codex with litellm

* add session handler

* fixes streaming iterator

* fix handler

* add litellm codex example

* fix code quality

* test fix

* docs litellm codex

* litellm codexdoc

* docs openai codex with litellm

* docs litellm openai codex

* litellm codex

* linting fixes for transforming responses API

* fix import error

* fix responses api test

* add sync iterator support for responses api
2025-04-18 19:53:59 -07:00
Ishaan Jaff d3e04eac7f [Feat] Unified Responses API - Add Azure Responses API support (#10116)
* initial commit for azure responses api support

* update get complete url

* fixes for responses API

* working azure responses API

* working responses API

* test suite for responses API

* azure responses API test suite

* fix test with complete url

* fix test refactor

* test fix metadata checks

* fix code quality check
2025-04-17 16:47:59 -07:00
Ishaan Jaff b04cf226aa test_openai_o1_pro_response_api_streaming 2025-03-20 13:04:49 -07:00
Ishaan Jaff d915ab3f07 test_openai_o1_pro_response_api 2025-03-20 09:18:38 -07:00
Ishaan Jaff 7fee847ffc test_openai_o1_pro_incomplete_response 2025-03-20 09:14:59 -07:00
Ishaan Jaff 3b632ac825 test_async_bad_request_bad_param_error 2025-03-13 15:57:19 -07:00
Ishaan Jaff ee47016300 test_openai_responses_litellm_router_with_metadata 2025-03-12 18:55:02 -07:00
Ishaan Jaff c82ef41dc4 test_openai_responses_litellm_router_no_metadata 2025-03-12 18:18:07 -07:00
Ishaan Jaff d808fa3c23 test_openai_responses_litellm_router 2025-03-12 16:13:48 -07:00
Ishaan Jaff daea59a7b4 test openai responses streaming 2025-03-12 11:32:26 -07:00
Ishaan Jaff accdaa4a74 fix ResponseAPILoggingUtils 2025-03-12 11:12:09 -07:00
Ishaan Jaff d6351c3433 test_basic_openai_responses_api 2025-03-12 10:07:03 -07:00
Ishaan Jaff 047879c004 add aresponses 2025-03-12 09:22:44 -07:00
Ishaan Jaff 3bf2fda128 add conftest 2025-03-12 09:17:27 -07:00
Ishaan Jaff c2dbcb798f working streaming logging + cost tracking 2025-03-12 07:27:53 -07:00
Ishaan Jaff dd7ac41e33 validate_responses_match 2025-03-11 22:42:49 -07:00
Ishaan Jaff b790f0a5c6 log input of response API 2025-03-11 22:34:18 -07:00
Ishaan Jaff 51dc24a405 _transform_response_api_usage_to_chat_usage 2025-03-11 22:26:44 -07:00
Ishaan Jaff aa40cb5b26 working ResponsesAPIStreamingIterator 2025-03-11 19:47:43 -07:00
Ishaan Jaff 8da714104b ResponsesAPIStreamingResponse 2025-03-11 17:48:15 -07:00
Ishaan Jaff 52b43f672b working test_basic_openai_responses_api 2025-03-11 17:35:43 -07:00
Ishaan Jaff d6c82327e6 working import litellm.responses 2025-03-11 14:32:32 -07:00