Commit Graph

88 Commits

Author SHA1 Message Date
Ishaan Jaffer 50328d15d4 test_process_chunk_with_response_completed_event 2025-11-26 11:52:05 -08:00
Sameer Kankute 82dc0354ce Litellm sameer nov 3 stable branch (#16963)
* Add openai metadata filed in the request

* Add docs related to openai metadata

* Add utils

* test_completion_openai_metadata[True]

* Added support for though signature for gemini 3 in responses api (#16872)

* Added support for though signature for gemini 3

* Update docs with all supported endpoints and cost tracking

* Added config based routing support for batches and files

* fix lint errors

* Litellm anthropic image url support (#16868)

* Add image as url support to anthropic

* fix mypy errors

* fix tests

* Fix: Populate spend_logs_metadata in batch and files endpoints (#16921)

* Add spend-logs-metadata to the metadata

* Add tests for spend logs metadata in batches

* use better names

* Remove support for penalty param for gemini 3 (#16907)

* Remove support for penalty param

* remove halucinated model names

* fix mypy/test errors

* fix tests

* fix too many lines error

* fix too many lines error

* Add config for cicd test case

* Fix final tests

* fix batch tests

* fix batch tests
2025-11-22 09:35:05 -08:00
Ishaan Jaffer 1d2bdaebb6 test_openai_streaming_logging 2025-11-08 11:49:36 -08:00
Ishaan Jaffer 4fb521c251 test_basic_openai_responses_api_non_streaming_with_logging 2025-11-08 11:12:35 -08:00
Ishaan Jaffer bbcdf6f996 test_basic_openai_responses_api_non_streaming_with_logging 2025-11-08 10:47:48 -08:00
Ishaan Jaffer 0148b8d2f7 test_basic_openai_responses_api_non_streaming_with_logging 2025-11-08 10:36:47 -08:00
Ishaan Jaffer 89157f4b5c test_basic_openai_responses_api_streaming_with_logging 2025-11-08 10:30:41 -08:00
Ishaan Jaffer 6bb963dca2 test_basic_openai_responses_api_non_streaming_with_logging 2025-11-08 10:13:27 -08:00
Ishaan Jaffer f9d95b71bb fix _get_assembled_streaming_response 2025-11-08 10:08:03 -08:00
Ishaan Jaffer 61cf169926 test_basic_openai_responses_api_streaming 2025-11-06 16:18:30 -08:00
Cesar Garcia 1fec48499f fix: Pass extra_body to provider in Responses API requests (#16320)
## Problem
The `extra_body` parameter in `litellm.responses()` and `litellm.aresponses()`
was being accepted but never passed to the HTTP request sent to the LLM provider.
This prevented users from sending custom/experimental parameters to provider APIs.

## Changes
- Added `data.update(extra_body)` in `async_response_api_handler` (line 2138)
- Added `data.update(extra_body)` in `response_api_handler` (line 2012)
- Added tests to `test_openai_responses_api.py` for extra_body functionality

## Testing
- Tests verify extra_body params are passed in both sync and async modes
- Existing Responses API tests continue to pass
- Manually verified with OpenAI API that custom params are sent correctly

## Impact
Users can now pass custom/experimental parameters via extra_body:
```python
litellm.aresponses(
    model="gpt-4o",
    input="hello",
    extra_body={"custom_param": "value"}  # Now works!
)
```

This aligns with the OpenAI SDK pattern and matches behavior in other
LiteLLM endpoints (completion, embedding, etc.) that already support extra_body.
2025-11-06 14:54:40 -08:00
Ishaan Jaffer 4a83ae0695 test_aresponses_service_tier_and_safety_identifier 2025-11-04 18:05:30 -08:00
Cesar Garcia 78ed5126a5 fix: Fix Responses API streaming tests usage field names and cost (#16236)
This commit fixes two bugs in Responses API streaming tests:

1. **Usage field naming bug**: Tests were using `input_tokens` and
   `output_tokens` but the Usage object uses `prompt_tokens` and
   `completion_tokens`.

2. **Missing cost in streaming usage**: When `include_cost_in_streaming_usage`
   was enabled, the cost was calculated and added to ResponseAPIUsage, but was
   lost during the transformation to the Usage object.

Changes:
- Updated test assertions to use correct field names (prompt_tokens, completion_tokens)
- Added cost preservation logic in FakeStreamerResponsesAPIIterator
- Modified _transform_response_api_usage_to_chat_usage() to preserve cost attribute

All streaming tests now pass successfully.
2025-11-04 15:57:59 -08:00
Krish Dholakia 74ae7aed44 build: Squashed commit of the following: (#16176)
commit bb0b050fb01633d83c1c2932f8e9c11432911847
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date:   Sat Nov 1 20:00:01 2025 -0700

    test: update tests

commit b2da4bdac23868e69a9452805b231f8830e49912
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date:   Wed Oct 22 14:58:01 2025 -0700

    fix(langfuse_otel_attributes.py): log tools and other optional params

commit 75bee1f2748f32b230467de0b085c55bf1d687a9
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date:   Wed Oct 22 14:42:05 2025 -0700

    feat(langfuse_otel/): working request/response logging on spans

    Closes https://github.com/BerriAI/litellm/issues/13764

commit a3e4fa5b81e82f71c74fb9e7dc859c6cb40495f5
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date:   Wed Oct 22 14:20:39 2025 -0700

    fix: initial commit fixing langfuse request/response logging with OTEL

commit 09fc9deac844004104822810e42975cd9c68f0e3
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date:   Wed Oct 22 13:33:52 2025 -0700

    fix(litellm_logging.py): for responses api - return a unified usage object for logging

    ensures logging integrations all pull the right usage information
2025-11-02 09:46:40 -08:00
Sameer Kankute f804ab6de5 Add LLM provider response headers to Responses API (#16091)
* Add llm headers to responses api

* fix mock test
2025-11-01 13:25:56 -07:00
Ishaan Jaffer 33371d18f4 test fix claude-sonnet-4-5-20250929 2025-10-28 19:05:13 -07:00
Ishaan Jaffer 1b49dba1dd fix claude-sonnet-4-5 2025-10-28 17:37:08 -07:00
Ishaan Jaffer a1d3790198 TestAzureResponsesAPITest 2025-10-25 16:22:52 -07:00
Ishaan Jaffer f0ae2bef4f TestAzureResponsesAPITest 2025-10-25 16:09:04 -07:00
Ishaan Jaffer 667f2613de TestAzureOpenAIVectorStore 2025-10-25 14:06:28 -07:00
Ishaan Jaffer eb425550a2 test responses API fixes 2025-10-25 10:58:39 -07:00
Ishaan Jaff d91efa7a7b [Bug Fix]: ErrorEvent ValidationError when OpenAI Responses API returns nested error structure (#15804)
* add ErrorEventError nested field

* test_openai_responses_api_token_limit_error

* test_openai_responses_api_token_limit_error
2025-10-22 14:18:46 -07:00
Sameer Kankute 44495c0117 fix encrypted content error (#15782) 2025-10-21 23:29:48 -07:00
Ishaan Jaffer 97626f2d02 fix test 2025-10-11 11:30:19 -07:00
Ishaan Jaffer ce6102d54f test_azure_responses_api_status_error 2025-10-11 10:41:15 -07:00
Ishaan Jaffer 86881a8fc1 test_azure_responses_api_status_error 2025-10-11 10:21:07 -07:00
Krrish Dholakia 5336fcc000 fix(azure/responses): always remove status
unsupported parameter
2025-10-06 18:08:57 -07:00
Ishaan Jaff 708c0bd78d [Feat] Return Cost for Responses API Streaming requests (#15053)
* test_basic_openai_responses_api_streaming

* _transform_chat_completion_usage_to_responses_usage

* ResponseAPIUsage.cost

* test fixes for anthropic cost with /responses

* fix mypy typng
2025-09-29 19:47:04 -07:00
Alexsander Hamir eaa04cd8ce fix: use fastuuid helper (#14903)
* fix: use fastuuid helper across the codebase

First batch of changes, simple drop in replacement.

* second batch of changes

* fixed: script mistake on helper file
2025-09-25 15:47:01 -07:00
Ishaan Jaff 8e22cf5d65 [Fix] /responses API - add cancel endpoint + allow non-admins to use this as an llm api endpoint (#14594)
* fix: ensure /responses/cancel works for non admins

* test: cancel endpoint

* fix responses API  cancel endpoint

* test fix

* TestGoogleAIStudioResponsesAPITest
2025-09-15 18:49:54 -07:00
Sameer Kankute 110ce543c2 [Feat]Add cancel endpoint support for openai and azure (#14561)
* Add cancel endpoint support for openai
 and azure

* fix lint error

* fix cancel url contruction azure

* readd changes
2025-09-15 07:08:56 -07:00
Sameer Kankute ad9f54a192 Move test to test_litellm/ folder 2025-09-05 10:08:34 +05:30
Krish Dholakia 94c1b21ae7 Merge branch 'main' into litellm_responses_structured_output 2025-09-04 12:23:26 -07:00
Sameer Kankute fc9560573b [BUG] Fix response api for reasoning item in input for litellm proxy (#14200)
* fix response api for litellm proxy

* Add test for checking if status is getting removed

* add test in correct file

* remove hardcoded fields

* Make the handling simpler

* fix lint error:
2025-09-04 10:36:48 -07:00
Sameer Kankute 8d67392e99 Merge branch 'main' into litellm_responses_structured_output 2025-09-04 22:35:30 +05:30
Sameer Kankute 5f79e8aac6 Litellm passthrough cost tracking chat completion (#14256)
* feat: add structured output for sdk

* Add support for cost tracking for chat completion in passthrough

* remove not required changes
2025-09-04 09:57:48 -07:00
Sameer Kankute 8e363fe78c feat: add structured output for sdk 2025-09-03 17:58:55 +05:30
Ishaan Jaff 37d885e46a [Fix] LiteLLM does not support new web_search tool (Responses API) (#14083)
* test_basic_openai_responses_with_websearch

* fix: ResponsesAPIResponse

* fix: StandardBuiltInToolCostTracking
2025-08-29 18:33:52 -07:00
Ishaan Jaff b9132968b2 [Perf] Improvements for Async Success Handler (Logging Callbacks) - Approx +130 RPS (#13905)
* [Performance] Reduce Significant CPU overhead from litellm_logging.py (#13895)

* fix: litellm.configured_cold_storage_logger

* fix Session Management - Non-OpenAI Models docs

* ruff fix

* test fix

* create LoggingWorker

* add GLOBAL_LOGGING_WORKER for async task handling

* fix logging tests

* add conftest

* fix conftest

* test fix location of encode bedrock runtime modelid arn

* fix conftest.py

* tuning LoggingWorker

* conftest.py

* fix conftest batches/

* test_async_chat_azure

* event_loop

* test_bedrock_streaming_passthrough_test2

* fix GLOBAL_LOGGING_WORKER

* logging worker

* add flush for global logging worker

* Revert "fix GLOBAL_LOGGING_WORKER"

This reverts commit d254f508f48935652f054777652938ad71976cce.

* fix conftest clear_queue

* fix conftest clear_queue

* setup_and_teardown for llm translation

* docs AWS_REGION

* test_async_chat_azure

* change test DIR

* run ci/cd again

* use 1 job for litellm_router_unit_testing

* fix space

* fix litellm_router_unit_testing

* test_aaarouter_dynamic_cooldown_message_retry_time

* litellm_router_unit_testing

* conftest.py clearing qu

* fixes litellm_router_unit_testing

* fixes clear_queue

* fix router_unit_tests

* remove conftest

* add back conftest for router

* fix event loop test

* test fix

* fixes for LoggingWorker

* ruff fix
2025-08-23 13:13:23 -07:00
Krrish Dholakia cbb161f10b test: handle internal server errors 2025-08-23 11:00:18 -07:00
Ishaan Jaff 76f1064229 [Bug Fix] litellm incompatible with newest release of openAI v1.100.0 (#13728)
* fix imports OpenAI SDK

* ResponseText fixes

* fixes ResponseText

* fix imports

* catch AttributeError

* fix import

* use openai==1.100.1

* fix build from PIP

* fix lint test

* Print OpenAI version

* fix Install dependencies
2025-08-18 18:26:17 -07:00
Ishaan Jaff 1cd827874f [Bug Fix] - Allow using reasoning_effort for gpt-5 model family and reasoning for Responses API (#13475)
* test_openai_gpt5_reasoning

* test_openai_gpt5_reasoning_effort_parameter

* add OpenAIGPT5ResponsesAPIConfig

* test_openai_gpt5_reasoning_effort_parameter

* fixes
2025-08-10 09:55:36 -07:00
Ishaan Jaff 825ea65b96 [Bug Fix] Responses API - Responses API failed if input containing ResponseReasoningItem (#13465)
* add test_responses_api_multi_turn_with_reasoning_and_structured_output

* fix transform_responses_api_request
2025-08-09 11:20:34 -07:00
Ishaan Jaff 9761ba7c7a [Bug Fix] Responses api session management for streaming responses (#13396)
* fix proxy config

* fix(responses api): fix streaming ID consistency and tool format handling (#12640)

* fix(responses): ensure streaming chunk IDs use consistent encoding format

Fixes streaming ID inconsistency where streaming responses used raw provider IDs
while non-streaming responses used properly encoded IDs with provider context.

Changes:
- Updated LiteLLMCompletionStreamingIterator to accept provider context
- Added _encode_chunk_id() method using same logic as non-streaming responses
- Modified chunk transformation to encode all streaming item_ids with resp_ prefix
- Updated handlers to pass custom_llm_provider and litellm_metadata to streaming iterator

Impact:
- Streaming chunk IDs now format: resp_<base64_encoded_provider_context>
- Enables session continuity when using streaming response IDs as previous_response_id
- Allows provider detection and load balancing with streaming responses
- Maintains backward compatibility with existing streaming functionality

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* fix(types): add explicit Optional[str] type annotation for model_id

This resolves MyPy type checking error where model_id could be None
but wasn't explicitly typed as Optional[str].

* fix(types): handle None case for litellm_metadata access

Prevents 'Item None has no attribute get' error by checking for None
before accessing litellm_metadata dictionary.

* test: add comprehensive tests for streaming ID consistency

Adds unit and E2E tests to verify streaming chunk IDs are properly encoded
with consistent format across streaming responses.

## Tests Added

### Unit Test (test_reasoning_content_transformation.py)
- `test_streaming_chunk_id_encoding()`: Validates the `_encode_chunk_id()` method
  correctly encodes chunk IDs with `resp_` prefix and provider context

### E2E Tests (test_e2e_openai_responses_api.py)
- `test_streaming_id_consistency_across_chunks()`: Tests that all streaming chunk IDs
  are properly encoded across multiple chunks in a real streaming response
- `test_streaming_response_id_as_previous_response_id()`: Tests the core use case -
  using streaming response IDs for session continuity with `previous_response_id`

## Key Testing Approach
- Uses **Gemini** (non-OpenAI model) to test the transformation logic rather than
  OpenAI passthrough, since the streaming ID consistency issue occurs when LiteLLM
  transforms responses rather than just passing through to native OpenAI responses API
- Tests validate that streaming chunk IDs now use same encoding as non-streaming responses
- Verifies session continuity works with streaming responses

Addresses @ishaan-jaff's request for unit tests covering the streaming ID consistency fix.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* fix(lint): remove unused imports in transformation.py

Removes unused imports to fix CI linting errors:
- GenericResponseOutputItem
- OutputFunctionToolCall

* test: remove E2E tests from openai_endpoints_tests

Remove streaming ID consistency E2E tests as requested by @ishaan-jaff.
Keep only the mock/unit test in test_reasoning_content_transformation.py

* revert: remove streaming chunk ID encoding to original behavior

This reverts the streaming chunk ID encoding changes to understand the original issue better.
Original behavior was:
- Streaming chunks: raw provider IDs
- Streaming final response: raw IDs (PROBLEM!)
- Non-streaming final response: encoded IDs (correct)

The real issue: streaming final response IDs were not encoded, breaking session continuity.

* fix(responses): encode streaming final response IDs to match OpenAI behavior

Fixes streaming ID inconsistency to match OpenAI's Responses API behavior:
- Streaming chunks: raw message IDs (like OpenAI's msg_xxx)
- Final response: encoded IDs (like OpenAI's resp_xxx)

This enables session continuity by ensuring streaming final response IDs
have the same encoded format as non-streaming responses, allowing them
to be used as previous_response_id in follow-up requests.

Changes:
- Add custom_llm_provider and litellm_metadata to LiteLLMCompletionStreamingIterator
- Update handlers to pass provider context to streaming iterator
- Apply _update_responses_api_response_id_with_model_id to final streaming response
- Keep streaming chunks as raw IDs to match OpenAI format

Impact:
- Session continuity works with streaming responses
- Load balancing can detect provider from streaming final response IDs
- Format matches OpenAI's Responses API exactly

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* test: update unit test to match correct OpenAI-compatible behavior

Updates the unit test to verify streaming chunk IDs are raw (not encoded)
to match OpenAI's responses API format:
- Streaming chunks: raw message IDs (like msg_xxx)
- Final response: encoded IDs (like resp_xxx)

This reflects the correct behavior implemented in the fix.

---------

Co-authored-by: Claude <noreply@anthropic.com>

* cleanup

* TestBaseResponsesAPIStreamingIterator

---------

Co-authored-by: Javier de la Torre <jatorre@carto.com>
Co-authored-by: Claude <noreply@anthropic.com>
2025-08-07 20:13:24 -07:00
Ishaan Jaff f3749709b8 Bug Fix - Responses API raises error with Gemini Tool Calls in input (#13260)
* add _transform_responses_api_function_call_to_chat_completion_message

* test_responses_api_with_tool_calls

* TestFunctionCallTransformation

* fixes for responses API testing google ai studio

* TestGoogleAIStudioResponsesAPITest

* test_responses_api_with_tool_calls

* test_responses_api_with_tool_calls

* test_basic_openai_responses_streaming_delete_endpoint
2025-08-04 12:01:33 -07:00
Ishaan Jaff dae72003a7 [Bug Fix] OpenAI / Azure Responses API - Add service_tier , safety_identifier supported params (#13258)
* test_aresponses_service_tier_and_safety_identifier

* add service_tier + safety_identifier

* fix get_supported_openai_params

* add safety_identifier + service_tier for responses()
2025-08-04 10:51:53 -07:00
Jugal D. Bhatt eb8a338d9b [MCP Guardrails] move pre and during hooks to ProxyLoggin (#13109)
* move pre and during hooks t o ProxyLoggin

* fix lint

* fix ruff

* fix tests
2025-07-30 13:58:41 -07:00
Ishaan Jaff 66a139a86a test_basic_openai_responses_api_streaming 2025-07-19 15:30:03 -07:00
Ishaan Jaff 274baac9df test_mcp_tools_with_responses_api 2025-07-03 14:53:30 -07:00
Ishaan Jaff 03a589d323 fix - MCP deepwiki mcp is unstable, move to stable mcp 2025-07-03 14:24:32 -07:00