Commit Graph

24 Commits

Author SHA1 Message Date
yuneng-jiang 8ca744036a [Fix] Malformed messages returning 500 instead of 400
The existing AttributeError detection in proxy error handling only
checked one level deep in the exception chain (__cause__, __context__,
original_exception). In practice, the AttributeError from malformed
messages gets wrapped in multiple layers (AttributeError ->
OpenAIException -> APIConnectionError), so the check never found it.

Extracted the check into _has_attribute_error_in_chain() which walks
the full exception chain recursively (depth-capped at 10 to prevent
infinite loops from circular references).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 23:01:25 -07:00
Sameer Kankute 36ec80d90c Fix azure model router 2026-03-12 12:40:37 +05:30
Sameer Kankute 5b83aae715 feat(azure_ai): show actual model used in Azure Model Router response
- Azure Model Router transform_response: let parent extract actual model from raw response
- common_request_processing: skip model override for Azure Model Router requests
- proxy_server: skip streaming chunk model restamp for Azure Model Router
- Add _is_azure_model_router_request helper
- Add tests for non-streaming and streaming

Made-with: Cursor
2026-03-12 11:41:19 +05:30
Ishaan Jaff 7befe3c78f feat(proxy): add key_alias, key_hash, requested_model DD APM span tags (#22710)
* feat(proxy): add key_alias, key_hash, requested_model tags to DD APM spans

* refactor(proxy): consolidate DD APM tag helpers into DDSpanTagger class

* refactor(proxy): move DDSpanTagger to its own file litellm/proxy/dd_span_tagger.py
2026-03-03 20:22:59 -08:00
Ishaan Jaff 9546d9b482 _add_dd_apm_tags_for_litellm_call_id (#22219) 2026-02-26 16:42:23 -08:00
Ishaan Jaff c343bfffda fix(router): emit x-litellm-overhead-duration-ms header for streaming requests (#22027)
* fix(router): preserve _hidden_params in FallbackStreamWrapper so x-litellm-overhead-duration-ms is emitted for streaming requests

* test(router): add regression test for FallbackStreamWrapper _hidden_params preservation
2026-02-24 11:56:16 -08:00
Harshit Jain 9fc3c77c42 fix: ensure arrival_time is set before calculating queue time 2026-02-23 17:04:47 +05:30
yuneng-jiang fd3ca081cc use cached keys and teams for router settings 2026-02-06 15:07:29 -08:00
yuneng-jiang 400e560ee5 Merge remote-tracking branch 'origin' into litellm_router_search_fix 2026-02-06 14:08:55 -08:00
Ishaan Jaffer 35e29c2bcd Revert "Merge pull request #18790 from BerriAI/litellm_key_team_routing_3"
This reverts commit ae26d8e68a, reversing
changes made to 864e8c6543.
2026-01-31 17:58:46 -08:00
yuneng-jiang a9eae5937f Override router settings 2026-01-31 16:04:52 -08:00
yuneng-jiang c9261c9f37 fix model name during fallback 2026-01-31 11:46:58 -08:00
Sameer Kankute 844c766c65 Merge pull request #18763 from BerriAI/litellm_staging_01_07_2026
Staging - 01/07/2026
2026-01-09 17:01:58 +05:30
yuneng-jiang 51759424a6 Key and Team Routing Setting 2026-01-07 17:17:30 -08:00
Kris Xia 91b5c66cf2 fix(proxy): return json error response instead of sse format for initial streaming errors (#18757)
* adding signoz integration to observability docs

* Fixing build

* Adding timeout for flaky test

* Fixing e2e

* fix(proxy): return json error response instead of sse format for initial streaming errors

when the first chunk of a streaming response contains an error,
return a standard json error response instead of sse format.
this ensures clients receive properly formatted error responses
before the stream actually begins.

- rename create_streaming_response to create_response
- add logic to detect error in first chunk and return JSONResponse
- add _extract_error_from_sse_chunk helper function
- update all call sites to use the new function name
- update tests to reflect the function rename

* test(proxy): add comprehensive tests for error extraction from sse chunks

- Add new test class TestExtractErrorFromSSEChunk with 10 test cases
- Update existing tests to verify JSONResponse returned for initial streaming errors
- Add tests for error code as string, bytes input, invalid JSON, and edge cases
- Verify correct error format extraction from SSE chunks

---------

Co-authored-by: Goutham Karthi <goutham@signoz.io>
Co-authored-by: yuneng-jiang <yuneng.jiang@gmail.com>
Co-authored-by: YutaSaito <36355491+uc4w6c@users.noreply.github.com>
2026-01-07 21:26:47 +05:30
Ishaan Jaff 1123cfa928 [Feat] AI Gateway - Add support for Platform Fee / Margins (#18427)
* init cost_margin_config

* feat: add cost margin

* init types

* LITELLM_SETTINGS_SAFE_DB_OVERRIDES

* feat _apply_cost_margin

* ui endpoint

* ui provider margins

* add margin

* refactored ui

* test cost margins

* refactored ui

* provider discounts

* add cost_breakdown to spendLogs

* add CostBreakdownViewer

* fix cost breakdown

* docs fix

* doc margins

* docs margins
2025-12-25 11:07:27 +05:30
Sameer Kankute caaf8a6784 Fix x-litellm-key-spend update 2025-12-12 11:44:51 +05:30
Krish Dholakia 1eb06f8031 Revert "fix: respect guardrail mock_response during during_call to return blo…" (#17332)
This reverts commit 6de6107673.
2025-12-01 15:40:28 -08:00
YutaSaito 6de6107673 fix: respect guardrail mock_response during during_call to return blocked output (#17247) 2025-12-01 09:59:01 -08:00
Ishaan Jaff a6c57cb5bd [Feat] Cost Tracking - specify a global vendor discount for costs. (#15546)
* fix cost_discount_config

* add CostBreakdown

* fix: set_cost_breakdown

* test_cost_discount_vertex_ai

* docs fix

* docs fix discounts

* docs fix

* docs custom pricing

* docs fix

* fixes for getting cost breakdown in response headers

* test - response headers wth discount
2025-10-14 20:07:04 -07:00
Alexsander Hamir eaa04cd8ce fix: use fastuuid helper (#14903)
* fix: use fastuuid helper across the codebase

First batch of changes, simple drop in replacement.

* second batch of changes

* fixed: script mistake on helper file
2025-09-25 15:47:01 -07:00
Ishaan Jaff 98d57b5d27 [Feat] Allow using x-litellm-stream-timeout header for stream timeout in requests (#14147)
* fix: allow passing stream_timeout header

* fix: _get_stream_timeout_from_request

* test_add_litellm_data_to_request_with_stream_timeout_header

* docs: LiteLLM Headers

* test_add_litellm_data_to_request_with_stream_timeout_header
2025-09-01 15:59:14 -07:00
Ishaan Jaff 8a4b163453 [Feat] DD Trace - Add instrumentation for streaming chunks (#11338)
* fix: add tracing for litellm.completion

* fix: NULL span add trace

* fix: add tracing for litellm.completion streaming

* fix: add tracing for litellm.completion streaming

* fix: use a constant for str
2025-06-02 16:48:39 -07:00
Krish Dholakia ef42461c1e Litellm fix GitHub action testing (#11163)
* test: add __init__.py files

* refactor: rename test folder to avoid naming conflict

* test: update workflows

* test: update tests

* test: update imports

* test: update tests

* test: remove unused import

* ci(test-litellm.yml): add pytest retry to github workflow

* test: fix test
2025-05-26 14:41:42 -07:00