Commit Graph

683 Commits

Author SHA1 Message Date
Jugal D. Bhatt 36229dc69f [LLM Translation] Fix Model Usage not having text tokens (#13234)
* fix + test

* remove test comments

* fix mypy

* fix mypy

* fix tests
2025-08-04 21:06:49 +05:30
Ishaan Jaff 44900e781a testing fixes - vertex ai deprecated claude 3 sonnet models 2025-08-01 21:23:52 -07:00
Ishaan Jaff 9d6098e8cc fix vertex deprecated old model 2025-08-01 16:46:16 -07:00
Krish Dholakia 78997c2e35 Anthropic - working mid-stream fallbacks (#13149)
* fix(router.py): add acompletion_streaming_iterator inside router

allows router to catch errors mid-stream for fallbacks

Work for https://github.com/BerriAI/litellm/issues/6532

* fix(router.py): working mid-stream fallbacks

* fix(router.py): more iterations

* fix(router.py): working mid-stream fallbacks with fallbacks set on router

* fix(router.py): pass prior content back in new request as assistant prefix message

* fix(router.py): add a system prompt to help guide non-prefix supporting models to use the continued text correctly

* fix(common_utils.py): support converting `prefix: true` for non-prefix supporting models

* fix: reduce LOC in function

* test(test_router.py): add unit tests for new function

* test: add basic unit test

* fix(router.py): ensure return type of fallback stream is compatible with CustomStreamWrapper

prevent client code from breaking

* fix: cleanup

* test: update test

* fix: fix linting error
2025-07-31 21:22:49 -07:00
Jugal D. Bhatt 5db4862cbf [MCP Gateway] Litellm mcp client list fail (#13114)
* fix headers

* fix test

* fix ruff

* added try except for catching errors which lead to client failures

* fix mypy

* fix ruff

* fix tests

* fix python error

* fix test

* fix test

* fixed the MCP Call Tool result
2025-07-30 15:23:19 -07:00
Krrish Dholakia ae947e63ce test: update test 2025-07-29 22:07:07 -07:00
Krrish Dholakia 378db1b62d test: remove o1-preview 2025-07-28 17:47:57 -07:00
Krish Dholakia 1737cf4257 VertexAI - camelcase optional params for image generation + Anthropic - streaming, always ensure assistant role set on only first chunk (#12889)
* fix(vertex_ai/image_generation): transform `_` param to camelcase

Fixes https://github.com/BerriAI/litellm/issues/12690

* test(test_vertex_image_generation.py): add unit tests

* fix(streaming_handler.py): assert only 1 assistant chunk in stream

Fixes https://github.com/BerriAI/litellm/issues/12616

* fix(streaming_handler.py): fix check
2025-07-27 10:09:43 -07:00
Ishaan Jaff 2c38dc0de7 test_router_auto_router 2025-07-26 13:33:53 -07:00
sings-to-bees-on-wednesdays eb96fb78bc fix(auth_utils): make header comparison case-insensitive (#12950)
If the user specified in the configuration e.g. "user_header_name:
X-OpenWebUI-User-Email", here we were looking for a dict key
"X-OpenWebUI-User-Email" when the dict actually contained
"x-openwebui-user-email".

Switch to iteration and case insensitive string comparison instead to
fix this.

This fixes customer budget enforcement when the customer ID is passed
in as a header rather than as a "user" value in the body.
2025-07-24 22:06:12 -07:00
Ishaan Jaff b8e404dd95 [Feat] Backend Router - Add Auto-Router powered by semantic-router (#12955)
* add router.json

* test_router_auto_router

* async_pre_routing_hook

* fixes for auto router

* add async_pre_routing_hook

* add LiteLLMRouterEncoder

* update test auto_router_embedding_model

* add auto_router_embedding_model

* add AutoRouter

* fix async_pre_routing_hook

* update async_pre_routing_hook

* fix auto router

* fix router.json

* working router init

* working embedding encoder

* working auto router

* test_router_auto_router

* test auto router

* add semantic-router as optional for litellm

* add extras

* semantic_router==0.1.10

* ruff fix

* use aiohttp==3.10.11

* python-dotenv==1.0.1

* test auto router

* test_router_auto_router

* semantic_router

* test_is_auto_router_deployment

* fix check

* fix docker build step

* add semantic_router

* Revert "add semantic_router"

This reverts commit 537b67288798731a119d811f643b682086377ee9.
2025-07-24 18:32:56 -07:00
Ishaan Jaff 99031bf8b6 ci/cd new release 2025-07-23 13:50:36 -07:00
Ishaan Jaff 461cd0c30a test_completion_cost_deepseek 2025-07-23 13:16:12 -07:00
Ishaan Jaff 79a0841719 test_router_content_policy_fallbacks 2025-07-23 13:04:28 -07:00
Ishaan Jaff 642cfa26b0 remove deprecated 2025-07-22 20:59:34 -07:00
Ishaan Jaff bf300f8ca7 Revert "Litellm dev 07 21 2025 p1 (#12848)"
This reverts commit e4e10aa4ed.
2025-07-22 18:28:36 -07:00
Ishaan Jaff 1910cf8496 test fix vertex ai 2025-07-22 18:06:38 -07:00
Krish Dholakia e4e10aa4ed Litellm dev 07 21 2025 p1 (#12848)
* fix(main.py): fix async retryer

Fixes https://github.com/BerriAI/litellm/issues/12830

* fix(forward_clientside_headers_by_model_group.py): filter out 'content-type' from forwardable headers

clientside content-type != proxy content type, can cause requests to hang

* test(tests/): update tests
2025-07-21 22:09:39 -07:00
Ishaan Jaff 49d40a1c3d test_router_provider_wildcard_routing 2025-07-21 21:33:40 -07:00
Tomáš Dvořák 270e3d75db fix(watsonx): use correct parameter name for tool choice (#9980)
Closes BerriAI/litellm#9979
2025-07-21 19:01:10 -07:00
Ishaan Jaff 4a7b9dee5f test fix - anthropic deprecated claude 2 2025-07-21 18:22:39 -07:00
Ishaan Jaff 7bd5ce595d test_provider_budgets_e2e_test_expect_to_fail 2025-07-19 16:00:25 -07:00
Ishaan Jaff 48dede9367 test_redis_proxy_batch_redis_get_cache 2025-07-19 15:58:25 -07:00
Krish Dholakia ab09d0621d Litellm gemini grounding metadata stream (#12673)
* fix(prompt_templates/factory.py): handle anthropic cache control on individual tool results

Fixes issue where cache control on individual tool result was being ignored

* test(test_vertex_And_google_ai_studio_gemini.py): initial unit test covering translation for grounding metadata on streaming chunk

* fix(vertex_and_google_ai_studio.py): ensure grounding metadata is preserved on streaming

Closes https://github.com/BerriAI/litellm/issues/10237

* fix(core_helpers.py): include usage in expected openai keys
2025-07-19 11:52:12 -07:00
Jugal D. Bhatt be60d12ff7 [LLM Translation - Redis] fix: redis caching for embedding response models (#12750)
* fix: redis caching for embedding responses

* add helper

* add mypy fixes

* lint fix

* review changes

* remove file

* fix ruff

* add if check

* add if check
2025-07-18 16:31:10 -07:00
Jugal D. Bhatt a46b9d376f [Prometheus] Move Prometheus to enterprise folder (#12659)
* fix tools fetch for keys

* add promethues to enterprise

* remove old prom

* remove old prom

* fix tests

* safe imports

* add if

* fix enterprise test

* rename imports

* added label import

* added label import

* move tests to enterprise

* fix tests

* add log

* build: update versions

---------

Co-authored-by: Krrish Dholakia <krrishdholakia@gmail.com>
2025-07-18 11:54:47 -07:00
Ishaan Jaff d9943f9812 fix cohere InternalServerError error mapping 2025-07-16 16:13:34 -07:00
Krrish Dholakia 8a1a90bc7a test: update test 2025-07-16 10:28:08 -07:00
Krrish Dholakia 9cac629ca6 test: update test 2025-07-16 09:13:16 -07:00
Krrish Dholakia 85fe1d35e1 test: update test, remove old gemini models 2025-07-15 22:31:49 -07:00
Ishaan Jaff 84261f3ac8 test_create_delete_assistants 2025-07-15 21:35:25 -07:00
Ishaan Jaff f0e87d4eb0 test test_redis_caching_multiple_namespaces (#12552) 2025-07-12 12:06:16 -07:00
Ishaan Jaff 7eb1a68854 fix test_qdrant_semantic_cache_acompletion 2025-07-11 22:07:19 -07:00
Ishaan Jaff 7ac187b269 test_pre_call_hook_team_rpm_limits 2025-07-11 15:52:21 -07:00
Ishaan Jaff 47ba271d58 test_http_parsing_utils.py 2025-07-10 18:20:41 -07:00
Krish Dholakia bda9eecd45 Litellm dev 07 05 2025 p3 (#12349)
* refactor(aim.py): refactor to support adding aim guardrails on UI

* fix(base.py): add ui_friendly_name to config model

* feat(ui/): support loading new guardrails from backend api call

removes need to onboard each guardrail to ui

* fix: don't show optional params if not set and don't show ui_friendly_name (internal param0

* fix(ui/add_guardrail_form.tsx): ensure dynamic provider value is used

* fix(ui/): just one-time update the provider map dictionary

* fix(ui/): show masked api base / api key on guardrail update

* refactor(aporia_ai/): refactor to show on UI

* feat(aporia_ai/): add aporia ai guardrail to UI

* refactor(guardrails_ai/): refactor to add via UI

* refactor(lasso.py): refactor to enable adding lasso guardrails via UI

* feat(pangea.py): add pangea guardrail on UI

* feat(panw): add panw prisma airs through UI

* test: update tests

* fix: fix ruff linting error

* test: update tests

* fix: add missing docs

* fix: fix guardrail init

* fix: suppress linting errors

* fix(proxy_server.py): fix linting error
2025-07-05 18:44:00 -07:00
Krish Dholakia 380eb31103 feat(vertex_ai/): add new deepseek-ai api service (#12312)
* feat(vertex_ai/): add new deepseek-ai api service

Closes https://github.com/BerriAI/litellm/issues/12192

* test: cleanup test
2025-07-05 10:38:37 -07:00
Ishaan Jaff 19e26a5c60 test_default_api_base 2025-07-04 18:26:54 -07:00
Ishaan Jaff 59f3771799 test_text_completion_stream - hf 2025-07-03 16:00:51 -07:00
Ishaan Jaff 437f4765b4 test_completion_mistral_api_mistral_large_function_call_with_streaming 2025-07-03 14:58:28 -07:00
Krish Dholakia c0319d0d01 Litellm dev fix gemini web search tracking (#12288)
* feat(stream_chunk_builder_utils.py): correctly return web_search_requests on stream chunk builder

* fix(types/utils.py): handle prompttokendetails

* fix(stream_chunk_builder_utils.py): fix ruff check error

* test: try-except rate limit error

* fix: fix import
2025-07-03 12:27:14 -07:00
Ishaan Jaff 75bb22a868 fix huggingface/deepseek-ai/DeepSeek-R1 2025-07-03 12:13:51 -07:00
Ishaan Jaff 5630147e80 Revert "Revert "fix tests (#12286)""
This reverts commit 12f157513b.
2025-07-03 12:08:27 -07:00
Ishaan Jaff 12f157513b Revert "fix tests (#12286)"
This reverts commit 99ce3a24cc.
2025-07-03 12:04:23 -07:00
célina 99ce3a24cc fix tests (#12286) 2025-07-03 10:57:19 -07:00
Krrish Dholakia a198d4a39f test: change mistral model
service tier exceeded
2025-07-02 21:11:02 -07:00
Ishaan Jaff 6b623f9c98 test whitelisted models 2025-06-28 14:46:16 -07:00
Ishaan Jaff 041db0268c [Bug fix] Router - handle cooldown_time = 0 for deployments (#12108)
* fix get cooldown time

* fixes for _should_run_cooldown_logic

* test_cooldown_time_zero_uses_zero_not_default

* Update litellm/router_utils/cooldown_cache.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update litellm/router_utils/cooldown_handlers.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-06-27 17:50:35 -07:00
Krish Dholakia 7f8b2579a2 Minor Fixes (#11868)
* fix(litellm_pre_call_utils.py): add user agent tags to spend logs in standard logging payload logic

avoid clash when tag based routing is enabled

* test: remove redundant test

* test: rename oidc test to run earlier

quicker debuging

* fix(azure.py): return more detailed error message

* fix(azure/common_utils.py): use default scope, if scope is none

fixes oidc test

* fix: always default to cognitiveservices.azure.com

* test: update test
2025-06-18 14:12:59 -07:00
Krish Dholakia 0319adbf5d feat(speech/): working gemini tts support via openai's /v1/speech endpoint (#11832)
* feat(speech/): working gemini tts support via openai's `/v1/speech` endpoint

Enables calling gemini models via `/v1/speech`

* feat(speech_to_completion_bridge/): voice param support

enables passing voice param to gemini models

* fix: fix ruff checks

* fix: fix checks
2025-06-18 10:36:25 -07:00