Commit Graph

27174 Commits

Author SHA1 Message Date
YutaSaito 6eb74bd62a Feat/persist mcp credentials in db (#16308)
* feat: persist mcp credentials in db

* feat: remove Auth Value field from MCP Tool Testing Playground

* fix: test
2025-11-07 19:22:49 -08:00
Sameer Kankute b6f792f301 Added support for desabling thoughts by setting budget to 0 (#16347) 2025-11-07 19:19:16 -08:00
Jack Cherng 2ab34f9a52 Fix HostedVLLMRerankConfig will not be used (#16352)
* Fix HostedVLLMRerankConfig will not be used

Signed-off-by: Jun-Fei Cherng <jfcherng@realtek.com>

* Fix no usage statistics in rerank with hosted_vllm

Signed-off-by: Jun-Fei Cherng <jfcherng@realtek.com>

* Revise typo in comment

Signed-off-by: Jun-Fei Cherng <jfcherng@realtek.com>

---------

Signed-off-by: Jun-Fei Cherng <jfcherng@realtek.com>
2025-11-07 19:11:59 -08:00
Sumit Tembe 0a527bd1d8 Fix cache_read_input_token_cost for gemini-2.5-flash (#16354) 2025-11-07 19:11:10 -08:00
Sameer Kankute fd95909d9f Add cohere embed v4 model support (#16358) 2025-11-07 19:10:42 -08:00
Will Chen c495320b87 Propagate cache creation/read token costs for model info to fix Anthropic long context cost calculations (#16376)
* Fix

* fix

* fix

* fix
2025-11-07 19:06:43 -08:00
Cesar Garcia d65a29b88d docs: fix image generation response format from 'image' to 'images' (#16378)
Update documentation to reflect actual API response format:
- Change singular 'image' field to plural 'images' array
- Add complete ImageURLListItem structure with index and type fields
- Update all code examples to use message.images instead of message.image
- Fix streaming examples to access images[0]["image_url"]["url"]

The documentation was incorrectly showing 'image' (singular object)
but the actual implementation returns 'images' (array of ImageURLListItem).

Related to issue #16227
2025-11-07 19:06:03 -08:00
Ishaan Jaffer a8533dc5c4 Revert "Added xai responses support (#16310)"
This reverts commit ee50f09e73.
2025-11-07 18:39:38 -08:00
Ishaan Jaffer 9a022b8277 fix _handle_logging_completed_response 2025-11-07 18:37:34 -08:00
Ishaan Jaffer 9856fb75d1 fix mistral stream test 2025-11-07 18:31:02 -08:00
Ishaan Jaffer 8b281c2f03 fix linting 2025-11-07 18:25:55 -08:00
Ishaan Jaffer 8cd43de931 fix qa check 2025-11-07 18:25:55 -08:00
Ishaan Jaff f1534c1650 Revert "[Fix] UI - Revert Changes for Test Key Multiple Model Select (#16372)" (#16386)
This reverts commit 99a8a304c1.
2025-11-07 18:24:01 -08:00
Ishaan Jaffer 4dcbfbbb89 ui fix linting 2025-11-07 18:20:31 -08:00
Ishaan Jaffer 736b6b3e1e TestVertexAIRerankTransform 2025-11-07 18:18:48 -08:00
Ishaan Jaff a978680714 [UI] Guardrails - allow updating guardrails through UI. Ensure litellm_params actually get updated in memory (#16384)
* fix safe dumps

* add patterns.json

* add PrebuiltPattern

* add test patterns

* fix edit and view

* fix backend handling

* fix CF ui edit

* fix init

* add _has_guardrail_params_changed

* fix sync_guardrail_from_db

* fix _has_guardrail_params_changed

* fix patch_guardrail

* add unsaved change check

* fix ContentFilterManager
2025-11-07 18:15:59 -08:00
Ishaan Jaff 674d4b4cab [Feat] Guardrails - LiteLLM Content Filter, Allow Viewing/Editing Content Filter Settings (#16383)
* fix safe dumps

* add patterns.json

* add PrebuiltPattern

* add test patterns

* fix edit and view

* fix backend handling

* fix CF ui edit

* fix init
2025-11-07 18:15:09 -08:00
Ishaan Jaff ca229fe030 [Feat] LiteLLM Guardrail - UI Fix, ensure you can see UI Friendly name for PII Patterns (#16382)
* fix safe dumps

* add patterns.json

* add PrebuiltPattern

* add test patterns
2025-11-07 18:14:58 -08:00
Krrish Dholakia 532ebf43d0 docs(moderation.md): fix moderation quick start docs 2025-11-07 16:25:08 -08:00
Ishaan Jaffer 514e3e01f9 ui new build 2025-11-07 15:39:07 -08:00
Ishaan Jaffer 5b5122125c guardrailLogoMap fix 2025-11-07 15:37:01 -08:00
Ishaan Jaffer 4621a23a89 add litellm logo jpg 2025-11-07 15:36:49 -08:00
Ishaan Jaffer bc2ff66f5a litellm_logo 2025-11-07 15:36:19 -08:00
Sameer Kankute faae0ff0dc Fix Azure DALL-E-3 health check content policy violation by using safe default prompt (#16329)
* Add custom health check prompt support

* Add constant for health check prompt

* Add constant for health check prompt
2025-11-07 15:30:56 -08:00
Krrish Dholakia 9059905d25 docs(openai/videos.md): document proxy usage on openai docs for video gen 2025-11-07 15:27:18 -08:00
Ishaan Jaff 2bd85dc455 [Feat] Add DD Agent Host support for datadog callback (#16379)
* add DD_AGENT_HOST

* docs DD Agent

* test_datadog_agent_configuration

* DD_AGENT_HOST
2025-11-07 15:18:23 -08:00
yuneng-jiang 5fb0940b9b [Feature] UI - Surface SSO Create errors on create flow (#16369)
* Surface backend SSO issues on create SSO flow

* Remove unused import

* Fix flaky test
2025-11-07 14:44:50 -08:00
Ishaan Jaff a6b0993405 [Feat] Secret Manager - Hashicorp, add auth via approle (#16374)
* add _verify_required_credentials_exist and _auth_via_approle

* test_hashicorp_secret_manager_approle_auth

* docs hcorp auth
2025-11-07 14:39:33 -08:00
Ishaan Jaffer ea4048324b docs fix 2025-11-07 14:39:24 -08:00
Ishaan Jaff 9288c8543c fix docker (#16342) 2025-11-07 14:38:20 -08:00
Cesar Garcia 8c58f65c62 Fix MyPy type errors for aembedding call_type (#16360)
Add "aembedding" to Literal type hints in ProxyLogging methods:
- pre_call_hook overloads (lines 872, 893, 913)
- during_call_hook (line 1052)
- _process_guardrail_callback (line 803)

Add type: ignore comments where ProxyLogging calls CustomLogger
callbacks (lines 1021, 1106) to handle type mismatch between
ProxyLogging's broader Literal (includes "aembedding") and
CustomLogger's narrower Literal (doesn't include "aembedding").

Related to PR #16328 which changed embeddings endpoint to use
call_type="aembedding" for async operations.
2025-11-07 14:37:14 -08:00
Alan Ponnachan 5b01fe0a81 fix(vertex_ai): Correctly map 429 Resource Exhausted to RateLimitError (#16363) 2025-11-07 14:36:20 -08:00
Emerson Gomes 940a72ceb0 Add Vertex MiniMAX m2 (#16373) 2025-11-07 14:27:42 -08:00
yuneng-jiang 99a8a304c1 [Fix] UI - Revert Changes for Test Key Multiple Model Select (#16372)
* Revert "Initial changes for supporting prompts to multiple models"

This reverts commit 0d8dee4401a410531ddc4a29ec11dc17f7807c4b.

* Add test for the single model select
2025-11-07 13:39:17 -08:00
Chen Qian 50cd988c57 fix databricks streaming (#16368) 2025-11-07 13:12:44 -08:00
yuneng-jiang 92bc7db594 Remove encoding_format in api calls for test key embedding models (#16367) 2025-11-07 10:58:30 -08:00
Xingyao Wang 4860cdbfd5 Fix: Azure GPT-5 incorrectly routed to O-series config (temperature parameter unsupported) (#16246)
* Fix Azure GPT-5 incorrectly routing to O-series config

GPT-5 models support reasoning but are NOT O-series models and DO support
temperature parameter. The previous routing logic in get_provider_responses_api_config()
was incorrectly sending Azure GPT-5 requests to AzureOpenAIOSeriesResponsesAPIConfig
which removes temperature from supported params.

This fix explicitly excludes GPT-5 models from O-series routing, ensuring they
use the standard AzureOpenAIResponsesAPIConfig which properly supports temperature.

Fixes: Azure GPT-5 throwing UnsupportedParamsError for temperature parameter
Tested: Added comprehensive unit tests for GPT-5 and O-series routing

* Apply suggestion from @xingyaoww

* Apply suggestion from @xingyaoww

* Improve Azure routing logic to use broader 'gpt' check for temperature support

Based on feedback from @krrishdholakia, updated the routing logic to check
for 'gpt' in model name instead of specifically 'gpt-5'. This approach is:

- More future-proof: covers all GPT models (gpt-3.5, gpt-4, gpt-5, future models)
- Simpler: single check for all GPT variants
- More maintainable: won't need updates for each new GPT model

Changes:
- litellm/utils.py: Changed from is_gpt5 to is_gpt_model check
- tests: Added comprehensive test for all GPT model variants (gpt-3.5 through gpt-5)

All tests pass:
- GPT models (gpt-3.5-turbo, gpt-4, gpt-4o, gpt-5) -> AzureOpenAIResponsesAPIConfig (supports temperature)
- O-series models (o1, o3) -> AzureOpenAIOSeriesResponsesAPIConfig (no temperature)

Co-authored-by: openhands <openhands@all-hands.dev>

---------

Co-authored-by: openhands <openhands@all-hands.dev>
2025-11-07 10:24:27 -08:00
yuneng-jiang 0a90283fec Change usage page to have a parent date picker (#16264) 2025-11-07 10:00:37 -08:00
huangyf 20d1bed514 fix lobal.anthropic.claude-haiku-4-5-20251001-v1:0 supports_reasoning (#16263) 2025-11-06 19:35:57 -08:00
Sameer Kankute 94e1a1ecac Use vertex creds passed via arguments (#16266) 2025-11-06 19:35:22 -08:00
Jason Roberts 5733f711fd feat(guardrails): panw prisma airs guardrail deduplication and enhanced session tracking (#16273)
* feat(guardrails): Add deduplication and session tracking

- Implement deduplication logic to prevent duplicate scans (via call_id; add _check_and_mark_scanned) caused by LiteLLM callback system
- Add session tracking using litellm_trace_id as AI Session ID for Prisma AIRS SCM logging
- Extract helper methods _extract_prompt_from_request maintainability
- Use httpxSpecialProvider import (LoggingCallback -> GuardrailCallback)
- Add comprehensive tests for deduplication and session tracking (7 new tests)
- Update documentation with multi-turn conversation tracking examples

* docs: update PANW Prisma AIRS multi-turn conversation example to use industry-standard terminology

- Clearer example for conversation tracking
- Updated terminology from 'AI Session ID' to 'Prisma AIRS AI Session ID' for clarity

* fix: remove unused asyncio import

* fix: correct mypy type ignore comment
2025-11-06 19:34:37 -08:00
Andrii Kislitsyn c497b6f239 Add retry-after header support for errors 502, 503, 504 (#16288)
* retry-after-header-support-for-502-503-504-initial

* retry-after-header-support-for-502-503-504-tests-and-linters
2025-11-06 19:33:29 -08:00
Shimon Mimoun 277e370b3c inti (#16313) 2025-11-06 19:30:41 -08:00
Aleksei Terin 514969d7f7 fix: pass aws_region_name in litellm_params (#16321) 2025-11-06 19:29:38 -08:00
Cesar Garcia 16325024df fix: Use valid CallTypes enum value in embeddings endpoint (#16328)
* Fix embeddings endpoint call_type to use valid CallTypes enum value

Fixed bug where the `/embeddings` endpoint was passing `call_type="embeddings"`
to guardrail hooks, but "embeddings" is not a valid value in the CallTypes enum.

Changed to use `call_type="aembedding"` (async embedding) which is the correct
CallTypes enum value and matches the route_type used in the same function.

Added unit tests to verify:
- "embeddings" is not a valid CallTypes enum value
- "aembedding" is the correct valid value
- The fix prevents ValueError when guardrails are enabled

Fixes #16240

* Inline embeddings call type regression check

* Ensure embedding test preserves proxy metadata
2025-11-06 19:25:00 -08:00
yuneng-jiang 29e8d857f7 Fix SSO Proxy Base URL input validation and remove normalizing / (#16332) 2025-11-06 19:24:05 -08:00
Sameer Kankute 83998d3573 Update the fireworks url in tests and doc (#16346) 2025-11-06 19:22:21 -08:00
Ishaan Jaffer 93426ba331 ui new build 2025-11-06 19:20:40 -08:00
Ishaan Jaffer 9c59d42fc1 bump extras pkg 2025-11-06 19:17:03 -08:00
Ishaan Jaffer d6271ceefb test fixes mock tests 2025-11-06 19:11:19 -08:00