Commit Graph

2585 Commits

Author SHA1 Message Date
Ishaan Jaff f3749709b8 Bug Fix - Responses API raises error with Gemini Tool Calls in input (#13260)
* add _transform_responses_api_function_call_to_chat_completion_message

* test_responses_api_with_tool_calls

* TestFunctionCallTransformation

* fixes for responses API testing google ai studio

* TestGoogleAIStudioResponsesAPITest

* test_responses_api_with_tool_calls

* test_responses_api_with_tool_calls

* test_basic_openai_responses_streaming_delete_endpoint
2025-08-04 12:01:33 -07:00
Ishaan Jaff dae72003a7 [Bug Fix] OpenAI / Azure Responses API - Add service_tier , safety_identifier supported params (#13258)
* test_aresponses_service_tier_and_safety_identifier

* add service_tier + safety_identifier

* fix get_supported_openai_params

* add safety_identifier + service_tier for responses()
2025-08-04 10:51:53 -07:00
Jugal D. Bhatt 36229dc69f [LLM Translation] Fix Model Usage not having text tokens (#13234)
* fix + test

* remove test comments

* fix mypy

* fix mypy

* fix tests
2025-08-04 21:06:49 +05:30
Krish Dholakia 3119064e94 Prompt Management - add prompts on UI (#13240)
* fix(create_key_button.tsx): add prompts on UI

* feat(key_management_endpoints.py): support adding prompt to key via `/key/update`

* fix(key_info_view.tsx): show existing prompts on key in key_info_view.tsx

* fix(key_edit_view.tsx): UX - disable premium feature for non-premium users

prevent accidental clicking

* fix(create_key_button.tsx): disable premium features behind flag, prevent errors

* feat(prompts.tsx): add new ui component to view created prompts

enables viewing prompts created on config

* feat(prompt_info.tsx): add component for viewing the prompt information

* feat(prompt_endpoints.py): support converting dotprompt to json structure + accept json structure in promptmanager

allows prompt manager to work with api endpoints

* test(test_prompt_manager.py): add unit tests for json data input

* feat(dotprompt/__init__.py): add prompt data to dotpromptmanager

* fix(prompt_endpoints.py): working crud endpoints for prompt management

* feat(prompts/): support `prompt_file` for dotprompt

allows to precisely point to the prompt file a prompt should use

* feat(proxy/utils.py): resolve prompt id correctly

resolves user sent prompt id with internal prompt id

* feat(schema.prisma): initial pr with db schema for prompt management table

allows post endpoints to work with backend

* feat(prompt_endpoints.py): use db in patch_prompt endpoint

* feat(prompt_endpoints.py): use db for update_prompt endpoint

* feat(prompt_endpoints.py): use db on prompt delete endpoint

* build(schema.prisma): add prompt tale to schema.prisma in litellm-proxy-extras

* build(migration.sql): add new sql migration file

* fix(init_prompts.py): fix init

* feat(prompt_info_view.tsx): show the raw prompt template on ui

allows developer to know the prompt template they'll be calling

* feat(add_prompt_form.tsx): working ui add prompt flow

allows user to add prompts to litellm via ui

* build(ui/): styling fixes

* build(ui/): prompts.tsx

styling improvements

* fix(add_prompt_form.tsx): styling improvements

* build(prompts.tsx): styling improvements

* build(ui/): styling improvements

* build(ui/): fix ui error

* fix: fix ruff check

* docs: document new api params

* test: update tests
2025-08-02 22:33:37 -07:00
Ishaan Jaff 2dd9361cd9 Revert "Revert "Fix SSO Logout | Create Unified Login Page with SSO and Username/Password Options (#12703)""
This reverts commit 5fe37b6f72060add859a22ddda0665cd1635f98f.
2025-08-02 12:25:22 -07:00
Ishaan Jaff af9031ba41 Revert "Fix SSO Logout | Create Unified Login Page with SSO and Username/Password Options (#12703)"
This reverts commit a752d7acc9.
2025-08-02 12:25:22 -07:00
Krish Dholakia 342fd2d8b6 Revert "fix: role chaining and session name with webauthentication for aws be…" (#13230)
This reverts commit 0ac093b59e.
2025-08-02 10:11:58 -07:00
Alexander Yastrebov 825923e7be litellm/proxy: preserve model order of /v1/models and /model_group/info (#13178)
Closes #12644

Signed-off-by: Alexander Yastrebov <alexander.yastrebov@zalando.de>
2025-08-02 08:57:38 -07:00
Richard Tweed 0ac093b59e fix: role chaining and session name with webauthentication for aws bedrock (#13205)
* fix(bedrock): prevent duplicate role assumption in EKS/IRSA environments

Fixes issue where AWS role assumption would fail in EKS/IRSA environments
when trying to assume the same role that's already being used.

The problem occurred when:
1. EKS/IRSA automatically assumes a role (e.g., LitellmRole)
2. LiteLLM tries to assume the same role again, causing AccessDenied errors
3. Different models with different roles would fail due to incorrect role context

Changes:
- Added check in _auth_with_aws_role() to detect if already using target role
- Skip role assumption if current identity matches target role
- Return current credentials instead of attempting duplicate assumption
- Added comprehensive test coverage for the fix

This ensures proper role chaining works in EKS/IRSA environments where:
- Service Account can assume Role A
- Role A can assume Role B for different models/accounts

Resolves the AccessDenied errors reported in bedrock usage scenarios.

* fix(bedrock): simplify role assumption for EKS/IRSA environments

Fixes AWS Bedrock role assumption in EKS/IRSA environments by properly
handling ambient credentials when no explicit credentials are provided.

The issue occurred because commit 197e7efa8f
introduced changes that broke role assumption in EKS/IRSA environments.

Changes:
- Simplified _auth_with_aws_role() to use ambient credentials when no
  explicit AWS credentials are provided (aws_access_key_id and
  aws_secret_access_key are both None)
- This allows web identity tokens in EKS/IRSA to work automatically
  through boto3's credential chain
- Maintains backward compatibility for explicit credential scenarios

Added comprehensive test coverage:
- test_eks_irsa_ambient_credentials_used: Verifies ambient credentials work
- test_explicit_credentials_used_when_provided: Ensures explicit creds still work
- test_partial_credentials_still_use_ambient: Edge case handling
- test_cross_account_role_assumption: Multi-account scenarios
- test_role_assumption_with_custom_session_name: Custom session names
- test_role_assumption_ttl_calculation: TTL calculation verification
- test_role_assumption_error_handling: Error propagation
- test_multiple_role_assumptions_in_sequence: Sequential role assumptions

This fix ensures that in EKS/IRSA environments:
1. Service accounts can assume their initial role via web identity
2. That role can then assume other roles across accounts as configured
3. Different models can use different roles without conflicts

* fix(bedrock): add automatic IRSA detection for EKS environments

- Detect AWS_WEB_IDENTITY_TOKEN_FILE and AWS_ROLE_ARN environment variables
- Automatically use web identity token flow when IRSA is detected
- Read web identity token from file and pass to existing auth method
- Add test coverage for IRSA environment detection
- Fixes authentication errors in EKS with IRSA when no explicit credentials provided

* fix(bedrock): skip role assumption when IRSA role matches requested role

- Detect when AWS_ROLE_ARN environment variable matches the requested role
- Skip unnecessary role assumption when already running as the target role
- Use existing env vars authentication method for IRSA credentials
- Add test coverage for same-role IRSA scenario
- Fixes 'not authorized to perform: sts:AssumeRole' errors when trying to assume the same role

* fix(bedrock): use boto3's native IRSA support for cross-account role assumption

- Replace custom web identity token handling with boto3's built-in IRSA support
- boto3 automatically reads AWS_WEB_IDENTITY_TOKEN_FILE and assumes initial role
- Then use standard assume_role for cross-account access
- Update test to mock boto3 STS client instead of internal methods
- Fixes 'OIDC token could not be retrieved from secret manager' error

* fix(bedrock): improve IRSA error handling and add debug logging

- Add debug logging to show current identity and role assumption attempts
- Provide clearer error messages for trust policy issues
- Fix region handling in IRSA flow
- Re-raise exceptions instead of silently falling through
- This helps diagnose cross-account role assumption permission issues

* fix(bedrock): manually assume IRSA role with correct session name for cross-account scenarios

- When doing cross-account role assumption, manually assume the IRSA role first with the desired session name
- This ensures the session name in the assumed role ARN matches what's expected in trust policies
- For same-account scenarios, continue using boto3's automatic IRSA support
- Updated tests to handle the new flow
- This fixes the issue where cross-account trust policies require specific session names

* fix: Fix linting issues in base_aws_llm.py

- Fix f-string without placeholders (F541)
- Refactor _auth_with_aws_role to reduce statements count (PLR0915)
  - Extract _handle_irsa_cross_account helper method
  - Extract _handle_irsa_same_account helper method
  - Extract _extract_credentials_and_ttl helper method

---------

Co-authored-by: openhands <openhands@all-hands.dev>
2025-08-02 08:55:35 -07:00
Sameer Kankute 1e33dc50a0 add Perplexity citation annotations support (#13225) 2025-08-02 08:47:35 -07:00
Ishaan Jaff 44900e781a testing fixes - vertex ai deprecated claude 3 sonnet models 2025-08-01 21:23:52 -07:00
Jugal D. Bhatt 900c7f45c0 [MCP Gateway] Litellm mcp pre and during guardrails (#13188)
* add guardrail support

* add guardrail support

* guardrails for MCP

* added changes

* add mcp guardrails

* added test

* add ui

* fix guardrail form

* working with cursor

* remvoe print

* fix mcp servertests

* fix mypy and remove console logs

* fix mypy and remove console logs

* fix mypy tests
2025-08-01 20:02:25 -07:00
Jugal D. Bhatt a4c11600a9 [LLM] fix model reload on model update (#13216)
* fix model reload on model update

* remove the flag
2025-08-01 18:08:02 -07:00
Jugal D. Bhatt 3867813277 [Proxy]fix key mgmt (#13148)
* fix key mgmt

* Add unit test
2025-08-01 17:17:15 -07:00
Ishaan Jaff 9d6098e8cc fix vertex deprecated old model 2025-08-01 16:46:16 -07:00
Ishaan Jaff 1358978abb test_recreate_prisma_client_successful_disconnect 2025-08-01 15:38:48 -07:00
Jugal D. Bhatt bfabf2709a [LLM translation] Fix bedrock computer use #13143 (#13150)
* fix json test

* fix pr

* fix bedrock computer use tool

* added unit test

* fix failing prisma tesT

* fix prisma connect
2025-08-01 15:02:44 -07:00
Krrish Dholakia d158a0344d test: update unit tests 2025-08-01 13:37:51 -07:00
Krrish Dholakia 72fd4e3d55 test: remove bad unit tests 2025-08-01 13:34:23 -07:00
Krrish Dholakia e3c9fc458d test: update tests 2025-08-01 09:19:31 -07:00
Krrish Dholakia 461b615bde test: update tests 2025-08-01 09:12:44 -07:00
Krrish Dholakia 952c2b5215 test: update test 2025-08-01 09:07:53 -07:00
Krrish Dholakia fe24c270de Prompt Management - add local dotprompt file support 2025-07-31 22:28:29 -07:00
Jason Roberts 04c299410e Fix/panw prisma airs post call hook (#13185)
* fix(guardrails): Fix PANW Prisma AIRS post-call hook method name

- Changed async_post_call_hook to async_post_call_success_hook to match proxy calling convention
- Added event_hook parameter to initialization to ensure proper hook registration
- Fixes post-call response scanning for PANW Prisma AIRS guardrails

Resolves issue where post-call hooks were not being invoked due to method name mismatch.

* Update PANW Prisma AIRS tests to use correct method name
2025-07-31 21:50:32 -07:00
Krish Dholakia 78997c2e35 Anthropic - working mid-stream fallbacks (#13149)
* fix(router.py): add acompletion_streaming_iterator inside router

allows router to catch errors mid-stream for fallbacks

Work for https://github.com/BerriAI/litellm/issues/6532

* fix(router.py): working mid-stream fallbacks

* fix(router.py): more iterations

* fix(router.py): working mid-stream fallbacks with fallbacks set on router

* fix(router.py): pass prior content back in new request as assistant prefix message

* fix(router.py): add a system prompt to help guide non-prefix supporting models to use the continued text correctly

* fix(common_utils.py): support converting `prefix: true` for non-prefix supporting models

* fix: reduce LOC in function

* test(test_router.py): add unit tests for new function

* test: add basic unit test

* fix(router.py): ensure return type of fallback stream is compatible with CustomStreamWrapper

prevent client code from breaking

* fix: cleanup

* test: update test

* fix: fix linting error
2025-07-31 21:22:49 -07:00
Krish Dholakia c7e4435bdc Fix - using managed files w/ OTEL + UI - add model group alias on UI (#13171)
* fix(router.py): safe deep copy kwargs

OTEL adds a parent_otel_span which cannot be deepcopied

* fix: use safe deep copy in other places as well

* test: add script to check and ban copy.deepcopy of kwargs

enforce safe_deep_copy usage

* build(ui/): new component for adding model group alias on UI

* fix(proxy_server.py): support updating model_group_alias via /config/update

allows ui component to work

* fix(router.py): update model_group_alias in router settings based on db value

* fix: fix code qa error
2025-07-31 21:22:04 -07:00
Cole McIntosh 0666ede8e3 fix: correct patch path in langfuse test for MAX_LANGFUSE_INITIALIZED_CLIENTS (#13192)
The test was failing because it was trying to patch MAX_LANGFUSE_INITIALIZED_CLIENTS
at the wrong path. The constant is imported from litellm.constants into the langfuse
module namespace, so we need to use patch.object on the imported module reference.

Changes:
- Import langfuse module explicitly for patching
- Use patch.object instead of patch string path
- This fixes the AttributeError that was causing CI failures
2025-07-31 17:11:28 -07:00
Ishaan Jaff ee70d593c1 [Feat] Allow redacting message / response content for specific logging integrations - DD LLM Observability (#13158)
* fix redact_standard_logging_payload

* add StandardCustomLoggerInitParams

* allow defining DatadogLLMObsInitParams

* fix init DataDogLLMObsLogger

* fix import

* update redact_standard_logging_payload_from_model_call_details

* test_dd_llms_obs_redaction

* docs DD logging

* docs DD

* docs DD

* Redacting Messages, Response docs DD LLM Obs

* fix redaction logic

* fix create_llm_obs_payload

* fix logging response

* fixes

* ruff fix

* fix test

* test_dd_llms_obs_redaction

* test_create_llm_obs_payload

* redact_standard_logging_payload_from_model_call_details

* img - dd_llm_obs

* docs DD

* fix linting

* fix linting

* fix mypy

* test_create_llm_obs_payload

* test_create_llm_obs_payload

* fix mock_env_vars

* fix _handle_anthropic_messages_response_logging
2025-07-31 16:44:16 -07:00
Ishaan Jaff 115d2480c1 [Bug Fix] Infra - ensure that stale Prisma clients disconnect DB connection (#13140)
* ensure original client is disconnected when re-creating

* test_recreate_prisma_client_successful_disconnect

* test_recreate_prisma_client_successful_disconnect
2025-07-31 16:43:26 -07:00
Ishaan Jaff cbb922b1bb [Bug Fix] Gemini-CLI Integration - ensure tool calling works as expected on generateContent (#13189)
* transform_generate_content_request

* add tools in GenerateContentRequestDict

* add generate_content_handler tool calling

* google_generate_content_endpoint_testing

* test_mock_stream_generate_content_with_tools

* test_validate_post_request_parameters

* fixes for generate_content_handler

* fix VertexAIGoogleGenAIConfig

* fixes veretx ai

* google_generate_content_endpoint_testing

* test_async_streaming_with_logging

* load_vertex_ai_credentials

* test_vertex_anthropic.py
2025-07-31 16:42:57 -07:00
Anand Khinvasara 212a339954 fix: support negative indexes in cache_control_injection_points for Anthropic Claude (#10226) (#13187) 2025-07-31 15:50:53 -07:00
Jugal D. Bhatt 524a1ffd5f [Proxy Startup]fix db config through envs (#13111)
* fix db config through envs

* add helper

* fix ruff

* fix imports

* add unit tests in db config changes
2025-07-31 13:52:56 -07:00
Ishaan Jaff 79be436c2b [Feat] Background Health Checks - Allow disabling background health checks for a specific (#13186)
* disable background health checks for specific models

* test_background_health_check_skip_disabled_models

* Disable Background Health Checks For Specific Models
2025-07-31 13:48:35 -07:00
Ishaan Jaff 65ca4f66f6 Revert "add framework name to UserAgent header in AWS Bedrock API call (#13159)"
This reverts commit 77f506e860.
2025-07-30 23:12:36 -07:00
Ishaan Jaff fad453bbf3 test_chat_completion_ratelimit 2025-07-30 23:12:11 -07:00
0x-fang 77f506e860 add framework name to UserAgent header in AWS Bedrock API call (#13159) 2025-07-30 22:44:22 -07:00
Krish Dholakia c6a8733234 build(config.yml): migrate build_and_test to ci/cd pg db (#13166) 2025-07-30 18:19:36 -07:00
Ishaan Jaff cf4c639dad test fix xai - it goes through base llm tests already 2025-07-30 18:18:49 -07:00
Krrish Dholakia 09cc748871 test: handle api instability 2025-07-30 16:32:23 -07:00
Ishaan Jaff 090e2ffb5a Revert "[LLM translation] Fix bedrock computer use (#13143)"
This reverts commit 840dd2e7c7.
2025-07-30 16:03:35 -07:00
Jugal D. Bhatt 5db4862cbf [MCP Gateway] Litellm mcp client list fail (#13114)
* fix headers

* fix test

* fix ruff

* added try except for catching errors which lead to client failures

* fix mypy

* fix ruff

* fix tests

* fix python error

* fix test

* fix test

* fixed the MCP Call Tool result
2025-07-30 15:23:19 -07:00
Jugal D. Bhatt eb8a338d9b [MCP Guardrails] move pre and during hooks to ProxyLoggin (#13109)
* move pre and during hooks t o ProxyLoggin

* fix lint

* fix ruff

* fix tests
2025-07-30 13:58:41 -07:00
Jugal D. Bhatt 840dd2e7c7 [LLM translation] Fix bedrock computer use (#13143)
* Add support for bedrock computer use

* remove print

* split bedrock tools

* add hosted tools

* fix tool use

* fix tool use

* fix function calling

* fix converse transformation

* fix tests

* fix llm translation test

* fix computer use
2025-07-30 12:27:12 -07:00
Jugal D. Bhatt e324f76859 [MCP Gateway] add health check endpoints for MCP (#13106)
* add health check endpoints for MCP

* add import

* Clean up endpopints

* fix ruff
2025-07-30 20:40:44 +05:30
Johnny.H 97d89584c1 fix tool aws bedrock call index when the function only have optional arg (#13115) 2025-07-29 22:07:24 -07:00
Krrish Dholakia ae947e63ce test: update test 2025-07-29 22:07:07 -07:00
Krish Dholakia ea6b4b08d3 move to use_prisma_migrate by default + resolve team-only models on auth checks + UI - add sagemaker on UI (#13117)
* fix(proxy_cli.py): make use_prisma_migrate proxy default

Fixes https://github.com/BerriAI/litellm/issues/13046

 Prisma migrate deploy prevents resetting db

* fix(auth_checks.py): resolve team only models while doing auth checks on model access groups

Fixes issue where key had access via an access group, but team only model could not be called

* test(test_router.py): add unit testing

* feat(provider_specific_fields.tsx): add aws sagemaker on UI
2025-07-29 21:56:18 -07:00
Krrish Dholakia 7e5bc8af28 test: update test 2025-07-29 21:35:44 -07:00
Krish Dholakia 1c182919b5 Revert "[LLM translation] Add support for bedrock computer use (#12948)" (#13118)
This reverts commit 760d747465.
2025-07-29 21:33:46 -07:00
Krrish Dholakia f544a4e238 test: update test 2025-07-29 21:08:36 -07:00