* Refactor proxy embeddings to use shared processor
- allow ProxyBaseLLMRequestProcessing to accept the aembedding route so embeddings requests reuse the base pipeline hooks
- route embeddings requests through base_process_llm_request, sharing logging, hook execution, retries, and header handling with chat/responses
- tighten token array decoding logic by using router deployment lookups and the unified error handler
* Fix: Correctly process embedding requests with token arrays
The `test_embedding_input_array_of_tokens` test was failing due to a regression that caused embedding requests with token arrays to be processed incorrectly. This prevented the `aembedding` function from being called as expected.
This was caused by a combination of three distinct issues:
1. In `litellm/proxy/common_request_processing.py`, the `function_setup` utility was called with `aembedding` as the `original_function` for embedding routes. This has been corrected to `embedding` to ensure proper request setup.
2. In `litellm/proxy/proxy_server.py`, a `TypeError` occurred because the `get_deployment` method was called with the `model_name` keyword argument instead of the expected `model_id`. This has been corrected. Additionally, the check for token arrays was improved to validate that all elements in the input subarray are integers.
3. In `litellm/proxy/litellm_pre_call_utils.py`, the check for the `enforced_params` enterprise feature was too strict. It blocked valid requests even when the `enforced_params` list was empty. The condition has been adjusted to trigger the check only for non-empty lists.
Finally, the `test_embedding_input_array_of_tokens` assertion was updated to be more robust. The previous `assert_called_once_with` was overly strict, causing failures when unrelated internal parameters were added to the function call. The test now first asserts that `aembedding` is called and then separately verifies the `model` and `input` arguments. This makes the test more resilient to future changes without sacrificing its ability to catch regressions.
* test: align proxy embedding assertions
Update the embedding proxy test to match the new request pipeline: keep the data the proxy builds, expect the extra control kwargs, let the post-call hook return the actual response, and assert the normalized 'embeddings' hook type. This proves the refactor still forwards metadata and returns the mocked payload.
* Update proxy exception test
The proxy now forwards additional kwargs (request_timeout, litellm_call_id, litellm_logging_obj) to llm_router.aembedding. The test needs to accept these to match the real call signature and keep validating the error path instead of the kwargs list.
* testing: unsure of this change
I don't remember why I changed this, will revert and see if any tests fail since the manual test isn't failing without it.
* fix: remove unrelated change
This change was not related to the embeddings refactor and actually belonged to a different branch.
* Fix embeddings endpoint call_type to use valid CallTypes enum value
Fixed bug where the `/embeddings` endpoint was passing `call_type="embeddings"`
to guardrail hooks, but "embeddings" is not a valid value in the CallTypes enum.
Changed to use `call_type="aembedding"` (async embedding) which is the correct
CallTypes enum value and matches the route_type used in the same function.
Added unit tests to verify:
- "embeddings" is not a valid CallTypes enum value
- "aembedding" is the correct valid value
- The fix prevents ValueError when guardrails are enabled
Fixes#16240
* Inline embeddings call type regression check
* Ensure embedding test preserves proxy metadata
* fix: use fastuuid helper across the codebase
First batch of changes, simple drop in replacement.
* second batch of changes
* fixed: script mistake on helper file
* disable background health checks for specific models
* test_background_health_check_skip_disabled_models
* Disable Background Health Checks For Specific Models
* fix(db_spend_update_writer.py): fix db query
* fix(litellm_pre_call_utils.py): support passing anthropic-beta headers when 'forward_client_headers_to_llm_api' is True
allows user to pass along extra headers to vertex ai anthropic models
* docs(config_settings.md): update docs
commit 440bc027251d8180174d762d83d271d0f7b68cc5
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date: Fri Jun 20 23:04:11 2025 -0700
fix: fix check
commit 89a7451cb9ee26ff9f642335714dcc6f449d1fc2
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date: Fri Jun 20 22:42:30 2025 -0700
fix: fix test
commit 1322e3b3497e5d334fdcaa18f0cf7a98ea758df4
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date: Fri Jun 20 20:52:40 2025 -0700
style: add more tooltips
commit 172738b98b7864aabcacf3334a394098b300283f
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date: Fri Jun 20 20:51:09 2025 -0700
feat(team_member_view.tsx): add a tooltip
commit 895eb28deb9127985e30b5e859e5bca8530951c9
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date: Fri Jun 20 18:46:49 2025 -0700
fix(teams.tsx): support setting team member budget on create
commit 003cc54a6dd0f65030c4f39a8487adc771b62e11
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date: Fri Jun 20 18:40:49 2025 -0700
fix(team_member_view.tsx): style improvements
commit a627a044f21df788f80d92a4081212072be91632
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date: Fri Jun 20 18:40:01 2025 -0700
fix(team_member_view.tsx): handle scientific notation in string
commit c5a3b7bd8419f6394e1b490849555d02d473baed
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date: Fri Jun 20 18:34:25 2025 -0700
feat(team_membership_view.tsx): show team member spend + max budget on UI
commit e986d12ad5b07c676f4cac5e16745939d7473dee
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date: Fri Jun 20 18:28:06 2025 -0700
feat(team_member_view.tsx): show team member spend + budget on team info
commit 8e398607b25f8a8f0bab41964810b5dd27c5e3f2
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date: Fri Jun 20 18:18:16 2025 -0700
feat(team_info.tsx): show team member budget on team info
commit 1f56886b5913dafefc0c00fbe741c0c9c01144a6
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date: Fri Jun 20 18:15:30 2025 -0700
feat(team_endpoints.py): get team budget table on team info
allows user to see max budget set for team members
commit 0a4320bbfa406c24ad32a420f82152da7bdd7323
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date: Fri Jun 20 18:10:06 2025 -0700
feat(team_endpoints.py): return team member budget on team info
allows ui to display this to admin / team member
commit 6a4e29f87b333ae9977e8f878960e63becd89150
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date: Fri Jun 20 17:57:20 2025 -0700
fix(team_endpoints.py): support updating team budget on UI
commit 53f0fff34032977433dfe6935ce0a684a4141fd8
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date: Fri Jun 20 17:38:17 2025 -0700
feat(proxy/_types.py): return team member spend
update pydantic object to include spend
Allows showing spend of team member within team on UI
commit ef2a1a43ecf7fecfb904042cbf47b3d56246edcb
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date: Fri Jun 20 16:31:42 2025 -0700
feat(team_endpoints.py): support 'team_member_budget' param on `/team/update`
enables budget working across all team members
commit 512999f1249b00a02a30f049a0cfa36e829ff989
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date: Fri Jun 20 16:20:04 2025 -0700
test: add unit tests for default team member budget
commit 90fa3f61a2d63e12b9f3e1da9775f5c8b7294b5f
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date: Fri Jun 20 15:37:51 2025 -0700
feat(team_endpoints.py): support using default team member budget id, if set
allows all team members to use the same budget id
commit acef5324b1a0935a482c71060f610c3d8823e8c3
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date: Fri Jun 20 15:22:30 2025 -0700
feat(team_endpoints.py): support `team_member_budget` param on `/team/new`
Allow creating 1 budget for all users within team (makes it easier to increase/reduce budget if needed for all team members)
commit 2e867ac70fbd8768e7c27cf3b078e6dc10e566b9
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date: Fri Jun 20 13:45:06 2025 -0700
fix(ui_sso.py): ensure user is added to team, if set via default internal settings
allows users signed up via SSO to be added to default team
* fix(internal_user_endpoints.py): support user with `+` in email on user info
ensures user is correctly parsed from input
* fix(factory.py): support vertex function call args as None
handles empty string in args for vertex gemini calls
* docs(langfuse_integration.md): pin langfuse sdk version on docs
* fix(vertex_ai/): return empty dict, instead of none when empty string given
* refactor: reduce function size
* fix: fix linting errors
* fix: revert check
* fix(internal_user_endpoints.py): fix check
* test: update tests
* test: update tests
* feat(handle_jwt.py): map user to team when added via jwt auth
makes it easy to ensure user belongs to team
* test: test_openai_image_edit_litellm_sdk
* use n 4 for mapped tests (#11109)
* Fix/background health check (#10887)
* fix: improve health check logic by deep copying model list on each iteration
* test: add async test for background health check reflecting model list changes
* fix: validate health check interval before executing background health check
* fix: specify type for health check results dictionary
* fix(user_api_key_auth.py): handle user custom auth set with no custom settings
* bump: version 0.1.21 → 0.2.0
* ci(config.yml): run enterprise and litellm tests separately
* fix: fix linting error
* docs: add missing docs
* [Feat] Add content policy violation error mapping for image editd (#11113)
* feat: add image edit mapping for content policy violations
* test fix
* Expose `/list` and `/info` endpoints for Audit Log events (#11102)
* feat(audit_logging_endpoints.py): expose list endpoint to show all audit logs
make it easier for user to retrieve individual endpoints
* feat(enterprise/): add audit logging endpoint
* feat(audit_logging_endpoints.py): expose new GET `/audit/{id}` endpoint
make it easier to retrieve view individual audit logs
* feat(key_management_event_hooks.py): correctly show the key of the user who initiated the change
* fix(key_management_event_hooks.py): add key rotations as an audit log event
'
* test(test_audit_logging_endpoints.py): add simple unit testing for audit log endpoint
* fix: testing fixes
* fix: fix ruff check
* [Feat] Use aiohttp transport by default - 97% lower median latency (#11097)
* fix: add flag for disabling use_aiohttp_transport
* feat: add _create_async_transport
* feat: fixes for transport
* add httpx-aiohttp
* feat: fixes for transport
* refactor: fixes for transport
* build: fix deps
* fixes: test fixes
* fix: ensure aiohttp does not auto set content type
* test: test fixes
* feat: add LiteLLMAiohttpTransport
* fix: fixes for responses API handling
* test: fixes for responses API handling
* test: fixes for responses API handling
* feat: fixes for transport
* fix: base embedding handler
* test: test_async_http_handler_force_ipv4
* test: fix failing deepeval test
* fix: add YARL for bedrock urls
* fix: issues with transport
* fix: comment out linting issues
* test fix
* test: XAI is unstable
* test: fixes for using respx
* test: XAI fixes
* test: XAI fixes
* test: infinity testing fixes
* docs(config_settings.md): document param
* test: test_openai_image_edit_litellm_sdk
* test: remove deprecated test
* bump respx==0.22.0
* test: test_xai_message_name_filtering
* test: fix anthropic test after bumping httpx
* use n 4 for mapped tests (#11109)
* fix: use 1 session per event loop
* test: test_client_session_helper
* fix: linting error
* fix: resolving GET requests on httpx 0.28.1
* test fixes proxy unit tests
* fix: add ssl verify settings
* fix: proxy unit tests
* fix: refactor
* tests: basic unit tests for aiohttp transports
* tests: fixes xai
---------
Co-authored-by: Krrish Dholakia <krrishdholakia@gmail.com>
* test: cleanup redundant test
---------
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: JuHyun Bae <jhyun0408@nate.com>
* fix: improve health check logic by deep copying model list on each iteration
* test: add async test for background health check reflecting model list changes
* fix: validate health check interval before executing background health check
* fix: specify type for health check results dictionary
* fix(key_management_endpoints.py): initial commit with logic to get all keys for teams user is an admin for
* fix(key_managements_endpoints.py): return all keys for teams user is an admin for
* fix(key_management_endpoints.py): add query param to ensure user opts into seeing all team keys (not just their own)
* fix(regenerate_key_modal.tsx): fix key regenerate
* fix(proxy_server.py): fix model metrics check on none api base
* test(test_key_generate_prisma.py): remove redundant test
* test(test_proxy_utils.py): add unit test covering new management endpoint helper util
* fix: fix test
* test(test_proxy_server.py): fix test
* test: add more unit testing for team member add
* fix(team_endpoints.py): add validation check to prevent same user from being added to team again
prevents duplicates
* fix(team_endpoints.py): raise error if `/team/member_delete` called on member that's not in team
prevent being able to call delete on same member multiple times
* test: update initial tests
* test: fix test
* test: update test to handle no member duplication
* docs(reliability.md): add doc on disabling fallbacks per request
* feat(litellm_pre_call_utils.py): support reading request timeout from request headers - new `x-litellm-timeout` param
Allows setting dynamic model timeouts from vercel's AI sdk
* test(test_proxy_server.py): add simple unit test for reading request timeout
* test(test_fallbacks.py): add e2e test to confirm timeout passed in request headers is correctly read
* feat(main.py): support passing metadata to openai in preview
Resolves https://github.com/BerriAI/litellm/issues/6022#issuecomment-2616119371
* fix(main.py): fix passing openai metadata
* docs(request_headers.md): document new request headers
* build: Merge branch 'main' into litellm_dev_01_27_2025_p3
* test: loosen test
* fix(internal_user_endpoints.py): fix team list sort - handle team_alias being set + None
* fix(key_management_endpoints.py): allow team admin to create key for member via admin ui
Fixes https://github.com/BerriAI/litellm/issues/7482
* fix(proxy_server.py): allow querying info on specific model group via `/model_group/info`
allows client-side user to get model info from proxy
* fix(proxy_server.py): add docstring on `/model_group/info` showing how to filter by model name
* test(test_proxy_utils.py): add unit test for returning model group info filtered
* fix(proxy_server.py): fix query param
* fix(test_Get_model_info.py): handle no whitelisted bedrock modells
* fix get_standard_logging_object_payload
* fix async_post_call_failure_hook
* fix post_call_failure_hook
* fix change
* fix _is_proxy_only_error
* fix async_post_call_failure_hook
* fix getting request body
* remove redundant code
* use a well named original function name for auth errors
* fix logging auth fails on DD
* fix using request body
* use helper for _handle_logging_proxy_only_error
* fix(key_management_endpoints.py): fix user-membership check when creating team key
* docs: add deprecation notice on original `/v1/messages` endpoint + add better swagger tags on pass-through endpoints
* fix(gemini/): fix image_url handling for gemini
Fixes https://github.com/BerriAI/litellm/issues/6897
* fix(teams.tsx): fix member add when role is 'user'
* fix(team_endpoints.py): /team/member_add
fix adding several new members to team
* test(test_vertex.py): remove redundant test
* test(test_proxy_server.py): fix team member add tests
* fix(ollama.py): fix get model info request
Fixes https://github.com/BerriAI/litellm/issues/6703
* feat(anthropic/chat/transformation.py): support passing user id to anthropic via openai 'user' param
* docs(anthropic.md): document all supported openai params for anthropic
* test: fix tests
* fix: fix tests
* feat(jina_ai/): add rerank support
Closes https://github.com/BerriAI/litellm/issues/6691
* test: handle service unavailable error
* fix(handler.py): refactor together ai rerank call
* test: update test to handle overloaded error
* test: fix test
* Litellm router trace (#6742)
* feat(router.py): add trace_id to parent functions - allows tracking retry/fallbacks
* feat(router.py): log trace id across retry/fallback logic
allows grouping llm logs for the same request
* test: fix tests
* fix: fix test
* fix(transformation.py): only set non-none stop_sequences
* Litellm router disable fallbacks (#6743)
* bump: version 1.52.6 → 1.52.7
* feat(router.py): enable dynamically disabling fallbacks
Allows for enabling/disabling fallbacks per key
* feat(litellm_pre_call_utils.py): support setting 'disable_fallbacks' on litellm key
* test: fix test
* fix(exception_mapping_utils.py): map 'model is overloaded' to internal server error
* test: handle gemini error
* test: fix test
* fix: new run
* add langsmith_api_key to StandardCallbackDynamicParams
* create a file for langsmith types
* langsmith add key / team based logging
* add key based logging for langsmith
* fix langsmith key based logging
* fix linting langsmith
* remove NOQA violation
* add unit test coverage for all helpers in test langsmith
* test_langsmith_key_based_logging
* docs langsmith key based logging
* run langsmith tests in logging callback tests
* fix logging testing
* test_langsmith_key_based_logging
* test_add_callback_via_key_litellm_pre_call_utils_langsmith
* add debug statement langsmith key based logging
* test_langsmith_key_based_logging
* fix(pattern_match_deployments.py): default to user input if unable to map based on wildcards
* test: fix test
* test: reset test name
* test: update conftest to reload proxy server module between tests
* ci(config.yml): move langfuse out of local_testing
reduce ci/cd time
* ci(config.yml): cleanup langfuse ci/cd tests
* fix: update test to not use global proxy_server app module
* ci: move caching to a separate test pipeline
speed up ci pipeline
* test: update conftest to check if proxy_server attr exists before reloading
* build(conftest.py): don't block on inability to reload proxy_server
* ci(config.yml): update caching unit test filter to work on 'cache' keyword as well
* fix(encrypt_decrypt_utils.py): use function to get salt key
* test: mark flaky test
* test: handle anthropic overloaded errors
* refactor: create separate ci/cd pipeline for proxy unit tests
make ci/cd faster
* ci(config.yml): add litellm_proxy_unit_testing to build_and_test jobs
* ci(config.yml): generate prisma binaries for proxy unit tests
* test: readd vertex_key.json
* ci(config.yml): remove `-s` from proxy_unit_test cmd
speed up test
* ci: remove any 'debug' logging flag
speed up ci pipeline
* test: fix test
* test(test_braintrust.py): rerun
* test: add delay for braintrust test