* Refactor proxy embeddings to use shared processor
- allow ProxyBaseLLMRequestProcessing to accept the aembedding route so embeddings requests reuse the base pipeline hooks
- route embeddings requests through base_process_llm_request, sharing logging, hook execution, retries, and header handling with chat/responses
- tighten token array decoding logic by using router deployment lookups and the unified error handler
* Fix: Correctly process embedding requests with token arrays
The `test_embedding_input_array_of_tokens` test was failing due to a regression that caused embedding requests with token arrays to be processed incorrectly. This prevented the `aembedding` function from being called as expected.
This was caused by a combination of three distinct issues:
1. In `litellm/proxy/common_request_processing.py`, the `function_setup` utility was called with `aembedding` as the `original_function` for embedding routes. This has been corrected to `embedding` to ensure proper request setup.
2. In `litellm/proxy/proxy_server.py`, a `TypeError` occurred because the `get_deployment` method was called with the `model_name` keyword argument instead of the expected `model_id`. This has been corrected. Additionally, the check for token arrays was improved to validate that all elements in the input subarray are integers.
3. In `litellm/proxy/litellm_pre_call_utils.py`, the check for the `enforced_params` enterprise feature was too strict. It blocked valid requests even when the `enforced_params` list was empty. The condition has been adjusted to trigger the check only for non-empty lists.
Finally, the `test_embedding_input_array_of_tokens` assertion was updated to be more robust. The previous `assert_called_once_with` was overly strict, causing failures when unrelated internal parameters were added to the function call. The test now first asserts that `aembedding` is called and then separately verifies the `model` and `input` arguments. This makes the test more resilient to future changes without sacrificing its ability to catch regressions.
* test: align proxy embedding assertions
Update the embedding proxy test to match the new request pipeline: keep the data the proxy builds, expect the extra control kwargs, let the post-call hook return the actual response, and assert the normalized 'embeddings' hook type. This proves the refactor still forwards metadata and returns the mocked payload.
* Update proxy exception test
The proxy now forwards additional kwargs (request_timeout, litellm_call_id, litellm_logging_obj) to llm_router.aembedding. The test needs to accept these to match the real call signature and keep validating the error path instead of the kwargs list.
* testing: unsure of this change
I don't remember why I changed this, will revert and see if any tests fail since the manual test isn't failing without it.
* fix: remove unrelated change
This change was not related to the embeddings refactor and actually belonged to a different branch.
* add supported_db_objects
* add _should_load_db_object
* add docs on storing MCP objects in DB
* test_should_load_db_object_with_supported_db_objects
* type fix
* added mcp guardrails doc in mcp.md
* add button to reload models
* Added button changes
* added button for scheduling reload
* add multi pod support to reloading the model price json
* fix ruff
* feat(proxy_server.py): working guardrails on streaming output
ensures guardrail actually raises an error if flagged during streaming output
* test: add unit tests
* feat(advanced_settings.tsx): add guardrails option as ui component on model add
enables setting guardrails on model add
* feat(add_model_tab.tsx): fix add model form
* feat(model_info_view.tsx): support adding guardrails on model update
* fix(add_model_tab.tsx/): working health check when guardrails selected
* fix(proxy_server.py): fix yield
* fix(proxy_setting_endpoints.py): require store model in db is enabled for setting user default settings
* test(test_proxy_server.py): update test
* fix(reset_budget_job.py): initial commit adding reset budget logic for team members
* test: update unit testing
* test(test_proxy_budget_reset.py): validate team member budget was reset
* test(test_reset_budget_job.py): update unit tests
* test: update tests
* fix(proxy_server.py): handle empty config yaml
Fixes https://github.com/BerriAI/litellm/issues/12163
* fix(gemini/common_utils.py): replace models/ as expected, instead of using 'strip'
Fixes https://github.com/BerriAI/litellm/issues/12160
* fix(anthropic/experimental_pass_through/messages/transformation.py): check for env var when selecting api key
* docs(config_settings.md): add api key to docs
* fix(team_endpoints.py): prevent overwriting current list of team models on new model add
* fix(networking.tsx): fix default proxy base url
* fix(proxy_server.py): include team only models when retrieving all deployments on `/v2/model/info` helper util
ensures team only models are shown to user
* fix(router.py): check model name by team public model name when team id given
Fixes issue where team member could not see team only models when clicking into that team on `Models + Endpoints`
* fix(team_member_view.tsx): fix rendering team member budget, when budget is set
* test: update tests
* test: update unit test
* test(test_router.py): initial unit test confirming router.afile_content uses dynamic api key / api base
* fix(managed_files.py): filter deployments for only those within file id mapping
ensure call works - only route to models where the file was written
* fix(proxy_server.py): fix loading in model ids from config, if config id is int
* fix(router.py): return all model file id mappings on create_file
if multiple deployments - this ensures all the file id mappings are bubbled up
Fixes issue when trying to use loadbalanced deployments - only 1 file id mapping was being stored
* fix(internal_user_endpoints.py): don't apply default internal user params if role is admin
prevent internal user restrictions from being applied to admin
* fix(proxy_server.py): fix model info v2 endpoint check - handle user_id being none
* fix(team_endpoints.py): ensure team doesn't lose all model access if set as empty string and new team model added
* fix(proxy_server.py): ensure model with team id is only added as valid for team which has that id
* fix(proxy_cli.py): check for module not found error on proxy import
Fixes https://github.com/BerriAI/litellm/issues/11836
* feat(proxy_server.py): utility function to get all models across all teams user is in
Allows user to see all team models on UI
* feat(proxy_server.py): return models accessible via team id's in `/v2/model/info`
Allows UI to tell user which team they can use to access model
* feat(columsn.tsx): initial PR to add 'accessible via Team ID's on model hub
allows user to know what teams they can access a model through
* Revert "feat(columsn.tsx): initial PR to add 'accessible via Team ID's on model hub"
This reverts commit f844c79383ec6739ed712f59e33a524a26b3d35a.
* fix(proxy_server.py): backend model info endpoint improvements
* UI Improvements for Default User access (#11952)
* feat(ui/): add a 'current team' and 'view' filters to the models page
allow user to see what all models they have access to within a specific team
* feat: working ui for seeing models in teams
* fix(model_dashboard.tsx): make current team filter more prominent
* style(model_dashboard.tsx): add a helpful note telling user how to create a model for the team they've selected
* style(model_dashboard.tsx): only show helpful note when current view is team, not for global
* fix(team_dropdown.tsx): allow searching by team id on create key modal
* feat(create_key_button.tsx): add helpful message when team selection is required
* fix: fix linting checks
* fix: fix ui linting error
* docs(team_endpoints.py): document new param