* litellm_proxy_unit_testing_part1
* test proxy unit test
* litellm_proxy_unit_testing_key_generation
* test_async_call_with_key_over_model_budget
* test_aasync_call_with_key_over_model_budget
* Refactor proxy embeddings to use shared processor
- allow ProxyBaseLLMRequestProcessing to accept the aembedding route so embeddings requests reuse the base pipeline hooks
- route embeddings requests through base_process_llm_request, sharing logging, hook execution, retries, and header handling with chat/responses
- tighten token array decoding logic by using router deployment lookups and the unified error handler
* Fix: Correctly process embedding requests with token arrays
The `test_embedding_input_array_of_tokens` test was failing due to a regression that caused embedding requests with token arrays to be processed incorrectly. This prevented the `aembedding` function from being called as expected.
This was caused by a combination of three distinct issues:
1. In `litellm/proxy/common_request_processing.py`, the `function_setup` utility was called with `aembedding` as the `original_function` for embedding routes. This has been corrected to `embedding` to ensure proper request setup.
2. In `litellm/proxy/proxy_server.py`, a `TypeError` occurred because the `get_deployment` method was called with the `model_name` keyword argument instead of the expected `model_id`. This has been corrected. Additionally, the check for token arrays was improved to validate that all elements in the input subarray are integers.
3. In `litellm/proxy/litellm_pre_call_utils.py`, the check for the `enforced_params` enterprise feature was too strict. It blocked valid requests even when the `enforced_params` list was empty. The condition has been adjusted to trigger the check only for non-empty lists.
Finally, the `test_embedding_input_array_of_tokens` assertion was updated to be more robust. The previous `assert_called_once_with` was overly strict, causing failures when unrelated internal parameters were added to the function call. The test now first asserts that `aembedding` is called and then separately verifies the `model` and `input` arguments. This makes the test more resilient to future changes without sacrificing its ability to catch regressions.
* test: align proxy embedding assertions
Update the embedding proxy test to match the new request pipeline: keep the data the proxy builds, expect the extra control kwargs, let the post-call hook return the actual response, and assert the normalized 'embeddings' hook type. This proves the refactor still forwards metadata and returns the mocked payload.
* Update proxy exception test
The proxy now forwards additional kwargs (request_timeout, litellm_call_id, litellm_logging_obj) to llm_router.aembedding. The test needs to accept these to match the real call signature and keep validating the error path instead of the kwargs list.
* testing: unsure of this change
I don't remember why I changed this, will revert and see if any tests fail since the manual test isn't failing without it.
* fix: remove unrelated change
This change was not related to the embeddings refactor and actually belonged to a different branch.
* Fix embeddings endpoint call_type to use valid CallTypes enum value
Fixed bug where the `/embeddings` endpoint was passing `call_type="embeddings"`
to guardrail hooks, but "embeddings" is not a valid value in the CallTypes enum.
Changed to use `call_type="aembedding"` (async embedding) which is the correct
CallTypes enum value and matches the route_type used in the same function.
Added unit tests to verify:
- "embeddings" is not a valid CallTypes enum value
- "aembedding" is the correct valid value
- The fix prevents ValueError when guardrails are enabled
Fixes#16240
* Inline embeddings call type regression check
* Ensure embedding test preserves proxy metadata
* Add gemini api key in the custom api url
* Update tests
* Use api key n the header
* Use api key n the header
* fix mypy error
* fix mypy error
* fix test gemini auth
* fix(presidio.py): handle content as a list of texts
covers openai + anthropic messages api
* fix(presidio.py): safe get messages
* test: add unit testing for presidio guardrails
* fix(unified_guardrail.py): initial commit
* fix(enkryptai.py): implement apply_guardrail to enkrypt guardrail
* fix(unified_guardrail.py): support unified guardrail on input
* feat(unified_guardrail.py): add post call success hook implementation
allows us to just have 1 place to handle llm translation to guardrail api spec
* refactor: refactor initial unified guardrail component
* refactor: more refactoring
* feat(responses/): add guardrails to responses api
allows existing guardrails to work for new llm endpoints
* docs(adding_guardrail_support.md): document new guardrail endpoint support
* test: add unit tests
* feat(image_generation/): add guardrail support for image generation endpoint
* feat(openai/text_completion): support guardrails on `/v1/completions` API
* docs: document guardrails support on new endpoints
* docs: clarify when guardrails run
* feat(openai/speech): add guardrail support for input
* docs(rerank/): add guardrail support on input query
* fix: fix ruff check
* fix: use fastuuid helper across the codebase
First batch of changes, simple drop in replacement.
* second batch of changes
* fixed: script mistake on helper file
- Apply Black formatting to all Bedrock CountTokens files
- Clean up imports and remove unused variables in tests
- Fix indentation and simplify test structure
- Fix pyright type error with type ignore annotation
- All tests continue to pass after cleanup
- Add endpoint integration test in test_proxy_token_counter.py
- Add unit tests for transformation logic in bedrock/count_tokens/
- Test model extraction from request body vs endpoint path
- Test input format detection (converse vs invokeModel)
- Test request transformation from Anthropic to Bedrock format
- All tests follow existing codebase patterns and pass successfully