Files
litellm/tests/proxy_unit_tests
Alexsander Hamir c7847125c2 [Perf] Embeddings: Use router's O(1) lookup and shared sessions (#16344)
* Refactor proxy embeddings to use shared processor

- allow ProxyBaseLLMRequestProcessing to accept the aembedding route so embeddings requests reuse the base pipeline hooks

- route embeddings requests through base_process_llm_request, sharing logging, hook execution, retries, and header handling with chat/responses

- tighten token array decoding logic by using router deployment lookups and the unified error handler

* Fix: Correctly process embedding requests with token arrays

The `test_embedding_input_array_of_tokens` test was failing due to a regression that caused embedding requests with token arrays to be processed incorrectly. This prevented the `aembedding` function from being called as expected.

This was caused by a combination of three distinct issues:

1.  In `litellm/proxy/common_request_processing.py`, the `function_setup` utility was called with `aembedding` as the `original_function` for embedding routes. This has been corrected to `embedding` to ensure proper request setup.

2.  In `litellm/proxy/proxy_server.py`, a `TypeError` occurred because the `get_deployment` method was called with the `model_name` keyword argument instead of the expected `model_id`. This has been corrected. Additionally, the check for token arrays was improved to validate that all elements in the input subarray are integers.

3.  In `litellm/proxy/litellm_pre_call_utils.py`, the check for the `enforced_params` enterprise feature was too strict. It blocked valid requests even when the `enforced_params` list was empty. The condition has been adjusted to trigger the check only for non-empty lists.

Finally, the `test_embedding_input_array_of_tokens` assertion was updated to be more robust. The previous `assert_called_once_with` was overly strict, causing failures when unrelated internal parameters were added to the function call. The test now first asserts that `aembedding` is called and then separately verifies the `model` and `input` arguments. This makes the test more resilient to future changes without sacrificing its ability to catch regressions.

* test: align proxy embedding assertions

Update the embedding proxy test to match the new request pipeline: keep the data the proxy builds, expect the extra control kwargs, let the post-call hook return the actual response, and assert the normalized 'embeddings' hook type. This proves the refactor still forwards metadata and returns the mocked payload.

* Update proxy exception test

The proxy now forwards additional kwargs (request_timeout, litellm_call_id, litellm_logging_obj) to llm_router.aembedding. The test needs to accept these to match the real call signature and keep validating the error path instead of the kwargs list.

* testing: unsure of this change

I don't remember why I changed this, will revert and see if any tests fail since the manual test isn't failing without it.

* fix: remove unrelated change

This change was not related to the embeddings refactor and actually belonged to a different branch.
2025-11-14 09:21:45 -08:00
..
2025-10-25 10:19:24 -07:00
2025-10-25 10:19:24 -07:00
2025-10-02 18:48:11 -07:00
2025-09-27 16:57:27 -07:00
2025-10-25 10:19:24 -07:00
2025-10-25 10:19:24 -07:00