litellm

mirror of https://github.com/tiennm99/litellm.git synced 2026-06-22 19:36:22 +00:00

Author	SHA1	Message	Date
Ishaan Jaff	41566722af	[Feat] UI - Prompt Management - Allow testing prompts with Chat UI (#16898 ) * TestPromptRequest * add prompts/test endpoint for testing prompt * TestPromptTestEndpoint * feat: working v1 of this ui * workig prompt endpoints * add chat ui for prompts * add conversation panel * add init chat ui	2025-11-21 08:53:18 -08:00
Sameer Kankute	34cc532d8d	Make sure that user inherits team permissions (#16639 )	2025-11-18 20:14:42 -08:00
Ishaan Jaff	06eeb28c8f	Litellm ci cd fixes 2 (#16693 ) * litellm_proxy_unit_testing_part1 * test proxy unit test * litellm_proxy_unit_testing_key_generation * test_async_call_with_key_over_model_budget * test_aasync_call_with_key_over_model_budget	2025-11-15 14:12:44 -08:00
Ishaan Jaffer	666913f76d	test_async_call_with_key_over_model_budget	2025-11-15 09:26:29 -08:00
Ishaan Jaffer	63994e302e	test_call_with_key_over_model_budget	2025-11-14 19:05:00 -08:00
Ishaan Jaffer	9e8653ad3c	fix prisma client	2025-11-14 18:25:27 -08:00
Alexsander Hamir	c7847125c2	[Perf] Embeddings: Use router's O(1) lookup and shared sessions (#16344 ) * Refactor proxy embeddings to use shared processor - allow ProxyBaseLLMRequestProcessing to accept the aembedding route so embeddings requests reuse the base pipeline hooks - route embeddings requests through base_process_llm_request, sharing logging, hook execution, retries, and header handling with chat/responses - tighten token array decoding logic by using router deployment lookups and the unified error handler * Fix: Correctly process embedding requests with token arrays The `test_embedding_input_array_of_tokens` test was failing due to a regression that caused embedding requests with token arrays to be processed incorrectly. This prevented the `aembedding` function from being called as expected. This was caused by a combination of three distinct issues: 1. In `litellm/proxy/common_request_processing.py`, the `function_setup` utility was called with `aembedding` as the `original_function` for embedding routes. This has been corrected to `embedding` to ensure proper request setup. 2. In `litellm/proxy/proxy_server.py`, a `TypeError` occurred because the `get_deployment` method was called with the `model_name` keyword argument instead of the expected `model_id`. This has been corrected. Additionally, the check for token arrays was improved to validate that all elements in the input subarray are integers. 3. In `litellm/proxy/litellm_pre_call_utils.py`, the check for the `enforced_params` enterprise feature was too strict. It blocked valid requests even when the `enforced_params` list was empty. The condition has been adjusted to trigger the check only for non-empty lists. Finally, the `test_embedding_input_array_of_tokens` assertion was updated to be more robust. The previous `assert_called_once_with` was overly strict, causing failures when unrelated internal parameters were added to the function call. The test now first asserts that `aembedding` is called and then separately verifies the `model` and `input` arguments. This makes the test more resilient to future changes without sacrificing its ability to catch regressions. * test: align proxy embedding assertions Update the embedding proxy test to match the new request pipeline: keep the data the proxy builds, expect the extra control kwargs, let the post-call hook return the actual response, and assert the normalized 'embeddings' hook type. This proves the refactor still forwards metadata and returns the mocked payload. * Update proxy exception test The proxy now forwards additional kwargs (request_timeout, litellm_call_id, litellm_logging_obj) to llm_router.aembedding. The test needs to accept these to match the real call signature and keep validating the error path instead of the kwargs list. * testing: unsure of this change I don't remember why I changed this, will revert and see if any tests fail since the manual test isn't failing without it. * fix: remove unrelated change This change was not related to the embeddings refactor and actually belonged to a different branch.	2025-11-14 09:21:45 -08:00
Ishaan Jaffer	ee8b1cfabc	test_call_with_end_user_over_budget	2025-11-13 16:26:02 -08:00
yuneng-jiang	cb27d6c456	[Fix] UI - Delete Callbacks Failing (#16473 ) * Temp commit for branch switching * Created normalize callback name util function and tests	2025-11-12 18:43:37 -08:00
Ishaan Jaff	5c9f50d584	[AI Gateway] - End User Budgets - Allow pointing max_end_user budget to an id, so the default ID applies to all end users (#16456 ) * add _apply_budget_limits_to_end_user_params * add _apply_budget_limits_to_end_user_params * add _apply_budget_limits_to_end_user_params * test_default_budget_applied_to_end_user_without_budget * docs fix * fix config	2025-11-11 08:20:13 -08:00
Cesar Garcia	16325024df	fix: Use valid CallTypes enum value in embeddings endpoint (#16328 ) * Fix embeddings endpoint call_type to use valid CallTypes enum value Fixed bug where the `/embeddings` endpoint was passing `call_type="embeddings"` to guardrail hooks, but "embeddings" is not a valid value in the CallTypes enum. Changed to use `call_type="aembedding"` (async embedding) which is the correct CallTypes enum value and matches the route_type used in the same function. Added unit tests to verify: - "embeddings" is not a valid CallTypes enum value - "aembedding" is the correct valid value - The fix prevents ValueError when guardrails are enabled Fixes #16240 * Inline embeddings call type regression check * Ensure embedding test preserves proxy metadata	2025-11-06 19:25:00 -08:00
Ishaan Jaffer	b5d81a5d9c	test_completion_text_003_prompt_array, test_key_generate_with_secret_manager_call	2025-11-06 17:18:01 -08:00
Petre Alexandru	911e802969	feat: add parallel execution handling in during_call_hook (#16279 )	2025-11-05 18:35:25 -08:00
yuneng-jiang	5d158775b1	[Fix] Litellm non root docker Model Hub Table fix (#16282 ) * Fix model hub table 404 on non-root docker * Adding test	2025-11-05 18:30:20 -08:00
Sameer Kankute	c45fad3855	Fix: Send Gemini API key via x-goog-api-key header with custom api_base (#16085 ) * Add gemini api key in the custom api url * Update tests * Use api key n the header * Use api key n the header * fix mypy error * fix mypy error * fix test gemini auth	2025-11-05 07:12:13 -08:00
Bowen Liang	4e12e3f90d	fix typo of orginal (#16255 )	2025-11-04 18:55:44 -08:00
steve-gore-snapdocs	88240c4cba	Fix Anthropic token counting for VertexAI (#16171 ) * transform anthropic messages in gemini handler * initial * linting * remove extra testt * maintain consistency * more tests * Revert "transform anthropic messages in gemini handler" This reverts commit 805e60fd2887991bb4b4554b9394437b874835f9. * don't lint file we aren't changing * cleanup * cleanup * Cleanup	2025-11-02 09:02:07 -08:00
Ishaan Jaffer	cb57455172	test_foward_litellm_user_info_to_backend_llm_call	2025-10-27 13:48:23 -07:00
Krish Dholakia	2bd41dc034	Guardrails - Responses API, Image Gen, Text completions, Audio transcriptions, Audio Speech, Rerank, Anthropic Messages API support via the unified `apply_guardrails` function (#15706 ) * fix(presidio.py): handle content as a list of texts covers openai + anthropic messages api * fix(presidio.py): safe get messages * test: add unit testing for presidio guardrails * fix(unified_guardrail.py): initial commit * fix(enkryptai.py): implement apply_guardrail to enkrypt guardrail * fix(unified_guardrail.py): support unified guardrail on input * feat(unified_guardrail.py): add post call success hook implementation allows us to just have 1 place to handle llm translation to guardrail api spec * refactor: refactor initial unified guardrail component * refactor: more refactoring * feat(responses/): add guardrails to responses api allows existing guardrails to work for new llm endpoints * docs(adding_guardrail_support.md): document new guardrail endpoint support * test: add unit tests * feat(image_generation/): add guardrail support for image generation endpoint * feat(openai/text_completion): support guardrails on `/v1/completions` API * docs: document guardrails support on new endpoints * docs: clarify when guardrails run * feat(openai/speech): add guardrail support for input * docs(rerank/): add guardrail support on input query * fix: fix ruff check	2025-10-25 13:38:57 -07:00
Ishaan Jaffer	0bedf1c0a7	fix tests	2025-10-25 10:19:24 -07:00
Carlo Alberto Ferraris	8b1424166b	attempt to avoid/minimize deadlocks (#15281 ) Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>	2025-10-24 12:22:38 -07:00
Ishaan Jaff	f55745fc5e	[Fix] Forward anthropic-beta headers to Bedrock, VertexAI (#15700 ) * [Fix] Forward anthropic-beta headers to Bedrock and other cross-provider scenarios (#15623) * add_provider_specific_headers_to_request * fix add_provider_specific_headers_to_request * test_provider_specific_header_multi_provider * test_provider_specific_header_in_request --------- Co-authored-by: Jack Venberg <jack.venberg@rover.com>	2025-10-18 16:26:32 -07:00
Nagailic Sergiu (Nikro)	6842d705d5	fix(token-counter): extract model_info from deployment for custom_tokenizer (#15657 ) (#15680 )	2025-10-17 19:38:45 -07:00
Achintya Rajan	264f1cded1	Merge branch 'main' into litellm_view_key_pagination_calls_fix	2025-10-06 18:10:57 -07:00
Krrish Dholakia	63cb2764fe	test: fix raise	2025-10-04 16:11:22 -07:00
=	6ba077593f	Update test_key_generate_prisma.py	2025-10-04 14:36:19 -07:00
=	5e03ef7382	fixes bloated key alias network calls with lean endpoint	2025-10-04 14:32:15 -07:00
Ishaan Jaffer	9c29f35c4b	test_end_user_jwt_auth	2025-10-02 18:48:11 -07:00
Ishaan Jaffer	ce57f59531	test_gemini_pass_through_endpoint	2025-09-27 17:17:12 -07:00
Ishaan Jaffer	0ec7dace79	test_embedding	2025-09-27 16:57:27 -07:00
Ishaan Jaffer	3c5e0abaf2	async_log_success_event	2025-09-27 14:17:13 -07:00
Ishaan Jaffer	6aa35ec999	test text-embedding-ada-002	2025-09-27 12:41:35 -07:00
Ishaan Jaffer	c27beb74b9	test fix	2025-09-27 12:40:34 -07:00
Ishaan Jaffer	284a8549a1	test_chat_completion	2025-09-27 11:43:20 -07:00
Ishaan Jaffer	3baa3aff1b	test fix	2025-09-27 10:38:35 -07:00
Mubashir Osmani	625ed3f8cf	fix: prisma client state retries (#14925 ) * added qwen models and gpt-5-codex * fix flaky test * fix failing test * Added retries to prisma client state * fix: prisma client state retries in pods * Revert "fix failing test" This reverts commit dbec4988a2627257fd05b905e216225664517f32. * Revert "fix flaky test" This reverts commit b0ac2f2dc35ca433af0c82f3cda770d6981caff4. * Revert "added qwen models and gpt-5-codex" This reverts commit 9a8a8f2d47ab4dc8aecb0cd9a6a4f82ed81bb056. * Revert "fix: prisma client state retries in pods" This reverts commit 04e58e5ca1a489916e3b49e9b674f5c6713fd7cd. * fix lint * Revert "fix lint" This reverts commit 5303d52a5e3bee7e131dcabd098e94f0613a7bb9. * fixed lint	2025-09-25 21:54:00 -07:00
Alexsander Hamir	eaa04cd8ce	fix: use fastuuid helper (#14903 ) * fix: use fastuuid helper across the codebase First batch of changes, simple drop in replacement. * second batch of changes * fixed: script mistake on helper file	2025-09-25 15:47:01 -07:00
Mubashir Osmani	a7a6381926	fix: flaky passthrough tests (#14692 ) * fix: flaky passthrough tests * Revert "fix: flaky passthrough tests" This reverts commit ffe692e017600a8853ab7c31f95485958ab74c5f. * fix: serialize prisma objects	2025-09-18 15:35:14 -07:00
Krish Dholakia	bfaab8ad7e	Merge pull request #14557 from timelfrink/fix/issue-14478-bedrock-count-tokens-endpoint Implement AWS Bedrock CountTokens API support	2025-09-17 23:51:06 -07:00
Tim Elfrink	c234b13275	Apply code formatting and linting fixes - Apply Black formatting to all Bedrock CountTokens files - Clean up imports and remove unused variables in tests - Fix indentation and simplify test structure - Fix pyright type error with type ignore annotation - All tests continue to pass after cleanup	2025-09-18 08:28:17 +02:00
Tim Elfrink	e74ac35b5d	Add comprehensive tests for Bedrock CountTokens functionality - Add endpoint integration test in test_proxy_token_counter.py - Add unit tests for transformation logic in bedrock/count_tokens/ - Test model extraction from request body vs endpoint path - Test input format detection (converse vs invokeModel) - Test request transformation from Anthropic to Bedrock format - All tests follow existing codebase patterns and pass successfully	2025-09-18 08:16:56 +02:00
Mubashir Osmani	8b804303ed	fix: ci/cd tests + lint errors (#14646 ) * fix: lint errors + tests * fixed ci tests * fixed tests --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>	2025-09-17 17:06:43 -07:00
Sameer Kankute	69c01488bd	remove not needed names (#14641 )	2025-09-17 14:26:48 -07:00
Krish Dholakia	635dc72211	Merge pull request #14604 from Sameerlite/litellm_gemini_api_base_update Litellm gemini api base update	2025-09-16 22:38:44 -07:00
Alexsander Hamir	02db2e8ae8	[Performance] RPS Improvement +500 RPS when sending the `user` field (#14616 ) * perf tool * fix: cache type issue * fix: exception hanging & cache setting 1. Removed unhandled exceptions 2. Set cache value to dict	2025-09-16 16:18:23 -07:00
Sameerlite	f08fc45a0f	add base url support for gemini	2025-09-16 15:15:24 +05:30
Sameer Kankute	1a123b2cd5	Litellm gemini cli bug fix (#14451 ) * Fix gemini cli error * Add reasoning request support * Added better handling * remove other PR code * refactored code for better structure following --------- Co-authored-by: sameer@berri.ai <sameer@berri.ai>	2025-09-12 11:55:26 -07:00
Krrish Dholakia	c45ede7187	test: update test	2025-09-09 21:31:34 -07:00
Ishaan Jaff	2cc85936ed	Revert "Security fix - prevent proxy_admin_viewer from modifying other user's credentials + remove hardcoded sensitive keys from test repo" (#14362 )	2025-09-08 18:40:54 -07:00
Krrish Dholakia	06d472f205	test: fix tests	2025-09-06 21:59:02 -07:00

1 2 3 4 5

243 Commits