litellm

mirror of https://github.com/tiennm99/litellm.git synced 2026-06-24 15:38:19 +00:00

Author	SHA1	Message	Date
Krrish Dholakia	5596728cae	Merge pull request #24753 from BerriAI/litellm_dev_03_27_2026_p1 Fix returned model when batch completions is used - return picked model, not comma-separated list	2026-03-30 17:53:48 -07:00
Krrish Dholakia	4c00a14ce0	fix: fix ci/cd + handle oidc jwt tokens	2026-03-30 16:12:58 -07:00
Yuneng Jiang	6522d282b5	[Fix] Correct kwarg name in test_user_api_key_auth tests PR #24755 renamed `azure_api_key_header` to `AZURE_AI_API_KEY_header` in the test file but did not update the actual function signatures of `get_api_key()` and `_user_api_key_auth_builder()`, causing TypeError on all affected test cases. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-30 14:51:09 -07:00
Krrish Dholakia	bc829d51f2	test: test	2026-03-28 19:17:38 -07:00
Yuneng Jiang	7100ed5d0a	[Fix] Test isolation for agent health checks and documentation test path resolution Fix agent health check tests failing with 500 errors in parallel CI by mocking prisma_client to None. Fix documentation validation tests using CWD-relative paths that break depending on the working directory. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 11:00:22 -07:00
ryan-crabbe-berri	2eb3c20e76	Merge pull request #24718 from BerriAI/litellm_ryan-march-26 litellm ryan march 26	2026-03-28 09:01:11 -07:00
Krrish Dholakia	32adda8a49	fix: return winning model name instead of comma-separated list for fastest_response When fastest_response=true with comma-separated models, the response model field was stamped with the entire comma-separated string. Now uses the x-litellm-model-group header from the winning response to return the correct model name. Made-with: Cursor	2026-03-27 22:34:26 -07:00
ryan-crabbe-berri	726a34627c	Merge pull request #24717 from BerriAI/litellm_fix-user-cache-invalidation fix(jwt): invalidate user cache after role/team sync updates	2026-03-27 19:50:41 -07:00
Ryan Crabbe	dd11e77852	fix: add explicit TTL to cache writes and test coverage for user cache invalidation Add DEFAULT_MANAGEMENT_OBJECT_IN_MEMORY_CACHE_TTL to both async_set_cache calls in sync_user_role_and_teams for consistency with all other user cache writes. Add 3 tests covering cache invalidation on role change, team change, and no-op when nothing changes.	2026-03-27 19:45:13 -07:00
ryan-crabbe-berri	5b651048f2	Merge pull request #24706 from BerriAI/litellm_fix-jwt-none-guard fix(auth): guard JWTHandler.is_jwt() against None token	2026-03-27 18:06:24 -07:00
yuneng-jiang	846e4b44b6	Merge pull request #24682 from michelligabriele/fix/budget-spend-counters fix(proxy): enforce budget limits across multi-pod deployments via Redis-backed spend counters	2026-03-27 16:59:23 -07:00
Ryan Crabbe	8e3755931d	test(auth): add regression tests for JWTHandler.is_jwt(None) Add None-token test cases to both proxy_unit_tests and test_litellm to cover the guard added in the previous commit. Also add -> bool return type annotation to is_jwt().	2026-03-27 16:51:08 -07:00
Ryan Crabbe	e24819afef	fix(sso): pass decoded JWT access token to role mapping during SSO login During SSO login, bearer tokens are stripped from the OAuth response before role mapping runs. Custom role claims encoded inside the JWT access token are lost, so map_jwt_role_to_litellm_role() returns None and the user falls back to internal_user_viewer. process_sso_jwt_access_token() now returns the decoded JWT payload, and a new _sync_user_role_from_jwt_role_map() receives it so jwt_litellm_role_map works correctly during SSO login.	2026-03-27 13:50:30 -07:00
michelligabriele	d533b432fd	fix(proxy): enforce budget limits across multi-pod deployments via Redis-backed spend counters Budget checks on API keys, teams, and team members were not enforced in multi-pod deployments because user_api_key_cache is intentionally in-memory-only. Each pod tracked spend independently, so with N pods the effective budget was N × max_budget. Introduces a separate spend_counter_cache (DualCache wired to redis_usage_cache) with atomic increment/read helpers: - increment_spend_counters(): awaited in cost callback (not create_task) to update both in-memory and Redis before the next auth check - get_current_spend(): reads Redis first (cross-pod authoritative), falls back to in-memory, then to cached object .spend from DB Budget check functions (_virtual_key_max_budget_check, _team_max_budget_check, _check_team_member_budget) now read spend via get_current_spend() instead of cached object .spend fields. When Redis is not configured, falls back to in-memory-only counters (same as current single-instance behavior). Fixes #23714	2026-03-27 20:39:52 +01:00
yuneng-jiang	1b111d23f3	Merge pull request #24688 from Sameerlite/litellm_litellm_team-model-group-name-routing-fix fix(team-routing): preserve sibling deployment candidates for team public models	2026-03-27 12:00:34 -07:00
Sameer Kankute	92a07e2d6e	fix(proxy): address Greptile review feedback - Remove HTTP_PROXY/HTTPS_PROXY from blocklist (legitimately used in corporate envs) - Add NO_PROXY/no_proxy to blocklist (prevents bypassing proxy monitoring) - Remove dead code in _is_valid_user_id (space exception was unreachable) - Update tests accordingly Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-27 20:38:36 +05:30
Sameer Kankute	8112fbf274	fix(proxy): sanitize user_id input and block dangerous env var keys Add input validation to get_user_id_from_request (length limit, control char rejection) and a blocklist of dangerous environment variable keys in _load_environment_variables to prevent PATH/LD_PRELOAD/PYTHONPATH override via config. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-27 20:38:36 +05:30
Sameer Kankute	2321d77599	fix(router): address remaining Greptile review comments - Cache LITELLM_ENABLE_TEAM_STALE_ALIAS_BYPASS at module level to avoid hot-path secret lookups - Add clarifying comments for should_include_deployment team isolation logic - Add negative assertion for update_team.assert_not_called() in test - Add docstring clarification for _get_team_deployments helper pattern - Add explicit assertion message in test_get_model_list_alias_optimization Made-with: Cursor	2026-03-27 20:11:28 +05:30
Sameer Kankute	1a0b30aaac	Fix greptile reviews and mock test	2026-03-27 20:11:28 +05:30
Sameer Kankute	c6cc0341f6	Fix greptile reviews and mock test	2026-03-27 20:11:28 +05:30
Sameer Kankute	316a742945	Fix greptile comments	2026-03-27 20:11:28 +05:30
Sameer Kankute	173695f5e0	Fix greptile comments	2026-03-27 20:11:27 +05:30
Sameer Kankute	e8fb7762b3	perf(routing): optimize team model checks and improve test coverage - Use O(1) team index lookup instead of map_team_model in alias guard - Fix MockPrismaClient to validate where clause filters - Add comment explaining DB query trade-off for team deployments Made-with: Cursor	2026-03-27 20:11:27 +05:30
Sameer Kankute	8aa58bdcaa	fix(routing): prevent stale model_aliases from interfering with team routing - Skip model_aliases rewrite if model resolves to team deployments - Add test coverage for sibling-preservation branch - Update MockPrismaClient to support sibling deployment scenarios Made-with: Cursor	2026-03-27 20:11:27 +05:30
Sameer Kankute	f5b7298854	fix(management): query DB directly for sibling deployments on rename - Add clarifying comments to test assertions - Query prisma DB instead of in-memory router to avoid stale state - Prevents incorrect deletion of old public name when siblings exist Made-with: Cursor	2026-03-27 20:11:27 +05:30
Sameer Kankute	aeb932d707	fix(team-routing): keep team model routing on public names Remove team model_alias rewrites and resolve team deployments by team_public_model_name with team_id so sibling deployments stay in the routing candidate pool, with explicit logs showing candidate selection before load balancing. Made-with: Cursor	2026-03-27 20:11:27 +05:30
Sameer Kankute	5534b40ab3	fix(team-routing): use deterministic team model group names Use a deterministic internal model_name for team-scoped deployments so sibling deployments with the same public model share a routing group. This makes team alias writes idempotent and preserves multi-deployment failover/load balancing behavior. Made-with: Cursor	2026-03-27 20:11:27 +05:30
Ryan Crabbe	0aadf51342	fix(proxy): ignore return_to in SSO when control_plane_url is not configured Instead of returning a 400 error when return_to is passed without control_plane_url configured, silently ignore it and proceed with the normal same-origin SSO flow.	2026-03-23 21:54:29 -07:00
Krrish Dholakia	26d162ccf4	fix(test): add user_api_key_project_alias to spend logs expected keys Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-23 18:12:50 -07:00
michelligabriele	fa7ccf0893	fix(test): add request_data param to test mock + black formatting	2026-03-23 15:43:05 +01:00
michelligabriele	4625ccbaa2	fix(proxy): anchor metadata dict in _process_response/_process_error so pop() mutates the real dict	2026-03-23 15:39:23 +01:00
michelligabriele	d8fd9a20ed	fix(proxy): address Greptile review — streaming request_data, OCR backward compat, test coverage - Pass request_data to end-of-stream process_output_streaming_response call - Restore inputs.update() in OCR handler for third-party guardrail providers - Add streaming end-to-end test for guardrail logging passthrough	2026-03-23 15:39:23 +01:00
michelligabriele	ae454fd700	fix(proxy): OpenAI Moderation post-call guardrail response not captured for logging Two independent bugs prevented post-call OpenAI Moderation guardrail results from reaching downstream logging callbacks (Langfuse, Datadog). Bug 1: process_output_response() created a throwaway request_data dict, so guardrail info written by @log_guardrail_information was discarded. Fixed by threading the real request_data from the unified guardrail dispatcher through all 13 BaseTranslation handlers, with litellm_metadata injection preserved for third-party guardrails (Zscaler, Prompt Security). Also extended to process_output_streaming_response for consistency. Bug 2: The @log_guardrail_information decorator collapsed the full moderation API response (categories, scores, flagged status) to "allow". Fixed by overriding _process_response/_process_error on OpenAIModerationGuardrail to stash and log the full response, following the established Model Armor pattern.	2026-03-23 15:39:22 +01:00
yuneng-jiang	9963b31e07	Revert "fix(proxy): restore per-entity breakdown in aggregated daily activity endpoint" This reverts commit `9c3fab24ad`.	2026-03-21 21:37:29 -07:00
yuneng-jiang	e3d4c29d37	Merge pull request #24323 from BerriAI/litellm_ryan_march_20 litellm ryan march 20	2026-03-21 15:57:28 -07:00
yuneng-jiang	72fba093c8	Merge remote-tracking branch 'origin/main' into litellm_dev_sameer_16_march_week	2026-03-21 15:11:29 -07:00
yuneng-jiang	2b889f1627	Merge pull request #23471 from michelligabriele/fix/aggregated-activity-entity-breakdown fix(proxy): restore per-entity breakdown in aggregated daily activity endpoint	2026-03-21 14:59:41 -07:00
Krish Dholakia	f911d8d865	Merge pull request #23818 from BerriAI/litellm_oss_staging_03_17_2026 fix(fireworks): skip #transform=inline for base64 data URLs (#23729)	2026-03-21 14:54:39 -07:00
yuneng-jiang	262534a3a5	Merge branch 'main' into litellm_dev_sameer_16_march_week	2026-03-21 14:30:57 -07:00
Ishaan Jaff	2ea9e207bd	Litellm ishaan march 20 (#24303 ) * feat(redis): add circuit breaker to RedisCache to fast-fail when Redis is down (#24181) * feat(redis): add circuit breaker env var constants * feat(redis): add RedisCircuitBreaker and apply guard decorator to all async ops * fix(dual_cache): fall back to L1 instead of re-raising on Redis increment failures * test(caching): add circuit breaker unit tests * fix(redis): fast-fail concurrent HALF_OPEN probes — only one probe at a time * fix(dual_cache): return None fallback when in_memory_cache is absent and Redis fails * test(caching): add regression tests for HALF_OPEN concurrency and None fallback * Fix blocking sync next in __anext__ (#24177) * Fix blocking sync next * Update tests/test_litellm/litellm_core_utils/test_streaming_handler.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * fix PEP 479 regression in __anext__ sync iterator exhaustion asyncio.to_thread re-raises thread exceptions inside a coroutine, where PEP 479 converts StopIteration to RuntimeError before any except clause can catch it. Add _next_sync_or_exhausted() module-level helper that catches StopIteration in the thread and returns a sentinel instead, then raise StopAsyncIteration in the coroutine. Also rewrites the non-blocking test to use asyncio.gather() instead of asyncio.create_task() (which returned None on Python 3.9 / pytest-asyncio in CI), and adds an exhaustion regression test that drains the wrapper fully and asserts no RuntimeError leaks out. --------- Co-authored-by: Emerson Gomes <emerson.gomes@thalesgroup.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * feat: add git-subdir source type to claude-code/plugins API (#24223) Support a third plugin source type `git-subdir` alongside the existing `github` and `url` types, as documented in the official Claude Code plugin marketplaces spec. New format: {"source": "git-subdir", "url": "...", "path": "subdir/path"} - Validates url and path fields are present and non-empty - Rejects absolute paths, '..' segments, backslashes, and percent-encoded traversal sequences (including double-encoded variants via regex check) - Extracts path validation into _validate_git_subdir_path() helper - Updates Pydantic field description to document all three source types - Adds isValidUrl() check for url/git-subdir source types in the UI form - Adds "Git Subdir" option to the UI form with a required Path field - Adds unit tests covering success, update, missing/empty fields, path traversal variants, and unknown source type Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * [FEAT] add extract_header and extract_footer to Mistral OCR supported params (#24213) * docs: add git-subdir source type to claude-code plugin marketplace docs (#24289) * fix(ui): swap J/K keyboard navigation in log details drawer (#24279) (#24286) J should navigate down (next) and K should navigate up (previous), matching vim/standard conventions. * fix: use async_set_cache in user_api_key_auth hot path (#24302) * fix: use async_set_cache in auth hot path to avoid blocking event loop * test: assert no blocking set_cache call in _user_api_key_auth_builder * test: broaden blocking call check to all sync DualCache methods * test: fix regression test to actually catch blocking cache calls * fix: ruff lint unused variable + UI build MessageManager error - litellm/caching/redis_cache.py: remove unused variable 'e' in circuit breaker exception handler (F841) - add_plugin_form.tsx: use MessageManager.error() instead of undefined message.error() for git URL validation Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * docs: add REDIS_CIRCUIT_BREAKER env vars to config_settings reference Add REDIS_CIRCUIT_BREAKER_FAILURE_THRESHOLD and REDIS_CIRCUIT_BREAKER_RECOVERY_TIMEOUT to the environment variables reference table so test_env_keys.py passes. Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> --------- Co-authored-by: Emerson Gomes <emerson.gomes@thalesgroup.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Vincenzo Barrea <manamana88@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Robert Kirscht <rkirscht242@gmail.com> Co-authored-by: Imgyu Kim <kimimgo@gmail.com> Co-authored-by: Cursor Agent <cursoragent@cursor.com> Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>	2026-03-21 12:40:11 -07:00
Sameer Kankute	4f1e484a9b	Merge branch 'main' into litellm_dev_sameer_16_march_week Resolve conflicts in common_request_processing.py (keep main streaming, post_call_success_hook try/finally, deferred logging; retain skip_pre_call_logic) and utils.py (defer + internal-call skip + sync success callbacks for all calls). Tighten _has_post_call_guardrails for event_hook=None; align deferred guardrail test. Sync model_prices_and_context_window_backup.json. Pyright: narrow ignores for passthrough StreamingResponse and post_call hook. Made-with: Cursor	2026-03-22 00:29:38 +05:30
Krrish Dholakia	509d2e9ac3	Fix PR review issues: gpt-4-0314 prompt caching, case-insensitive data URL check, test I/O mocking - Remove incorrect supports_prompt_caching from gpt-4-0314 (predates the feature) - Make data-URL detection case-insensitive in Gemini tool call result conversion - Mock show_banner/generate_feedback_box in max_budget tests to prevent real I/O Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-21 11:30:29 -07:00
Krish Dholakia	a5b7e49713	Merge branch 'main' into litellm_oss_staging_03_17_2026	2026-03-21 10:40:48 -07:00
Sameer Kankute	49abf98a27	Merge branch 'main' into litellm_oss_staging_03_17_2026	2026-03-21 21:16:49 +05:30
Sameer Kankute	a427807796	Merge branch 'main' into litellm_dev_sameer_16_march_week	2026-03-21 21:16:07 +05:30
Sameer Kankute	5b5c998dbd	Merge branch 'main' into litellm_oss_staging_03_19_2026	2026-03-21 20:31:08 +05:30
ryan-crabbe	1da02b66f6	Merge branch 'main' into litellm_audit_log_s3_export	2026-03-20 16:39:54 -07:00
ryan-crabbe	59b4a05782	Merge branch 'main' into litellm_ryan_march_18	2026-03-20 13:36:37 -07:00
yuneng-jiang	5927a77a14	Merge branch 'main' into fix/aggregated-activity-entity-breakdown	2026-03-20 11:50:59 -07:00
yuneng-jiang	f884e4ac66	Merge branch 'main' into fix/team-member-budget-duration-on-create	2026-03-20 11:48:08 -07:00

1 2 3 4 5 ...

1761 Commits