PR #24755 renamed `azure_api_key_header` to `AZURE_AI_API_KEY_header` in
the test file but did not update the actual function signatures of
`get_api_key()` and `_user_api_key_auth_builder()`, causing TypeError
on all affected test cases.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fix agent health check tests failing with 500 errors in parallel CI by
mocking prisma_client to None. Fix documentation validation tests using
CWD-relative paths that break depending on the working directory.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When fastest_response=true with comma-separated models, the response
model field was stamped with the entire comma-separated string. Now
uses the x-litellm-model-group header from the winning response to
return the correct model name.
Made-with: Cursor
Add DEFAULT_MANAGEMENT_OBJECT_IN_MEMORY_CACHE_TTL to both async_set_cache
calls in sync_user_role_and_teams for consistency with all other user cache
writes. Add 3 tests covering cache invalidation on role change, team change,
and no-op when nothing changes.
Add None-token test cases to both proxy_unit_tests and test_litellm
to cover the guard added in the previous commit. Also add -> bool
return type annotation to is_jwt().
During SSO login, bearer tokens are stripped from the OAuth response
before role mapping runs. Custom role claims encoded inside the JWT
access token are lost, so map_jwt_role_to_litellm_role() returns None
and the user falls back to internal_user_viewer.
process_sso_jwt_access_token() now returns the decoded JWT payload, and
a new _sync_user_role_from_jwt_role_map() receives it so
jwt_litellm_role_map works correctly during SSO login.
Budget checks on API keys, teams, and team members were not enforced in
multi-pod deployments because user_api_key_cache is intentionally
in-memory-only. Each pod tracked spend independently, so with N pods
the effective budget was N × max_budget.
Introduces a separate spend_counter_cache (DualCache wired to
redis_usage_cache) with atomic increment/read helpers:
- increment_spend_counters(): awaited in cost callback (not create_task)
to update both in-memory and Redis before the next auth check
- get_current_spend(): reads Redis first (cross-pod authoritative),
falls back to in-memory, then to cached object .spend from DB
Budget check functions (_virtual_key_max_budget_check,
_team_max_budget_check, _check_team_member_budget) now read spend via
get_current_spend() instead of cached object .spend fields.
When Redis is not configured, falls back to in-memory-only counters
(same as current single-instance behavior).
Fixes#23714
- Remove HTTP_PROXY/HTTPS_PROXY from blocklist (legitimately used in corporate envs)
- Add NO_PROXY/no_proxy to blocklist (prevents bypassing proxy monitoring)
- Remove dead code in _is_valid_user_id (space exception was unreachable)
- Update tests accordingly
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add input validation to get_user_id_from_request (length limit, control char rejection) and a blocklist of dangerous environment variable keys in _load_environment_variables to prevent PATH/LD_PRELOAD/PYTHONPATH override via config.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Cache LITELLM_ENABLE_TEAM_STALE_ALIAS_BYPASS at module level to avoid hot-path secret lookups
- Add clarifying comments for should_include_deployment team isolation logic
- Add negative assertion for update_team.assert_not_called() in test
- Add docstring clarification for _get_team_deployments helper pattern
- Add explicit assertion message in test_get_model_list_alias_optimization
Made-with: Cursor
- Use O(1) team index lookup instead of map_team_model in alias guard
- Fix MockPrismaClient to validate where clause filters
- Add comment explaining DB query trade-off for team deployments
Made-with: Cursor
- Skip model_aliases rewrite if model resolves to team deployments
- Add test coverage for sibling-preservation branch
- Update MockPrismaClient to support sibling deployment scenarios
Made-with: Cursor
- Add clarifying comments to test assertions
- Query prisma DB instead of in-memory router to avoid stale state
- Prevents incorrect deletion of old public name when siblings exist
Made-with: Cursor
Remove team model_alias rewrites and resolve team deployments by team_public_model_name with team_id so sibling deployments stay in the routing candidate pool, with explicit logs showing candidate selection before load balancing.
Made-with: Cursor
Use a deterministic internal model_name for team-scoped deployments so sibling deployments with the same public model share a routing group. This makes team alias writes idempotent and preserves multi-deployment failover/load balancing behavior.
Made-with: Cursor
Instead of returning a 400 error when return_to is passed without
control_plane_url configured, silently ignore it and proceed with
the normal same-origin SSO flow.
Two independent bugs prevented post-call OpenAI Moderation guardrail
results from reaching downstream logging callbacks (Langfuse, Datadog).
Bug 1: process_output_response() created a throwaway request_data dict,
so guardrail info written by @log_guardrail_information was discarded.
Fixed by threading the real request_data from the unified guardrail
dispatcher through all 13 BaseTranslation handlers, with litellm_metadata
injection preserved for third-party guardrails (Zscaler, Prompt Security).
Also extended to process_output_streaming_response for consistency.
Bug 2: The @log_guardrail_information decorator collapsed the full
moderation API response (categories, scores, flagged status) to "allow".
Fixed by overriding _process_response/_process_error on
OpenAIModerationGuardrail to stash and log the full response, following
the established Model Armor pattern.
* feat(redis): add circuit breaker to RedisCache to fast-fail when Redis is down (#24181)
* feat(redis): add circuit breaker env var constants
* feat(redis): add RedisCircuitBreaker and apply guard decorator to all async ops
* fix(dual_cache): fall back to L1 instead of re-raising on Redis increment failures
* test(caching): add circuit breaker unit tests
* fix(redis): fast-fail concurrent HALF_OPEN probes — only one probe at a time
* fix(dual_cache): return None fallback when in_memory_cache is absent and Redis fails
* test(caching): add regression tests for HALF_OPEN concurrency and None fallback
* Fix blocking sync next in __anext__ (#24177)
* Fix blocking sync next
* Update tests/test_litellm/litellm_core_utils/test_streaming_handler.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* fix PEP 479 regression in __anext__ sync iterator exhaustion
asyncio.to_thread re-raises thread exceptions inside a coroutine, where
PEP 479 converts StopIteration to RuntimeError before any except clause
can catch it. Add _next_sync_or_exhausted() module-level helper that
catches StopIteration in the thread and returns a sentinel instead, then
raise StopAsyncIteration in the coroutine.
Also rewrites the non-blocking test to use asyncio.gather() instead of
asyncio.create_task() (which returned None on Python 3.9 / pytest-asyncio
in CI), and adds an exhaustion regression test that drains the wrapper
fully and asserts no RuntimeError leaks out.
---------
Co-authored-by: Emerson Gomes <emerson.gomes@thalesgroup.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* feat: add git-subdir source type to claude-code/plugins API (#24223)
Support a third plugin source type `git-subdir` alongside the existing
`github` and `url` types, as documented in the official Claude Code
plugin marketplaces spec.
New format: {"source": "git-subdir", "url": "...", "path": "subdir/path"}
- Validates url and path fields are present and non-empty
- Rejects absolute paths, '..' segments, backslashes, and percent-encoded
traversal sequences (including double-encoded variants via regex check)
- Extracts path validation into _validate_git_subdir_path() helper
- Updates Pydantic field description to document all three source types
- Adds isValidUrl() check for url/git-subdir source types in the UI form
- Adds "Git Subdir" option to the UI form with a required Path field
- Adds unit tests covering success, update, missing/empty fields,
path traversal variants, and unknown source type
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* [FEAT] add extract_header and extract_footer to Mistral OCR supported params (#24213)
* docs: add git-subdir source type to claude-code plugin marketplace docs (#24289)
* fix(ui): swap J/K keyboard navigation in log details drawer (#24279) (#24286)
J should navigate down (next) and K should navigate up (previous),
matching vim/standard conventions.
* fix: use async_set_cache in user_api_key_auth hot path (#24302)
* fix: use async_set_cache in auth hot path to avoid blocking event loop
* test: assert no blocking set_cache call in _user_api_key_auth_builder
* test: broaden blocking call check to all sync DualCache methods
* test: fix regression test to actually catch blocking cache calls
* fix: ruff lint unused variable + UI build MessageManager error
- litellm/caching/redis_cache.py: remove unused variable 'e' in circuit
breaker exception handler (F841)
- add_plugin_form.tsx: use MessageManager.error() instead of undefined
message.error() for git URL validation
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
* docs: add REDIS_CIRCUIT_BREAKER env vars to config_settings reference
Add REDIS_CIRCUIT_BREAKER_FAILURE_THRESHOLD and
REDIS_CIRCUIT_BREAKER_RECOVERY_TIMEOUT to the environment variables
reference table so test_env_keys.py passes.
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
---------
Co-authored-by: Emerson Gomes <emerson.gomes@thalesgroup.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Vincenzo Barrea <manamana88@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Robert Kirscht <rkirscht242@gmail.com>
Co-authored-by: Imgyu Kim <kimimgo@gmail.com>
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
- Remove incorrect supports_prompt_caching from gpt-4-0314 (predates the feature)
- Make data-URL detection case-insensitive in Gemini tool call result conversion
- Mock show_banner/generate_feedback_box in max_budget tests to prevent real I/O
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>