Revert path fixes for documentation tests that CircleCI never ran
(test_exception_types, test_general_setting_keys, test_readme_providers,
test_standard_logging_payload). Update the GHA workflow to run only the
4 tests CircleCI actually executed: test_env_keys, test_router_settings,
test_api_docs, test_circular_imports.
Add 2 missing router_settings keys (enable_health_check_routing,
health_check_staleness_threshold) and 27 missing general_settings keys
to config_settings.md so test_router_settings passes.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fix agent health check tests failing with 500 errors in parallel CI by
mocking prisma_client to None. Fix documentation validation tests using
CWD-relative paths that break depending on the working directory.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add DEFAULT_MANAGEMENT_OBJECT_IN_MEMORY_CACHE_TTL to both async_set_cache
calls in sync_user_role_and_teams for consistency with all other user cache
writes. Add 3 tests covering cache invalidation on role change, team change,
and no-op when nothing changes.
Add None-token test cases to both proxy_unit_tests and test_litellm
to cover the guard added in the previous commit. Also add -> bool
return type annotation to is_jwt().
During SSO login, bearer tokens are stripped from the OAuth response
before role mapping runs. Custom role claims encoded inside the JWT
access token are lost, so map_jwt_role_to_litellm_role() returns None
and the user falls back to internal_user_viewer.
process_sso_jwt_access_token() now returns the decoded JWT payload, and
a new _sync_user_role_from_jwt_role_map() receives it so
jwt_litellm_role_map works correctly during SSO login.
Budget checks on API keys, teams, and team members were not enforced in
multi-pod deployments because user_api_key_cache is intentionally
in-memory-only. Each pod tracked spend independently, so with N pods
the effective budget was N × max_budget.
Introduces a separate spend_counter_cache (DualCache wired to
redis_usage_cache) with atomic increment/read helpers:
- increment_spend_counters(): awaited in cost callback (not create_task)
to update both in-memory and Redis before the next auth check
- get_current_spend(): reads Redis first (cross-pod authoritative),
falls back to in-memory, then to cached object .spend from DB
Budget check functions (_virtual_key_max_budget_check,
_team_max_budget_check, _check_team_member_budget) now read spend via
get_current_spend() instead of cached object .spend fields.
When Redis is not configured, falls back to in-memory-only counters
(same as current single-instance behavior).
Fixes#23714
Background health checks now feed deployment health state into the
router candidate-filtering pipeline. Unhealthy deployments are excluded
proactively instead of waiting for request failures to trigger cooldown.
Gated by `enable_health_check_routing: true` in general_settings.
Off by default — zero behavior change for existing users.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Remove HTTP_PROXY/HTTPS_PROXY from blocklist (legitimately used in corporate envs)
- Add NO_PROXY/no_proxy to blocklist (prevents bypassing proxy monitoring)
- Remove dead code in _is_valid_user_id (space exception was unreachable)
- Update tests accordingly
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add input validation to get_user_id_from_request (length limit, control char rejection) and a blocklist of dangerous environment variable keys in _load_environment_variables to prevent PATH/LD_PRELOAD/PYTHONPATH override via config.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Wildcard openrouter/* deployments pass custom_llm_provider=openrouter with
the full openrouter/provider/model id; OpenRouter expects provider/model.
Strip the outer openrouter/ only when the remainder contains a slash so
native ids like openrouter/auto stay intact.
Adds regression test for proxy wildcard path.
Made-with: Cursor
Reset _ENABLE_TEAM_STALE_ALIAS_BYPASS to None in both test functions
to ensure test isolation and prevent ordering-dependent failures
Made-with: Cursor
- Cache LITELLM_ENABLE_TEAM_STALE_ALIAS_BYPASS at module level to avoid hot-path secret lookups
- Add clarifying comments for should_include_deployment team isolation logic
- Add negative assertion for update_team.assert_not_called() in test
- Add docstring clarification for _get_team_deployments helper pattern
- Add explicit assertion message in test_get_model_list_alias_optimization
Made-with: Cursor
- Add deduplication guard in _update_team_model_index to prevent duplicate indices
- Add wildcard comment in map_team_model for clarity
- Add monkeypatch to test_team_alias_stale_bypass_disabled_by_default for determinism
- Extract _get_team_deployments helper to centralize DB access pattern
- Add clarifying comments for team_public_model_name assignment ordering
Made-with: Cursor
- Use O(1) team index lookup instead of map_team_model in alias guard
- Fix MockPrismaClient to validate where clause filters
- Add comment explaining DB query trade-off for team deployments
Made-with: Cursor
- Skip model_aliases rewrite if model resolves to team deployments
- Add test coverage for sibling-preservation branch
- Update MockPrismaClient to support sibling deployment scenarios
Made-with: Cursor
- Add clarifying comments to test assertions
- Query prisma DB instead of in-memory router to avoid stale state
- Prevents incorrect deletion of old public name when siblings exist
Made-with: Cursor
Remove team model_alias rewrites and resolve team deployments by team_public_model_name with team_id so sibling deployments stay in the routing candidate pool, with explicit logs showing candidate selection before load balancing.
Made-with: Cursor
Use a deterministic internal model_name for team-scoped deployments so sibling deployments with the same public model share a routing group. This makes team alias writes idempotent and preserves multi-deployment failover/load balancing behavior.
Made-with: Cursor
- Add cancel/retrieve overrides in AzureOpenAIFineTuningAPI to normalize responses
- Expand _AZURE_STATUS_MAP to handle all known Azure statuses
- Add "pending" to OpenAIFileObject.status allowed values
- Fix async test mock to return awaitable LiteLLMFineTuningJob
- Add test_openai_file_object_accepts_pending_status
Made-with: Cursor