Commit Graph

8047 Commits

Author SHA1 Message Date
Yuneng Jiang 7851567091 [Fix] Scope documentation workflow to match CircleCI and add missing router settings
Revert path fixes for documentation tests that CircleCI never ran
(test_exception_types, test_general_setting_keys, test_readme_providers,
test_standard_logging_payload). Update the GHA workflow to run only the
4 tests CircleCI actually executed: test_env_keys, test_router_settings,
test_api_docs, test_circular_imports.

Add 2 missing router_settings keys (enable_health_check_routing,
health_check_staleness_threshold) and 27 missing general_settings keys
to config_settings.md so test_router_settings passes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 11:23:53 -07:00
Yuneng Jiang 7100ed5d0a [Fix] Test isolation for agent health checks and documentation test path resolution
Fix agent health check tests failing with 500 errors in parallel CI by
mocking prisma_client to None. Fix documentation validation tests using
CWD-relative paths that break depending on the working directory.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 11:00:22 -07:00
ryan-crabbe-berri 2eb3c20e76 Merge pull request #24718 from BerriAI/litellm_ryan-march-26
litellm ryan march 26
2026-03-28 09:01:11 -07:00
ryan-crabbe-berri 726a34627c Merge pull request #24717 from BerriAI/litellm_fix-user-cache-invalidation
fix(jwt): invalidate user cache after role/team sync updates
2026-03-27 19:50:41 -07:00
Ryan Crabbe dd11e77852 fix: add explicit TTL to cache writes and test coverage for user cache invalidation
Add DEFAULT_MANAGEMENT_OBJECT_IN_MEMORY_CACHE_TTL to both async_set_cache
calls in sync_user_role_and_teams for consistency with all other user cache
writes. Add 3 tests covering cache invalidation on role change, team change,
and no-op when nothing changes.
2026-03-27 19:45:13 -07:00
ryan-crabbe-berri 5b651048f2 Merge pull request #24706 from BerriAI/litellm_fix-jwt-none-guard
fix(auth): guard JWTHandler.is_jwt() against None token
2026-03-27 18:06:24 -07:00
yuneng-jiang fe080a86b2 Merge pull request #24705 from BerriAI/litellm_auto_schema_sync
[Infra] Automated schema.prisma sync and drift detection
2026-03-27 17:08:23 -07:00
yuneng-jiang 846e4b44b6 Merge pull request #24682 from michelligabriele/fix/budget-spend-counters
fix(proxy): enforce budget limits across multi-pod deployments via Redis-backed spend counters
2026-03-27 16:59:23 -07:00
Ryan Crabbe 8e3755931d test(auth): add regression tests for JWTHandler.is_jwt(None)
Add None-token test cases to both proxy_unit_tests and test_litellm
to cover the guard added in the previous commit. Also add -> bool
return type annotation to is_jwt().
2026-03-27 16:51:08 -07:00
Yuneng Jiang a074d1d68b [Infra] Mirror litellm_table_patch source changes (no binaries)
Cherry-pick source-only changes from litellm_table_patch, excluding
build artifacts from the incident response period.

- Remove destructive DROP COLUMN migration (20260311180521_schema_sync)
- Remove now-unnecessary restore migration (20260327232350)
- Bump litellm-proxy-extras 0.4.60 → 0.4.61
- Add regression test to block future DROP COLUMN migrations
- Fix double error handling in getTeamPermissionsCall

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 16:45:12 -07:00
Ryan Crabbe e24819afef fix(sso): pass decoded JWT access token to role mapping during SSO login
During SSO login, bearer tokens are stripped from the OAuth response
before role mapping runs. Custom role claims encoded inside the JWT
access token are lost, so map_jwt_role_to_litellm_role() returns None
and the user falls back to internal_user_viewer.

process_sso_jwt_access_token() now returns the decoded JWT payload, and
a new _sync_user_role_from_jwt_role_map() receives it so
jwt_litellm_role_map works correctly during SSO login.
2026-03-27 13:50:30 -07:00
michelligabriele d533b432fd fix(proxy): enforce budget limits across multi-pod deployments via Redis-backed spend counters
Budget checks on API keys, teams, and team members were not enforced in
multi-pod deployments because user_api_key_cache is intentionally
in-memory-only. Each pod tracked spend independently, so with N pods
the effective budget was N × max_budget.

Introduces a separate spend_counter_cache (DualCache wired to
redis_usage_cache) with atomic increment/read helpers:
- increment_spend_counters(): awaited in cost callback (not create_task)
  to update both in-memory and Redis before the next auth check
- get_current_spend(): reads Redis first (cross-pod authoritative),
  falls back to in-memory, then to cached object .spend from DB

Budget check functions (_virtual_key_max_budget_check,
_team_max_budget_check, _check_team_member_budget) now read spend via
get_current_spend() instead of cached object .spend fields.

When Redis is not configured, falls back to in-memory-only counters
(same as current single-instance behavior).

Fixes #23714
2026-03-27 20:39:52 +01:00
yuneng-jiang 1b111d23f3 Merge pull request #24688 from Sameerlite/litellm_litellm_team-model-group-name-routing-fix
fix(team-routing): preserve sibling deployment candidates for team public models
2026-03-27 12:00:34 -07:00
yuneng-jiang 8c2c6a40a6 Merge pull request #24689 from Sameerlite/litellm_litellm_remove-200k-pricing-opus-sonnet-46
fix(pricing): remove above_200k_tokens price tiers for claude-opus-4-6 and claude-sonnet-4-6
2026-03-27 10:26:17 -07:00
Sameer Kankute b4d0e3213f Fix the Pricing changes for claude models 2026-03-27 22:44:40 +05:30
yuneng-jiang 2ac1efdc0d Merge pull request #24603 from Sameerlite/litellm_openrouter-wildcard-strip-prefix
fix(openrouter): strip routing prefix for wildcard proxy deployments
2026-03-27 10:11:01 -07:00
yuneng-jiang b6506bf40f Merge pull request #24610 from Sameerlite/litellm_lyria-3-cost-map-doc
feat(gemini): Lyria 3 preview models in cost map and docs
2026-03-27 10:10:03 -07:00
yuneng-jiang 4bf5e66dbf Merge pull request #24624 from Sameerlite/litellm_sanitize-proxy-inputs
fix(proxy): sanitize user_id input and block dangerous env var keys
2026-03-27 10:08:32 -07:00
yuneng-jiang 53ac4c5459 Merge pull request #24661 from Sameerlite/litellm_filter-metadata-user-id
fix(anthropic): strip undocumented keys from metadata before sending to API
2026-03-27 10:00:36 -07:00
yuneng-jiang 695304d758 Merge pull request #24662 from Sameerlite/litellm_gemini-retrieve-file-url-normalize
feat(gemini): normalize AI Studio file retrieve URL
2026-03-27 09:59:46 -07:00
yuneng-jiang 210e30f055 Merge pull request #24665 from Sameerlite/litellm_gemini-3-1-flash-live-preview
feat(gemini): add gemini-3.1-flash-live-preview to model cost map
2026-03-27 09:58:44 -07:00
yuneng-jiang f3fe6d1c0a Merge pull request #24687 from Sameerlite/litellm_litellm_azure-finetuning-fixes
feat(fine-tuning): fix Azure OpenAI fine-tuning job creation
2026-03-27 09:58:12 -07:00
Sameer Kankute 931c88f567 Fix test 2026-03-27 21:21:43 +05:30
Sameer Kankute 09675ef205 Fix test 2026-03-27 21:12:37 +05:30
Sameer Kankute 7675488640 feat(router): add health-check-driven routing behind opt-in flag
Background health checks now feed deployment health state into the
router candidate-filtering pipeline. Unhealthy deployments are excluded
proactively instead of waiting for request failures to trigger cooldown.

Gated by `enable_health_check_routing: true` in general_settings.
Off by default — zero behavior change for existing users.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-27 20:57:08 +05:30
Sameer Kankute b212b340ab feat(gemini): normalize AI Studio file retrieve URL and harden tests
Made-with: Cursor
2026-03-27 20:43:19 +05:30
Sameer Kankute 38e8003297 fix(anthropic): strip undocumented keys from metadata before sending to API 2026-03-27 20:42:16 +05:30
Sameer Kankute 92a07e2d6e fix(proxy): address Greptile review feedback
- Remove HTTP_PROXY/HTTPS_PROXY from blocklist (legitimately used in corporate envs)
- Add NO_PROXY/no_proxy to blocklist (prevents bypassing proxy monitoring)
- Remove dead code in _is_valid_user_id (space exception was unreachable)
- Update tests accordingly

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-27 20:38:36 +05:30
Sameer Kankute 8112fbf274 fix(proxy): sanitize user_id input and block dangerous env var keys
Add input validation to get_user_id_from_request (length limit, control char rejection) and a blocklist of dangerous environment variable keys in _load_environment_variables to prevent PATH/LD_PRELOAD/PYTHONPATH override via config.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-27 20:38:36 +05:30
Sameer Kankute cdc1dd5c37 Fix the tests 2026-03-27 20:36:01 +05:30
Sameer Kankute cc73ae776a feat(gemini): add Lyria 3 preview models to cost map and docs
Made-with: Cursor
2026-03-27 20:36:00 +05:30
Sameer Kankute 9d7fc307b8 fix(openrouter): strip LiteLLM prefix when proxy sets custom_llm_provider
Wildcard openrouter/* deployments pass custom_llm_provider=openrouter with
the full openrouter/provider/model id; OpenRouter expects provider/model.
Strip the outer openrouter/ only when the remainder contains a slash so
native ids like openrouter/auto stay intact.

Adds regression test for proxy wildcard path.

Made-with: Cursor
2026-03-27 20:35:17 +05:30
Sameer Kankute 00a810e92d feat(openai): round-trip Responses API reasoning_items in chat completions
Made-with: Cursor
2026-03-27 20:25:08 +05:30
yuneng-jiang d3568efad0 Merge pull request #24611 from Sameerlite/Sameerlite/order-fallback2
feat(router): order-based fallback across deployment priority levels
2026-03-27 20:15:30 +05:30
Sameer Kankute 1fac58abb3 fix(tests): reset module-level cache in stale alias bypass tests
Reset _ENABLE_TEAM_STALE_ALIAS_BYPASS to None in both test functions
to ensure test isolation and prevent ordering-dependent failures

Made-with: Cursor
2026-03-27 20:11:28 +05:30
Sameer Kankute 2321d77599 fix(router): address remaining Greptile review comments
- Cache LITELLM_ENABLE_TEAM_STALE_ALIAS_BYPASS at module level to avoid hot-path secret lookups
- Add clarifying comments for should_include_deployment team isolation logic
- Add negative assertion for update_team.assert_not_called() in test
- Add docstring clarification for _get_team_deployments helper pattern
- Add explicit assertion message in test_get_model_list_alias_optimization

Made-with: Cursor
2026-03-27 20:11:28 +05:30
Sameer Kankute 592ac98ddc fix(router): address Greptile P1/P2 review comments
- Add deduplication guard in _update_team_model_index to prevent duplicate indices
- Add wildcard comment in map_team_model for clarity
- Add monkeypatch to test_team_alias_stale_bypass_disabled_by_default for determinism
- Extract _get_team_deployments helper to centralize DB access pattern
- Add clarifying comments for team_public_model_name assignment ordering

Made-with: Cursor
2026-03-27 20:11:28 +05:30
Sameer Kankute 1a0b30aaac Fix greptile reviews and mock test 2026-03-27 20:11:28 +05:30
Sameer Kankute c6cc0341f6 Fix greptile reviews and mock test 2026-03-27 20:11:28 +05:30
Sameer Kankute 9a0a216195 Fix code qa issues 2026-03-27 20:11:28 +05:30
Sameer Kankute 316a742945 Fix greptile comments 2026-03-27 20:11:28 +05:30
Sameer Kankute 173695f5e0 Fix greptile comments 2026-03-27 20:11:27 +05:30
Sameer Kankute e8fb7762b3 perf(routing): optimize team model checks and improve test coverage
- Use O(1) team index lookup instead of map_team_model in alias guard
- Fix MockPrismaClient to validate where clause filters
- Add comment explaining DB query trade-off for team deployments

Made-with: Cursor
2026-03-27 20:11:27 +05:30
Sameer Kankute 8aa58bdcaa fix(routing): prevent stale model_aliases from interfering with team routing
- Skip model_aliases rewrite if model resolves to team deployments
- Add test coverage for sibling-preservation branch
- Update MockPrismaClient to support sibling deployment scenarios

Made-with: Cursor
2026-03-27 20:11:27 +05:30
Sameer Kankute f5b7298854 fix(management): query DB directly for sibling deployments on rename
- Add clarifying comments to test assertions
- Query prisma DB instead of in-memory router to avoid stale state
- Prevents incorrect deletion of old public name when siblings exist

Made-with: Cursor
2026-03-27 20:11:27 +05:30
Sameer Kankute 248fb8bc90 fix(router): address remaining Greptile P0/P1 issues
- Update map_team_model test to expect public name return
- Only remove old public name if no sibling deployments use it

Made-with: Cursor
2026-03-27 20:11:27 +05:30
Sameer Kankute aeb932d707 fix(team-routing): keep team model routing on public names
Remove team model_alias rewrites and resolve team deployments by team_public_model_name with team_id so sibling deployments stay in the routing candidate pool, with explicit logs showing candidate selection before load balancing.

Made-with: Cursor
2026-03-27 20:11:27 +05:30
Sameer Kankute 5534b40ab3 fix(team-routing): use deterministic team model group names
Use a deterministic internal model_name for team-scoped deployments so sibling deployments with the same public model share a routing group. This makes team alias writes idempotent and preserves multi-deployment failover/load balancing behavior.

Made-with: Cursor
2026-03-27 20:11:27 +05:30
Sameer Kankute e635cee712 feat(fine-tuning): address greptile review feedback (greploop iteration 5)
- Remove unused FineTuningJob import from test
- Document "canceling" → "cancelled" mapping in _AZURE_STATUS_MAP

Made-with: Cursor
2026-03-27 20:04:41 +05:30
Sameer Kankute 528bac5a27 feat(fine-tuning): address greptile review feedback (greploop iteration 4)
- Add cancel/retrieve overrides in AzureOpenAIFineTuningAPI to normalize responses
- Expand _AZURE_STATUS_MAP to handle all known Azure statuses
- Add "pending" to OpenAIFileObject.status allowed values
- Fix async test mock to return awaitable LiteLLMFineTuningJob
- Add test_openai_file_object_accepts_pending_status

Made-with: Cursor
2026-03-27 20:04:41 +05:30