Commit Graph

4316 Commits

Author SHA1 Message Date
Yuneng Jiang 2b374a2abf Merge main and resolve Snowflake test conflict
Main rewrote the same tests we moved. Resolution: keep the tests only
in the unit test directory, adopting main's improved patterns (AsyncMock,
assert_called_once, stronger content assertions on streaming).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 18:06:37 -07:00
Yuneng Jiang 19f8b58046 [Test] Move mocked Snowflake chat completion tests to unit test directory
Move test_chat_completion_snowflake and test_chat_completion_snowflake_stream
from tests/llm_translation/ to tests/test_litellm/llms/snowflake/chat/ so
they run as part of `make test-unit` without requiring API credentials.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 18:03:00 -07:00
Krrish Dholakia 5596728cae Merge pull request #24753 from BerriAI/litellm_dev_03_27_2026_p1
Fix returned model when batch completions is used - return picked model, not comma-separated list
2026-03-30 17:53:48 -07:00
Krrish Dholakia ee4ded4c44 Merge pull request #24445 from quora/fix/missing-content-part-added
fix(responses): emit content_part.added event for non-OpenAI models
2026-03-30 17:52:40 -07:00
Krrish Dholakia 4c00a14ce0 fix: fix ci/cd + handle oidc jwt tokens 2026-03-30 16:12:58 -07:00
Yuneng Jiang 6522d282b5 [Fix] Correct kwarg name in test_user_api_key_auth tests
PR #24755 renamed `azure_api_key_header` to `AZURE_AI_API_KEY_header` in
the test file but did not update the actual function signatures of
`get_api_key()` and `_user_api_key_auth_builder()`, causing TypeError
on all affected test cases.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 14:51:09 -07:00
Yuneng Jiang 4f03273285 Merge remote-tracking branch 'origin/main' into litellm_/objective-mendel 2026-03-30 13:16:15 -07:00
Yuneng Jiang fc32f91ffd [Fix] Rename test file so router code coverage check detects it
The router_code_coverage.py script only scans test files with "router" in the filename.
test_health_check_routing.py was invisible to this check, causing _async_filter_health_check_unhealthy_deployments
and _filter_health_check_unhealthy_deployments to appear untested.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 13:13:41 -07:00
Krrish Dholakia a671083596 Merge pull request #24755 from BerriAI/litellm_test_cleanup
Litellm test cleanup
2026-03-28 22:04:26 -07:00
Krrish Dholakia 92db2df2f6 Merge pull request #23794 from ndgigliotti/feat/bedrock-structured-output-cost-json
Bedrock: move native structured output model list to cost JSON, add Sonnet 4.6
2026-03-28 20:04:47 -07:00
Krrish Dholakia bc829d51f2 test: test 2026-03-28 19:17:38 -07:00
Krrish Dholakia 5cd8ca2365 refactor: refactor testing 2026-03-28 18:39:32 -07:00
Krrish Dholakia 44b03e6138 fix: fix azure tests 2026-03-28 18:19:41 -07:00
Krrish Dholakia bc33cf662f Merge pull request #24449 from J-Byron/feat/prometheus-org-budget-metrics
Feat/prometheus org budget metrics
2026-03-28 17:32:17 -07:00
Yuneng Jiang 7100ed5d0a [Fix] Test isolation for agent health checks and documentation test path resolution
Fix agent health check tests failing with 500 errors in parallel CI by
mocking prisma_client to None. Fix documentation validation tests using
CWD-relative paths that break depending on the working directory.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 11:00:22 -07:00
ryan-crabbe-berri 2eb3c20e76 Merge pull request #24718 from BerriAI/litellm_ryan-march-26
litellm ryan march 26
2026-03-28 09:01:11 -07:00
Krrish Dholakia 32adda8a49 fix: return winning model name instead of comma-separated list for fastest_response
When fastest_response=true with comma-separated models, the response
model field was stamped with the entire comma-separated string. Now
uses the x-litellm-model-group header from the winning response to
return the correct model name.

Made-with: Cursor
2026-03-27 22:34:26 -07:00
ryan-crabbe-berri 726a34627c Merge pull request #24717 from BerriAI/litellm_fix-user-cache-invalidation
fix(jwt): invalidate user cache after role/team sync updates
2026-03-27 19:50:41 -07:00
Ryan Crabbe dd11e77852 fix: add explicit TTL to cache writes and test coverage for user cache invalidation
Add DEFAULT_MANAGEMENT_OBJECT_IN_MEMORY_CACHE_TTL to both async_set_cache
calls in sync_user_role_and_teams for consistency with all other user cache
writes. Add 3 tests covering cache invalidation on role change, team change,
and no-op when nothing changes.
2026-03-27 19:45:13 -07:00
ryan-crabbe-berri 5b651048f2 Merge pull request #24706 from BerriAI/litellm_fix-jwt-none-guard
fix(auth): guard JWTHandler.is_jwt() against None token
2026-03-27 18:06:24 -07:00
yuneng-jiang 846e4b44b6 Merge pull request #24682 from michelligabriele/fix/budget-spend-counters
fix(proxy): enforce budget limits across multi-pod deployments via Redis-backed spend counters
2026-03-27 16:59:23 -07:00
Ryan Crabbe 8e3755931d test(auth): add regression tests for JWTHandler.is_jwt(None)
Add None-token test cases to both proxy_unit_tests and test_litellm
to cover the guard added in the previous commit. Also add -> bool
return type annotation to is_jwt().
2026-03-27 16:51:08 -07:00
Ryan Crabbe e24819afef fix(sso): pass decoded JWT access token to role mapping during SSO login
During SSO login, bearer tokens are stripped from the OAuth response
before role mapping runs. Custom role claims encoded inside the JWT
access token are lost, so map_jwt_role_to_litellm_role() returns None
and the user falls back to internal_user_viewer.

process_sso_jwt_access_token() now returns the decoded JWT payload, and
a new _sync_user_role_from_jwt_role_map() receives it so
jwt_litellm_role_map works correctly during SSO login.
2026-03-27 13:50:30 -07:00
michelligabriele d533b432fd fix(proxy): enforce budget limits across multi-pod deployments via Redis-backed spend counters
Budget checks on API keys, teams, and team members were not enforced in
multi-pod deployments because user_api_key_cache is intentionally
in-memory-only. Each pod tracked spend independently, so with N pods
the effective budget was N × max_budget.

Introduces a separate spend_counter_cache (DualCache wired to
redis_usage_cache) with atomic increment/read helpers:
- increment_spend_counters(): awaited in cost callback (not create_task)
  to update both in-memory and Redis before the next auth check
- get_current_spend(): reads Redis first (cross-pod authoritative),
  falls back to in-memory, then to cached object .spend from DB

Budget check functions (_virtual_key_max_budget_check,
_team_max_budget_check, _check_team_member_budget) now read spend via
get_current_spend() instead of cached object .spend fields.

When Redis is not configured, falls back to in-memory-only counters
(same as current single-instance behavior).

Fixes #23714
2026-03-27 20:39:52 +01:00
yuneng-jiang 1b111d23f3 Merge pull request #24688 from Sameerlite/litellm_litellm_team-model-group-name-routing-fix
fix(team-routing): preserve sibling deployment candidates for team public models
2026-03-27 12:00:34 -07:00
yuneng-jiang 8c2c6a40a6 Merge pull request #24689 from Sameerlite/litellm_litellm_remove-200k-pricing-opus-sonnet-46
fix(pricing): remove above_200k_tokens price tiers for claude-opus-4-6 and claude-sonnet-4-6
2026-03-27 10:26:17 -07:00
Sameer Kankute b4d0e3213f Fix the Pricing changes for claude models 2026-03-27 22:44:40 +05:30
yuneng-jiang 2ac1efdc0d Merge pull request #24603 from Sameerlite/litellm_openrouter-wildcard-strip-prefix
fix(openrouter): strip routing prefix for wildcard proxy deployments
2026-03-27 10:11:01 -07:00
yuneng-jiang b6506bf40f Merge pull request #24610 from Sameerlite/litellm_lyria-3-cost-map-doc
feat(gemini): Lyria 3 preview models in cost map and docs
2026-03-27 10:10:03 -07:00
yuneng-jiang 4bf5e66dbf Merge pull request #24624 from Sameerlite/litellm_sanitize-proxy-inputs
fix(proxy): sanitize user_id input and block dangerous env var keys
2026-03-27 10:08:32 -07:00
yuneng-jiang 695304d758 Merge pull request #24662 from Sameerlite/litellm_gemini-retrieve-file-url-normalize
feat(gemini): normalize AI Studio file retrieve URL
2026-03-27 09:59:46 -07:00
yuneng-jiang 210e30f055 Merge pull request #24665 from Sameerlite/litellm_gemini-3-1-flash-live-preview
feat(gemini): add gemini-3.1-flash-live-preview to model cost map
2026-03-27 09:58:44 -07:00
yuneng-jiang f3fe6d1c0a Merge pull request #24687 from Sameerlite/litellm_litellm_azure-finetuning-fixes
feat(fine-tuning): fix Azure OpenAI fine-tuning job creation
2026-03-27 09:58:12 -07:00
Sameer Kankute 931c88f567 Fix test 2026-03-27 21:21:43 +05:30
Sameer Kankute 09675ef205 Fix test 2026-03-27 21:12:37 +05:30
Sameer Kankute 7675488640 feat(router): add health-check-driven routing behind opt-in flag
Background health checks now feed deployment health state into the
router candidate-filtering pipeline. Unhealthy deployments are excluded
proactively instead of waiting for request failures to trigger cooldown.

Gated by `enable_health_check_routing: true` in general_settings.
Off by default — zero behavior change for existing users.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-27 20:57:08 +05:30
Sameer Kankute b212b340ab feat(gemini): normalize AI Studio file retrieve URL and harden tests
Made-with: Cursor
2026-03-27 20:43:19 +05:30
Sameer Kankute 92a07e2d6e fix(proxy): address Greptile review feedback
- Remove HTTP_PROXY/HTTPS_PROXY from blocklist (legitimately used in corporate envs)
- Add NO_PROXY/no_proxy to blocklist (prevents bypassing proxy monitoring)
- Remove dead code in _is_valid_user_id (space exception was unreachable)
- Update tests accordingly

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-27 20:38:36 +05:30
Sameer Kankute 8112fbf274 fix(proxy): sanitize user_id input and block dangerous env var keys
Add input validation to get_user_id_from_request (length limit, control char rejection) and a blocklist of dangerous environment variable keys in _load_environment_variables to prevent PATH/LD_PRELOAD/PYTHONPATH override via config.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-27 20:38:36 +05:30
Sameer Kankute cdc1dd5c37 Fix the tests 2026-03-27 20:36:01 +05:30
Sameer Kankute cc73ae776a feat(gemini): add Lyria 3 preview models to cost map and docs
Made-with: Cursor
2026-03-27 20:36:00 +05:30
Sameer Kankute 9d7fc307b8 fix(openrouter): strip LiteLLM prefix when proxy sets custom_llm_provider
Wildcard openrouter/* deployments pass custom_llm_provider=openrouter with
the full openrouter/provider/model id; OpenRouter expects provider/model.
Strip the outer openrouter/ only when the remainder contains a slash so
native ids like openrouter/auto stay intact.

Adds regression test for proxy wildcard path.

Made-with: Cursor
2026-03-27 20:35:17 +05:30
Sameer Kankute 00a810e92d feat(openai): round-trip Responses API reasoning_items in chat completions
Made-with: Cursor
2026-03-27 20:25:08 +05:30
yuneng-jiang d3568efad0 Merge pull request #24611 from Sameerlite/Sameerlite/order-fallback2
feat(router): order-based fallback across deployment priority levels
2026-03-27 20:15:30 +05:30
Sameer Kankute 2321d77599 fix(router): address remaining Greptile review comments
- Cache LITELLM_ENABLE_TEAM_STALE_ALIAS_BYPASS at module level to avoid hot-path secret lookups
- Add clarifying comments for should_include_deployment team isolation logic
- Add negative assertion for update_team.assert_not_called() in test
- Add docstring clarification for _get_team_deployments helper pattern
- Add explicit assertion message in test_get_model_list_alias_optimization

Made-with: Cursor
2026-03-27 20:11:28 +05:30
Sameer Kankute 1a0b30aaac Fix greptile reviews and mock test 2026-03-27 20:11:28 +05:30
Sameer Kankute c6cc0341f6 Fix greptile reviews and mock test 2026-03-27 20:11:28 +05:30
Sameer Kankute 316a742945 Fix greptile comments 2026-03-27 20:11:28 +05:30
Sameer Kankute 173695f5e0 Fix greptile comments 2026-03-27 20:11:27 +05:30
Sameer Kankute e8fb7762b3 perf(routing): optimize team model checks and improve test coverage
- Use O(1) team index lookup instead of map_team_model in alias guard
- Fix MockPrismaClient to validate where clause filters
- Add comment explaining DB query trade-off for team deployments

Made-with: Cursor
2026-03-27 20:11:27 +05:30