Commit Graph

4435 Commits

Author SHA1 Message Date
yuneng-jiang a4341ccf83 ci/cd changes for debugging 2025-12-03 21:00:49 -08:00
yuneng-jiang 3de84b3f8b e2e tests 2025-12-03 20:34:41 -08:00
yuneng-jiang d215576477 Add auto redirect to SSO to new login page 2025-12-03 17:07:12 -08:00
yuneng-jiang 73824c278a Merge pull request #17443 from BerriAI/litellm_v2_login
[Feature] New Login Page
2025-12-03 16:23:47 -08:00
yuneng-jiang 37c598441f Change is_sso_configured to auto_redirect_to_sso 2025-12-03 15:48:50 -08:00
yuneng-jiang 8a1cf104e0 Merge remote-tracking branch 'origin' into litellm_ui_config_add_sso 2025-12-03 15:36:33 -08:00
yuneng-jiang ee63105e16 Merge pull request #17446 from BerriAI/litellm_ui_e2e_cicd_fix
[Fix] Change e2e test to look for Virtual Keys instead of API Keys
2025-12-03 15:35:22 -08:00
yuneng-jiang de4b79851f Change e2e test to look for Virtual Keys instead of API Keys 2025-12-03 15:26:25 -08:00
Ishaan Jaff 100cfc11ac [Bug Fix] Parallel Request Limiter with /messages (#17426)
* fix: use standard_logging_object for parallel request limiter

* fix test parallel request limtier
2025-12-03 14:13:28 -08:00
yuneng-jiang 9bb292f478 V2 login route 2025-12-03 12:41:45 -08:00
yuneng-jiang e6620fcdad Ruff checks 2025-12-03 11:01:10 -08:00
yuneng-jiang b3c0ea5414 Merge remote-tracking branch 'origin' into litellm_login_route_refactor 2025-12-03 10:40:11 -08:00
Sameer Kankute ece1e49fda Merge pull request #17414 from BerriAI/litellm_ragflow_vector_store
Add vector store support for ragflow
2025-12-03 20:48:52 +05:30
Sameer Kankute fcc2855baa Merge pull request #17403 from BerriAI/litellm_streaming_gemini_3_fix
Fix gemini 3 last chunk thinking block
2025-12-03 20:48:13 +05:30
Sameer Kankute 7e9c1ffb33 Merge pull request #17407 from BerriAI/litellm_enforce_enforce_user_param
Enforce support of enforce_user_param to openai post endpoints
2025-12-03 20:45:21 +05:30
Sameer Kankute 1fbe310444 Merge pull request #17405 from BerriAI/litellm_gemini_thought_sig_tool_beta
Make thought sign in tool call id as a beta feat
2025-12-03 20:43:53 +05:30
Sameer Kankute 8eaabb4ad7 Add vector store support for ragflow 2025-12-03 15:29:47 +05:30
Sameer Kankute 52090c3f3e Merge pull request #17350 from BerriAI/litellm_rag_chat_completion_api
Add ragflow support for chat completions API
2025-12-03 13:29:32 +05:30
Krish Dholakia 8edcc4ecc3 Guardrails API - add streaming support (#17400)
* fix(initial-commit): adding a way to get the right response type based on the api route

* feat(unified_guardrail.py): support streaming guardrails

* test: update tests

* fix: fix linting errors

* test: update tests
2025-12-02 22:52:09 -08:00
Sameer Kankute 54e29e7828 Enforce support of enforce_user_param to openai post endpoints 2025-12-03 12:19:21 +05:30
rioiart 1ac2655b17 Fix/organization max budget not enforced (#17334)
* test: add failing tests for organization budget enforcement bug

Add comprehensive tests exposing that organization-level budgets are
retrieved but never enforced during request authentication. Tests verify:

1. Basic org budget exceeded scenario (team under budget, org over)
2. Multiple teams collectively exceeding org budget
3. Organization budget fields exist but are never checked
4. Inconsistency between team budget enforcement (works) and org (doesn't)

Tests intentionally fail to document the bug. Will be fixed in next commit.

Related to organization_max_budget not being enforced in auth_checks.py

* fix: enforce organization budget in auth checks

Add organization budget enforcement to common_checks() in auth_checks.py.
Previously, organization_max_budget was retrieved from DB but never checked,
allowing teams to collectively exceed their organization's budget limit.

Changes:
- Add _organization_max_budget_check() function following team budget pattern
- Call org budget check after team budget check in common_checks()
- Add "organization_budget" to budget_alerts type literals
- Update tests to verify org budget is enforced

Budget hierarchy is now properly enforced:
  Organization Budget (hard ceiling)
    └─ Team Budget (sub-allocation)
        └─ Team Member Budget (per-user within team)
            └─ Key Budget (per-key)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* fix: add organization_id to budget alerts, fix enum comparison and linting of newly added code

- Add organization_id field to CallInfo class for better alert context
- Include organization_id in budget alerts (token, soft, team, org)
- Fix event_group enum comparison (was comparing enum to string)
- Add OrganizationBudgetAlert class for organization budget alerting
- Add organization_budget to test parameterizations
- Apply Black formatting to slack_alerting.py

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-12-02 22:46:03 -08:00
Matt Greathouse f22bc0aab2 Support Deepseek 3.2 with Reasoning (#17384)
* Add openrouter/deepseek/deepseek-v3.2

* Added deepseek-provided v3.2

* Allow reasoning effort param for openrouter models that support it

* Added tests
2025-12-02 22:00:19 -08:00
Richard Song 099ccf56a7 Refactor add_schema_to_components to move definitions to components/schemas and add corresponding unit test (#17389) 2025-12-02 21:57:07 -08:00
Sameer Kankute 40c203e32b Make thought sign in tool call id as a beta feat 2025-12-03 11:04:53 +05:30
Sameer Kankute 209e9e05aa Fix gemini 3 last chunk thinking block 2025-12-03 10:13:59 +05:30
yuneng-jiang 7fea97a0c0 Add is_sso_configured to UI Config 2025-12-02 17:53:16 -08:00
Ishaan Jaff 427074ac6e Fix: Datadog callback regression when ddtrace is installed (#17393)
* fix DD agent host logging

* docs fix

* test_datadog_agent_configuration

* test_datadog_ignores_ddtrace_agent_host
2025-12-02 17:27:50 -08:00
flozonn 31cad8e6e6 feat: Add Nova lite 2 reasoning support with reasoningConfig (#17371) 2025-12-02 16:33:07 -08:00
Ishaan Jaff 6c188c5ae2 [Feat] New model/provider - Adds support for Google Cloud Chirp3 HD on /speech (#17391)
* docs vertex tts

* place vertex ai types in file

* use VertexAITextToSpeechConfig

* use vertex_voice_dict

* refactor docs

* docs vertex ai chirp

* TestVertexAITextToSpeechConfig

* new provider vertex ai chirp3

* test_litellm_speech_vertex_ai_chirp

* add vertex_ai/chirp cost trackign
2025-12-02 15:36:23 -08:00
Leslie Cheng de4ff120eb 🐛 Fix proxy caching between requests in aiohttp transport (#17122)
* write a regression test

* impl fix

* add test for host case

* use the host as cache key
2025-12-02 14:37:45 -08:00
yuneng-jiang 6ee9d9c344 /login route refactor 2025-12-02 11:19:27 -08:00
kothamah 12530b375f Litellm bedrock OpenAI model support (#17368)
* Update constants.py

added constants

* Update base_aws_llm.py

added steps

* Update invoke_handler.py

added openai support

* Update base_invoke_transformation.py

added

* Update test_bedrock_completion.py

added
2025-12-02 09:19:53 -08:00
Sameer Kankute 397aceced8 Merge pull request #17342 from BerriAI/litellm_fix_mcp_auth_header_forwarding
Fix: litellm user auth not passing issue
2025-12-02 22:20:33 +05:30
Sameer Kankute 18a9af3488 Merge pull request #17291 from BerriAI/litellm_fix_correct_attribute_error_code_raise
Fix 500 error for malformed request
2025-12-02 22:17:48 +05:30
Ishaan Jaff 1bb9e1bde8 [Feat] Add vllm batch+files API support (#15823)
* add OPENAI_COMPATIBLE_BATCH_AND_FILES_PROVIDERS

* fix use OPENAI_COMPATIBLE_BATCH_AND_FILES_PROVIDERS

* add _get_batch_job_total_usage_from_file_content

* fixes for vLLM + 12 labs async invoke

* fix: vLLM Batch APIs

* afile_retrieve

* test_hosted_vllm_full_workflow

* fix SERVER_URL for test
2025-12-02 08:41:50 -08:00
Sameer Kankute 831ad45c4d Add ragflow support 2025-12-02 18:18:08 +05:30
Sameer Kankute 082c8af37f Fix: litellm user auth not passing issue 2025-12-02 11:25:32 +05:30
Krish Dholakia 4c7a988454 Guardrail API V2 - user api key metadata, session id, specify input type (request/response), image support (#17338)
* refactor(generic_guardrail_api.py): refactor to update to new guardrail api logic

* refactor: refactor llm api integrations to support passing in text as a list[str] instead of one at a time

* refactor: fix linting errors

* refactor: pass request type to guardrail api

allows request vs. response processing to occur

* feat: pass user api key dict information to the guardrail api

* fix: pass user api key dict information to the guardrail api

* feat: pass litellm call id + trace id, if present

* docs: update docs
2025-12-01 20:11:58 -08:00
Korbinian Koch 6e8e3b30f9 Update Databricks model pricing and add new models (including databricks pricing test). (#17277)
* update databricks pricing and add DBU<>USD test

* Refactor test_databricks_pricing.py

Removed unnecessary sys.path modification and cleaned up comments.
2025-12-01 20:06:47 -08:00
codgician e09e309371 feat(github-copilot): Add Embedding API support (#17278) 2025-12-01 20:05:28 -08:00
Boxuan Li 89458573a2 Add context window exception mapping for Together AI (#17284) 2025-12-01 20:02:59 -08:00
YutaSaito da5b81c1ff feat: add experimental latest-user filtering for Bedrock (#17282)
* feat: add experimental latest-user filtering for Bedrock

* doc: add experimental bedrock latest-message flag
2025-12-01 20:02:28 -08:00
Cesar Garcia 01dfc3561a Fix AttributeError when metadata is null in request body (#17263) (#17306)
Handle the case where metadata is explicitly set to null/None in the
request body. This was causing a 401 error with "'NoneType' object
has no attribute 'get'" when calling /v1/batches with metadata: null.

The fix uses `or {}` instead of a default dict value since the key
exists but has a None value.
2025-12-01 19:58:27 -08:00
Cesar Garcia 965406c643 feat(provider): add Z.AI (Zhipu AI) as built-in provider (#17307)
* feat(provider): add Z.AI (Zhipu AI) as built-in provider

Add support for Z.AI GLM models as a native OpenAI-compatible provider.

- Add "zai" to openai_compatible_providers list
- Add ZAI enum to LlmProviders
- Add provider URL resolution for https://api.z.ai/api/paas/v4
- Add 8 GLM models with pricing to model cost maps:
  - glm-4.6 (200K context, $0.6/$2.2 per 1M tokens)
  - glm-4.5, glm-4.5v, glm-4.5-x, glm-4.5-air, glm-4.5-airx
  - glm-4-32b-0414-128k
  - glm-4.5-flash (free tier)
- Add unit tests for provider integration

Closes #17289

* docs: add Z.AI provider documentation

- Add zai.md with usage examples, model list, and pricing
- Add to sidebars.js navigation
2025-12-01 19:56:47 -08:00
idola9 71efcb7115 Refactor Noma guardrail to use shared Responses transformation and include system instructions (#17315)
* Support system prompts in noma guardrails

* Use litellm util to covert chat completions to responses api
2025-12-01 19:56:14 -08:00
rioiart 98a244450e Fix sso users not added to entra synced team (#17331)
* test: add failing tests for SSO user not added to Entra-synced teams bug

Adds tests reproducing the bug where new SSO users with teams=None
(from NewUserResponse) are not added to Entra ID synced teams because
add_missing_team_member() returns early when teams is None.

Tests demonstrate:
- NewUserResponse with teams=None fails to add user to teams (bug)
- LiteLLM_UserTable with teams=[] correctly adds user to teams (control)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* fix: treat None as empty list in add_missing_team_member for new SSO users

Fixed bug where new SSO users logging in via Microsoft SSO were not added
to their Entra-synced teams. The issue was an early return when
user_info.teams is None (default for NewUserResponse). Now treats None
as an empty list so new users are properly added to all their SSO teams.

Location: litellm/proxy/management_endpoints/ui_sso.py:438-440

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-12-01 19:53:30 -08:00
rioiart 70126d9130 Fix/new org team validate against org (#17333)
* fix: skip user budget/model validation for org-scoped teams

When creating a team with organization_id, budget and model constraints
should be validated against the organization's limits, not the user's
personal limits. This allows org admins with restrictive personal
budgets to create teams within their organization's more generous limits.

Adds 4 unit tests to verify:
- Org-scoped teams bypass user budget validation
- Org-scoped teams bypass user model validation
- Standalone teams still validate against user limits

* fix: enforce user budget/model limits for standalone teams in update_team

- Add user-level budget and model validation to update_team endpoint for standalone teams,
  matching the existing pattern in new_team
- Org-scoped teams correctly bypass user validation and use organization limits instead
- Add 5 new comprehensive tests covering standalone/org team budget/model validation

* fix: Add direct TPM/RPM org limit validation and consolidate user team limit checks

- Add direct TPM/RPM comparison against org limits in _check_org_team_limits()
- Consolidate budget/models/TPM/RPM user validation into _check_user_team_limits() helper
- Ensure user limits only apply to standalone teams (organization_id=None)
- Org-scoped teams now validate TPM/RPM against org limits (not user limits)
- Add 8 tests for TPM/RPM validation scenarios (org and user limits)
- Reduce code duplication between new_team() and update_team()
2025-12-01 19:51:42 -08:00
Ishaan Jaff 1cdfb3da8f [Bug Fix] - Fix litellm_enterprise ensure imported routes exist (#17337)
* test_enterprise_routes.py

* test_enterprise_routes_all_imports_exist
2025-12-01 19:14:12 -08:00
Sameer Kankute 289c13ca5d Merge pull request #17260 from abi-jey/main
fix: GA path for azure openai realtime models
2025-12-02 08:34:10 +05:30
Sameer Kankute dbf1cd591d Merge pull request #17271 from colinlin-stripe/cherry-pick-invoke-headers
[fix] extra_headers in messages api bedrock invoke
2025-12-02 08:26:54 +05:30