litellm

mirror of https://github.com/tiennm99/litellm.git synced 2026-06-18 07:33:58 +00:00

Author	SHA1	Message	Date
yuneng-jiang	73824c278a	Merge pull request #17443 from BerriAI/litellm_v2_login [Feature] New Login Page	2025-12-03 16:23:47 -08:00
yuneng-jiang	37c598441f	Change is_sso_configured to auto_redirect_to_sso	2025-12-03 15:48:50 -08:00
yuneng-jiang	8a1cf104e0	Merge remote-tracking branch 'origin' into litellm_ui_config_add_sso	2025-12-03 15:36:33 -08:00
Ishaan Jaff	100cfc11ac	[Bug Fix] Parallel Request Limiter with /messages (#17426 ) * fix: use standard_logging_object for parallel request limiter * fix test parallel request limtier	2025-12-03 14:13:28 -08:00
yuneng-jiang	9bb292f478	V2 login route	2025-12-03 12:41:45 -08:00
yuneng-jiang	e6620fcdad	Ruff checks	2025-12-03 11:01:10 -08:00
yuneng-jiang	b3c0ea5414	Merge remote-tracking branch 'origin' into litellm_login_route_refactor	2025-12-03 10:40:11 -08:00
Sameer Kankute	ece1e49fda	Merge pull request #17414 from BerriAI/litellm_ragflow_vector_store Add vector store support for ragflow	2025-12-03 20:48:52 +05:30
Sameer Kankute	7e9c1ffb33	Merge pull request #17407 from BerriAI/litellm_enforce_enforce_user_param Enforce support of enforce_user_param to openai post endpoints	2025-12-03 20:45:21 +05:30
Sameer Kankute	1fbe310444	Merge pull request #17405 from BerriAI/litellm_gemini_thought_sig_tool_beta Make thought sign in tool call id as a beta feat	2025-12-03 20:43:53 +05:30
Sameer Kankute	8eaabb4ad7	Add vector store support for ragflow	2025-12-03 15:29:47 +05:30
Sameer Kankute	52090c3f3e	Merge pull request #17350 from BerriAI/litellm_rag_chat_completion_api Add ragflow support for chat completions API	2025-12-03 13:29:32 +05:30
Krish Dholakia	8edcc4ecc3	Guardrails API - add streaming support (#17400 ) * fix(initial-commit): adding a way to get the right response type based on the api route * feat(unified_guardrail.py): support streaming guardrails * test: update tests * fix: fix linting errors * test: update tests	2025-12-02 22:52:09 -08:00
Sameer Kankute	54e29e7828	Enforce support of enforce_user_param to openai post endpoints	2025-12-03 12:19:21 +05:30
rioiart	1ac2655b17	Fix/organization max budget not enforced (#17334 ) * test: add failing tests for organization budget enforcement bug Add comprehensive tests exposing that organization-level budgets are retrieved but never enforced during request authentication. Tests verify: 1. Basic org budget exceeded scenario (team under budget, org over) 2. Multiple teams collectively exceeding org budget 3. Organization budget fields exist but are never checked 4. Inconsistency between team budget enforcement (works) and org (doesn't) Tests intentionally fail to document the bug. Will be fixed in next commit. Related to organization_max_budget not being enforced in auth_checks.py * fix: enforce organization budget in auth checks Add organization budget enforcement to common_checks() in auth_checks.py. Previously, organization_max_budget was retrieved from DB but never checked, allowing teams to collectively exceed their organization's budget limit. Changes: - Add _organization_max_budget_check() function following team budget pattern - Call org budget check after team budget check in common_checks() - Add "organization_budget" to budget_alerts type literals - Update tests to verify org budget is enforced Budget hierarchy is now properly enforced: Organization Budget (hard ceiling) └─ Team Budget (sub-allocation) └─ Team Member Budget (per-user within team) └─ Key Budget (per-key) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * fix: add organization_id to budget alerts, fix enum comparison and linting of newly added code - Add organization_id field to CallInfo class for better alert context - Include organization_id in budget alerts (token, soft, team, org) - Fix event_group enum comparison (was comparing enum to string) - Add OrganizationBudgetAlert class for organization budget alerting - Add organization_budget to test parameterizations - Apply Black formatting to slack_alerting.py --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-12-02 22:46:03 -08:00
Matt Greathouse	f22bc0aab2	Support Deepseek 3.2 with Reasoning (#17384 ) * Add openrouter/deepseek/deepseek-v3.2 * Added deepseek-provided v3.2 * Allow reasoning effort param for openrouter models that support it * Added tests	2025-12-02 22:00:19 -08:00
Richard Song	099ccf56a7	Refactor add_schema_to_components to move definitions to components/schemas and add corresponding unit test (#17389 )	2025-12-02 21:57:07 -08:00
Sameer Kankute	40c203e32b	Make thought sign in tool call id as a beta feat	2025-12-03 11:04:53 +05:30
yuneng-jiang	7fea97a0c0	Add is_sso_configured to UI Config	2025-12-02 17:53:16 -08:00
flozonn	31cad8e6e6	feat: Add Nova lite 2 reasoning support with reasoningConfig (#17371 )	2025-12-02 16:33:07 -08:00
Leslie Cheng	de4ff120eb	🐛 Fix proxy caching between requests in aiohttp transport (#17122 ) * write a regression test * impl fix * add test for host case * use the host as cache key	2025-12-02 14:37:45 -08:00
yuneng-jiang	6ee9d9c344	/login route refactor	2025-12-02 11:19:27 -08:00
Sameer Kankute	831ad45c4d	Add ragflow support	2025-12-02 18:18:08 +05:30
Sameer Kankute	082c8af37f	Fix: litellm user auth not passing issue	2025-12-02 11:25:32 +05:30
Krish Dholakia	4c7a988454	Guardrail API V2 - user api key metadata, session id, specify input type (request/response), image support (#17338 ) * refactor(generic_guardrail_api.py): refactor to update to new guardrail api logic * refactor: refactor llm api integrations to support passing in text as a list[str] instead of one at a time * refactor: fix linting errors * refactor: pass request type to guardrail api allows request vs. response processing to occur * feat: pass user api key dict information to the guardrail api * fix: pass user api key dict information to the guardrail api * feat: pass litellm call id + trace id, if present * docs: update docs	2025-12-01 20:11:58 -08:00
Korbinian Koch	6e8e3b30f9	Update Databricks model pricing and add new models (including databricks pricing test). (#17277 ) * update databricks pricing and add DBU<>USD test * Refactor test_databricks_pricing.py Removed unnecessary sys.path modification and cleaned up comments.	2025-12-01 20:06:47 -08:00
codgician	e09e309371	feat(github-copilot): Add Embedding API support (#17278 )	2025-12-01 20:05:28 -08:00
Boxuan Li	89458573a2	Add context window exception mapping for Together AI (#17284 )	2025-12-01 20:02:59 -08:00
Cesar Garcia	01dfc3561a	Fix AttributeError when metadata is null in request body (#17263 ) (#17306 ) Handle the case where metadata is explicitly set to null/None in the request body. This was causing a 401 error with "'NoneType' object has no attribute 'get'" when calling /v1/batches with metadata: null. The fix uses `or {}` instead of a default dict value since the key exists but has a None value.	2025-12-01 19:58:27 -08:00
Cesar Garcia	965406c643	feat(provider): add Z.AI (Zhipu AI) as built-in provider (#17307 ) * feat(provider): add Z.AI (Zhipu AI) as built-in provider Add support for Z.AI GLM models as a native OpenAI-compatible provider. - Add "zai" to openai_compatible_providers list - Add ZAI enum to LlmProviders - Add provider URL resolution for https://api.z.ai/api/paas/v4 - Add 8 GLM models with pricing to model cost maps: - glm-4.6 (200K context, $0.6/$2.2 per 1M tokens) - glm-4.5, glm-4.5v, glm-4.5-x, glm-4.5-air, glm-4.5-airx - glm-4-32b-0414-128k - glm-4.5-flash (free tier) - Add unit tests for provider integration Closes #17289 * docs: add Z.AI provider documentation - Add zai.md with usage examples, model list, and pricing - Add to sidebars.js navigation	2025-12-01 19:56:47 -08:00
idola9	71efcb7115	Refactor Noma guardrail to use shared Responses transformation and include system instructions (#17315 ) * Support system prompts in noma guardrails * Use litellm util to covert chat completions to responses api	2025-12-01 19:56:14 -08:00
rioiart	98a244450e	Fix sso users not added to entra synced team (#17331 ) * test: add failing tests for SSO user not added to Entra-synced teams bug Adds tests reproducing the bug where new SSO users with teams=None (from NewUserResponse) are not added to Entra ID synced teams because add_missing_team_member() returns early when teams is None. Tests demonstrate: - NewUserResponse with teams=None fails to add user to teams (bug) - LiteLLM_UserTable with teams=[] correctly adds user to teams (control) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * fix: treat None as empty list in add_missing_team_member for new SSO users Fixed bug where new SSO users logging in via Microsoft SSO were not added to their Entra-synced teams. The issue was an early return when user_info.teams is None (default for NewUserResponse). Now treats None as an empty list so new users are properly added to all their SSO teams. Location: litellm/proxy/management_endpoints/ui_sso.py:438-440 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-12-01 19:53:30 -08:00
rioiart	70126d9130	Fix/new org team validate against org (#17333 ) * fix: skip user budget/model validation for org-scoped teams When creating a team with organization_id, budget and model constraints should be validated against the organization's limits, not the user's personal limits. This allows org admins with restrictive personal budgets to create teams within their organization's more generous limits. Adds 4 unit tests to verify: - Org-scoped teams bypass user budget validation - Org-scoped teams bypass user model validation - Standalone teams still validate against user limits * fix: enforce user budget/model limits for standalone teams in update_team - Add user-level budget and model validation to update_team endpoint for standalone teams, matching the existing pattern in new_team - Org-scoped teams correctly bypass user validation and use organization limits instead - Add 5 new comprehensive tests covering standalone/org team budget/model validation * fix: Add direct TPM/RPM org limit validation and consolidate user team limit checks - Add direct TPM/RPM comparison against org limits in _check_org_team_limits() - Consolidate budget/models/TPM/RPM user validation into _check_user_team_limits() helper - Ensure user limits only apply to standalone teams (organization_id=None) - Org-scoped teams now validate TPM/RPM against org limits (not user limits) - Add 8 tests for TPM/RPM validation scenarios (org and user limits) - Reduce code duplication between new_team() and update_team()	2025-12-01 19:51:42 -08:00
Ishaan Jaff	1cdfb3da8f	[Bug Fix] - Fix `litellm_enterprise` ensure imported routes exist (#17337 ) * test_enterprise_routes.py * test_enterprise_routes_all_imports_exist	2025-12-01 19:14:12 -08:00
Sameer Kankute	289c13ca5d	Merge pull request #17260 from abi-jey/main fix: GA path for azure openai realtime models	2025-12-02 08:34:10 +05:30
Sameer Kankute	dbf1cd591d	Merge pull request #17271 from colinlin-stripe/cherry-pick-invoke-headers [fix] extra_headers in messages api bedrock invoke	2025-12-02 08:26:54 +05:30
Ishaan Jaff	860cdc81d3	[Fix] Fix Watsonx Audio Transcription API (#17326 ) * """ add * fix transform_audio_transcription_request * fix tests * test_watsonx_transcription_request_body	2025-12-01 18:26:56 -08:00
Elias	37ecb03d4f	Add support of audio transcription for OVHcloud (#17305 )	2025-12-01 18:26:39 -08:00
Krish Dholakia	1eb06f8031	Revert "fix: respect guardrail mock_response during during_call to return blo…" (#17332 ) This reverts commit `6de6107673`.	2025-12-01 15:40:28 -08:00
Ishaan Jaff	24f847b84c	[Feat] JWT Auth - AI Gateway, allow using regular OIDC flow with user info endpoints (#17324 ) * feat: allow fetching OIDC user info * test: use test_auth_builder_with_oidc_userinfo_enabled gets user info when enabled * fix tool permission doc * docs fix diagram	2025-12-01 13:59:00 -08:00
Ishaan Jaff	ce0dc0c8b9	[Feat] WatsonX - allow passing zen_api_key dynamically (#16655 ) * test_watsonx_zen_api_key_from_client * zen api key * docs using zen api key	2025-12-01 12:55:47 -08:00
Colin Lin	661bccbc39	fixed flaky test by sorting list	2025-12-01 14:26:14 -05:00
Colin Lin	e420b633a1	add tests	2025-12-01 14:25:37 -05:00
orgersh92	7808a610f8	Fix session consistency, move Lasso API version away from source code (#17316 ) * store and fetch lasso-conversation id from cache * include gateway/v# in the baseUrl to allow simpler version migrations in the future * add tests for cached conversation ID	2025-12-01 10:03:51 -08:00
YutaSaito	6de6107673	fix: respect guardrail mock_response during during_call to return blocked output (#17247 )	2025-12-01 09:59:01 -08:00
Sameer Kankute	353c779c34	Merge pull request #17301 from BerriAI/litellm_claude_code_beta_fix Remove not compatible beta header from Bedrock	2025-12-01 21:39:37 +05:30
Sameer Kankute	983ba7aa0f	Remove not compatible beta header from claude code	2025-12-01 17:22:04 +05:30
Sameer Kankute	7dac498efb	Add passthrough cost tracking for veo	2025-12-01 14:33:03 +05:30
Sameer Kankute	7f42b9b987	Merge pull request #17193 from BerriAI/litellm_twelvelabs_int Added support for twelvelabs pegasus	2025-11-28 22:09:00 +05:30
Sameer Kankute	9d058398df	Fix pegasus response and add doc	2025-11-28 21:41:25 +05:30

1 2 3 4 5 ...

1560 Commits