Commit Graph

28119 Commits

Author SHA1 Message Date
yuneng-jiang a4341ccf83 ci/cd changes for debugging 2025-12-03 21:00:49 -08:00
yuneng-jiang 3de84b3f8b e2e tests 2025-12-03 20:34:41 -08:00
yuneng-jiang d215576477 Add auto redirect to SSO to new login page 2025-12-03 17:07:12 -08:00
yuneng-jiang 73824c278a Merge pull request #17443 from BerriAI/litellm_v2_login
[Feature] New Login Page
2025-12-03 16:23:47 -08:00
yuneng-jiang c5670839b6 Merge pull request #17399 from BerriAI/litellm_ui_config_add_sso
[Feature] Add auto_redirect_to_sso to UI Config
2025-12-03 15:59:06 -08:00
yuneng-jiang 37c598441f Change is_sso_configured to auto_redirect_to_sso 2025-12-03 15:48:50 -08:00
yuneng-jiang 8a1cf104e0 Merge remote-tracking branch 'origin' into litellm_ui_config_add_sso 2025-12-03 15:36:33 -08:00
yuneng-jiang ee63105e16 Merge pull request #17446 from BerriAI/litellm_ui_e2e_cicd_fix
[Fix] Change e2e test to look for Virtual Keys instead of API Keys
2025-12-03 15:35:22 -08:00
yuneng-jiang de4b79851f Change e2e test to look for Virtual Keys instead of API Keys 2025-12-03 15:26:25 -08:00
dependabot[bot] 462d423d86 Bump mcp from 1.10.1 to 1.23.0 in /.circleci (#17363)
Bumps [mcp](https://github.com/modelcontextprotocol/python-sdk) from 1.10.1 to 1.23.0.
- [Release notes](https://github.com/modelcontextprotocol/python-sdk/releases)
- [Changelog](https://github.com/modelcontextprotocol/python-sdk/blob/main/RELEASE.md)
- [Commits](https://github.com/modelcontextprotocol/python-sdk/compare/v1.10.1...v1.23.0)

---
updated-dependencies:
- dependency-name: mcp
  dependency-version: 1.23.0
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-12-03 15:19:34 -08:00
yuneng-jiang 43da0793d3 Fixed typo 2025-12-03 15:13:19 -08:00
yuneng-jiang b14b4a7112 New login screen using v2/login 2025-12-03 15:05:29 -08:00
Ishaan Jaff 100cfc11ac [Bug Fix] Parallel Request Limiter with /messages (#17426)
* fix: use standard_logging_object for parallel request limiter

* fix test parallel request limtier
2025-12-03 14:13:28 -08:00
Ishaan Jaffer 9b3d8302cf docs fix stable 2025-12-03 14:12:50 -08:00
yuneng-jiang 5f43e7a2d2 New login page WIP 2025-12-03 13:18:33 -08:00
yuneng-jiang 9bb292f478 V2 login route 2025-12-03 12:41:45 -08:00
yuneng-jiang 8aa939dd67 Merge pull request #17317 from BerriAI/litellm_ui_cred_refresh
[Feature] Delete Credential Enhancements
2025-12-03 12:20:55 -08:00
yuneng-jiang 857614d586 Merge pull request #17436 from BerriAI/litellm_ui_model_page_scrollbars
[Fix] Remove second scrollbar when sidebar is expanded + tooltip z index
2025-12-03 12:20:00 -08:00
yuneng-jiang 9783f0ff6c Remove second scrollbar when sidebar is expanded + tooltip z index 2025-12-03 12:14:48 -08:00
Alexsander Hamir 56328e6535 [Refactor#2] litellm/init – Lazy-load utils to reduce memory + import time (#17171)
* fix: lazy load utils.py imports

Lazy-load most functions and response types from utils.py to avoid loading
tiktoken and other heavy dependencies at import time. This significantly
reduces memory usage when importing completion from litellm.
2025-12-03 11:40:16 -08:00
yuneng-jiang dcb7552b79 Merge pull request #17379 from BerriAI/litellm_login_route_refactor
[Refactor] /login route
2025-12-03 11:32:16 -08:00
Felipe Garé 82a8134d7a fixing optional parameter default value (#17434) 2025-12-03 11:24:32 -08:00
yuneng-jiang 8df8a7ef95 Merge remote-tracking branch 'origin' into litellm_ui_cred_refresh 2025-12-03 11:04:09 -08:00
yuneng-jiang e6620fcdad Ruff checks 2025-12-03 11:01:10 -08:00
Felipe Garé 5ecebe2a57 adding status parameter as optinal for FileObject (#17431) 2025-12-03 11:00:18 -08:00
Alexsander Hamir 0a7602bb7c fix: prevent memory leak in aiohttp connection pooling (#17388)
* fix: prevent memory leak in aiohttp connection pooling

Add connection limits to aiohttp TCPConnector to prevent unbounded
connection growth that causes memory leaks. Without these limits,
aiohttp's _wrap_create_connection can accumulate connections
indefinitely in long-running processes.

Changes:
- Set default limit of 300 total connections and 50 per host
- Apply limits to shared proxy session initialization
- Apply limits to HTTP handler transport creation
- Configurable via AIOHTTP_CONNECTOR_LIMIT and
  AIOHTTP_CONNECTOR_LIMIT_PER_HOST environment variables
- Set to 0 for unlimited (not recommended for production)

This fix covers:
- All standard LLM provider API calls (OpenAI, Anthropic, etc.)
- Proxy server shared session
- Most guardrail HTTP calls

Impact: Prevents memory exhaustion in high-traffic deployments and
long-running proxy servers that make thousands of API calls.

Testing: Verified connection limits are applied correctly and
existing functionality remains unchanged.
2025-12-03 10:43:27 -08:00
yuneng-jiang b3c0ea5414 Merge remote-tracking branch 'origin' into litellm_login_route_refactor 2025-12-03 10:40:11 -08:00
Cesar Garcia 5e791464af docs: add Microsoft Agent Lightning to projects (#17422)
Add Agent Lightning, Microsoft's open-source framework for training
AI agents with RL, APO, and SFT. Uses LiteLLM Proxy for LLM routing
and trace collection.
2025-12-03 09:07:02 -08:00
Krrish Dholakia be5dd234bf docs: fix list 2025-12-03 08:01:26 -08:00
Sameer Kankute ece1e49fda Merge pull request #17414 from BerriAI/litellm_ragflow_vector_store
Add vector store support for ragflow
2025-12-03 20:48:52 +05:30
Sameer Kankute fcc2855baa Merge pull request #17403 from BerriAI/litellm_streaming_gemini_3_fix
Fix gemini 3 last chunk thinking block
2025-12-03 20:48:13 +05:30
Sameer Kankute 7967da77c0 Merge pull request #17419 from BerriAI/litellm_fix_bedrock_models_model_map
Fix bedrock models in model map
2025-12-03 20:45:45 +05:30
Sameer Kankute 7e9c1ffb33 Merge pull request #17407 from BerriAI/litellm_enforce_enforce_user_param
Enforce support of enforce_user_param to openai post endpoints
2025-12-03 20:45:21 +05:30
Sameer Kankute 1fbe310444 Merge pull request #17405 from BerriAI/litellm_gemini_thought_sig_tool_beta
Make thought sign in tool call id as a beta feat
2025-12-03 20:43:53 +05:30
Sameer Kankute c9c7823f43 Fix bedrock models in model map 2025-12-03 17:31:20 +05:30
Sameer Kankute dad0b2c111 Fix unused imports 2025-12-03 15:32:42 +05:30
Sameer Kankute 8eaabb4ad7 Add vector store support for ragflow 2025-12-03 15:29:47 +05:30
Sameer Kankute 52090c3f3e Merge pull request #17350 from BerriAI/litellm_rag_chat_completion_api
Add ragflow support for chat completions API
2025-12-03 13:29:32 +05:30
Krish Dholakia 8edcc4ecc3 Guardrails API - add streaming support (#17400)
* fix(initial-commit): adding a way to get the right response type based on the api route

* feat(unified_guardrail.py): support streaming guardrails

* test: update tests

* fix: fix linting errors

* test: update tests
2025-12-02 22:52:09 -08:00
Krish Dholakia 74ba18df55 Litellm chainguard fixes 12 02 2025 p1 (#17406)
* build: update dockerfile non root

* build: update build

* build: update non root

* build: dockerfile fixes

* build: ensure dockerfile + dockerfile.database also work
2025-12-02 22:50:13 -08:00
Sameer Kankute 54e29e7828 Enforce support of enforce_user_param to openai post endpoints 2025-12-03 12:19:21 +05:30
rioiart 1ac2655b17 Fix/organization max budget not enforced (#17334)
* test: add failing tests for organization budget enforcement bug

Add comprehensive tests exposing that organization-level budgets are
retrieved but never enforced during request authentication. Tests verify:

1. Basic org budget exceeded scenario (team under budget, org over)
2. Multiple teams collectively exceeding org budget
3. Organization budget fields exist but are never checked
4. Inconsistency between team budget enforcement (works) and org (doesn't)

Tests intentionally fail to document the bug. Will be fixed in next commit.

Related to organization_max_budget not being enforced in auth_checks.py

* fix: enforce organization budget in auth checks

Add organization budget enforcement to common_checks() in auth_checks.py.
Previously, organization_max_budget was retrieved from DB but never checked,
allowing teams to collectively exceed their organization's budget limit.

Changes:
- Add _organization_max_budget_check() function following team budget pattern
- Call org budget check after team budget check in common_checks()
- Add "organization_budget" to budget_alerts type literals
- Update tests to verify org budget is enforced

Budget hierarchy is now properly enforced:
  Organization Budget (hard ceiling)
    └─ Team Budget (sub-allocation)
        └─ Team Member Budget (per-user within team)
            └─ Key Budget (per-key)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* fix: add organization_id to budget alerts, fix enum comparison and linting of newly added code

- Add organization_id field to CallInfo class for better alert context
- Include organization_id in budget alerts (token, soft, team, org)
- Fix event_group enum comparison (was comparing enum to string)
- Add OrganizationBudgetAlert class for organization budget alerting
- Add organization_budget to test parameterizations
- Apply Black formatting to slack_alerting.py

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-12-02 22:46:03 -08:00
Fabian Reinold c173a4a275 Helm Chart: add ingress-only labels (#17348)
* feat(helm): add ingress-only labels

* feat(helm): add ingress configuration tests

* chore(helm): bump chart version
2025-12-02 22:30:54 -08:00
Cesar Garcia 86350fe6d7 docs: add Google ADK and Harbor to projects (#17352)
Both frameworks integrate with LiteLLM:
- Google ADK uses LiteLLM for model-agnostic agent building
- Harbor uses LiteLLM for agent evaluation across providers
2025-12-02 22:27:04 -08:00
Cesar Garcia 4c6604b0da Cleanup: Remove orphan docs pages and Docusaurus template files (#17356)
* docs: update getting started page

- Add Core Functions table with link to full list
- Add Responses API section
- Add Async section with acompletion() example
- Add "Switch Providers with One Line" example
- Clarify Basic Usage supports multiple endpoints
- Update models to current versions (openai/gpt-4o, anthropic/claude-sonnet-4)
- Use provider/model format throughout
- Fix deprecated import: from openai.error -> from openai
- Keep original structure: community key, More details links, observability env vars

* Cleanup: Remove orphan docs pages and Docusaurus template files

- Remove orphan getting_started.md (not linked in sidebar)
- Remove Docusaurus template intro.md
- Remove tutorial-basics/ directory (Docusaurus template)
- Remove tutorial-extras/ directory (Docusaurus template)
2025-12-02 22:25:26 -08:00
Deepak Tammali e289f5e454 feat: make streaming chunk size configurable in bedrock converse and invoke handlers (#17357) 2025-12-02 22:14:48 -08:00
Jonathan Yang 43dd9e4a90 fix: replace deprecated .dict() with .model_dump() in streaming_handler (#17359)
Replace Pydantic v1 `.dict()` method with v2 `.model_dump()` to fix
PydanticDeprecatedSince20 warnings. The `.dict()` method is deprecated
in Pydantic v2 and will be removed in v3.

Fixes #5987
2025-12-02 22:12:55 -08:00
Jonathan Yang 17faea96bb fix: conditionally pass enable_cleanup_closed to aiohttp TCPConnector (#17367)
* fix: conditionally pass enable_cleanup_closed to aiohttp TCPConnector

Fixes deprecation warning on Python 3.12.7+ and 3.13.1+ where
enable_cleanup_closed is no longer needed since the underlying
CPython SSL connection leak bug was fixed.

See: https://github.com/python/cpython/pull/118960

* chore: add aiohttp source reference to AIOHTTP_NEEDS_CLEANUP_CLOSED
2025-12-02 22:09:57 -08:00
Mariano Hielpos 566adebdec update model_prices_and_context_window.json (#17376)
* update model_prices_and_context_window.json

* update

* update
2025-12-02 22:06:51 -08:00
Ali Saleh 6b5ad5d5a6 docs: Update Instructions For Phoenix Integration (#17373) 2025-12-02 22:03:54 -08:00