litellm

mirror of https://github.com/tiennm99/litellm.git synced 2026-06-18 00:48:01 +00:00

Author	SHA1	Message	Date
Sameer Kankute	3fdd67ff23	Delete docs/my-website/blog/debug_cost_discrepancy/index.md	2026-04-15 21:35:05 +05:30
Sameer Kankute	639135e365	Update docs/my-website/blog/debug_cost_discrepancy/index.md Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>	2026-04-13 11:33:24 +05:30
Sameer Kankute	5e830e0d55	docs(troubleshoot): add cost discrepancy debugging guide - New troubleshoot page and blog post with step-by-step comparison workflow - Screenshots under static/img/cost-discrepancy-debug - Link from spend tracking; sidebar entry under Troubleshooting - Flowchart SVG: Path B connectors below box; clarify LiteLLM schedules customer calls when stuck Made-with: Cursor	2026-04-13 11:27:16 +05:30
Sameer Kankute	fa605d85c0	Merge pull request #25616 from BerriAI/main merge main	2026-04-13 08:43:43 +05:30
ishaan-berri	fdd7500904	blog: add back arrow to blog post pages (#25587 ) * blog: add back arrow to post pages * blog: style back arrow — fixed top-left below navbar	2026-04-11 19:15:45 -07:00
ishaan-berri	1edf41c26f	Merge pull request #25585 from BerriAI/litellm_dev_04_11_2026_p1 Litellm dev 04 11 2026 p1	2026-04-11 18:46:57 -07:00
Ishaan Jaffer	35f4b47ff8	apply content guidelines: scale/resilience narrative, FAQ, Key Takeaways, Conclusion CTA	2026-04-11 18:12:32 -07:00
Ishaan Jaffer	14eed24471	add redis circuit breaker blog post with React diagrams	2026-04-11 18:02:59 -07:00
Ishaan Jaffer	8e616ecdf4	add BlogPostPage swizzle: hide sidebar, add hiring CTA on every post	2026-04-11 18:02:56 -07:00
Ishaan Jaffer	dac44fb443	blog list styles: clean typography, marquee animation, hero layout	2026-04-11 18:02:52 -07:00
Ishaan Jaffer	85cb7db8b9	blog list page: Ramp-style flat list with hero, provider marquee, hiring CTA	2026-04-11 18:02:48 -07:00
Ishaan Jaffer	05d516482f	restyle blog list page to match engineering blog aesthetic	2026-04-11 18:02:44 -07:00
Krrish Dholakia	e08e3bf748	docs: clarify how to get benchmarking script	2026-04-11 17:31:03 -07:00
Krrish Dholakia	12bca649fc	docs: refactor benchmarking docs to be clearer	2026-04-11 17:30:09 -07:00
Yuneng Jiang	909247785e	Merge remote-tracking branch 'origin' into litellm_internal_staging_04_11_2026	2026-04-11 15:41:03 -07:00
Sameer Kankute	c13be44e44	feat(guardrails): optional skip system message in unified guardrail inputs (#25481 ) * feat(guardrails): optional skip system message in unified guardrail inputs Made-with: Cursor * feat(dashboard): skip_system_message_in_guardrail in guardrail UI Add a tri-state control (inherit / yes / no) when creating or editing guardrails so admins can set litellm_params.skip_system_message_in_guardrail without YAML. Table edit merges existing litellm_params before PUT to avoid wiping content-filter and other provider fields. Document the dashboard flow in the guardrails quick start with a screenshot. Made-with: Cursor * fix(guardrails): type structured_messages as AllMessageValues for mypy Use AllMessageValues in openai_messages_without_system and cast adapter request messages so GenericGuardrailAPIInputs matches TypedDict. Made-with: Cursor	2026-04-11 08:53:24 -07:00
ishaan-berri	831083b565	Merge pull request #25525 from BerriAI/feat/anthropic-advisor-tool feat(anthropic): support advisor_20260301 tool type	2026-04-10 16:39:34 -07:00
Krrish Dholakia	4e12d3c562	docs: document april townhall announcements (#25537 ) * docs: document april townhall announcements * docs: cleanup blog post	2026-04-10 16:12:06 -07:00
Ishaan Jaffer	d6e2a74c0f	docs: move advisor tool doc to completion/ guides section in sidebar	2026-04-10 15:08:25 -07:00
Ishaan Jaffer	ed973c049f	docs: add Advisor Tool documentation page	2026-04-10 13:15:54 -07:00
Yuneng Jiang	ce0b57b4ff	[Docs] Add missing MCP per-user token env vars to config_settings MCP_PER_USER_TOKEN_DEFAULT_TTL and MCP_PER_USER_TOKEN_EXPIRY_BUFFER_SECONDS were added in #25441 but not documented, causing test_env_keys.py to fail.	2026-04-09 21:04:34 -07:00
Krrish Dholakia	3a6db708ce	docs: add Docker Image Security Guide for cosign verification and deployment best practices (#25439 ) - New doc page covering all signed image variants, verification commands, CI/CD enforcement (K8s Sigstore Policy Controller, GCP Binary Authorization, AWS/EKS, GitHub Actions), digest pinning, and safe upgrade patterns - Added to sidebar under Setup & Deployment - Cross-linked from the existing deploy.md cosign section Co-authored-by: Cursor Agent <cursoragent@cursor.com> Co-authored-by: Krrish Dholakia <krrish-berri-2@users.noreply.github.com>	2026-04-09 11:50:15 -07:00
Abhijoy Sarkar	c688d9d6bc	Add PromptGuard guardrail integration (#24268 ) * Add PromptGuard guardrail integration Add PromptGuard as a first-class guardrail vendor in LiteLLM's proxy, supporting prompt injection detection, PII redaction, topic filtering, entity blocklists, and hallucination detection via PromptGuard's /api/v1/guard API endpoint. Backend: - Add PROMPTGUARD to SupportedGuardrailIntegrations enum - Implement PromptGuardGuardrail (CustomGuardrail subclass) with apply_guardrail handling allow/block/redact decisions - Add Pydantic config model with api_key, api_base, ui_friendly_name - Auto-discovered via guardrail_hooks/promptguard/__init__.py registries Frontend: - Add PromptGuard partner card to Guardrail Garden with eval scores - Add preset configuration for quick setup - Add logo to guardrailLogoMap Tests: - 30 unit tests covering configuration, allow/block/redact actions, request payload construction, error handling, config model, and registry wiring * Fix redact path and init ordering per review feedback - P1: Update structured_messages (not just texts) when PromptGuard returns a redact decision, so PII redaction is effective for the primary LLM message path - P2: Validate credentials before allocating the HTTPX client so resources aren't acquired if PromptGuardMissingCredentials is raised - Add tests for structured_messages redaction and texts-only redaction * Harden PromptGuard integration: fail-open, event hooks, images, docs - Add block_on_error config (default fail-closed, configurable fail-open) - Declare supported_event_hooks (pre_call, post_call) like other vendors - Forward images from GenericGuardrailAPIInputs to PromptGuard API - Wrap API call in try/except for resilient error handling - Add comprehensive documentation page with config examples - Register docs page in sidebar alongside other guardrail providers - Expand test suite from 32 to 40 tests covering new functionality * Fix dict[str, Any] -> Dict[str, Any] for Python 3.8 compat * Address remaining Greptile feedback: timeout, redact guard - Add explicit 10s timeout to async_handler.post() to prevent indefinite hangs when PromptGuard API is unresponsive - Guard redact path: only update inputs["texts"] when the key was originally present, avoiding phantom key injection - Add test: redact with structured_messages only does not create texts key (41 tests total) * Fix CI lint: black formatting, add PromptGuardConfigModel to LitellmParams - Reformat promptguard.py to match CI black version (parenthesization) - Add PromptGuardConfigModel as base class of LitellmParams for proper Pydantic schema validation, consistent with all other guardrail vendors - Use litellm_params.block_on_error directly (now a typed field) * Address Greptile review: redact path, null decision, error context - P1: Filter _extract_texts_from_messages to user-role messages only, preventing system/assistant content from being injected into texts - P1: Strengthen test_redact_updates_structured_messages assertion from weak `in` check to strict equality, catching the injection bug - P2: Use `result.get("decision") or "allow"` to handle explicit null decision values (not just absent keys) - P2: Wrap bare exception re-raise in GuardrailRaisedException so the caller knows which guardrail failed (block_on_error=True path) - P2: Add static Promptguard entry in guardrail_provider_map so the preset works before populateGuardrailProviderMap is called - Add test for explicit null decision treated as allow * Fix black formatting: collapse f-string in error message	2026-04-09 08:12:24 -07:00
michelligabriele	cd9c511df6	feat(proxy): add credential overrides per team/project via model_config metadata (#24438 )	2026-04-09 07:22:27 -07:00
Krrish Dholakia	f42ffed2bd	Litellm oss staging 04 02 2026 p1 (#25055 ) * fix(vertex_ai): support pluggable (executable) credential_source for WIF auth (#24700) The WIF credential dispatch in load_auth() only handled identity_pool and aws credential types. When credential_source.executable was present (used for Azure Managed Identity via Workload Identity Federation), it fell through to identity_pool.Credentials which rejected it with MalformedError. Add dispatch to google.auth.pluggable.Credentials for executable-type credential sources, following the same pattern as the existing identity_pool and aws helpers. Fixes authentication for Azure Container Apps → GCP Vertex AI via WIF with executable credential sources. * feat(logging): add component and logger fields to JSON logs for 3rd p… (#24447) * feat(logging): add component and logger fields to JSON logs for 3rd party filtering * Let user-supplied extra fields win over auto-generated component/logger, tighten test assertions * Feat - Add organization into the metrics metadata for org_id & org_alias (#24440) * Add org_id and org_alias label names to Prometheus metric definitions * Add user_api_key_org_alias to StandardLoggingUserAPIKeyMetadata * Populate user_api_key_org_alias in pre-call metadata * Pass org_id and org_alias into per-request Prometheus metric labels * Add test for org labels on per-request Prometheus metrics * chore: resolve test mockdata * Address review: populate org_alias from DB view, add feature flag, use .get() for org metadata * Add org labels to failure path and verify flag behavior in test * Fix test: build flag-off enum_values without org fields * Gate org labels behind feature flag in get_labels() instead of static metric lists * Scope org label injection to metrics that carry team context, remove orphaned budget label defs, add test teardown * Use explicit metric allowlist for org label injection instead of team heuristic * Fix duplicate org label guard, move _org_label_metrics to class constant * Reset custom_prometheus_metadata_labels after duplicate label assertion * fix: emit org labels by default, remove flag, fix missing org_alias in all metadata paths * fix: emit org labels by default, no opt-in flag required * fix: write org_alias to metadata unconditionally in proxy_server.py * fix: 429s from batch creation being converted to 500 (#24703) * add us gov models (#24660) * add us gov models * added max tokens * Litellm dev 04 02 2026 p1 (#25052) * fix: replace hardcoded url * fix: Anthropic web search cost not tracked for Chat Completions The ModelResponse branch in response_object_includes_web_search_call() only checked url_citation annotations and prompt_tokens_details, missing Anthropic's server_tool_use.web_search_requests field. This caused _handle_web_search_cost() to never fire for Anthropic Claude models. Also routes vertex_ai/claude-* models to the Anthropic cost calculator instead of the Gemini one, since Claude on Vertex uses the same server_tool_use billing structure as the direct Anthropic API. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * fix(anthropic): pass logging_obj to client.post for litellm_overhead_time_ms (#24071) When LITELLM_DETAILED_TIMING=true, litellm_overhead_time_ms was null for Anthropic because the handler did not pass logging_obj to client.post(), so track_llm_api_timing could not set llm_api_duration_ms. Pass logging_obj=logging_obj at all four post() call sites (make_call, make_sync_call, acompletion, completion). Add test to ensure make_call passes logging_obj to client.post. Made-with: Cursor * sap - add additional parameters for grounding - additional parameter for grounding added for the sap provider * sap - fix models * (sap) add filtering, masking, translation SAP GEN AI Hub modules * (sap) add tests and docs for new SAP modules * (sap) add support of multiple modules config * (sap) code refactoring * (sap) rename file * test(): add safeguard tests * (sap) update tests * (sap) update docs, solve merge conflict in transformation.py * (sap) linter fix * (sap) Align embedding request transformation with current API * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) mock commit * (sap) run black formater * (sap) add literals to models, add negative tests, fix test for tool transformation * (sap) fix formating * (sap) fix models * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) commit for rerun bot review * (sap) minor improve * (sap) fix after bot review * (sap) lint fix * docs(sap): update documentation * fix(sap): change creds priority * fix(sap): change creds priority * fix(sap): fix sap creds unit test * fix(sap): linter fix * fix(sap): linter fix * linter fix * (sap) update logic of fetching creds, add additional tests * (sap) clean up code * (sap) fix after review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) add a possibility to put the service key by both variants * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) update test * (sap) update service key resolve function * (sap) run black formater * (sap) fix validate credentials, add negative tests for credential fetching * (sap) fix validate credentials, add negative tests for credential fetching * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) lint fix * (sap) lint fix * feat: support service_tier in gemini * chore: add a service_tier field mapping from openai to gemini * fix: use x-gemini-service-tier header in response * docs: add service_tier to gemini docs * chore: add defaut/standard mapping, and some tests * chore: tidying up some case insensitivity * chore: remove unnecessary guard * fix: remove redundant test file * fix: handle 'auto' case-insensitively * fix: return service_tier on final steamed chunk * chore: black * feat: enable supports_service_tier to gemini models * Fix get_standard_logging_metadata tests * Fix test_get_model_info_bedrock_models * Fix test_get_model_info_bedrock_models * Fix remaining tests * Fix mypy issues * Fix tests * Fix merge conflicts * Fix code qa * Fix code qa * Fix code qa * Fix greptile review --------- Co-authored-by: michelligabriele <gabriele.michelli@icloud.com> Co-authored-by: Josh <36064836+J-Byron@users.noreply.github.com> Co-authored-by: mubashir1osmani <mubashir.osmani777@gmail.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: milan-berri <milan@berri.ai> Co-authored-by: Alperen Kömürcü <alperen.koemuercue@sap.com> Co-authored-by: Vasilisa Parshikova <vasilisa.parshikova@sap.com> Co-authored-by: Lin Xu <lin.xu03@sap.com> Co-authored-by: Mark McDonald <macd@google.com> Co-authored-by: Sameer Kankute <sameer@berri.ai>	2026-04-08 21:37:10 -07:00
Kedar Thakkar	233870d7b2	Add Ramp as a built-in generic API callback with docs (#23769 )	2026-04-08 20:06:48 -07:00
Sameer Kankute	65829f79d7	docs: document LITELLM_MCP_STDIO_EXTRA_COMMANDS in env reference Required by tests/documentation_tests/test_env_keys.py for os.getenv usage in constants. Made-with: Cursor	2026-04-08 21:31:51 +05:30
yuneng-jiang	096893ea97	Merge pull request #25273 from BerriAI/litellm_pin_cosign_pub_to_commit [Infra] Pin cosign.pub verification to initial commit hash	2026-04-07 15:40:46 -07:00
milan-berri	bf8b615b64	fix(auth): support selective jwt override oauth2 routing (#25252 ) Allow JWT tokens matching routing_overrides to use OAuth2 introspection without enabling global OAuth2 while keeping OAuth2 routing limited to LLM/info routes. Add regression coverage for management-route boundary and tighten opaque-token assertions; update docs to reflect selective-mode route scope. Made-with: Cursor	2026-04-07 13:52:47 -07:00
Yuneng Jiang	ce75fde727	Merge remote main into litellm_pin_cosign_pub_to_commit	2026-04-07 10:27:00 -07:00
Yuneng Jiang	30565581be	[Infra] Pin cosign.pub verification to initial commit hash Pin all cosign public key references to the immutable commit hash (`0112e53`) that first introduced the key, instead of fetching it from the release tag. This addresses the concern that an attacker with push access could replace the key on main/tags and re-sign tampered images. Docs now show two verification methods: commit hash (recommended) and release tag (convenience), with explanation of why the hash is stronger. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-06 22:53:23 -07:00
ishaan-berri	7a9a9f0c79	fix: batch-limit stale managed object cleanup to prevent 300K row UPD… (#25258 ) * fix: batch-limit stale managed object cleanup to prevent 300K row UPDATE (#25257) * Add STALE_OBJECT_CLEANUP_BATCH_SIZE constant Configurable batch limit (default 1000) for stale managed object cleanup, preventing unbounded UPDATE queries from hitting 300K+ rows at once. * Batch-limit stale managed object cleanup with single bounded SQL query Two fixes to _cleanup_stale_managed_objects: 1. Replace unbounded update_many with a single execute_raw using a subquery LIMIT, capping each poll cycle to STALE_OBJECT_CLEANUP_BATCH_SIZE rows. Zero rows loaded into Python memory — everything stays in Postgres. Uses the same PostgreSQL raw-SQL pattern as spend_log_cleanup.py (the proxy requires PostgreSQL per schema.prisma). 2. Extract _expire_stale_rows as a separate method for testability. Keeps the file_purpose='response' filter to avoid incorrectly expiring long-running batch or fine-tune jobs that legitimately exceed the staleness cutoff. * docs: add STALE_OBJECT_CLEANUP_BATCH_SIZE to env vars reference * test: remove deprecated embed-english-v2.0 cohere embedding tests	2026-04-06 19:11:55 -07:00
yuneng-jiang	39c1042258	[Docs] Add cosign Docker image verification steps to security blog posts (#25122 ) * docs(blog): add cosign Docker image verification instructions Add steps for verifying Docker images with cosign to three security blog posts: CI/CD v2, Security Townhall, and Security Update. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs(proxy): add cosign verification to Docker/Helm/Terraform deploy page Add image signature verification steps to the main deployment doc so users pulling Docker images know how to verify them with cosign. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: fixes * Update index.md Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * [Docs] Scope cosign signing docs to GHCR and specify starting version Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * [Docs] Add starting version callout to ci_cd_v2 blog post Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Krrish Dholakia <krrish+github@berri.ai> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>	2026-04-06 09:59:27 -07:00
ishaan-berri	c5686b9726	[Nit] Small docs fix, fixing img + folder name (#25171 ) * fix toolsets img * docs fix	2026-04-04 18:14:32 -07:00
ishaan-berri	9088b46b90	Litellm docs 1 83 3 (#25166 ) * doc fix * docs fix * docs fix * doc fix * docs * docs fix	2026-04-04 17:54:47 -07:00
ishaan-berri	693ad49719	Litellm ishaan march23 - MCP Toolsets + GCP Caching fix (#25146 ) (#25155 ) * Litellm ishaan march23 - MCP Toolsets + GCP Caching fix (#25146) * feat(mcp): MCP Toolsets — curated tool subsets from one or more MCP servers (#24335) * feat(mcp): add LiteLLM_MCPToolsetTable and mcp_toolsets to ObjectPermissionTable * feat(mcp): add prisma migration for MCPToolset table * feat(mcp): add MCPToolset Python types * feat(mcp): add toolset_db.py with CRUD helpers for MCPToolset * feat(mcp): add toolset CRUD endpoints to mcp_management_endpoints * fix(mcp): skip allow_all_keys servers when explicit mcp_servers permission is set (toolset scope fix) * feat(mcp): add _apply_toolset_scope and toolset route handling in server.py * fix(mcp): resolve toolset names in responses API before fetching tools * feat(mcp): add mcp_toolsets field to LiteLLM_ObjectPermissionTable type * feat(mcp): register LiteLLM_MCPToolsetTable in prisma client initialization * feat(mcp): validate mcp_toolsets in key-vs-team permission check * feat(mcp): register toolset routes in proxy_server.py * feat(mcp): add MCPToolset and MCPToolsetTool TypeScript types * feat(mcp): add fetchMCPToolsets, createMCPToolset, updateMCPToolset, deleteMCPToolset API functions * feat(mcp): add useMCPToolsets React Query hook * feat(mcp): add toolsets (purple) as third option type in MCPServerSelector * feat(mcp): extract toolsets from combined MCP field in key form * feat(mcp): extract toolsets from combined MCP field in team form * feat(mcp): show toolsets section in MCPServerPermissions read view * feat(mcp): pass mcp_toolsets through object_permissions_view * feat(mcp): add MCPToolsetsTab component for creating and managing toolsets * feat(mcp): add Toolsets tab to mcp_servers.tsx * feat(mcp): pass mcpToolsets to playground chat and responses API calls * feat(mcp): generate correct server_url for toolsets in playground API calls * docs(mcp): add MCP Toolsets documentation * docs(mcp): add mcp_toolsets to sidebar * fix(mcp): replace x-mcp-toolset-id header with ContextVar to prevent client forgery * fix(mcp): use ContextVar + StreamingResponse for toolset MCP routes (fixes SSE streaming) * fix(mcp): cache toolset permission lookups to avoid per-request DB calls * test(mcp): add tests for toolset scope enforcement, ContextVar isolation, and access control * fix(mcp): cache toolset name lookups in MCPServerManager to avoid per-request DB calls * fix(mcp): prevent body_iter deadlock + use cached toolset lookup in responses API - _stream_mcp_asgi_response: add done callback to handler_task that puts the EOF sentinel on body_queue when the task exits, preventing body_iter from hanging forever if the handler raises after headers are sent. - litellm_proxy_mcp_handler: replace raw get_mcp_toolset_by_name() DB call with global_mcp_server_manager.get_toolset_by_name_cached() so toolset resolution uses the 60s TTL cache added for this purpose instead of hitting the DB on every responses-API request. * fix(mcp): toolset access control, asyncio fix, and real unit tests - server.py: _apply_toolset_scope now enforces that non-admin keys must have the requested toolset_id in their mcp_toolsets grant list; admin keys always bypass the check. - mcp_management_endpoints.py: three access-control fixes: * fetch_mcp_toolsets: non-admin keys with mcp_toolsets=None now return [] instead of all toolsets (only admins get 'all' when the field is absent) * fetch_mcp_toolset: non-admin keys that haven't been granted the requested toolset_id now get 403 instead of the full result * add_mcp_toolset: duplicate toolset_name now returns 409 Conflict instead of an opaque 500 - proxy_server.py: use asyncio.get_running_loop() instead of get_event_loop() inside an already-running coroutine (Python 3.10+). - test_mcp_toolset_scope.py: replace four hollow tests that only asserted local variable properties with real tests that call the production fetch_mcp_toolsets() and handle_streamable_http_mcp() functions with mocked dependencies. * fix(mcp): add mcp_toolsets to ObjectPermissionBase, fix multi-toolset overwrite, fix delete 404, allow standalone key toolsets * fix(mcp): add auth check on toolset resolution in responses API; union mcp_servers in _merge_toolset_permissions * fix(mcp): handle RecordNotFoundError in update_mcp_toolset; union direct servers with toolset servers * fix(mcp): use _user_has_admin_view; deny None mcp_toolsets for non-admin; use direct RecordNotFoundError import; fix docstring * fix(mcp): add @default(now()) to MCPToolsetTable.updated_at; fix test for non-admin toolset access * fix: use UniqueViolationError import; guard _ensure_eof for error/cancel only * fix(mcp): preserve mcp_access_groups in toolset scope, use shared Redis cache for toolset perms - Remove mcp_access_groups=[] from _apply_toolset_scope (server.py) and the responses API toolset path (litellm_proxy_mcp_handler.py). A key's access-group grants remain valid even when the request is scoped to a single toolset; clearing them silently revoked legitimate entitlements. - Switch resolve_toolset_tool_permissions and get_toolset_by_name_cached to use user_api_key_cache (Redis-backed DualCache in production) instead of per-instance in-memory dicts. Cache entries are now shared across workers, eliminating the per-worker stale-toolset-permission window flagged as a P1 by Greptile. - Use union merge (set union of tool names per server) when applying toolset permissions in the responses API path so direct-server tool restrictions are not overwritten by toolset permissions. * fix(mcp): return 404 when edit_mcp_toolset target does not exist * fix(mcp): align mcp_toolsets default to None in LiteLLM_ObjectPermissionTable * fix(mcp): admin toolset visibility, in-place tool name mutation, test helper coercion * fix(mcp): treat None/[] team mcp_toolsets as no restriction in key validation * fix(mcp): allow_all_keys backward compat, blocked_tools API write-path, efficient startup query * fix(mcp): use _mcp_active_toolset_id ContextVar to detect toolset scope, avoiding DB-default false-positive * fix(mcp): remove dead toolset cache stubs, log invalidation failures, align schema updated_at defaults * fix(mcp): deserialise MCPToolset from Redis cache hit, replace fastapi import in test * fix(mcp): evict name-cache on toolset mutation, 409 on rename conflict, warning-level list errors * fix(redis): regenerate GCP IAM token per connection for async cluster (#24426) * fix(redis): regenerate GCP IAM token per connection for async cluster clients Async RedisCluster was generating the IAM token once at startup and storing it as a static password. After the 1-hour GCP token TTL, any new connection (including to newly-discovered cluster nodes) would fail to authenticate. Fix: introduce GCPIAMCredentialProvider that implements redis-py's CredentialProvider protocol. It calls _generate_gcp_iam_access_token() on every new connection, matching what the sync redis_connect_func already does. async_redis.RedisCluster accepts a credential_provider kwarg which is invoked per-connection. * refactor(redis): move GCPIAMCredentialProvider to its own file Extract GCPIAMCredentialProvider and _generate_gcp_iam_access_token into litellm/_redis_credential_provider.py. _redis.py imports them from there, keeping the public API unchanged. * fix: address Greptile review issues - GCPIAMCredentialProvider now inherits from redis.credentials.CredentialProvider so redis-py's async path calls get_credentials_async() properly - move _redis_credential_provider import to top of _redis.py (PEP 8) - remove dead else-branch that silently no-oped (gcp_service_account from redis_kwargs.get() was always None since it's popped by _get_redis_client_logic) - remove mid-function 'from litellm import get_secret_str' inline import - remove unused 'call' import from test_redis.py * chore: retrigger CI/review * chore: sync schema.prisma copies from root * chore: sync schema.prisma copies from root * fix(proxy_server): use bounded asyncio.Queue with maxsize to prevent unbounded growth * fix(a2a/pydantic_ai): make api_base Optional to match base class signature * fix(a2a/pydantic_ai): make api_base Optional in handler and guard against None * fix(mcp): remove unused get_all_mcp_servers import * fix(mcp): remove unused MCPToolset import * refactor(mcp): extract toolset permission logic to reduce statement count below PLR0915 limit * fix(tests): update reload_servers_from_database tests to mock prisma directly --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * fix(toolset_db): lazy-import prisma to avoid ImportError when prisma not installed * fix(tests): update UI tests for toolset tab and updated empty state text * fix(tests): add get_mcp_server_by_name to fake_manager stub --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2026-04-04 16:23:21 -07:00
ishaan-berri	51876292a0	Litellm ishaan april4 2 (#25150 ) * feat(router): integrate allowed_fails_policy into health check failures (#24988) * feat(router): integrate allowed_fails_policy into health check failures Health check failures now increment the same per-deployment failure counters used by allowed_fails_policy, so users can control how many health check failures of each error type are required before a deployment enters cooldown. - ahealth_check() preserves the original exception in its return dict - run_with_timeout() returns a litellm.Timeout on health check timeout - _perform_health_check() propagates exceptions to unhealthy endpoints - _write_health_state_to_router_cache() calls _set_cooldown_deployments for each unhealthy endpoint that has an exception - When allowed_fails_policy is set, the binary health check filter is bypassed so cooldown is the sole routing exclusion mechanism - Safety net: if all deployments are in cooldown with enable_health_check_routing=True, the cooldown filter is bypassed Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(router): add health_check_ignore_transient_errors flag When enabled, health check failures with 429 (rate limit) or 408 (timeout) status codes are skipped from the cooldown pipeline. These are transient load issues, not broken deployments. Auth errors (401), 404, and 5xx errors still increment counters and trigger cooldown as before. Config (general_settings): health_check_ignore_transient_errors: true Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(router): also exclude 429/408 from health state cache when ignore_transient_errors set The previous fix only skipped cooldown counter increments. The health state cache was still marking 429/408 endpoints as is_healthy=False, causing the binary health check filter to exclude them from routing. Now, when health_check_ignore_transient_errors=True, 429/408 endpoints are also excluded from the unhealthy list passed to build_deployment_health_states(), so the binary filter treats them as unaffected (not unhealthy). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs(router): add health check driven routing guide New standalone page covering the full health check routing feature: allowed_fails_policy integration, health_check_ignore_transient_errors, architecture SVG, step-by-step setup, and gotchas (TTL, AllowedFails semantics). Replaces the inline section in health.md with a link to the new page. Added to the Routing & Load Balancing sidebar. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(health-check-routing): fix three CI failures - Add "exception" to ILLEGAL_DISPLAY_PARAMS in health_check.py so the exception object is stripped before the health endpoint serializes results to JSON (fixes TypeError: 'URL' object is not iterable) - Add allowed_fails_policy = None to FakeRouter stubs in test_router_health_check_routing.py (fixes AttributeError) - Add health_check_ignore_transient_errors to config_settings.md router settings reference table (fixes documentation test) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Fix litellm/tests/proxy_unit_tests/test_proxy_server.py * fix(router): address greptile review comments - Narrow cooldown safety-net bypass: only fires when allowed_fails_policy is set (cooldown is health-check driven). Without a policy, cooldowns are from real request failures and must not be bypassed. - Restore cooldown deployments DEBUG log that was accidentally removed. - Fix test_health TypeError: move exception extraction to a separate exceptions_by_model_id dict returned alongside endpoints, so exception objects never appear in the endpoint dicts that get JSON-serialized by the /health response. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(health-check-routing): properly isolate exceptions from health response Return exceptions_by_model_id as a separate third value from _perform_health_check / perform_health_check so exception objects (which contain non-JSON-serializable httpx URL types) never appear in the endpoint dicts that get serialized by the /health response. Callers updated: _health_endpoints.py, shared_health_check_manager.py, proxy_server.py background loop. All use the exceptions dict only for cooldown integration, not for display. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(shared-health-check): fix remaining 2-value return sites and update type annotation * fix(health-check-routing): fix P0 cooldown integration never firing The cooldown loop was reading endpoint.get("exception") which is always None because exceptions are now returned via exceptions_by_model_id, not stored in endpoint dicts. Fixed to use _exceptions.get(model_id). Also fixes the transient-error filter to use _exceptions instead of endpoint.get("exception"), and fixes all remaining 2-value return sites in shared_health_check_manager.py. Tests updated to pass exceptions via exceptions_by_model_id parameter instead of endpoint dicts. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(health-check-routing): fix P1 transient-error filter broken on cache hits When SharedHealthCheckManager returns cached results, exceptions_by_model_id is always {} so the transient-error filter defaulted to status 500 for all endpoints, incorrectly marking 429/408 endpoints as unhealthy. Fix: store integer exception_status on each unhealthy endpoint dict in _perform_health_check. _get_endpoint_exception_status() uses the live exception object when available (direct path) and falls back to the stored integer (cache-hit path). The integer is JSON-serializable and survives the shared cache round-trip. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(health-check-routing): gate cooldown loop behind allowed_fails_policy Without the policy, cooldown is not the routing exclusion mechanism. Firing _set_cooldown_deployments for all enable_health_check_routing users was a backwards-incompatible change — 401s would immediately cooldown deployments that the binary filter would have recovered on the next cycle. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * revert: undo allowed_fails_policy gate on cooldown loop Cooldown integration via health checks is intentional for all enable_health_check_routing users, not just those with allowed_fails_policy. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(docs+tests): fix health_check_ignore_transient_errors doc section and test coverage - Move health_check_ignore_transient_errors from router_settings to general_settings in config_settings.md (code reads it from general_settings) - Remove duplicate enable_health_check_routing / health_check_staleness_threshold entries that were incorrectly listed under router_settings - Replace TestHealthCheckEndpointExceptionPropagation tests with ones that exercise the real _perform_health_check code path via mocked ahealth_check, verifying exceptions appear in exceptions_by_model_id and NOT in endpoint dicts Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(tests+docs): fix tuple unpacking and docs test failures - Update test mocks that return (healthy, unhealthy) to return (healthy, unhealthy, {}) to match the new 3-value signature - Update test unpackings of perform_shared_health_check to use healthy, unhealthy, _ = ... - Add health_check_ignore_transient_errors to router_settings section in config_settings.md (it is a Router constructor param, so the doc test requires it there; it also lives in general_settings for proxy use) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Fix CodeQL errors * fix(tests): fix 2-value unpackings of _perform_health_check in test_health_check.py * fix(tests): fix mock _perform_health_check returning 2-tuple instead of 3 * fix team routing --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: add distributed lock for key rotation job (#23364) * fix: add distributed lock for key rotation job * fix: address Greptile review feedback on key rotation lock (#23834) * fix: address Greptile review feedback on key rotation lock * fix req changes greptile * feat(proxy): Optional on_error for guardrail pipeline (API / technical failures) (#24831) * guardrails fallback * docs * docs: add LITELLM_KEY_ROTATION_LOCK_TTL_SECONDS to environment variables reference * fix(mypy): accept Union[Dict, Any] in _get_deployment_order and use typed list to fix min() type error * fix(mypy): use Optional[str] for api_base in PydanticAI provider to match superclass signature --------- Co-authored-by: Sameer Kankute <sameer@berri.ai> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Harshit Jain <48647625+Harshit28j@users.noreply.github.com> Co-authored-by: Shivam Rawat <shivam@berri.ai> Co-authored-by: yuneng-jiang <yuneng@berri.ai>	2026-04-04 23:09:42 +00:00
ishaan-berri	b53cfe729a	Litellm ishaan march30 (#24887 ) (#25151 ) * fix(pricing): add unversioned vertex_ai/claude-haiku-4-5 entry Missing unversioned entry causes cost tracking to return $0.00 for all requests using vertex_ai/claude-haiku-4-5. All other Vertex AI Claude models have both versioned and unversioned entries. * fix(router): skip misleading tags error when no candidates (e.g. cooldown) Return early from get_deployments_for_tag when healthy_deployments is empty so tag-based routing does not raise no_deployments_with_tag_routing after cooldown filters all deployments. Adds regression test. Made-with: Cursor * feat(oci): add embedding support and update model catalog - Add OCIEmbeddingConfig for OCI GenAI embedding models - Add 16 new chat models (Cohere, Meta Llama, xAI Grok, Google Gemini) - Add 8 embedding models (Cohere embed v3.0, v4.0) - Update documentation with embedding examples - Update pricing for all new models * test(oci): add unit tests for OCI embedding support - 17 unit tests covering OCIEmbeddingConfig - Tests for URL generation, param mapping, request/response transform - Tests for model pricing JSON completeness * style(oci): format with black and ruff * fix(oci): correct embedding request body format OCI embedText API expects inputs, truncate, and inputType at the top level of the request body, not nested under embedTextDetails. Fixed transformation and updated tests accordingly. Verified with real OCI API: 3/3 embedding models working. * docs: clarify tag routing early return and test intent Made-with: Cursor * fix(oci): address code review findings from Greptile - P1: Fix signing URL mismatch with custom api_base by accepting api_base parameter in transform_embedding_request - P2: Remove encoding_format from supported params (OCI does not support it, was silently dropped) - P2: Raise ValueError for token-array inputs instead of silently converting to string representation - Add test for token-list rejection * fix(mcp): add STS AssumeRole support for MCP SigV4 authentication MCPSigV4Auth only supported static AWS credentials or the boto3 default credential chain. Production Kubernetes environments typically authenticate via IAM role assumption (sts:AssumeRole), which was not possible. Add aws_role_name and aws_session_name parameters to the MCP SigV4 auth stack. When aws_role_name is provided, MCPSigV4Auth calls sts:AssumeRole to obtain temporary credentials before signing requests. Explicit keys, if also provided, are used as the source identity for the STS call; otherwise ambient credentials (pod role, instance profile) are used. * fix: stop logging credential values and add missing redaction patterns Replaces raw credential values in debug/error log messages with boolean presence checks or type names. Adds PEM block, GCP token, JWT, SAS token, and service-account blob patterns to the redaction filter. Fixes private_key pattern to capture full PEM blocks instead of stopping at the first whitespace. Addresses: Vertex AI credential JSON (including RSA private key) being logged to stderr on health check failures. * fix: log only field names for UserAPIKeyAuth, not full object * style: apply black formatting to experimental_mcp_client/client.py * style: fix black/isort formatting and mypy error in proxy_server.py - Fix black formatting in experimental_mcp_client/client.py (done in prev commit) - Fix black/isort formatting in key_management_endpoints.py, proxy_server.py, transformation.py - Fix mypy: iterate over optional list safely (access_group_ids or []) in proxy_server.py * fix(test): patch check_migration.verbose_logger directly to fix xdist ordering issue When test_proxy_cli.py tests run before test_check_migration.py in the same xdist worker, litellm.proxy.db.check_migration is already in sys.modules. Patching litellm._logging.verbose_logger has no effect on the already-bound reference. Patch the correct target (check_migration.verbose_logger) and import the module before patching so the order doesn't matter. * fix(mypy): make api_base Optional in PydanticAIProviderConfig to match base class signature --------- Co-authored-by: Ihsan Soydemir <soydemir.ihsan@gmail.com> Co-authored-by: Milan <milan@berri.ai> Co-authored-by: Daniel Gandolfi <danielgandolfi@gmail.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: michelligabriele <gabriele.michelli@icloud.com> Co-authored-by: user <70670632+stuxf@users.noreply.github.com> Co-authored-by: Ishaan Jaffer <ishaanjaffer0324@gmail.com>	2026-04-04 14:44:07 -07:00
ryan-crabbe-berri	eb780a85bb	Merge pull request #25032 from BerriAI/litellm_docs-default-team-params docs: document default_team_params in config reference	2026-04-03 16:07:46 -07:00
ishaan-berri	c6aa3ea452	Litellm ishaan april1 try2 (#25110 ) * Litellm ishaan april1 (#25103) * fix(proxy): enforce upperbound key params on key/update and add custom_key_update hook The /key/update endpoint did not enforce upperbound_key_generate_params, allowing users to bypass configured limits (tpm_limit, rpm_limit, max_budget, duration, budget_duration) by updating an existing key instead of generating a new one. Extract the upperbound enforcement logic from _common_key_generation_helper() into a standalone _enforce_upperbound_key_params() function and call it from both the generate and update paths. For updates, None values are skipped (not filled with defaults) since they mean "don't change this field". Also adds a custom_key_update config option and user_custom_key_update global, mirroring the existing custom_key_generate pattern, so custom key validation logic can fire during key updates as well. * fix(proxy): invoke custom_key_update hook in bulk update path The user_custom_key_update hook was only called in update_key_fn (single key update) but not in _process_single_key_update (bulk update path), allowing custom validation to be bypassed via the /key/update/bulk endpoint. Mirror the hook invocation in both paths. * fix(proxy): pass UpdateKeyRequest to hook in bulk path, not BulkUpdateKeyRequestItem Move the custom_key_update hook invocation to after UpdateKeyRequest is constructed so the hook receives the same type in both single and bulk update paths. Previously the bulk path passed BulkUpdateKeyRequestItem (5 fields only), which would cause AttributeError for hooks accessing fields like tpm_limit or models. * fix(bedrock): promote cache usage to message_delta for Claude Code (#24850) Ensure Bedrock/Anthropic-compatible streaming exposes cache usage where Claude Code reads it by promoting message_stop usage onto message_delta and preserving usage fields in fake-streamed message_delta events. Made-with: Cursor * fix(search): Support self-hosted Firecrawl response format in search transform (#24866) The `transform_search_response` method only handled Firecrawl Cloud (v2) response format where `data` is a dict with `web`/`news` keys. Self-hosted Firecrawl (v1) returns `data` as a flat list of result objects, causing an `AttributeError: 'list' object has no attribute 'get'`. Detect the response format by checking if `data` is a list (self-hosted) or dict (cloud) and handle both cases. Cloud format: {"data": {"web": [...], "news": [...]}} Self-hosted: {"success": true, "data": [{"url": "...", "title": "...", ...}]} Co-authored-by: Synergy <synergyoclaw@gmail.com> * feat: add environment and user tracking to prompt management (#24855) * feat: add environment and user tracking to prompt management - Add environment (development/staging/production) and created_by columns to LiteLLM_PromptTable - Update unique constraint to [prompt_id, version, environment] - All CRUD endpoints support environment filtering and user tracking - Redesigned prompt detail page with environment tabs and version history - UI: environment filter on list page, environment selector in editor - 8 new tests for environment and user tracking Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: Black formatting and add environments to PromptInfoResponse TypeScript type Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address Greptile review findings - P1: delete_prompt scopes in-memory cleanup to environment when provided - P2: dotprompt_content parsed directly regardless of environment flag - P2: use distinct for environments query - P2: fix double-fetch on initial mount in prompt_info.tsx - fix: remove unsupported select kwarg from find_many Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address remaining Greptile review comments - Remove unused useCallback import (index.tsx) - Remove unused ENV_COLORS variable (prompt_info.tsx) - P1: in-memory fallback in get_prompt_versions now respects environment filter - P1: reset selectedEnv when promptId changes to avoid stale state - Cyclic imports are pre-existing pattern, not introduced by this PR Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: scope patch_prompt to environment using primary key - Add environment query param to patch_prompt endpoint - Look up target row by composite key (prompt_id + version + environment) - Update by primary key (id) to target exactly one row - Fixes Greptile finding: patch with multiple environments no longer ambiguous Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: use actual start_time for failed request spend logs (#24906) async_post_call_failure_hook set both start_time and end_time to datetime.now(), making all failed requests show duration=0. Use the actual start_time from litellm_logging_obj instead, so spend logs reflect the real request duration on timeout and other failures. Fixes #24888 * feat(bedrock): add nova canvas image edit support (#24869) * feat(bedrock): add nova canvas image edit support * fix(bedrock): support PathLike inputs for nova image edit * chore: sync schema.prisma copies from root * fix(mypy): correct type-ignore code for delta_usage arg-type * fix(mypy): cast status_code to str, suppress intentional str yield * fix(lint): extract _create_content_block_chunks to fix PLR0915 * fix(lint): extract helpers to fix PLR0915 in prompt endpoints --------- Co-authored-by: michelligabriele <gabriele.michelli@icloud.com> Co-authored-by: Sameer Kankute <sameer@berri.ai> Co-authored-by: redhelix <amin.lalji@gmail.com> Co-authored-by: Synergy <synergyoclaw@gmail.com> Co-authored-by: Talha Anwar <37379131+talhaanwarch@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: madhu19991 <madhu@thunkai.com> Co-authored-by: Srikanth @adobe <devarakondasrikanth@users.noreply.github.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * fix(test): update model armor streaming test to handle string or int error code --------- Co-authored-by: michelligabriele <gabriele.michelli@icloud.com> Co-authored-by: Sameer Kankute <sameer@berri.ai> Co-authored-by: redhelix <amin.lalji@gmail.com> Co-authored-by: Synergy <synergyoclaw@gmail.com> Co-authored-by: Talha Anwar <37379131+talhaanwarch@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: madhu19991 <madhu@thunkai.com> Co-authored-by: Srikanth @adobe <devarakondasrikanth@users.noreply.github.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2026-04-03 14:57:44 -07:00
ishaan-berri	fc885af994	docs(blog): add security hardening April 2026 post (#25101 ) (#25102 )	2026-04-03 13:06:14 -07:00
yuneng-jiang	3604b600d3	[Infra] Merge internal dev branch with main (#25036 ) * fix(proxy): enforce key-level model allowlist for custom auth custom_auth_run_common_checks only runs common_checks (team/user/project model checks). Custom auth now also enforces key-level model restrictions via can_key_call_model. Move the custom-auth key-access regression tests to test_user_api_key_auth.py and keep test_custom_auth_end_user_budget.py focused on end-user budget behavior. Made-with: Cursor * fix(proxy): gate custom-auth key model checks behind opt-in Keep key-level model allowlist enforcement in custom auth behind `custom_auth_run_common_checks` to preserve backwards compatibility, and update tests to verify default non-enforcement and opt-in enforcement behavior. Made-with: Cursor * test(proxy): isolate custom auth default check from shared settings state Patch `proxy_server.general_settings` to an empty dict in the default custom-auth key-access test so it remains deterministic under shared module state. Made-with: Cursor * test(proxy): strengthen custom auth post-check assertions Tighten custom auth regression tests by asserting exact can_key_call_model args and remove an unused common_checks mock from the default behavior path. Made-with: Cursor * fix(agentcore): parse A2A JSON-RPC responses in AgentCore provider * fix(prompt-templates): ensure_alternating_roles handles tool-call chains * feat(auth): add JWT claim routing overrides for OAuth2 validation Made-with: Cursor * docs(auth): document JWT-to-OAuth2 routing overrides Add generic docs for running JWT and OAuth2 together, including routing_overrides YAML examples and list-based selector behavior for iss/client_id/aud. Made-with: Cursor --------- Co-authored-by: Milan <milan@berri.ai> Co-authored-by: michelligabriele <gabriele.michelli@icloud.com>	2026-04-02 16:38:01 -07:00
Ryan Crabbe	c19a63e2bf	docs: clarify that models sub-field only applies to SSO auto-created teams	2026-04-02 16:00:20 -07:00
Ryan Crabbe	59b09102b9	docs: add default_team_params to config reference and update examples - Add default_team_params to litellm_settings reference table in config_settings.md with all sub-fields documented - Update self_serve.md and msft_sso.md examples to include team_member_permissions, tpm_limit, and rpm_limit - Fix misleading comment that implied default_team_params only applies to SSO auto-created teams — it applies to all /team/new calls	2026-04-02 15:51:28 -07:00
Krrish Dholakia	06df8edf92	docs: cleanup (#25026 )	2026-04-02 15:18:24 -07:00
Krrish Dholakia	cae8613660	Announce April Townhall (#25021 ) * fix: replace hardcoded url * docs: announce april townhall	2026-04-02 14:10:49 -07:00
yuneng-jiang	068e6e2a9e	Merge pull request #24951 from BerriAI/litellm_remove_neon_cli [Fix] Remove Neon CLI and Pin All JS Dependencies	2026-04-02 12:47:46 -07:00
David Chen	d1df4e838b	Litellm fix update bedrock models (#24947 ) * update bedrock models in tests * updated more tests and model_prices_and_context_window * fix model id and pricing * replace more sonnet models * update tests * git push * update pricing * flaky total cost * monkey patch * relax the cost change * fix and revert some changes * revert the pricing * chore: move cost/pricing changes to bedrock-cost-fixes branch * chore: split Bedrock file-api beta stripping to separate branch Removes strip_unsupported_file_api_betas_for_bedrock_invoke from this branch; see litellm_bedrock_invoke_strip_file_api_betas for that fix. Made-with: Cursor	2026-04-01 19:22:54 -07:00
Yuneng Jiang	006d481025	[Fix] Remove neon CLI dependency and pin all JS dependencies Remove @neondatabase/api-client and neonctl to address CVE-2026-25639 (axios supply chain vulnerability). Pin all JS dependencies to exact versions across all package.json files to prevent future supply chain attacks via semver range resolution. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-01 16:15:32 -07:00
ryan-crabbe-berri	2f1cfb0548	Merge pull request #24751 from BerriAI/litellm_ryan-march-28 litellm ryan march 28	2026-03-31 17:25:30 -07:00

1 2 3 4 5 ...

6085 Commits