litellm

mirror of https://github.com/tiennm99/litellm.git synced 2026-06-29 19:04:28 +00:00

Author	SHA1	Message	Date
Sameer Kankute	1971c22b43	Add documentation for this new feat	2026-02-11 12:36:10 +05:30
Ishaan Jaff	3407006120	[Docs] Add docs guide for using policies (#20914 ) * init schema with TAGS * ui: add policy test * resolvePoliciesCall * add_policy_sources_to_metadata + headers * types Policy * preview Impact * def _describe_match_reason( * match based on TAGs * TestTagBasedAttachments * test fixes * add policy_resolve_router * add_guardrails_from_policy_engine * TestMatchAttribution * refactor * fix * fix: address Greptile review feedback on policy resolve endpoints - Track unnamed keys/teams as separate counts instead of inflating affected_keys_count with duplicate "(unnamed key)" placeholders. Added unnamed_keys_count and unnamed_teams_count to response. - Push alias pattern matching to DB via _build_alias_where() which converts exact patterns to Prisma "in" and suffix wildcards to "startsWith" filters. - Gate sync_policies_from_db/sync_attachments_from_db behind force_sync query param (default false) to avoid 2 DB round-trips on every /policies/resolve request. - Remove worktree-only conftest.py that cleared sys.modules at import time — no longer needed since code moved to main repo. - Rename MAX_ESTIMATE_IMPACT_ROWS → MAX_POLICY_ESTIMATE_IMPACT_ROWS. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: eliminate duplicate DB queries and fix header delimiter ambiguity - Fetch teams table once in estimate_attachment_impact and reuse for both tag-based and alias-based lookups (was querying teams twice when both tag_patterns and team_patterns were provided). - Convert tag/team filter functions from async DB queries to sync filters that operate on pre-fetched data (_filter_keys_by_tags, _filter_teams_by_tags). - Fix comma ambiguity in x-litellm-policy-sources header: use '; ' as entry delimiter since matched_via values can contain commas. - Use '+' as the within-value separator in matched_via reason strings (e.g. "tag:healthcare+team:health-team") to avoid conflict with header delimiters. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs v1 guide with UI imgs * docs fix --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-10 18:52:31 -08:00
Alexsander Hamir	b7993b14cf	Add semgrep & Fix OOMs (#20912 )	2026-02-10 17:50:14 -08:00
Alexsander Hamir	ebce0e5f8c	[Release - 02/10/2026] v1.81.10-nightly	2026-02-10 16:26:30 -08:00
Ishaan Jaffer	f311fba194	fix	2026-02-10 15:24:46 -08:00
Ishaan Jaff	f8619e2000	[Stability] Investigate + fix issue where model cost map became poorly formatted (#20895 ) * init: GetModelCostMap * fix * docs * docs fix * docs fixes * docs fix * test model cost map resilience * MODEL_COST_MAP_MIN_MODEL_COUNT * validate_model_cost_map * test_should_have_minimum_models_in_backup * docs fix * docs fix * fix * dos fix * docs fix * docs fix * docs fix * docs fix * validate_model_cost_map * fix * cleanup	2026-02-10 15:17:01 -08:00
Sameer Kankute	3de892b8ca	Merge pull request #20860 from BerriAI/litellm_perplexity_research_api_support [Feat] Perplexity research api support	2026-02-10 18:22:30 +05:30
Sameer Kankute	2eb52db3e9	Add documentation for perplexity	2026-02-10 17:44:00 +05:30
Praveena Mundolimoole	ab670a74f4	Add support for extra fields in Generic SSO via GENERIC_USER_EXTRA_ATTRIBUTES (#20761 ) * Add chat completion support for websearch * Add chat completion tool calls support and response transformation * Add new methods in chat completion * Add chat completion tool format * Add callback for websearch in completion method * Add test for web search * Potential fix for code scanning alert no. 4046: Clear-text logging of sensitive information Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com> * Update litellm/integrations/websearch_interception/tools.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * fix: empty guardrails/policies arrays should not trigger enterprise license check (#20567) * fix: empty guardrails/policies arrays should not trigger enterprise license check (#20304) The UI sends empty arrays for enterprise-only fields (guardrails, policies, logging) even when the user has not configured these features. The backend `is not None` check treated `[]` as a truthy intent to use the feature, falsely requiring an enterprise license for basic team operations. Backend: Add `and updated_kv[field] != [] and updated_kv[field] != {}` guards in `_update_metadata_fields` so empty collections are skipped. UI: Conditionally omit guardrails, logging, and policies from the payload when empty instead of defaulting to `[]`. Fixes #20304 * fix: allow clearing fields with empty collections while skipping enterprise check Address PR review feedback: 1. Move the empty-collection guard into _update_metadata_field (singular) so that empty lists/dicts skip only the premium license check but still get written into metadata. This lets users intentionally clear a previously-set field (e.g. guardrails: []) without being blocked, while the UI's default empty arrays still don't trigger a false enterprise error. 2. Remove sys.path hack from test file; use standard imports that work with pytest discovery. 3. Add tests verifying that empty collections are moved into metadata (field clearing works) even though they bypass the premium check. Fixes #20304 * fix critical CVE vulnerabliltes (#20683) * fix: add hook to handle db case (#20635) * Add team policy mapping for zguard (#20608) * support policy mapping on team key level * update document * update document * address comments * update document * add unit test for new feature * add more test case * feat: add support for anthropic_messages call type in prompt caching (#19233) * feat: add support for anthropic_messages call type in prompt caching * test: move anthropic_messages prompt caching test to main router test file * add tutorial on using claude code with prompt cache routing * docs: add SDK proxy authentication (OAuth2/JWT auto-refresh) documentation (#20680) Adds documentation for the litellm.proxy_auth feature that automatically obtains and refreshes OAuth2/JWT tokens when connecting to a LiteLLM Proxy. * Fixes #20582 (#20663) * fix: show error details instead of Data Not Available for failed requests (#20656) * fix(ui): add null guard for models in API keys table (#20655) The VirtualKeysTable crashed when rendering keys with null or undefined models field. The className expression tried to access .length on null, throwing a TypeError that broke the entire keys table. Added Array.isArray() guard before accessing .length on the models value. Fixes #20611 * Fix: Spend logs pickle error with Pydantic models and redaction (#20685) * docs: add callback registration optimization to v1.81.9 release notes (#20681) * docs: add callback registration optimization to v1.81.9 release notes * Update v1.81.9.md --------- Co-authored-by: Alexsander Hamir <alexsanderhamirgomesbaptista@gmail.com> * Fix spend logs pickle error with Pydantic models Replace copy.deepcopy() with Pydantic-safe serialization to avoid "cannot pickle '_thread.RLock' object" errors when request/response redaction is enabled. Changes: - Add _convert_to_json_serializable_dict() helper that uses model_dump() for Pydantic models instead of pickle - Replace copy.deepcopy() calls in request and response redaction paths with the new helper function - Recursively handles nested dicts, lists, and Pydantic models Root cause: Pydantic v2 BaseModel instances contain internal _thread.RLock objects for thread-safety. When copy.deepcopy() attempts to pickle these objects, it fails because threading primitives cannot be pickled. Fixes #20647 * chore: remove unused copy import Remove unused copy import that was causing lint failure. The copy.deepcopy() calls were replaced with _convert_to_json_serializable_dict() helper function in the previous commit, making the copy module no longer needed. --------- Co-authored-by: ryan-crabbe <128659760+ryan-crabbe@users.noreply.github.com> Co-authored-by: Alexsander Hamir <alexsanderhamirgomesbaptista@gmail.com> * fix(vertex_ai): propagate extra_headers anthropic-beta to request body (#20666) Vertex AI requires Anthropic beta flags in the request body (anthropic_beta array), not as HTTP headers. The Bedrock handler already extracts user-specified beta headers from the headers dict, but the Vertex handler was missing this, causing extra_headers like interleaved-thinking-2025-05-14 to be silently dropped. This extracts anthropic-beta values from optional_params extra_headers and merges them into the anthropic_beta request body field, and also removes extra_headers from the request body since the parent's transform_request spreads optional_params into data. * fix(streaming): preserve interleaved thinking/redacted blocks * test(streaming): build thinking chunks with typed Delta/StreamingChoices * Fix video list pagination cursors not encoded with provider metadata first_id and last_id in the video list response were returned as raw provider IDs while data[].id was properly wrapped with encode_video_id_with_provider(). This caused pagination to break when clients passed unencoded cursors back as the `after` parameter. - Encode first_id/last_id in transform_video_list_response - Decode the `after` param in transform_video_list_request via extract_original_video_id() - Add 6 unit tests covering encoding, decoding, passthrough, and full round-trip pagination Fixes #20708 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(responses): preserve streamed tool deltas when id is omitted * fix(responses): guard ambiguous tool-call index reuse * Add compaction for vertex ai * Add all new feat for v1/messages * Add inference_geo as supported messages param * Add inference based costing * Add inference_geo as supported messages param * Add support for fast param * Add fast mode for other providers * Add documentation for Fast Mode * add missing indexes on VerificationToken table * Fix structured response of tool call * Add tests for WebSearch interception with chat completions API * Add doc for chat completion web search * Fix: is_web_search_tool_chat_completion * Fix double json import * Add new vercel ai anthropic models * Fix: base_model name for body and deplyment name in URL * Add output_config as supported param * Add response schema for vercel ai sonnet 4.5 * handle when litellm_parrams might be none * Fix : litellm/tests/test_litellm/llms/bedrock/chat/invoke_transformations/test_bedrock_chat_invoke_transformations_anthropic_claude3_transformation.py * fix: Missing return statement for async streaming * Fix: get_supported_anthropic_messages_params * Fix mypy issues * Fix mypy issues * Add support for extra fields in Generic SSO via GENERIC_USER_EXTRA_ATTRIBUTES Enables extraction of additional fields from the Generic SSO userinfo endpoint response beyond the standard 8 fields (id, email, name, etc.). Custom handlers can now access these fields via CustomOpenID.extra_fields dict. Changes: - Add extra_fields: Optional[Dict[str, Any]] to CustomOpenID type - Add GENERIC_USER_EXTRA_ATTRIBUTES env var (comma-separated field names) - Extract specified fields using get_nested_value() with dot notation support - Add 4 test cases covering basic, nested, and missing field scenarios - Update custom_sso.py example showing how to access extra_fields Backward compatible: extra_fields is None when env var not set * docs: Add documentation for GENERIC_USER_EXTRA_ATTRIBUTES Document the new GENERIC_USER_EXTRA_ATTRIBUTES environment variable for Generic SSO - Add to admin_ui_sso.md: explanation and usage examples - Add to config_settings.md: environment variable reference - Add to custom_sso.md: code example showing how to access extra_fields - Includes examples for nested field paths with dot notation --------- Co-authored-by: Sameer Kankute <sameer@berri.ai> Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Co-authored-by: Varun Chawla <34209028+veeceey@users.noreply.github.com> Co-authored-by: Harshit Jain <48647625+Harshit28j@users.noreply.github.com> Co-authored-by: jwang-gif <j.wang@zscaler.com> Co-authored-by: nuernber <benjamin.nuernberger@jpl.nasa.gov> Co-authored-by: Cesar Garcia <128240629+Chesars@users.noreply.github.com> Co-authored-by: John Lathouwers <john.lathouwers@oracle.com> Co-authored-by: ryan-crabbe <128659760+ryan-crabbe@users.noreply.github.com> Co-authored-by: Alexsander Hamir <alexsanderhamirgomesbaptista@gmail.com> Co-authored-by: Elias Högbom Aronsson <elias.aronson@gmail.com> Co-authored-by: Emerson Gomes <emerson.gomes@thalesgroup.com> Co-authored-by: tshushan <tshushan@outbrain.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Carlo Alberto Ferraris <cafxx@mercari.com>	2026-02-10 16:00:28 +05:30
Sameer Kankute	cb8ce09b0d	Add support for langchain_aws via litellm passthrough	2026-02-10 13:37:17 +05:30
Ishaan Jaff	36e0361187	[UI] M2M OAuth2 UI Flow (#20794 ) * add has_client_credentials * MCPOAuth2TokenCache * init MCP Oauth2 constants * MCPOAuth2TokenCache * resolve_mcp_auth * test fixes * docs fix * address greptile review: min TTL, env-configurable constants, tests, docs - Fix zero-TTL edge case: floor at MCP_OAUTH2_TOKEN_CACHE_MIN_TTL (10s) - Make all MCP OAuth2 constants env-configurable via os.getenv() - Move test file to follow 1:1 mapping convention (test_oauth2_token_cache.py) - Add MCP OAuth doc page (mcp_oauth.md) with M2M and PKCE sections - Update FAQ in mcp.md to reflect M2M support - Add E2E test script and config Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix mypy lint * fix oauth2 * ui feat fixes * test M2M * test fix * ui feats * ui fixes * ui fix client ID * fix: backend endpoints * docs fix * fixes greptile --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-09 19:28:02 -08:00
Ishaan Jaff	19024e0602	[Feat] MCP Oauth2 Fixes - Add support for MCP M2M Oauth2 support (#20788 ) * add has_client_credentials * MCPOAuth2TokenCache * init MCP Oauth2 constants * MCPOAuth2TokenCache * resolve_mcp_auth * test fixes * docs fix * address greptile review: min TTL, env-configurable constants, tests, docs - Fix zero-TTL edge case: floor at MCP_OAUTH2_TOKEN_CACHE_MIN_TTL (10s) - Make all MCP OAuth2 constants env-configurable via os.getenv() - Move test file to follow 1:1 mapping convention (test_oauth2_token_cache.py) - Add MCP OAuth doc page (mcp_oauth.md) with M2M and PKCE sections - Update FAQ in mcp.md to reflect M2M support - Add E2E test script and config Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix mypy lint * fix oauth2 * remove old files * docs fix * address greptile comments * fix: atomic lock creation + validate JSON response shape - Use dict.setdefault() for atomic per-server lock creation - Add isinstance(body, dict) check before accessing token response fields Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: replace asserts with proper guards, wrap HTTP errors with context - Replace `assert` statements with `if/raise ValueError` (asserts can be disabled with python -O in production) - Wrap `httpx.HTTPStatusError` to provide a clear error message with server_id and status code - Add tests for HTTP error and non-dict JSON response error paths - Remove unused imports Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-09 17:35:11 -08:00
Ishaan Jaff	4555ed37c5	fix(callbacks): allow MAX_CALLBACKS override via env var (#20781 ) * fix(callbacks): allow MAX_CALLBACKS override via env var (#20778) * fix(callbacks): allow MAX_CALLBACKS override via env var - Move MAX_CALLBACKS from logging_callback_manager.py to constants.py - Add LITELLM_MAX_CALLBACKS env var override (default: 30) - Add troubleshooting doc explaining the limit and override Fixes issue where large deployments with 60+ teams using guardrails would hit the hardcoded MAX_CALLBACKS=30 limit and fail to start. * docs: add max_callbacks to sidebar navigation --------- Co-authored-by: shin-bot-litellm <shin-bot-litellm@users.noreply.github.com> * fix callbacks issue --------- Co-authored-by: shin-bot-litellm <shin-bot-litellm@berri.ai> Co-authored-by: shin-bot-litellm <shin-bot-litellm@users.noreply.github.com>	2026-02-09 12:11:32 -08:00
Ishaan Jaffer	f2ba120c43	docs fix	2026-02-09 10:59:57 -08:00
Ishaan Jaff	9532ad0fab	docs fix (#20768 )	2026-02-09 10:03:43 -08:00
Sameer Kankute	6b2bcdb870	Merge pull request #20483 from BerriAI/litellm_completion_websearch [Feat] Chat completion - Add Websearch support using LiteLLM /search (using web search interception hook)	2026-02-09 17:52:52 +05:30
Sameer Kankute	ef55d37bf0	Merge branch 'main' into litellm_v1_messages_claude_4_6	2026-02-09 17:14:36 +05:30
Sameer Kankute	f929461fc6	Merge pull request #20702 from emerzon/fix/issue-20698-stream-chunk-thinking-blocks fix(streaming): preserve interleaved thinking/redacted_thinking blocks	2026-02-09 16:32:43 +05:30
Sameer Kankute	2f33445054	Merge pull request #20738 from BerriAI/main merge main	2026-02-09 15:07:39 +05:30
Sameer Kankute	7fa4d090ec	Add doc for chat completion web search	2026-02-09 13:51:26 +05:30
Sameer Kankute	319453d059	Add documentation for Fast Mode	2026-02-09 11:39:35 +05:30
Sameer Kankute	20440bcadc	Add inference based costing	2026-02-09 10:51:50 +05:30
Sameer Kankute	d41df6053a	Add all new feat for v1/messages	2026-02-09 10:35:11 +05:30
Varun Chawla	c9c6a5edc9	Fix: Spend logs pickle error with Pydantic models and redaction (#20685 ) * docs: add callback registration optimization to v1.81.9 release notes (#20681) * docs: add callback registration optimization to v1.81.9 release notes * Update v1.81.9.md --------- Co-authored-by: Alexsander Hamir <alexsanderhamirgomesbaptista@gmail.com> * Fix spend logs pickle error with Pydantic models Replace copy.deepcopy() with Pydantic-safe serialization to avoid "cannot pickle '_thread.RLock' object" errors when request/response redaction is enabled. Changes: - Add _convert_to_json_serializable_dict() helper that uses model_dump() for Pydantic models instead of pickle - Replace copy.deepcopy() calls in request and response redaction paths with the new helper function - Recursively handles nested dicts, lists, and Pydantic models Root cause: Pydantic v2 BaseModel instances contain internal _thread.RLock objects for thread-safety. When copy.deepcopy() attempts to pickle these objects, it fails because threading primitives cannot be pickled. Fixes #20647 * chore: remove unused copy import Remove unused copy import that was causing lint failure. The copy.deepcopy() calls were replaced with _convert_to_json_serializable_dict() helper function in the previous commit, making the copy module no longer needed. --------- Co-authored-by: ryan-crabbe <128659760+ryan-crabbe@users.noreply.github.com> Co-authored-by: Alexsander Hamir <alexsanderhamirgomesbaptista@gmail.com>	2026-02-07 23:02:29 -08:00
Cesar Garcia	1fecae0399	docs: add SDK proxy authentication (OAuth2/JWT auto-refresh) documentation (#20680 ) Adds documentation for the litellm.proxy_auth feature that automatically obtains and refreshes OAuth2/JWT tokens when connecting to a LiteLLM Proxy.	2026-02-07 22:57:04 -08:00
nuernber	55a89f279f	feat: add support for anthropic_messages call type in prompt caching (#19233 ) * feat: add support for anthropic_messages call type in prompt caching * test: move anthropic_messages prompt caching test to main router test file * add tutorial on using claude code with prompt cache routing	2026-02-07 22:51:06 -08:00
jwang-gif	c9df996b77	Add team policy mapping for zguard (#20608 ) * support policy mapping on team key level * update document * update document * address comments * update document * add unit test for new feature * add more test case	2026-02-07 22:44:17 -08:00
Harshit Jain	3b043ee8bf	fix critical CVE vulnerabliltes (#20683 )	2026-02-07 22:23:01 -08:00
ryan-crabbe	f39c1e9045	docs: add middleware performance blog post (#20677 ) * docs: add middleware performance blog post * docs: add Krrish, Ishaan, and author details to middleware blog post	2026-02-07 17:36:53 -08:00
ryan-crabbe	94db421e67	docs: add callback registration optimization to v1.81.9 release notes (#20681 ) * docs: add callback registration optimization to v1.81.9 release notes * Update v1.81.9.md --------- Co-authored-by: Alexsander Hamir <alexsanderhamirgomesbaptista@gmail.com>	2026-02-07 17:09:45 -08:00
Ishaan Jaffer	d8528fbfdb	docs fix	2026-02-07 16:00:28 -08:00
Ishaan Jaffer	02cfc87bdb	fidocs fix	2026-02-07 15:58:13 -08:00
Alexsander Hamir	5de7fe2897	docs: add LiteLLM Observatory section to v1.81.9 release notes (#20675 ) - Add paragraph on release validation, extensibility, and 100% coverage goal - Include OOMs and CPU regressions as issues surfaced under sustained load	2026-02-07 15:24:15 -08:00
yuneng-jiang	0531254899	reorganize admin UI docs	2026-02-07 15:20:24 -08:00
yuneng-jiang	ea255e2bd0	UI contributing and trouble shooting docs	2026-02-07 15:11:49 -08:00
yuneng-jiang	6e984122ba	Adding to release notes + sidebar	2026-02-07 14:43:46 -08:00
yuneng-jiang	ca24f56b39	Merge remote-tracking branch 'origin' into docs_yj_feb7	2026-02-07 14:40:21 -08:00
Alexsander Hamir	0f7104f8a5	docs: polish LiteLLM Observatory blog post (#20670 )	2026-02-07 14:35:28 -08:00
yuneng-jiang	5876441aa2	warning placement	2026-02-07 14:35:11 -08:00
yuneng-jiang	b29572cebc	adjusting to add email integration prereq	2026-02-07 14:33:04 -08:00
yuneng-jiang	f7fbcefd26	UI team soft budget docs	2026-02-07 14:27:53 -08:00
Ishaan Jaffer	8bce48daa4	docs fix	2026-02-07 14:22:04 -08:00
Ishaan Jaff	caf51a4ca9	Litellm docs rc fixes (#20667 ) * docs * review 1 * docs fix * docs * docs fix * docs	2026-02-07 13:32:15 -08:00
Ishaan Jaff	f2ba3cc6e1	[Docs] 1.81.9 stability (#20665 ) * docs * review 1 * docs fix	2026-02-07 13:15:23 -08:00
Sameer Kankute	f5ed7826a4	Merge pull request #20637 from BerriAI/litellm_blog_claude_4_6 Update opus 4.6 blog with adaptive thinking	2026-02-07 13:09:56 +05:30
Sameer Kankute	8741512183	Update opus 4.6 blog with adaptive thinking	2026-02-07 13:07:20 +05:30
Ishaan Jaffer	36be0044dc	docs	2026-02-06 18:30:17 -08:00
Ishaan Jaff	1b24a0fdd7	docs (#20626 )	2026-02-06 18:24:21 -08:00
Alexsander Hamir	0d7465694d	Add OpenAI/Azure release test suite with HTTP client lifecycle regression detection (#20622 )	2026-02-06 18:03:05 -08:00
Krish Dholakia	ba74e6d9d2	Add http support to custom code guardrails + Unified guardrails for MCP + Agent guardrail support (#20619 ) * fix: fix styling * fix(custom_code_guardrail.py): add http support for custom code guardrails allows users to call external guardrails on litellm with minimal code changes (no custom handlers) Test guardrail integrations more easily * feat(a2a/): add guardrails for agent interactions allows the same guardrails for llm's to be applied to agents as well * fix(a2a/): support passing guardrails to a2a from the UI * style(code-editor): allow editing custom code guardrails on ui + add examples of pre/post calls for custom code guardrails * feat(mcp/): support custom code guardrails for mcp calls allows custom code guardrails to work on mcp input * feat(chatui.tsx): support guardrails on mcp tool calls on playground	2026-02-06 17:34:32 -08:00

1 2 3 4 5 ...

5460 Commits