litellm

mirror of https://github.com/tiennm99/litellm.git synced 2026-06-18 07:33:58 +00:00

Author	SHA1	Message	Date
yuneng-jiang	5c1f7d99bf	Merge pull request #25731 from BerriAI/docs_guardrail fallbacks image	2026-04-14 18:13:12 -07:00
shivam	65ce89dc67	update	2026-04-14 18:02:41 -07:00
shivam	19629004f5	fallbacks image	2026-04-14 17:58:11 -07:00
Yuneng Jiang	05ad48236f	[Docs] Regenerate v1.83.3-stable release notes from v1.82.3-stable baseline The previous v1.83.3 changelog was generated against v1.83.0-nightly and missed ~3 weeks of work. This regenerates it against the previous stable release and restructures the LLM API Endpoints section to group by API type (Responses, Batch, Count Tokens, Video Generation, Pass-Through, etc.) matching the convention used in v1.82.3, v1.82.0, and v1.81.14. Adds ~25 previously uncited PRs, cross-section duplications for cross-cutting changes, and a verified first-time-contributors list.	2026-04-14 17:19:42 -07:00
Ryan Crabbe	3aae15f5d8	[Docs] Use GitHub avatar for Ryan Crabbe in release notes Replace the expiring LinkedIn CDN image URL with a stable GitHub avatar URL for v1.83.3 and v1.83.7.rc.1 release notes.	2026-04-14 16:22:07 -07:00
Yuneng Jiang	966be2982a	[Docs] Add missed content PRs to v1.83.7.rc.1 and update runbook - Add 8 content PRs that merged directly to the release branch outside the listed staging PRs: #23769 (Ramp callback), #25252 (JWT OAuth2 override), #25254 (AWS GovCloud mode), #25258 (batch-limit cleanup), #25334 (router custom_llm_provider), #25345 (Triton embeddings), #25347 (tag-based routing), #25358 (Baseten pricing attribution) - Add @kedarthakkar to new contributors (first-ever PR via #23769) - Update RELEASE_NOTES_GENERATION_INSTRUCTIONS: require walking git log range between release tags in addition to staging PRs, and verify new-contributor status per author rather than trusting the GH release body floor	2026-04-14 16:13:09 -07:00
Yuneng Jiang	4a1da629fa	[Fix] Correct pip install versions for v1.83.3-stable and v1.83.7.rc.1 docs PyPI publishes 1.83.3 and 1.83.7 (no .post1 / rc1 suffixes) — align the pip install commands with the actual published versions.	2026-04-14 16:00:27 -07:00
Yuneng Jiang	8eec2c69b7	[Docs] Add release notes for v1.83.3-stable and v1.83.7.rc.1 - Retitle existing v1.83.3 preview file to v1.83.3-stable (same commit) - Add new v1.83.7.rc.1 preview release notes - Update RELEASE_NOTES_GENERATION_INSTRUCTIONS runbook with guidance on resolving staging PRs to their underlying commits	2026-04-14 15:58:13 -07:00
ishaan-berri	0e43050a01	Merge pull request #25650 from BerriAI/litellm_dev_04_13_2026_p1 feat: add litellm.compress() — BM25-based prompt compression with ret…	2026-04-14 12:24:47 -07:00
Sameer Kankute	1a9a31e4a2	Merge pull request #25665 from BerriAI/litellm_oss_staging_04_13_2026_p1 litellm oss staging 04/13/2026	2026-04-14 23:50:08 +05:30
Jonas Neubert	e724e5e07d	add NO_OPENAPI env var to disable /openapi.json endpoint (#25547 )	2026-04-14 23:37:49 +05:30
Ashton Sidhu	6343148c95	Hiddenlayer Integration: Add V2 Integration (#22708 ) * Serialize error message to a string; only scan last message * Update litellm/proxy/guardrails/guardrail_hooks/hiddenlayer/hiddenlayer.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Add v2 of hiddenlayer guardrail implementation * Update litellm/proxy/guardrails/guardrail_hooks/hiddenlayer/hiddenlayer.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Fix potential header issue * linting * Add image support --------- Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>	2026-04-14 23:37:49 +05:30
ishaan-berri	4a71583951	Merge pull request #25348 from BerriAI/litellm_gemini-veo-video-resolution-pricing2 feat(gemini): Veo Lite pricing, video resolution usage and tiered cost	2026-04-14 10:23:22 -07:00
yuneng-jiang	8427534f13	Merge pull request #25647 from BerriAI/litellm_yj_apr_11 [Infra] Merge dev branch with main	2026-04-13 17:28:38 -07:00
yuneng-jiang	a306092d47	Merge pull request #25463 from BerriAI/litellm_oss_staging_04_09_2026 Litellm oss staging 04 09 2026	2026-04-13 17:25:53 -07:00
ishaan-berri	548225ef31	Merge pull request #25586 from BerriAI/litellm_ishaan_april11 Litellm ishaan april11	2026-04-13 14:55:50 -07:00
Krrish Dholakia	26c7412339	feat: add litellm.compress() — BM25-based prompt compression with retrieval tool (#25637 ) * feat: add litellm.compress() for BM25-based context compression Adds a compress() utility that reduces context size for LLM calls using BM25 relevance scoring (with optional semantic embeddings via litellm.embedding()). Messages below a token threshold pass through unchanged; messages above are scored, ranked, and the lowest-relevance ones replaced with stubs. Originals are cached and a retrieval tool is injected so the model can recover dropped content on demand. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(compress): truncate high-scoring messages instead of fully stubbing them When a relevant message was too large to fit in the token budget it was replaced with a stub, leaving the LLM with no real content to work with. Now the highest-scoring overflow message is truncated (first 70% + last 30% of words) to fill the remaining budget, so the LLM always receives actual content rather than just a retrieval pointer. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(bm25): add prefix expansion so query terms match inflected doc tokens "cook" now matches "cooking", "auth" matches "authentication", etc. Without this, short query terms scored 0 against longer inflected forms in documents, causing the wrong message to be kept. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test: add routing correctness test and eval harness for litellm.compress() - test_simple_compression: parametrized test verifying BM25 routes the right message based on query ("How to cook?" keeps cooking, "Fix auth" keeps auth content) - eval_compression.py: end-to-end eval harness comparing baseline vs compressed model performance on HumanEval-style coding problems Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat(eval): add SWE-bench Lite compression eval harness Uses princeton-nlp/SWE-bench_Lite_bm25_27K which bundles ~27k tokens of BM25-retrieved repo context per problem — large enough to meaningfully stress litellm.compress() without Docker or GitHub API calls. Proxy eval metrics (no test runner needed): - has_diff: model produced a valid unified diff - file_overlap: fraction of gold-patch files in generated patch - exact_file_match: generated patch touches exactly the right files Run: python tests/eval_swe_bench.py --model gpt-4o --problems 10 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(eval): robust dataset loading + sys.path fix for worktree imports - Add HuggingFace API fallback so the SWE-bench loader doesn't need the `datasets` library (avoids pyarrow/numpy binary compat issues) - Insert repo root into sys.path so compression module resolves from worktrees - Use direct import of litellm_compress to avoid __getattr__ issues Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * improve compression quality: line-based truncation, multi-message budget, 70% default target - Switch truncate_message from word-based to line-based splitting to preserve code structure (function boundaries, indentation) - Allow multiple messages to be truncated instead of burning entire budget on one overflow message - Raise default compression target from 50% to 70% of trigger for better quality/cost tradeoff - Add --compression-target CLI arg to SWE-bench eval harness - Move tests to canonical locations (tests/test_litellm/, scripts/) - Add docs page and sidebar entries for compress() Eval results (5 problems, Opus, trigger=10k): Hunk overlap delta improved from -0.417 to -0.221 Content similarity now matches baseline (+0.006) Cost savings: 72% Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: add SWE-bench performance results to compress() docs Include benchmark table from Opus eval (5 problems, trigger=10k) showing 72% cost savings with file-level quality fully preserved. Add metric explanations and eval runner examples. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(eval): use tolerance-based hunk overlap metric The exact line-number matching was too brittle — LLM-generated patches often target the right code region but with slightly offset line numbers. Switch to hunk-level overlap with a 10-line tolerance window so nearby edits count as matches. This better reflects actual patch quality. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: add compression_interception callback for LiteLLM Proxy Add a proxy callback that automatically compresses incoming /v1/messages payloads above a configurable token threshold, runs the retrieval tool loop server-side, and returns the final response. This brings compress() support to proxy deployments (e.g. Claude Code via /v1/messages). - New callback: litellm/integrations/compression_interception/ - Proxy config: compression_interception_params in litellm_settings - Support for input_type param in compress() (openai vs anthropic) - Docs: proxy setup instructions with YAML config example - Tests: 139-line unit test suite for the interception handler Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Revert "feat: add compression_interception callback for LiteLLM Proxy" This reverts commit 72bd5cb152ca1df07f14a14e14a2816e188874a8. --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-13 12:23:54 -07:00
Krrish Dholakia	d319cd8cc6	fix: blog dark mode - text invisible on dark background (#25620 ) The blog CSS selectors for dark mode used descendant selectors like [data-theme='dark'] .blog-wrapper which never matched because both data-theme and .blog-wrapper are applied to the same <html> element by Docusaurus. Fixed by using compound selectors (no space): [data-theme='dark'].blog-wrapper. Also added missing dark-mode overrides for: - pre/code blocks in blog posts - link colors in blog posts - marquee items, separators, and labels on blog list page - pagination links on blog list page - meta text and author separators on blog list page Co-authored-by: Cursor Agent <cursoragent@cursor.com> Co-authored-by: Krrish Dholakia <krrish-berri-2@users.noreply.github.com>	2026-04-13 09:08:57 -07:00
Sameer Kankute	fa605d85c0	Merge pull request #25616 from BerriAI/main merge main	2026-04-13 08:43:43 +05:30
Yuneng Jiang	41849a540d	document new env var and fix type hint - Add LITELLM_OIDC_ALLOWED_CREDENTIAL_DIRS to the environment variables reference so the documentation test passes. - Annotate the values variable in _reject_os_environ_references so it accepts both dict.values() and list iterables.	2026-04-11 22:17:32 -07:00
Yuneng Jiang	6baee0dfcb	address review feedback - Log a warning when dropping callback params that carry os.environ/ references so operators notice the misconfiguration. - Require absolute paths in oidc/file/ and correct the documented example to use the leading-slash form. - Drop the unused return value from _reject_os_environ_references.	2026-04-11 21:52:39 -07:00
Yuneng Jiang	06a0d4498a	fix: tighten handling of environment references in request parameters - Reject os.environ/ references supplied via /health/test_connection request params instead of resolving them; config-sourced values are already resolved before reaching the endpoint. - Skip os.environ/ references in dynamic callback params loaded from per-request metadata. - Constrain oidc/file/ to an allowed credential directory allowlist (defaults to /var/run/secrets and /run/secrets, overridable via LITELLM_OIDC_ALLOWED_CREDENTIAL_DIRS).	2026-04-11 21:41:41 -07:00
ishaan-berri	fdd7500904	blog: add back arrow to blog post pages (#25587 ) * blog: add back arrow to post pages * blog: style back arrow — fixed top-left below navbar	2026-04-11 19:15:45 -07:00
ishaan-berri	1edf41c26f	Merge pull request #25585 from BerriAI/litellm_dev_04_11_2026_p1 Litellm dev 04 11 2026 p1	2026-04-11 18:46:57 -07:00
ishaan-berri	329a526b9d	Merge pull request #25579 from BerriAI/feat/anthropic-advisor-tool feat(advisor): advisor tool orchestration loop for non-Anthropic providers	2026-04-11 18:32:44 -07:00
Ishaan Jaffer	dd87f3be5b	docs(advisor): move supported providers to top, focus how it works on litellm native loop	2026-04-11 18:27:18 -07:00
Ishaan Jaffer	a8bc7bfcd4	docs(advisor): add how it works section with mermaid diagram + non-native provider table	2026-04-11 18:23:33 -07:00
Ishaan Jaffer	35f4b47ff8	apply content guidelines: scale/resilience narrative, FAQ, Key Takeaways, Conclusion CTA	2026-04-11 18:12:32 -07:00
Ishaan Jaffer	14eed24471	add redis circuit breaker blog post with React diagrams	2026-04-11 18:02:59 -07:00
Ishaan Jaffer	8e616ecdf4	add BlogPostPage swizzle: hide sidebar, add hiring CTA on every post	2026-04-11 18:02:56 -07:00
Ishaan Jaffer	dac44fb443	blog list styles: clean typography, marquee animation, hero layout	2026-04-11 18:02:52 -07:00
Ishaan Jaffer	85cb7db8b9	blog list page: Ramp-style flat list with hero, provider marquee, hiring CTA	2026-04-11 18:02:48 -07:00
Ishaan Jaffer	05d516482f	restyle blog list page to match engineering blog aesthetic	2026-04-11 18:02:44 -07:00
Krrish Dholakia	e08e3bf748	docs: clarify how to get benchmarking script	2026-04-11 17:31:03 -07:00
Krrish Dholakia	12bca649fc	docs: refactor benchmarking docs to be clearer	2026-04-11 17:30:09 -07:00
Yuneng Jiang	909247785e	Merge remote-tracking branch 'origin' into litellm_internal_staging_04_11_2026	2026-04-11 15:41:03 -07:00
Sameer Kankute	c13be44e44	feat(guardrails): optional skip system message in unified guardrail inputs (#25481 ) * feat(guardrails): optional skip system message in unified guardrail inputs Made-with: Cursor * feat(dashboard): skip_system_message_in_guardrail in guardrail UI Add a tri-state control (inherit / yes / no) when creating or editing guardrails so admins can set litellm_params.skip_system_message_in_guardrail without YAML. Table edit merges existing litellm_params before PUT to avoid wiping content-filter and other provider fields. Document the dashboard flow in the guardrails quick start with a screenshot. Made-with: Cursor * fix(guardrails): type structured_messages as AllMessageValues for mypy Use AllMessageValues in openai_messages_without_system and cast adapter request messages so GenericGuardrailAPIInputs matches TypedDict. Made-with: Cursor	2026-04-11 08:53:24 -07:00
Yuneng Jiang	9a0487553d	Merge remote-tracking branch 'origin' into litellm_oss_staging_04_09_2026	2026-04-10 16:41:27 -07:00
ishaan-berri	831083b565	Merge pull request #25525 from BerriAI/feat/anthropic-advisor-tool feat(anthropic): support advisor_20260301 tool type	2026-04-10 16:39:34 -07:00
Krrish Dholakia	4e12d3c562	docs: document april townhall announcements (#25537 ) * docs: document april townhall announcements * docs: cleanup blog post	2026-04-10 16:12:06 -07:00
Ishaan Jaffer	d6e2a74c0f	docs: move advisor tool doc to completion/ guides section in sidebar	2026-04-10 15:08:25 -07:00
Ishaan Jaffer	ed973c049f	docs: add Advisor Tool documentation page	2026-04-10 13:15:54 -07:00
Yuneng Jiang	a889dea8cc	[Docs] Add missing MCP per-user token env vars to config_settings MCP_PER_USER_TOKEN_DEFAULT_TTL and MCP_PER_USER_TOKEN_EXPIRY_BUFFER_SECONDS were added in #25441 but not documented, causing test_env_keys.py to fail.	2026-04-09 23:58:36 -07:00
Krrish Dholakia	a6d81e1575	docs: add Docker Image Security Guide for cosign verification and deployment best practices (#25439 ) - New doc page covering all signed image variants, verification commands, CI/CD enforcement (K8s Sigstore Policy Controller, GCP Binary Authorization, AWS/EKS, GitHub Actions), digest pinning, and safe upgrade patterns - Added to sidebar under Setup & Deployment - Cross-linked from the existing deploy.md cosign section Co-authored-by: Cursor Agent <cursoragent@cursor.com> Co-authored-by: Krrish Dholakia <krrish-berri-2@users.noreply.github.com>	2026-04-09 23:58:35 -07:00
Yuneng Jiang	ce0b57b4ff	[Docs] Add missing MCP per-user token env vars to config_settings MCP_PER_USER_TOKEN_DEFAULT_TTL and MCP_PER_USER_TOKEN_EXPIRY_BUFFER_SECONDS were added in #25441 but not documented, causing test_env_keys.py to fail.	2026-04-09 21:04:34 -07:00
Krrish Dholakia	3a6db708ce	docs: add Docker Image Security Guide for cosign verification and deployment best practices (#25439 ) - New doc page covering all signed image variants, verification commands, CI/CD enforcement (K8s Sigstore Policy Controller, GCP Binary Authorization, AWS/EKS, GitHub Actions), digest pinning, and safe upgrade patterns - Added to sidebar under Setup & Deployment - Cross-linked from the existing deploy.md cosign section Co-authored-by: Cursor Agent <cursoragent@cursor.com> Co-authored-by: Krrish Dholakia <krrish-berri-2@users.noreply.github.com>	2026-04-09 11:50:15 -07:00
stuxf	a6c30b30bf	build: migrate packaging, CI, and Docker from Poetry to uv (#25007 ) * build: migrate packaging metadata to uv * ci: move automation and local tooling to uv * docker: migrate image builds and runtime setup to uv * docs: update install and deployment guidance for uv * chore: align auxiliary scripts and tests with uv * test: harden test_litellm isolation * fix: keep release and health check images self-contained * build: pin uv tooling and health check deps * test: isolate bedrock image request formatting from suite state * test: cover sandbox executor requirements flow * ci: fix circleci no-op command steps * ci: fix circleci publish workflow parsing * fix: stabilize remaining uv migration CI checks * ci: increase matrix test timeout headroom * fix: restore published docker and license coverage * fix: restore proxy runtime build parity * fix: restore proxy extras parity and venv migrations * ci: persist uv path across circleci steps * fix: keep psycopg binary in default test env * docker: preserve prisma cache across stages * test: run local proxy checks through uv python * build: restore runtime deps moved into ci * build: refresh uv lock after upstream merge * fix: restore module import in test_check_migration after merge The conflict resolution imported only the function but the test body references check_migration as a module throughout. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: revert dependency promotions, remove nodejs-wheel-binaries, fix Docker layer caching - Move google-generativeai, Pillow, tenacity back to ci group (they are lazily imported and bloat the base SDK install needlessly) - Remove nodejs-wheel-binaries from extra_proxy and proxy-dev (redundant in Docker where system Node.js is already installed via apk) - Remove all nodejs-wheel node replacement and venv npm patching blocks from Dockerfiles since the wheel is no longer installed - Add --no-default-groups to CodSpeed benchmark workflow so the benchmark environment matches the old minimal pip install footprint - Apply standard uv two-phase Docker pattern: copy metadata first, install deps (cached layer), then copy source and install project - Replace CircleCI enterprise no-op with proper uv sync command Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: regenerate uv.lock after removing nodejs-wheel-binaries Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(ci): use cache/restore instead of cache to prevent cache poisoning The old workflow used actions/cache/restore (read-only). The uv migration changed it to actions/cache (read-write), which zizmor flags as a cache poisoning risk. Restore the safer read-only variant. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(ci): disable setup-uv built-in cache to silence cache-poisoning alert The setup-uv action enables caching by default, which zizmor flags as a cache poisoning risk. Disable it since we already use a read-only cache/restore step. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(ci): disable setup-uv cache in publish workflow Silences zizmor cache-poisoning alert. Publishing workflow runs infrequently on protected branches so caching adds no real benefit. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(test): remove duplicate verbose_logger mock in test_check_migration The logger was patched twice — first via mocker.patch() then via mocker.patch.object(autospec=True). The second call fails because autospec cannot inspect an already-mocked attribute. Remove the redundant first patch. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(ci): free disk space before Docker build in test-server-root-path The Dockerfile.non_root build ran out of disk on the CI runner. Remove Android SDK, .NET, Boost, and GHC toolchains (~12GB) to free space. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 11:46:23 -07:00
Abhijoy Sarkar	c688d9d6bc	Add PromptGuard guardrail integration (#24268 ) * Add PromptGuard guardrail integration Add PromptGuard as a first-class guardrail vendor in LiteLLM's proxy, supporting prompt injection detection, PII redaction, topic filtering, entity blocklists, and hallucination detection via PromptGuard's /api/v1/guard API endpoint. Backend: - Add PROMPTGUARD to SupportedGuardrailIntegrations enum - Implement PromptGuardGuardrail (CustomGuardrail subclass) with apply_guardrail handling allow/block/redact decisions - Add Pydantic config model with api_key, api_base, ui_friendly_name - Auto-discovered via guardrail_hooks/promptguard/__init__.py registries Frontend: - Add PromptGuard partner card to Guardrail Garden with eval scores - Add preset configuration for quick setup - Add logo to guardrailLogoMap Tests: - 30 unit tests covering configuration, allow/block/redact actions, request payload construction, error handling, config model, and registry wiring * Fix redact path and init ordering per review feedback - P1: Update structured_messages (not just texts) when PromptGuard returns a redact decision, so PII redaction is effective for the primary LLM message path - P2: Validate credentials before allocating the HTTPX client so resources aren't acquired if PromptGuardMissingCredentials is raised - Add tests for structured_messages redaction and texts-only redaction * Harden PromptGuard integration: fail-open, event hooks, images, docs - Add block_on_error config (default fail-closed, configurable fail-open) - Declare supported_event_hooks (pre_call, post_call) like other vendors - Forward images from GenericGuardrailAPIInputs to PromptGuard API - Wrap API call in try/except for resilient error handling - Add comprehensive documentation page with config examples - Register docs page in sidebar alongside other guardrail providers - Expand test suite from 32 to 40 tests covering new functionality * Fix dict[str, Any] -> Dict[str, Any] for Python 3.8 compat * Address remaining Greptile feedback: timeout, redact guard - Add explicit 10s timeout to async_handler.post() to prevent indefinite hangs when PromptGuard API is unresponsive - Guard redact path: only update inputs["texts"] when the key was originally present, avoiding phantom key injection - Add test: redact with structured_messages only does not create texts key (41 tests total) * Fix CI lint: black formatting, add PromptGuardConfigModel to LitellmParams - Reformat promptguard.py to match CI black version (parenthesization) - Add PromptGuardConfigModel as base class of LitellmParams for proper Pydantic schema validation, consistent with all other guardrail vendors - Use litellm_params.block_on_error directly (now a typed field) * Address Greptile review: redact path, null decision, error context - P1: Filter _extract_texts_from_messages to user-role messages only, preventing system/assistant content from being injected into texts - P1: Strengthen test_redact_updates_structured_messages assertion from weak `in` check to strict equality, catching the injection bug - P2: Use `result.get("decision") or "allow"` to handle explicit null decision values (not just absent keys) - P2: Wrap bare exception re-raise in GuardrailRaisedException so the caller knows which guardrail failed (block_on_error=True path) - P2: Add static Promptguard entry in guardrail_provider_map so the preset works before populateGuardrailProviderMap is called - Add test for explicit null decision treated as allow * Fix black formatting: collapse f-string in error message	2026-04-09 08:12:24 -07:00
michelligabriele	cd9c511df6	feat(proxy): add credential overrides per team/project via model_config metadata (#24438 )	2026-04-09 07:22:27 -07:00
Krrish Dholakia	f42ffed2bd	Litellm oss staging 04 02 2026 p1 (#25055 ) * fix(vertex_ai): support pluggable (executable) credential_source for WIF auth (#24700) The WIF credential dispatch in load_auth() only handled identity_pool and aws credential types. When credential_source.executable was present (used for Azure Managed Identity via Workload Identity Federation), it fell through to identity_pool.Credentials which rejected it with MalformedError. Add dispatch to google.auth.pluggable.Credentials for executable-type credential sources, following the same pattern as the existing identity_pool and aws helpers. Fixes authentication for Azure Container Apps → GCP Vertex AI via WIF with executable credential sources. * feat(logging): add component and logger fields to JSON logs for 3rd p… (#24447) * feat(logging): add component and logger fields to JSON logs for 3rd party filtering * Let user-supplied extra fields win over auto-generated component/logger, tighten test assertions * Feat - Add organization into the metrics metadata for org_id & org_alias (#24440) * Add org_id and org_alias label names to Prometheus metric definitions * Add user_api_key_org_alias to StandardLoggingUserAPIKeyMetadata * Populate user_api_key_org_alias in pre-call metadata * Pass org_id and org_alias into per-request Prometheus metric labels * Add test for org labels on per-request Prometheus metrics * chore: resolve test mockdata * Address review: populate org_alias from DB view, add feature flag, use .get() for org metadata * Add org labels to failure path and verify flag behavior in test * Fix test: build flag-off enum_values without org fields * Gate org labels behind feature flag in get_labels() instead of static metric lists * Scope org label injection to metrics that carry team context, remove orphaned budget label defs, add test teardown * Use explicit metric allowlist for org label injection instead of team heuristic * Fix duplicate org label guard, move _org_label_metrics to class constant * Reset custom_prometheus_metadata_labels after duplicate label assertion * fix: emit org labels by default, remove flag, fix missing org_alias in all metadata paths * fix: emit org labels by default, no opt-in flag required * fix: write org_alias to metadata unconditionally in proxy_server.py * fix: 429s from batch creation being converted to 500 (#24703) * add us gov models (#24660) * add us gov models * added max tokens * Litellm dev 04 02 2026 p1 (#25052) * fix: replace hardcoded url * fix: Anthropic web search cost not tracked for Chat Completions The ModelResponse branch in response_object_includes_web_search_call() only checked url_citation annotations and prompt_tokens_details, missing Anthropic's server_tool_use.web_search_requests field. This caused _handle_web_search_cost() to never fire for Anthropic Claude models. Also routes vertex_ai/claude-* models to the Anthropic cost calculator instead of the Gemini one, since Claude on Vertex uses the same server_tool_use billing structure as the direct Anthropic API. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * fix(anthropic): pass logging_obj to client.post for litellm_overhead_time_ms (#24071) When LITELLM_DETAILED_TIMING=true, litellm_overhead_time_ms was null for Anthropic because the handler did not pass logging_obj to client.post(), so track_llm_api_timing could not set llm_api_duration_ms. Pass logging_obj=logging_obj at all four post() call sites (make_call, make_sync_call, acompletion, completion). Add test to ensure make_call passes logging_obj to client.post. Made-with: Cursor * sap - add additional parameters for grounding - additional parameter for grounding added for the sap provider * sap - fix models * (sap) add filtering, masking, translation SAP GEN AI Hub modules * (sap) add tests and docs for new SAP modules * (sap) add support of multiple modules config * (sap) code refactoring * (sap) rename file * test(): add safeguard tests * (sap) update tests * (sap) update docs, solve merge conflict in transformation.py * (sap) linter fix * (sap) Align embedding request transformation with current API * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) mock commit * (sap) run black formater * (sap) add literals to models, add negative tests, fix test for tool transformation * (sap) fix formating * (sap) fix models * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) commit for rerun bot review * (sap) minor improve * (sap) fix after bot review * (sap) lint fix * docs(sap): update documentation * fix(sap): change creds priority * fix(sap): change creds priority * fix(sap): fix sap creds unit test * fix(sap): linter fix * fix(sap): linter fix * linter fix * (sap) update logic of fetching creds, add additional tests * (sap) clean up code * (sap) fix after review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) add a possibility to put the service key by both variants * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) update test * (sap) update service key resolve function * (sap) run black formater * (sap) fix validate credentials, add negative tests for credential fetching * (sap) fix validate credentials, add negative tests for credential fetching * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) lint fix * (sap) lint fix * feat: support service_tier in gemini * chore: add a service_tier field mapping from openai to gemini * fix: use x-gemini-service-tier header in response * docs: add service_tier to gemini docs * chore: add defaut/standard mapping, and some tests * chore: tidying up some case insensitivity * chore: remove unnecessary guard * fix: remove redundant test file * fix: handle 'auto' case-insensitively * fix: return service_tier on final steamed chunk * chore: black * feat: enable supports_service_tier to gemini models * Fix get_standard_logging_metadata tests * Fix test_get_model_info_bedrock_models * Fix test_get_model_info_bedrock_models * Fix remaining tests * Fix mypy issues * Fix tests * Fix merge conflicts * Fix code qa * Fix code qa * Fix code qa * Fix greptile review --------- Co-authored-by: michelligabriele <gabriele.michelli@icloud.com> Co-authored-by: Josh <36064836+J-Byron@users.noreply.github.com> Co-authored-by: mubashir1osmani <mubashir.osmani777@gmail.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: milan-berri <milan@berri.ai> Co-authored-by: Alperen Kömürcü <alperen.koemuercue@sap.com> Co-authored-by: Vasilisa Parshikova <vasilisa.parshikova@sap.com> Co-authored-by: Lin Xu <lin.xu03@sap.com> Co-authored-by: Mark McDonald <macd@google.com> Co-authored-by: Sameer Kankute <sameer@berri.ai>	2026-04-08 21:37:10 -07:00

1 2 3 4 5 ...

6111 Commits