litellm

mirror of https://github.com/tiennm99/litellm.git synced 2026-06-18 03:31:23 +00:00

Author	SHA1	Message	Date
user	9622864ef2	test: set allow_client_tags on duplicate-tag merge test test_add_litellm_data_to_request_duplicate_tags tests the request/key tag merge when tags overlap. The merge requires caller-supplied tags to flow through — set allow_client_tags=True on the key so the merge path stays testable under the new default-deny regime.	2026-04-16 22:50:54 +00:00
user	af8d479482	chore(proxy): emit warning when caller-supplied tags are stripped Silent strip is the worst debug UX: admin's client sends routing tags, they disappear, admin can't figure out why. Emit a warning naming the metadata key the tags came from and telling the admin exactly which flag to set if this is intentional.	2026-04-16 22:42:14 +00:00
user	8526628a8f	test: update tag-merge tests for default-deny client tag policy Two pre-existing tests codified the pre-fix behavior where any caller- supplied metadata.tags would flow through to spend logs and routing: - test_add_key_or_team_level_spend_logs_metadata_to_request exercised the request/key/team tag merge. Set allow_client_tags=True on the key metadata so the merge path is still tested under the new regime. - test_create_file_with_nested_litellm_metadata asserted that litellm_metadata[tags] form-data propagated to the handler. Drop the tag field; the test still proves nested form-parser correctness via spend_logs_metadata and environment.	2026-04-16 22:40:43 +00:00
user	0e62addd94	fix(proxy): gate caller-supplied routing/budget tags behind allow_client_tags VERIA-28 (High) follow-up: tag-based routing and tag budget enforcement read metadata.tags directly from the request, letting an attacker reach restricted tag-routed deployments or misattribute spend to a victim team's tag. Strip metadata.tags (and litellm_metadata.tags) at the pre-call boundary unless the caller's key or team metadata opts in with allow_client_tags=True. Default-deny: existing clients that need to pass routing tags must have the flag set explicitly on their key or team. Preserves the tag-routing feature for admins who trust their callers; closes the injection path for everyone else.	2026-04-16 22:31:00 +00:00
user	d0601692b8	fix(proxy): strip user_api_key_metadata injection slots from user input Expand the pre-call metadata strip to also remove user_api_key_metadata and user_api_key_team_metadata. The proxy writes these fields into data[_metadata_variable_name] with admin-authoritative values, but only into that one metadata key; the caller's value in the OTHER metadata key (metadata vs litellm_metadata) would otherwise persist and be picked up by _get_admin_metadata, letting a caller supply their own 'admin' config to disable guardrails, opt out of global policies, etc. VERIA-28 (High): Security Policy and Guardrail Bypass via Unsanitized Request Metadata. Add regression test at the proxy boundary verifying the strip, and extend the guardrail test to cover the post-strip admin-config path.	2026-04-16 21:48:36 +00:00
user	22572eafaf	fix: merge admin metadata from both metadata and litellm_metadata Greptile P2: _get_admin_metadata used 'litellm_metadata or metadata', meaning a caller sending a non-empty litellm_metadata would shadow admin config the proxy had injected into data['metadata']. Admin exemptions would be silently ignored. Check both keys and prefer whichever contains admin fields. Add regression test covering the shadowing scenario.	2026-04-16 21:29:13 +00:00
user	413f89892b	test: update dynamic callback params test for turn_off_message_logging removal Verify turn_off_message_logging is no longer extracted from request kwargs since it is now admin-only.	2026-04-16 21:07:00 +00:00
user	34e9be1ba7	fix: merge team metadata in admin helper, remove turn_off_message_logging from dynamic params Include user_api_key_team_metadata alongside user_api_key_metadata in _get_admin_metadata() so team-level guardrail settings are respected. Key-level settings take precedence over team-level. Remove turn_off_message_logging from _supported_callback_params so it cannot be set via request metadata. Admin controls logging globally or via key/team configuration. Update tests to verify user-injected guardrail flags are ignored while admin-configured flags are respected.	2026-04-16 21:06:59 +00:00
user	3cd5796fc7	refactor: extract admin metadata helper, hoist loop-invariant tag resolution Extract _get_admin_metadata() in CustomGuardrail to deduplicate metadata lookup. Hoist tag resolution above the deployment loop in budget limiter. Update stale comment in tag routing.	2026-04-16 21:06:59 +00:00
user	74a49b527c	fix(proxy): read guardrail config from admin metadata, fix tag routing consistency Read guardrail control flags (disable_global_guardrails, opted_out_global_guardrails) from admin-configured key metadata instead of the request body. This ensures callers cannot override admin security policies. Fix tag-based routing to enforce strict tag checks regardless of whether the request includes tags. Fix budget limiter to use the same dynamic metadata key resolution as the tag router for consistent tag extraction.	2026-04-16 21:06:59 +00:00
shin-berri	7279dca929	Merge pull request #25898 from BerriAI/litellm_llmTranslationOomMitigation_staging [Infra] Reduce llm_translation_testing parallelism and tolerate worker restarts	2026-04-16 13:31:05 -07:00
Yuneng Jiang	ebac729146	[Infra] CI: reduce llm_translation_testing parallelism and tolerate worker restarts Workers in llm_translation_testing have been crashing mid-run with "Not properly terminated" (OOM), even after bumping resource_class to xlarge. Reduce xdist workers from 8 to 4 to lower peak memory, and add --max-worker-restart=5 so a crashed worker is replaced instead of failing the whole run.	2026-04-16 13:10:22 -07:00
shin-berri	65717add14	Merge pull request #25887 from BerriAI/litellm_/vigilant-cannon [Infra] Bump llm_translation_testing resource class to xlarge	2026-04-16 11:53:52 -07:00
Yuneng Jiang	72ba880905	[Infra] Bump llm_translation_testing resource class to xlarge	2026-04-16 11:50:55 -07:00
Sameer Kankute	c6c970ca43	Merge pull request #25875 from BerriAI/litellm_docs_opus_4.7 Fix version in docs	2026-04-16 22:53:14 +05:30
yuneng-jiang	21c0718850	Merge pull request #25871 from BerriAI/litellm_yj_apr15 [Infra] Merge dev branch	2026-04-16 10:11:48 -07:00
Sameer Kankute	13522ff33a	Fix version in docs	2026-04-16 22:41:32 +05:30
ishaan-berri	44c992416c	Merge pull request #25867 from BerriAI/litellm_day_0_opus_4.7_support Litellm day 0 opus 4.7 support	2026-04-16 09:42:11 -07:00
Yuneng Jiang	b26f858ab0	fix(ci): authorize langgraph-prebuilt in liccheck.ini langgraph-prebuilt was previously pulled in as a transitive of langgraph so PyPI license metadata was reported as unknown. Now that it is explicitly pinned (==1.0.8) to avoid the broken 1.0.9 release, the license checker flags it. It is published under MIT by the same langchain-ai/langgraph repository as langgraph itself.	2026-04-16 09:41:51 -07:00
Yuneng Jiang	c294bbe4f0	fix(deps): pin langgraph-prebuilt==1.0.8 to avoid broken 1.0.9 langgraph-prebuilt 1.0.9 imports ExecutionInfo and ServerInfo from langgraph.runtime, but those symbols are not exported until langgraph 1.1.0. Our pin of langgraph==1.0.10 allows langgraph-prebuilt<1.1.0,>=1.0.8, and uv resolves to 1.0.9 (the latest in range), which breaks at import time in every test that touches langgraph.prebuilt (e.g. tests/pass_through_tests/test_mcp_routes.py): ImportError: cannot import name 'ExecutionInfo' from 'langgraph.runtime' Pinning langgraph-prebuilt to 1.0.8 pairs correctly with langgraph==1.0.10 and restores the import path.	2026-04-16 09:36:05 -07:00
Sameer Kankute	07d863b8e7	Remove max support for opus 4.7	2026-04-16 21:58:03 +05:30
Sameer Kankute	f94c8dda82	Fix model names	2026-04-16 21:47:58 +05:30
Yuneng Jiang	dafa1bf97c	Merge remote-tracking branch 'origin/litellm_internal_staging' into litellm_yj_apr15 # Conflicts: # litellm/litellm_core_utils/litellm_logging.py # uv.lock	2026-04-16 09:17:20 -07:00
Sameer Kankute	b3d5ff5774	Fix tests + add docs	2026-04-16 21:45:31 +05:30
Sameer Kankute	a9ff4c6991	Fix add leagcy support for claude code	2026-04-16 21:20:48 +05:30
Sameer Kankute	607412defb	feat(bedrock): inject thinking for clear_thinking context_management on Messages API Bedrock rejects clear_thinking_20251015 unless thinking is enabled or adaptive. Inject minimal extended thinking and interleaved-thinking beta when Claude Code sends context_management without thinking. Adds unit tests. Made-with: Cursor	2026-04-16 21:11:09 +05:30
Sameer Kankute	fb33daa09f	opus 4.7 doesn't support tool search	2026-04-16 21:11:07 +05:30
Sameer Kankute	0868a82c34	Add support for opus 4.7 with new effort levels	2026-04-16 20:45:45 +05:30
Sameer Kankute	26937a2146	Merge pull request #25831 from BerriAI/litellm_oss_staging_04_15_2026_p1 litellm oss staging 04/15/2026	2026-04-16 19:53:00 +05:30
Sameer Kankute	4b5c86b8a1	Fix code qa	2026-04-16 19:29:08 +05:30
Sameer Kankute	baf19b4413	Fix import error	2026-04-16 19:16:49 +05:30
waani	d9a8a8a42e	fix(credentials): sync in-memory credential_list after update (#25758 )	2026-04-16 19:04:26 +05:30
Tim Ren	dd4a41951f	fix(utils): allowed_openai_params must not forward unset params as None (#25777 ) * feat(proxy): add NO_OPENAPI env var to disable /openapi.json endpoint (#25696) * feat(proxy): add NO_OPENAPI env var to disable /openapi.json endpoint - Fixes #25538 * test(proxy): add tests for _get_openapi_url --------- Co-authored-by: Progressive-engg <lov.kumari55@gmail.com> * feat(prometheus): add api_provider label to spend metric (#25693) * feat(prometheus): add api_provider label to spend metric Add `api_provider` to `litellm_spend_metric` labels so users can build Grafana dashboards that break down spend by cloud provider (e.g. bedrock, anthropic, openai, azure, vertex_ai). The `api_provider` label already exists in UserAPIKeyLabelValues and is populated from `standard_logging_payload["custom_llm_provider"]`, but was not included in the spend metric's label list. * add api_provider to requests metric + add test Address review feedback: - Add api_provider to litellm_requests_metric too (same call-site as spend metric, keeps label sets in sync) - Add test_api_provider_in_spend_and_requests_metrics following the existing pattern in test_prometheus_labels.py * fix: ensure `litellm_metadata` is attached to `pre_call` guardrail to align with `post_call` guardrail (#25641) * fix: ensure `litellm_metadata` is attached to pre_call to align with post_call * refactor: remove unused BaseTranslation._ensure_litellm_metadata * refactor: module level imports for ensure_litellm_metadata and CodeQL * fix: update based off of Codex comment * revert: undo usage of `_guardrail_litellm_metadata` * feat: add pricing entry for openrouter/google/gemini-3.1-flash-lite-preview (#25610) * fix(bedrock): skip synthetic tool injection for json_object with no schema (#25740) When response_format={"type": "json_object"} is sent without a JSON schema, _create_json_tool_call_for_response_format builds a tool with an empty schema (properties: {}). The model follows the empty schema and returns {} instead of the actual JSON the caller asked for. This patch: - Skips synthetic json_tool_call injection when no schema is provided. The model already returns JSON when the prompt asks for it. - Fixes finish_reason: after _filter_json_mode_tools strips all synthetic tool calls, finish_reason stays "tool_calls" instead of "stop". Callers (like the OpenAI SDK) misinterpret this as a pending tool invocation. json_schema requests with an explicit schema are unchanged. Co-authored-by: Claude <noreply@anthropic.com> * fix(utils): allowed_openai_params must not forward unset params as None `_apply_openai_param_overrides` iterated `allowed_openai_params` and unconditionally wrote `optional_params[param] = non_default_params.pop(param, None)` for each entry. If the caller listed a param name but did not actually send that param in the request, the pop returned `None` and `None` was still written to `optional_params`. The openai SDK then rejected it as a top-level kwarg: AsyncCompletions.create() got an unexpected keyword argument 'enable_thinking' Reproducer (from #25697): allowed_openai_params = ["chat_template_kwargs", "enable_thinking"] body = {"chat_template_kwargs": {"enable_thinking": False}} Here `enable_thinking` is only present nested inside `chat_template_kwargs`, so the helper should forward `chat_template_kwargs` and leave `enable_thinking` alone. Instead it wrote `optional_params["enable_thinking"] = None`. Fix: only forward a param if it was actually present in `non_default_params`. Behavior is unchanged for the happy path (param sent → still forwarded), and the explicit `None` leakage is gone. Adds a regression test exercising the helper in isolation so the test does not depend on any provider-specific `map_openai_params` plumbing. Fixes #25697 --------- Co-authored-by: lovek629 <59618812+lovek629@users.noreply.github.com> Co-authored-by: Progressive-engg <lov.kumari55@gmail.com> Co-authored-by: Ori Kotek <ori.k@codium.ai> Co-authored-by: Alexander Grattan <51346343+agrattan0820@users.noreply.github.com> Co-authored-by: Mohana Siddhartha Chivukula <103447836+iamsiddhu3007@users.noreply.github.com> Co-authored-by: Amiram Mizne <amiramm@users.noreply.github.com> Co-authored-by: Claude <noreply@anthropic.com>	2026-04-16 19:04:26 +05:30
Brendan Smith-Elion	265a960472	fix(noma-v2): fall back to key_alias for application_id in Noma dashboard (#25795 ) Noma v1 resolved application_id from user_api_key_alias when no explicit value was set (PR #16832). Noma v2 (PR #21400) was rewritten from scratch and this fallback was not ported, causing all requests from shared LiteLLM instances to appear as a single generic "litellm" application in the Noma dashboard — breaking per-user traceability. Fix: after checking dynamic_params and self.application_id, fall back to user_api_key_alias from litellm_metadata or metadata. This matches the pattern used by PromptSecurityGuardrail._resolve_key_alias_from_request_data() and restores the v1 behavior where each API key gets its own application entry in the Noma dashboard. Fixes #25794 Co-authored-by: Brendan Smith-Elion <brendan.smith-elion@arcadia.io> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-16 19:04:24 +05:30
Jared Everett	3cbb36aa13	fix(ollama): propagate done_reason='length' as finish_reason for max_tokens truncation (#25824 ) * fix(ollama): propagate done_reason='length' as finish_reason for max_tokens truncation Ollama returns done_reason='length' when a response is cut off by num_predict (the max_tokens limit). Previously, non-streaming responses hardcoded finish_reason='stop', and streaming used chunk.get('done_reason', 'stop') which also defaulted to 'stop' when done_reason was absent. This meant callers (e.g. the Anthropic pass-through adapter, which maps OpenAI 'length' -> Anthropic 'max_tokens') could never detect truncation, making stop_reason always appear as 'end_turn' even for cut-off responses. Fix: read done_reason from the response JSON in the non-streaming path and use `chunk.get('done_reason') or 'stop'` in the streaming path, so Ollama's actual done_reason passes through to the caller unchanged. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Update test_ollama_chat_transformation.py * Update litellm/llms/ollama/chat/transformation.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>	2026-04-16 19:03:41 +05:30
Darien Kindlund	6b2973b29a	fix(vertex): strip version suffix from model name in count_tokens requests (#25800 ) The Vertex AI count-tokens endpoint rejects model names that include version suffixes (@default, @20251001, etc.) with: "claude-sonnet-4-6@default is not supported for token counting" The same model without the suffix ("claude-sonnet-4-6") works correctly. Strip @suffix from both the model parameter and request_data["model"] in handle_count_tokens_request before sending to the API. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 19:03:40 +05:30
ryan-crabbe-berri	ed0138b50e	Merge pull request #25812 from BerriAI/litellm_fix-invalidate-orgs-on-team-mutation fix(ui): invalidate org queries after team mutations	2026-04-15 22:51:20 -07:00
ryan-crabbe-berri	18c93e0ccd	Merge pull request #25809 from BerriAI/litellm_fix_tool_test_panel_bool_rendering fix(ui): use antd Select for MCP ToolTestPanel bool inputs	2026-04-15 22:50:57 -07:00
ryan-crabbe-berri	cf4f0516be	Merge pull request #25806 from BerriAI/litellm_fix_guardrail_optional_params_bool_rendering fix(ui): render guardrail optional_params bool defaults in Select	2026-04-15 22:50:20 -07:00
Ryan Crabbe	96415a5ac2	Merge remote-tracking branch 'origin/litellm_internal_staging' into litellm_fix-invalidate-orgs-on-team-mutation	2026-04-15 22:41:38 -07:00
Ryan Crabbe	83095c24c6	Merge remote-tracking branch 'origin/litellm_internal_staging' into litellm_fix_tool_test_panel_bool_rendering	2026-04-15 22:41:09 -07:00
Ryan Crabbe	bbf204e602	Merge remote-tracking branch 'origin/litellm_internal_staging' into litellm_fix_guardrail_optional_params_bool_rendering	2026-04-15 22:40:37 -07:00
ryan-crabbe-berri	2dd060b4e4	Merge pull request #25838 from BerriAI/litellm_fix-virtual-key-projected-spend-alert fix(proxy): fix virtual key projected-spend soft budget alerts	2026-04-15 22:22:33 -07:00
Yuneng Jiang	c8cfc5de21	fix(httpx): set response.request and strip content-encoding in MaskedHTTPStatusError MaskedHTTPStatusError constructs a new httpx.Response from the original error. Two bugs surfaced under real HTTP error responses: 1. The new Response was created without request=, so response.request raised RuntimeError("The .request property has not been set.") for any downstream caller (e.g. exception_mapping_utils) that inspected it. 2. The decoded response bytes were passed together with the original Content-Encoding header. On construction httpx tried to decompress the already-decoded bytes and raised httpx.DecodingError ("Error -3 while decompressing data: incorrect header check"). Set response.request to the masked Request and strip Content-Encoding (and the now-stale Content-Length) before rebuilding the Response. URL/message masking is unchanged; the new request carries the already masked URL. Also update test_logging_key_masking_gemini: the security commit `25f93bed91` moved Gemini API keys from ?key=... URL params to the x-goog-api-key header, so api_base no longer contains the key.	2026-04-15 22:03:48 -07:00
Ryan Crabbe	f639769ca9	fix(proxy): use flat soft_budget field for virtual key projected-spend alerts The projected-spend alert in _update_key_cache read from existing_spend_obj.litellm_budget_table["soft_budget"], but the nested dict is never populated for virtual keys (the combined_view SQL maps budget fields to flat top-level attributes instead). This made the check dead code — it silently short-circuited on every request, and when unblocked, crashed update_cache with a Pydantic ValidationError because _get_projected_spend_over_limit returns a date object but CallInfo.projected_exceeded_date expects str. Fixes: read from the flat existing_spend_obj.soft_budget field that IS populated, and stringify projected_exceeded_date. Also marks team soft budget email alerts as enterprise in docs. Closes #20324	2026-04-15 21:38:18 -07:00
Yuneng Jiang	070374d03a	fix(ci): authorize RestrictedPython in liccheck.ini RestrictedPython (ZPL-2.1, a BSD-style permissive license) was added as a dependency for the custom_code guardrail sandbox, but the license checker didn't recognize it. Add to authorized packages list.	2026-04-15 21:20:40 -07:00
Yuneng Jiang	fdeeed6df8	fix(ci): resolve mypy and ruff lint failures - vertex_ai_context_caching.py: add explicit Optional[str] annotation on auth_header so later branches that assign vertex_auth_header (Optional[str]) type-check against the first branch's dict assignment (which already has type: ignore[assignment]). - path_utils.py: remove unused pathlib.Path import (F401). - emulated_handler.py: extract _extract_tool_call_fields, _resolve_queries_from_args, _execute_file_search_tool_calls, and _build_follow_up_input helpers to drop aresponses_with_emulated_file_search below ruff's PLR0915 statement limit. Behavior unchanged.	2026-04-15 21:12:51 -07:00
yuneng-jiang	be1b802501	Merge pull request #25834 from stuxf/fix/path-traversal-guardrail-yaml fix(proxy): add shared path utilities, prevent directory traversal	2026-04-15 21:01:48 -07:00
yuneng-jiang	0c8b83c0a1	Merge pull request #25827 from stuxf/fix/outbound-host-validation fix(proxy): harden request parameter handling	2026-04-15 20:57:45 -07:00
user	a4faacecaf	style: move import os to module level, fix import ordering	2026-04-16 03:29:55 +00:00

1 2 3 4 5 ...

37275 Commits