litellm

mirror of https://github.com/tiennm99/litellm.git synced 2026-06-17 20:48:32 +00:00

Author	SHA1	Message	Date
Tim Ren	dd4a41951f	fix(utils): allowed_openai_params must not forward unset params as None (#25777 ) * feat(proxy): add NO_OPENAPI env var to disable /openapi.json endpoint (#25696) * feat(proxy): add NO_OPENAPI env var to disable /openapi.json endpoint - Fixes #25538 * test(proxy): add tests for _get_openapi_url --------- Co-authored-by: Progressive-engg <lov.kumari55@gmail.com> * feat(prometheus): add api_provider label to spend metric (#25693) * feat(prometheus): add api_provider label to spend metric Add `api_provider` to `litellm_spend_metric` labels so users can build Grafana dashboards that break down spend by cloud provider (e.g. bedrock, anthropic, openai, azure, vertex_ai). The `api_provider` label already exists in UserAPIKeyLabelValues and is populated from `standard_logging_payload["custom_llm_provider"]`, but was not included in the spend metric's label list. * add api_provider to requests metric + add test Address review feedback: - Add api_provider to litellm_requests_metric too (same call-site as spend metric, keeps label sets in sync) - Add test_api_provider_in_spend_and_requests_metrics following the existing pattern in test_prometheus_labels.py * fix: ensure `litellm_metadata` is attached to `pre_call` guardrail to align with `post_call` guardrail (#25641) * fix: ensure `litellm_metadata` is attached to pre_call to align with post_call * refactor: remove unused BaseTranslation._ensure_litellm_metadata * refactor: module level imports for ensure_litellm_metadata and CodeQL * fix: update based off of Codex comment * revert: undo usage of `_guardrail_litellm_metadata` * feat: add pricing entry for openrouter/google/gemini-3.1-flash-lite-preview (#25610) * fix(bedrock): skip synthetic tool injection for json_object with no schema (#25740) When response_format={"type": "json_object"} is sent without a JSON schema, _create_json_tool_call_for_response_format builds a tool with an empty schema (properties: {}). The model follows the empty schema and returns {} instead of the actual JSON the caller asked for. This patch: - Skips synthetic json_tool_call injection when no schema is provided. The model already returns JSON when the prompt asks for it. - Fixes finish_reason: after _filter_json_mode_tools strips all synthetic tool calls, finish_reason stays "tool_calls" instead of "stop". Callers (like the OpenAI SDK) misinterpret this as a pending tool invocation. json_schema requests with an explicit schema are unchanged. Co-authored-by: Claude <noreply@anthropic.com> * fix(utils): allowed_openai_params must not forward unset params as None `_apply_openai_param_overrides` iterated `allowed_openai_params` and unconditionally wrote `optional_params[param] = non_default_params.pop(param, None)` for each entry. If the caller listed a param name but did not actually send that param in the request, the pop returned `None` and `None` was still written to `optional_params`. The openai SDK then rejected it as a top-level kwarg: AsyncCompletions.create() got an unexpected keyword argument 'enable_thinking' Reproducer (from #25697): allowed_openai_params = ["chat_template_kwargs", "enable_thinking"] body = {"chat_template_kwargs": {"enable_thinking": False}} Here `enable_thinking` is only present nested inside `chat_template_kwargs`, so the helper should forward `chat_template_kwargs` and leave `enable_thinking` alone. Instead it wrote `optional_params["enable_thinking"] = None`. Fix: only forward a param if it was actually present in `non_default_params`. Behavior is unchanged for the happy path (param sent → still forwarded), and the explicit `None` leakage is gone. Adds a regression test exercising the helper in isolation so the test does not depend on any provider-specific `map_openai_params` plumbing. Fixes #25697 --------- Co-authored-by: lovek629 <59618812+lovek629@users.noreply.github.com> Co-authored-by: Progressive-engg <lov.kumari55@gmail.com> Co-authored-by: Ori Kotek <ori.k@codium.ai> Co-authored-by: Alexander Grattan <51346343+agrattan0820@users.noreply.github.com> Co-authored-by: Mohana Siddhartha Chivukula <103447836+iamsiddhu3007@users.noreply.github.com> Co-authored-by: Amiram Mizne <amiramm@users.noreply.github.com> Co-authored-by: Claude <noreply@anthropic.com>	2026-04-16 19:04:26 +05:30
Krrish Dholakia	f42ffed2bd	Litellm oss staging 04 02 2026 p1 (#25055 ) * fix(vertex_ai): support pluggable (executable) credential_source for WIF auth (#24700) The WIF credential dispatch in load_auth() only handled identity_pool and aws credential types. When credential_source.executable was present (used for Azure Managed Identity via Workload Identity Federation), it fell through to identity_pool.Credentials which rejected it with MalformedError. Add dispatch to google.auth.pluggable.Credentials for executable-type credential sources, following the same pattern as the existing identity_pool and aws helpers. Fixes authentication for Azure Container Apps → GCP Vertex AI via WIF with executable credential sources. * feat(logging): add component and logger fields to JSON logs for 3rd p… (#24447) * feat(logging): add component and logger fields to JSON logs for 3rd party filtering * Let user-supplied extra fields win over auto-generated component/logger, tighten test assertions * Feat - Add organization into the metrics metadata for org_id & org_alias (#24440) * Add org_id and org_alias label names to Prometheus metric definitions * Add user_api_key_org_alias to StandardLoggingUserAPIKeyMetadata * Populate user_api_key_org_alias in pre-call metadata * Pass org_id and org_alias into per-request Prometheus metric labels * Add test for org labels on per-request Prometheus metrics * chore: resolve test mockdata * Address review: populate org_alias from DB view, add feature flag, use .get() for org metadata * Add org labels to failure path and verify flag behavior in test * Fix test: build flag-off enum_values without org fields * Gate org labels behind feature flag in get_labels() instead of static metric lists * Scope org label injection to metrics that carry team context, remove orphaned budget label defs, add test teardown * Use explicit metric allowlist for org label injection instead of team heuristic * Fix duplicate org label guard, move _org_label_metrics to class constant * Reset custom_prometheus_metadata_labels after duplicate label assertion * fix: emit org labels by default, remove flag, fix missing org_alias in all metadata paths * fix: emit org labels by default, no opt-in flag required * fix: write org_alias to metadata unconditionally in proxy_server.py * fix: 429s from batch creation being converted to 500 (#24703) * add us gov models (#24660) * add us gov models * added max tokens * Litellm dev 04 02 2026 p1 (#25052) * fix: replace hardcoded url * fix: Anthropic web search cost not tracked for Chat Completions The ModelResponse branch in response_object_includes_web_search_call() only checked url_citation annotations and prompt_tokens_details, missing Anthropic's server_tool_use.web_search_requests field. This caused _handle_web_search_cost() to never fire for Anthropic Claude models. Also routes vertex_ai/claude-* models to the Anthropic cost calculator instead of the Gemini one, since Claude on Vertex uses the same server_tool_use billing structure as the direct Anthropic API. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * fix(anthropic): pass logging_obj to client.post for litellm_overhead_time_ms (#24071) When LITELLM_DETAILED_TIMING=true, litellm_overhead_time_ms was null for Anthropic because the handler did not pass logging_obj to client.post(), so track_llm_api_timing could not set llm_api_duration_ms. Pass logging_obj=logging_obj at all four post() call sites (make_call, make_sync_call, acompletion, completion). Add test to ensure make_call passes logging_obj to client.post. Made-with: Cursor * sap - add additional parameters for grounding - additional parameter for grounding added for the sap provider * sap - fix models * (sap) add filtering, masking, translation SAP GEN AI Hub modules * (sap) add tests and docs for new SAP modules * (sap) add support of multiple modules config * (sap) code refactoring * (sap) rename file * test(): add safeguard tests * (sap) update tests * (sap) update docs, solve merge conflict in transformation.py * (sap) linter fix * (sap) Align embedding request transformation with current API * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) mock commit * (sap) run black formater * (sap) add literals to models, add negative tests, fix test for tool transformation * (sap) fix formating * (sap) fix models * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) commit for rerun bot review * (sap) minor improve * (sap) fix after bot review * (sap) lint fix * docs(sap): update documentation * fix(sap): change creds priority * fix(sap): change creds priority * fix(sap): fix sap creds unit test * fix(sap): linter fix * fix(sap): linter fix * linter fix * (sap) update logic of fetching creds, add additional tests * (sap) clean up code * (sap) fix after review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) add a possibility to put the service key by both variants * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) update test * (sap) update service key resolve function * (sap) run black formater * (sap) fix validate credentials, add negative tests for credential fetching * (sap) fix validate credentials, add negative tests for credential fetching * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) fix after bot review * (sap) lint fix * (sap) lint fix * feat: support service_tier in gemini * chore: add a service_tier field mapping from openai to gemini * fix: use x-gemini-service-tier header in response * docs: add service_tier to gemini docs * chore: add defaut/standard mapping, and some tests * chore: tidying up some case insensitivity * chore: remove unnecessary guard * fix: remove redundant test file * fix: handle 'auto' case-insensitively * fix: return service_tier on final steamed chunk * chore: black * feat: enable supports_service_tier to gemini models * Fix get_standard_logging_metadata tests * Fix test_get_model_info_bedrock_models * Fix test_get_model_info_bedrock_models * Fix remaining tests * Fix mypy issues * Fix tests * Fix merge conflicts * Fix code qa * Fix code qa * Fix code qa * Fix greptile review --------- Co-authored-by: michelligabriele <gabriele.michelli@icloud.com> Co-authored-by: Josh <36064836+J-Byron@users.noreply.github.com> Co-authored-by: mubashir1osmani <mubashir.osmani777@gmail.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: milan-berri <milan@berri.ai> Co-authored-by: Alperen Kömürcü <alperen.koemuercue@sap.com> Co-authored-by: Vasilisa Parshikova <vasilisa.parshikova@sap.com> Co-authored-by: Lin Xu <lin.xu03@sap.com> Co-authored-by: Mark McDonald <macd@google.com> Co-authored-by: Sameer Kankute <sameer@berri.ai>	2026-04-08 21:37:10 -07:00
Sameer Kankute	1fac58abb3	fix(tests): reset module-level cache in stale alias bypass tests Reset _ENABLE_TEAM_STALE_ALIAS_BYPASS to None in both test functions to ensure test isolation and prevent ordering-dependent failures Made-with: Cursor	2026-03-27 20:11:28 +05:30
Sameer Kankute	592ac98ddc	fix(router): address Greptile P1/P2 review comments - Add deduplication guard in _update_team_model_index to prevent duplicate indices - Add wildcard comment in map_team_model for clarity - Add monkeypatch to test_team_alias_stale_bypass_disabled_by_default for determinism - Extract _get_team_deployments helper to centralize DB access pattern - Add clarifying comments for team_public_model_name assignment ordering Made-with: Cursor	2026-03-27 20:11:28 +05:30
Sameer Kankute	173695f5e0	Fix greptile comments	2026-03-27 20:11:27 +05:30
Alexey	71b687e00a	fix(proxy): sync normalized call_type into model_call_details for proxy-only errors	2026-03-18 23:49:07 +03:00
Krish Dholakia	244bdffd1b	Merge pull request #23509 from michelligabriele/fix/pass-through-duplicate-failure-logs fix(proxy): prevent duplicate callback logs for pass-through endpoint failures	2026-03-18 11:57:50 -07:00
yuneng-jiang	3489d1dbef	fix(tests): update outdated model names in wildcard model tests The expected model names in test_get_known_models_from_wildcard were removed from the model registry (claude-3-5-haiku-20241022, gemini-1.5-flash, gemini-1.5-pro). Updated to current model names. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-12 22:53:31 -07:00
michelligabriele	a4f94b241b	fix(proxy): prevent duplicate callback logs for pass-through endpoint failures Pass-through endpoint failures fired both async_failure_handler and async_post_call_failure_hook, causing duplicate logs in callback integrations. Add pass-through guards to the failure path, matching the existing success path behavior.	2026-03-13 04:32:36 +01:00
Julio Quinteros Pro	740cdc5c20	fix: use real State object in mock_request to fix _safe_get_request_headers The _safe_get_request_headers caching (commit `e7175a52`) uses request.state._cached_headers. With Mock(spec=Request), getattr on state returns a Mock (truthy), causing RedactedDict to receive a Mock instead of a dict. Using a real starlette State object fixes this. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-03 20:05:20 -03:00
Ishaan Jaff	29e3fd5d79	[Release Fix] (#22411 ) * fix(lint): suppress PLR0915 for 3 complex methods that exceed 50-statement limit - streaming_iterator.py: _process_event (84 statements) - transformation.py: translate_messages_to_responses_input (51 statements) - transformation.py: transform_realtime_response (54 statements) Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix(mypy): resolve type errors in public_endpoints, user_api_key_auth, common_utils, transformation - public_endpoints.py: fix _cached_endpoints type annotation - user_api_key_auth.py: accept Optional[str] for end_user_id parameter - common_utils.py: add NewProjectRequest/UpdateProjectRequest to Union type - transformation.py: add ChatCompletionRedactedThinkingBlock and list[Any] to content type Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix(proxy-extras): bump version to 0.4.50 and sync schema - Bump litellm-proxy-extras from 0.4.49 to 0.4.50 - Sync schema.prisma with main proxy schema - Includes new LiteLLM_ClaudeCodePluginTable model - Includes new @@index([startTime, request_id]) on SpendLogs - Update version references in requirements.txt and pyproject.toml Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix(router): use string id in test_add_deployment and add defensive str() in register_model - Change test to use string '100' instead of int 100 for model_info.id - Add str() conversion in register_model to prevent AttributeError on non-string keys Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix(security): update minimatch to 10.2.4 to fix CVE-2026-27903 and CVE-2026-27904 - Run npm audit fix in docs/my-website - Updates minimatch from 10.2.1 to 10.2.4 (fixes HIGH severity ReDoS vulnerabilities) Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix(test): update realtime guardrail test assertions to match actual guardrail behavior - test_text_message_blocked_by_guardrail_no_ai_response: allow guardrail's own block message text in response.done (previously expected empty content) - test_voice_transcript_blocked_by_guardrail: allow guardrail to send response.cancel + block message + response.create flow (previously expected no response.create) Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix: revert proxy-extras version in requirements.txt and pyproject.toml The litellm-proxy-extras 0.4.50 is not published to PyPI yet, so consumer references must stay at 0.4.49. Only the source package pyproject.toml should be bumped to 0.4.50 for the publish_proxy_extras CI job. Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix: make transcript delta check optional in voice guardrail test The guardrail sends an error event (guardrail_violation) when blocking voice transcripts; it does not always produce transcript deltas. Remove the assertion requiring response.audio_transcript.delta since the error event is the primary signal that blocked content was handled. Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * Add missing env keys to documentation: LITELLM_MAX_STREAMING_DURATION_SECONDS and LITELLM_USE_CHAT_COMPLETIONS_URL_FOR_ANTHROPIC_MESSAGES These two environment variables were used in code but not documented in the environment variables reference section of config_settings.md, causing the test_env_keys.py CI test to fail. Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * Fix 13 mypy type errors across 6 files - in_flight_requests_middleware.py: Fix type: ignore error codes from [union-attr] to [attr-defined], add [arg-type] for Gauge *kwargs - transformation.py: Add [assignment] ignore for output_format reassignment, add fallback empty string for tool use id to fix arg-type - responses/main.py: Remove redundant type annotation on second secret_fields assignment to fix no-redef - streaming_iterator.py: Add [assignment] ignores for intermediate cache token assignments - handler.py: Add [typeddict-item] ignore for AnthropicMessagesRequest construction from dict - public_endpoints.py: Add [arg-type] ignore for _load_endpoints() return type mismatch with SupportedEndpoint model Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> fix: add auth overrides to spend tracking tests, fix realtime guardrail assertion, update UI minimatch - Add app.dependency_overrides for user_api_key_auth in 4 spend tracking tests that were returning 401 Unauthorized (error_code, error_message, error_code_and_key_alias, key_hash) - Fix realtime guardrail test to check ANY error event for guardrail_violation instead of just the first (OpenAI may send its own errors first) - Update ui/litellm-dashboard/package-lock.json to fix minimatch vulnerability Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * Fix failing MCP e2e and create_mcp_server UI tests Test 1 (test_independent_clients_no_shared_session): - Add allow_all_keys: true to MCP servers in test config. With master_key and no DB, get_allowed_mcp_servers returned empty, causing 0 tools and 403 on tool calls. allow_all_keys bypasses per-key restrictions. - Add asyncio.sleep(0.5) between client connections to allow MCP SDK TaskGroup cleanup and avoid ExceptionGroup on connection close (MCP #915). Test 2 (create_mcp_server 'auth value is provided'): - Use userEvent.setup({ delay: null }) for instant keystrokes to avoid timeout from default typing delay on CI. - Increase per-test timeout to 15000ms for CI environments. Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix: stabilize proxy unit tests for parallel execution - test_response_polling_handler: add xdist_group to prevent heavy import OOM - test_db_schema_migration: use temp dir for worker isolation, sync schema.prisma index - test_custom_tokenizer_bug: use lighter tokenizer to prevent OOM in parallel Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix: add auth overrides to more spend tracking and model info tests - Fix test_ui_view_spend_logs_pagination missing auth override (401) - Fix test_view_spend_tags missing auth override (401) - Fix test_view_spend_tags_no_database missing auth override (401) - Fix test_empty_model_list.py to use app.dependency_overrides instead of patch() for FastAPI dependency injection auth Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix(test): use patch.object for aiohttp transport test to work in parallel execution The @patch decorator was not intercepting the static method call in parallel xdist workers. Using patch.object on the directly-imported class is more reliable. Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix(security): update minimatch from 10.2.1 to 10.2.4 in Dockerfile The Docker image was explicitly pinning minimatch@10.2.1 which has HIGH severity ReDoS vulnerabilities (GHSA-7r86-cg39-jmmj, GHSA-23c5-xmqv-rm74). Update to 10.2.4 which includes fixes for both CVEs. Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix(ui): prevent MCP and TeamInfo test timeouts on CI - Add userEvent.setup({ delay: null }) to all tests using userEvent in both files - Add timeout: 15000 to tests with significant user interaction (typing, multiple clicks) - Fixes: create_mcp_server Bearer Token test, TeamInfo cancel button test Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix: stabilize parallel test execution and aiohttp transport test - test_aiohttp_handler: rewrite transport test to not rely on static method mock (consistently fails in parallel xdist workers) - test_proxy_cli: add xdist_group to prevent timeout during heavy imports - test_swagger_chat_completions: add xdist_group to prevent timeout Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix(security): add serialize-javascript override to fix GHSA-5c6j-r48x-rmvq Add npm override for serialize-javascript>=7.0.3 in docs/my-website to fix HIGH severity RCE vulnerability via RegExp.flags. Also bump minimatch override to >=10.2.4. Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * Fix flaky tests: remove broken Vertex model, add retries for Anthropic - Remove vertex_ai/meta/llama-4-scout-17b-16e-instruct-maas from test_partner_models_httpx_streaming - consistently returns 400 BadRequest - Add @pytest.mark.flaky(retries=6, delay=10) to test_function_call_parsing for transient Anthropic API overload errors - Add @pytest.mark.flaky(retries=6, delay=10) to test_openai_stream_options_call for transient Anthropic InternalServerError Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix(ci): add xdist_group(proxy_heavy) to prevent OOM in parallel proxy tests - Add pytestmark = pytest.mark.xdist_group('proxy_heavy') to test_proxy_utils.py - Change test_db_schema_migration.py from schema_migration to proxy_heavy group - Add @pytest.mark.xdist_group('proxy_heavy') to test_proxy_server.py::test_health Groups heavy proxy tests to run on same worker, avoiding worker OOM crashes. Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * Fix vertex AI qwen global endpoint test to mock vertexai module import The test_vertex_ai_qwen_global_endpoint_url test was failing because the VertexAIPartnerModels.completion() method tries to 'import vertexai' before any of the mocked code runs. In environments without google-cloud-aiplatform installed, this import fails with a VertexAIError(status_code=400). Fix by: - Adding patch.dict('sys.modules', {'vertexai': MagicMock()}) to mock the vertexai module import - Adding vertex_ai_location parameter to the acompletion call for completeness Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix(ci): add xdist_group to health endpoint and watsonx tests for parallel stability - test_health_liveliness_endpoint: add xdist_group('proxy_health') to prevent timeout - test_watsonx_gpt_oss tests: add xdist_group('watsonx_heavy') to prevent mock interference Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix(test): pre-populate WatsonX IAM token cache to prevent parallel test interference The watsonx prompt transformation test was failing in parallel execution because litellm.module_level_client.post mock was being interfered with by other tests. Pre-populating the IAM token cache avoids the HTTP call entirely. Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix(test): add spend data polling with retries for e2e pass-through tests - test_vertex_with_spend.test.js: Replace 15s fixed wait with polling loop (up to 6 attempts, 10s apart) for spend data to appear in DB - Increase test timeout from 25s to 90s to accommodate polling - base_anthropic_messages_tool_search_test.py: Add flaky(retries=3) for streaming test that depends on live Anthropic API Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix(ci): reduce parallel workers from 8 to 4 for proxy tests to prevent OOM - litellm_proxy_unit_testing_part2: -n 8 -> -n 4 - litellm_mapped_tests_proxy_part2: -n 8 -> -n 4, timeout 60 -> 120 - Worker crashes consistently caused by too many parallel proxy tests each loading the full FastAPI app and heavy dependency tree Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix(db): add migration for SpendLogs composite index (startTime, request_id) The @@index([startTime, request_id]) was added to schema.prisma but had no corresponding migration. This caused test_aaaasschema_migration_check to fail because prisma migrate diff detected the missing index. Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix(db): add migration for MCP available_on_public_internet default change to true The schema.prisma changed the default for available_on_public_internet from false to true, but no migration was created. This caused the schema migration test to detect drift. Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix(test): increase server wait time and add retry to flaky external API tests - test_basic_python_version.py: increase server startup wait from 60s to 90s for slower CI environments (fixes installing_litellm_on_python_3_13) - test_a2a_agent.py: add flaky(retries=3, delay=5) for non-streaming test that depends on live A2A agent endpoint Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix(test): add flaky retries to all intermittent external API tests for 0-fail CI Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix(test): add auth overrides to file endpoint tests that return 500 The test_target_storage tests were getting 500 because the FastAPI auth dependency wasn't overridden. Added app.dependency_overrides for proper auth bypass in test environment. Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> --------- Co-authored-by: Cursor Agent <cursoragent@cursor.com> Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>	2026-02-28 09:46:35 -08:00
milan-berri	3e60ca3682	fix: populate user_id and user_info for admin users in /user/info (#22239 ) * fix: populate user_id and user_info for admin users in /user/info endpoint Fixes #22179 When admin users call /user/info without a user_id parameter, the endpoint was returning null for both user_id and user_info fields. This broke budgeting tooling that relies on /user/info to look up current budget and spend. Changes: - Modified _get_user_info_for_proxy_admin() to accept user_api_key_dict parameter - Added logic to fetch admin's own user info from database - Updated function to return admin's user_id and user_info instead of null - Updated unit test to verify admin user_id is populated The fix ensures admin users get their own user information just like regular users. * test: make mock get_data signature match real method - Updated MockPrismaClientDB.get_data() to accept all parameters that the real method accepts - Makes mock more robust against future refactors - Added datetime and Union imports - Mock now returns None when user_id is not provided	2026-02-27 19:12:16 -08:00
Ishaan Jaff	daa682e125	fix(tests): add missing start_db_health_watchdog_task mock (#21804 ) * fix(tests): add missing start_db_health_watchdog_task mock in test_proxy_server_prisma_setup * fix(tests): add missing start_db_health_watchdog_task mock in test_health_check_not_called_when_disabled	2026-02-21 12:31:52 -08:00
Shivam Rawat	c0e87f7ffb	fixed byok models for teams issue (#21408 )	2026-02-17 15:26:03 -08:00
Julio Quinteros Pro	2d41b03f8b	fix(test): mock environment variables for callback validation test The test test_proxy_config_state_post_init_callback_call was failing with: ``` ValidationError: 2 validation errors for TeamCallbackMetadata callback_vars.langfuse_public_key Input should be a valid string [type=string_type, input_value=None, input_type=NoneType] ``` Root cause: The test uses environment variable references like "os.environ/LANGFUSE_PUBLIC_KEY" which get resolved at runtime. In parallel execution with --dist=loadscope, these environment variables may not be set in all worker processes, causing the resolution to return None, which fails Pydantic validation expecting strings. Solution: Use monkeypatch to set the required environment variables before the test runs. This ensures consistent behavior across all test execution environments (local, CI, parallel workers). Fixes test failure exposed by PR #21277. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-15 20:44:17 -03:00
Sameer Kankute	9083b06ba7	Fix test_provider_specific_header_in_request	2026-02-11 16:56:26 +05:30
Harshit Jain	51d565f619	fix conflicts with main- (this PR is from upstream/main)	2026-02-07 03:10:53 +05:30
Ishaan Jaffer	35c636ba97	test_health_check_not_called_when_disabled	2026-01-10 13:55:11 -08:00
Alexsander Hamir	5534038e93	Fix CI: Revert security scan changes and add GitGuardian ignore rules (#18358 )	2025-12-22 17:03:53 -08:00
yuneng-jiang	81dc70673a	Merge remote-tracking branch 'origin' into litellm_ui_unset_values	2025-12-22 11:44:41 -08:00
Ishaan Jaffer	6112160a16	Revert "[Fix] Security - Remove example API keys with high entropy (#18255 )" This reverts commit `24edbccf5c`.	2025-12-20 20:48:11 +05:30
yuneng-jiang	ffcac2eebc	Allow deleting key expiry	2025-12-19 18:04:04 -08:00
Alexsander Hamir	24edbccf5c	[Fix] Security - Remove example API keys with high entropy (#18255 )	2025-12-19 10:09:50 -08:00
Alexsander Hamir	e9baa83a0f	[Fix] CI/CD – Clean Up Performance PR Changes & others (#17838 )	2025-12-11 12:50:03 -08:00
Petre Alexandru	911e802969	feat: add parallel execution handling in during_call_hook (#16279 )	2025-11-05 18:35:25 -08:00
Ishaan Jaffer	cb57455172	test_foward_litellm_user_info_to_backend_llm_call	2025-10-27 13:48:23 -07:00
Krish Dholakia	2bd41dc034	Guardrails - Responses API, Image Gen, Text completions, Audio transcriptions, Audio Speech, Rerank, Anthropic Messages API support via the unified `apply_guardrails` function (#15706 ) * fix(presidio.py): handle content as a list of texts covers openai + anthropic messages api * fix(presidio.py): safe get messages * test: add unit testing for presidio guardrails * fix(unified_guardrail.py): initial commit * fix(enkryptai.py): implement apply_guardrail to enkrypt guardrail * fix(unified_guardrail.py): support unified guardrail on input * feat(unified_guardrail.py): add post call success hook implementation allows us to just have 1 place to handle llm translation to guardrail api spec * refactor: refactor initial unified guardrail component * refactor: more refactoring * feat(responses/): add guardrails to responses api allows existing guardrails to work for new llm endpoints * docs(adding_guardrail_support.md): document new guardrail endpoint support * test: add unit tests * feat(image_generation/): add guardrail support for image generation endpoint * feat(openai/text_completion): support guardrails on `/v1/completions` API * docs: document guardrails support on new endpoints * docs: clarify when guardrails run * feat(openai/speech): add guardrail support for input * docs(rerank/): add guardrail support on input query * fix: fix ruff check	2025-10-25 13:38:57 -07:00
Ishaan Jaff	f55745fc5e	[Fix] Forward anthropic-beta headers to Bedrock, VertexAI (#15700 ) * [Fix] Forward anthropic-beta headers to Bedrock and other cross-provider scenarios (#15623) * add_provider_specific_headers_to_request * fix add_provider_specific_headers_to_request * test_provider_specific_header_multi_provider * test_provider_specific_header_in_request --------- Co-authored-by: Jack Venberg <jack.venberg@rover.com>	2025-10-18 16:26:32 -07:00
Mubashir Osmani	8b804303ed	fix: ci/cd tests + lint errors (#14646 ) * fix: lint errors + tests * fixed ci tests * fixed tests --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>	2025-09-17 17:06:43 -07:00
Krrish Dholakia	7e5bc8af28	test: update test	2025-07-29 21:35:44 -07:00
Ishaan Jaff	ff7dd1756a	[Security Bug Fix] Ensure only LLM API route fails get logged on Langfuse (and other loggers) (#12308 ) * _is_proxy_only_llm_api_error * test_proxy_only_error_true_for_llm_route * add not on change * Update tests/test_litellm/proxy/test_proxy_utils.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * add test_post_call_failure_hook_auth_error_key_info_route * test fix _is_proxy_only_llm_api_error * test_chat_completion_request_with_redaction * test_post_call_failure_hook_auth_error_llm_api_route --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-07-04 14:42:42 -07:00
Youfu Zhang	1c68c24358	introduce new environment variable NO_REDOC to opt-out Redoc (#12092 ) Signed-off-by: Youfu Zhang <zhangyoufu@gmail.com>	2025-06-27 21:26:37 -07:00
Krish Dholakia	cb90f8e613	Allow `/models` to return correct models for custom wildcard prefixes (#11784 ) * fix(model_checks.py): cleanup logic support wildcard models with non-provider prefix's for model discovery Closes https://github.com/BerriAI/litellm/pull/10358 * feat(model_checks.py): delegate wildcard prefix appending to the get_known_models_from_wildcard function remove from the 'get_provider_models' function * fix(model_checks.py): don't double add the wildcard prefix * test: update tests	2025-06-16 22:11:36 -07:00
Laurien	0c50f8bcc9	Update enduser spend and budget reset date based on budget duration (#8460 )	2025-06-08 08:39:14 -07:00
Ishaan Jaff	ea841eeb9b	[Feat] UI - show vector store permissions for Key, Team, Org (#11277 ) * fix LiteLLM_ObjectPermissionTable * fix include object_permission for list key * fix key list to inclue obj permissions * fix object permissions for vector stores on key info * add key edit view with vector stores * allow editing vector stores permissions * fixes obj permissions * feat: add obj permission on UI * fix: add object_permission:true * ui show org vector stores on org info * fix: show object permissions on /org/info * feat: allow updating obj permissions for keys * fixes: key object permissions * fixes: team object permissions * fixes: org object permissions * fix vector store selector for Orgs	2025-05-30 17:23:50 -07:00
Krish Dholakia	1caefb0ce0	fix(ui_sso.py): maintain backwards compatibility for older user id va… (#11106 ) * fix(ui_sso.py): maintain backwards compatibility for older user id variations Fixes issue in later SSO checks which only checked id from result * fix(internal_user_endpoints.py): handle trailing whitespace in new user email * fix(internal_user_endpoints.py): apply default_internal_user_settings on all new user calls (even when role not set) allows role undefined users to be assigned the correct role on sign up * feat(proxy_server.py): load default user settings from db - update litellm correctly updates the litellm module with default internal user settings ensures updated settings actually apply * test: add unit test * fix(internal_user_endpoints.py): fix internal user default param role * fix(ui_sso.py): fix linting error	2025-05-23 23:46:29 -07:00
Krrish Dholakia	b54e2ae98b	test: update unit test	2025-05-15 22:18:15 -07:00
Damian Gleumes	384a7ba94d	[Feat]: Configure LiteLLM to Parse User Headers from Open Web UI (#9802 ) * add user_header_name * docs: add per-user tracking to Open WebUI with LiteLLM doc * docs: standardize "OpenWeb UI" spelling across openweb_ui.md * docs: improve wording for openweb_ui guide * fix end_user_id not being set - move user header parsing to add_litellm_data_to_request - also set user_api_key_dict.end_user_id from user header	2025-05-15 22:01:12 -07:00
Ishaan Jaff	4ddca7a79c	Merge branch 'main' into litellm_fix_service_account_behavior	2025-04-01 12:04:28 -07:00
Ishaan Jaff	c2c5dbf24f	test_get_enforced_params	2025-04-01 08:41:53 -07:00
Ishaan Jaff	13aa7f75f6	test_enforced_params_check	2025-04-01 07:40:31 -07:00
Ishaan Jaff	55763ae276	test_end_user_transactions_reset	2025-04-01 07:13:25 -07:00
Ishaan Jaff	923ac2303b	test_end_user_transactions_reset	2025-03-31 20:55:13 -07:00
Ishaan Jaff	758182fc7f	fix typo on codebase	2025-03-27 22:36:00 -07:00
Krrish Dholakia	6ed995952f	fix: fix test	2025-03-14 20:28:50 -07:00
Krish Dholakia	51cb3c84e3	Litellm stable UI 02 17 2025 p1 (#8599 ) * fix(key_management_endpoints.py): initial commit with logic to get all keys for teams user is an admin for * fix(key_managements_endpoints.py): return all keys for teams user is an admin for * fix(key_management_endpoints.py): add query param to ensure user opts into seeing all team keys (not just their own) * fix(regenerate_key_modal.tsx): fix key regenerate * fix(proxy_server.py): fix model metrics check on none api base * test(test_key_generate_prisma.py): remove redundant test * test(test_proxy_utils.py): add unit test covering new management endpoint helper util * fix: fix test * test(test_proxy_server.py): fix test	2025-02-17 17:55:05 -08:00
Krish Dholakia	57e5ec07cc	Improved wildcard route handling on `/models` and `/model_group/info` (#8473 ) * fix(model_checks.py): update returning known model from wildcard to filter based on given model prefix ensures wildcard route - `vertex_ai/gemini-` just returns known vertex_ai/gemini- models test(test_proxy_utils.py): add unit testing for new 'get_known_models_from_wildcard' helper * test(test_models.py): add e2e testing for `/model_group/info` endpoint * feat(prometheus.py): support tracking total requests by user_email on prometheus adds initial support for tracking total requests by user_email * test(test_prometheus.py): add testing to ensure user email is always tracked * test: update testing for new prometheus metric * test(test_prometheus_unit_tests.py): add user email to total proxy metric * test: update tests * test: fix spend tests * test: fix test * fix(pagerduty.py): fix linting error	2025-02-11 19:37:43 -08:00
Ishaan Jaff	81109893ec	(round 4 fixes) - Team model alias setting (#8474 ) * update team info endpoint * clean up model alias * fix model alias * fix model alias card * clean up naming on docs * fix model alias card * fix _model_in_team_aliases * team alias - fix litellm.model_alias_map * fix _update_model_if_team_alias_exists * fix test_aview_spend_per_user * Test model alias functionality with teams: * complete e2e test * test_update_model_if_team_alias_exists	2025-02-11 16:40:01 -08:00
Krish Dholakia	df93debbc7	Internal User Endpoint - vulnerability fix + response type fix (#8228 ) * fix(key_management_endpoints.py): fix vulnerability where a user could update another user's keys Resolves https://github.com/BerriAI/litellm/issues/8031 * test(key_management_endpoints.py): return consistent 403 forbidden error when modifying key that doesn't belong to user * fix(internal_user_endpoints.py): return model max budget in internal user create response Fixes https://github.com/BerriAI/litellm/issues/7047 * test: fix test * test: update test to handle gemini token counter change * fix(factory.py): fix bedrock http:// handling * docs: fix typo in lm_studio.md (#8222) * test: fix testing * test: fix test --------- Co-authored-by: foreign-sub <51928805+foreign-sub@users.noreply.github.com>	2025-02-04 06:41:14 -08:00
Krish Dholakia	1e011b66d3	Ollama ssl verify = False + Spend Logs reliability fixes (#7931 ) * fix(http_handler.py): support passing ssl verify dynamically and using the correct httpx client based on passed ssl verify param Fixes https://github.com/BerriAI/litellm/issues/6499 * feat(llm_http_handler.py): support passing `ssl_verify=False` dynamically in call args Closes https://github.com/BerriAI/litellm/issues/6499 * fix(proxy/utils.py): prevent bad logs from breaking all cost tracking + reset list regardless of success/failure prevents malformed logs from causing all spend tracking to break since they're constantly retried * test(test_proxy_utils.py): add test to ensure bad log is dropped * test(test_proxy_utils.py): ensure in-memory spend logs reset after bad log error * test(test_user_api_key_auth.py): add unit test to ensure end user id as str works * fix(auth_utils.py): ensure extracted end user id is always a str prevents db cost tracking errors * test(test_auth_utils.py): ensure get end user id from request body always returns a string * test: update tests * test: skip bedrock test- behaviour now supported * test: fix testing * refactor(spend_tracking_utils.py): reduce size of get_logging_payload * test: fix test * bump: version 1.59.4 → 1.59.5 * Revert "bump: version 1.59.4 → 1.59.5" This reverts commit 1182b46b2ed814064f55f438c11b590cd7248596. * fix(utils.py): fix spend logs retry logic * fix(spend_tracking_utils.py): fix get tags * fix(spend_tracking_utils.py): fix end user id spend tracking on pass-through endpoints	2025-01-23 23:05:41 -08:00

1 2

70 Commits