litellm

mirror of https://github.com/tiennm99/litellm.git synced 2026-06-18 03:31:23 +00:00

Author	SHA1	Message	Date
Sameer Kankute	b08445837b	fix(logging): preserve ModelResponse choices format in redacted standard_logging_object + add Charity Engine provider endpoint - Fix perform_redaction to handle dict representation of ModelResponse (from model_dump()) - Preserve full choices structure when redacting, redact content/audio in place - Add _redact_standard_logging_object helper for standard_logging_object field - Update test_logging_redaction_e2e_test assertions to expect choices format - Add charity_engine to provider_endpoints_support.json Fixes: test_standard_logging_payload, test_standard_logging_payload_audio Made-with: Cursor	2026-03-10 10:22:57 +05:30
Sameer Kankute	c1b860b3c1	Revert "fix: strip empty text content blocks in /v1/messages endpoint (#23097 )" This reverts commit `2c738cc939`.	2026-03-10 09:53:19 +05:30
Krish Dholakia	dd6f0d6c55	fix: forward recognized OpenAI params from kwargs in completion() (#23224 ) Any param in DEFAULT_CHAT_COMPLETION_PARAM_VALUES that arrives via completion(**kwargs) is now automatically forwarded to get_optional_params(), even if it's not a named parameter of completion(). Previously, get_non_default_completion_params() excluded params in OPENAI_CHAT_COMPLETION_PARAMS (assuming they'd be forwarded via the named-param path), while optional_param_args only contained explicitly named params. Params like 'store' that were in the known-params list but not named params fell through both paths and were silently dropped. The fix adds a 7-line loop after building optional_param_args that forwards any kwargs present in DEFAULT_CHAT_COMPLETION_PARAM_VALUES. This means new OpenAI params only need to be added to the constants dict — no boilerplate changes to 3+ function signatures required. Fixes #23087 Co-authored-by: Cursor Agent <cursoragent@cursor.com>	2026-03-09 20:56:27 -07:00
tristanolive	30b82c3a0c	feat(charity_engine): add Charity Engine provider (#23223 ) * feat(charity_engine): add Charity Engine provider Charity Engine is a crowdsourced distributed computing platform that donates processing power to charitable causes. Its inference API provides OpenAI-compatible chat, completions, and embeddings endpoints. * test(charity_engine): add provider config and resolution tests Verify JSONProviderRegistry config, provider list membership, model routing for charity_engine/<model>, and Router compatibility. * feat(charity_engine): add Charity Engine to LlmProviders enum Enables provider_list membership and LlmProviders.CHARITY_ENGINE resolution required by the provider and test suite. * fix(charity_engine): remove api_base_env to fix non-deterministic test The CHARITY_ENGINE_API_BASE env var could override the base_url in CI, causing test_charity_engine_provider_resolution to fail intermittently. * fix(charity_engine): remove trailing slash from base_url	2026-03-09 20:46:43 -07:00
Maxwell Calkin	2c738cc939	fix: strip empty text content blocks in /v1/messages endpoint (#23097 ) Claude's API returns assistant messages with empty text blocks ({"type": "text", "text": ""}) alongside tool_use blocks during multi-turn tool-use conversations. These blocks are rejected when sent back to the API with "text content blocks must be non-empty". Sanitization already exists for other code paths (/v1/chat/completions for both Anthropic and Bedrock), but NOT for the /v1/messages native path. This adds the same treatment by stripping empty text blocks from messages in async_anthropic_messages_handler before they are forwarded to the provider. Fixes #22930	2026-03-09 19:51:25 -07:00
Krish Dholakia	9500fc18d1	Fix TypeError: LiteLLM_Params.__init__() got multiple values for argument 'self' (#23220 ) The bug occurred when user data inadvertently contained reserved Python keywords like 'self', 'params', or '__class__' as keys. When such a dict was unpacked via **kwargs to LiteLLM_Params() or GenericLiteLLMParams(), Python raised TypeError because 'self' was passed both implicitly and as a keyword argument. The fix: - Add a Pydantic model_validator(mode='before') to GenericLiteLLMParams that filters out reserved keys ('self', 'params', '__class__') before validation - Move the max_retries str-to-int conversion into the same validator - Remove the custom __init__ methods from both GenericLiteLLMParams and LiteLLM_Params, since the validator now handles the preprocessing - Clean up unused VERTEX_CREDENTIALS_TYPES import This fix applies to all classes that inherit from GenericLiteLLMParams, including LiteLLM_Params and updateLiteLLMParams. Added comprehensive tests in tests/test_litellm/test_litellm_params_reserved_keys.py Co-authored-by: Cursor Agent <cursoragent@cursor.com>	2026-03-09 19:33:52 -07:00
yuneng-jiang	c1d042c2a3	Fix flaky test_stream_chunk_builder_openai_audio_output_usage The test calls OpenAI's gpt-4o-audio-preview model which sometimes doesn't return usage data in the streaming response. Fixed by: - Adding @pytest.mark.flaky(retries=5, delay=2) for retry handling - Fixing usage_obj loop to check chunk.usage is not None - Skipping gracefully when OpenAI doesn't return usage data Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-09 17:18:00 -07:00
yuneng-jiang	6fe82d3886	Merge pull request #23211 from BerriAI/litellm_/sharp-keller [Fix] Skills API test failing with duplicate skill name 500	2026-03-09 17:13:29 -07:00
yuneng-jiang	af8f91ef66	[Fix] Use unique skill names in Skills API test to avoid duplicate-name 500s The test_create_skill test was consistently failing in CI with a 500 from Anthropic because the SKILL.md frontmatter always used the same hardcoded name (test-skill-litellm). Since test_delete_skill is permanently skipped, skills accumulate in the CI account, and re-creating with a duplicate name triggers an Internal Server Error on Anthropic's side. Fix: pass a timestamp-based unique_suffix to create_skill_zip so each run produces a distinct skill name in the zip's SKILL.md frontmatter. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-09 17:09:15 -07:00
yuneng-jiang	4c3f873bde	Merge pull request #23198 from BerriAI/litellm_fix_nova_pro_max_tokens [Fix] Claude Agent SDK E2E Test Nova Pro max_tokens Limit	2026-03-09 15:54:00 -07:00
yuneng-jiang	d719c8a53c	Merge branch 'main' into litellm_fix_nova_pro_max_tokens	2026-03-09 15:47:53 -07:00
yuneng-jiang	2a836c7103	Fix Claude Agent SDK E2E test for Nova Pro max_tokens limit The Claude Agent SDK sends max_tokens=32000 for unrecognized model names (like "bedrock-nova-pro"), which exceeds Nova Pro's 10,000 limit. Enable modify_params in the test proxy config so LiteLLM clamps max_tokens to the model's actual limit. Also swap nova-premier to nova-pro since premier requires provisioned throughput unavailable in CI. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-09 15:45:24 -07:00
yuneng-jiang	ffd1eb18e0	Merge remote main and resolve conflicts Kept our sync test fix, accepted upstream's xdist_group marker on the async handler test. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-09 15:34:50 -07:00
yuneng-jiang	74ed6a16ac	Fix flaky test_watsonx_gpt_oss_prompt_transformation The test was flaky under pytest-xdist parallel execution because it used async acompletion (which runs completion() in a thread pool via run_in_executor) and relied on shared global state (known_tokenizer_config, iam_token_cache, module_level_client) that could be modified by other tests running in parallel. Failures were silently swallowed by a broad try/except, causing mock_post.call_count to remain 0. Fix: - Convert from async acompletion to sync completion, matching every other test in the file. The test's intent is verifying prompt transformation, not async behavior. - Use monkeypatch.setitem for known_tokenizer_config to ensure proper teardown isolation. - Remove unnecessary mock layers (async template fetchers, iam_token_cache pre-population, mock completion response) that were only needed for the async code path. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-09 15:32:30 -07:00
yuneng-jiang	29ca052064	Merge remote main, resolve conflict keeping new unit tests Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-09 15:20:20 -07:00
yuneng-jiang	b7ac688b2b	Replace SearXNG integration tests with unit tests for request/response transformation The SearXNG search tests were failing in CI because they depend on a live SearXNG instance that returns results. Since this provider is used by a very small subset of customers, replace the flaky integration tests with deterministic unit tests that validate request payloads, URL construction, response parsing, and header configuration without requiring external infra. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-09 15:13:58 -07:00
yuneng-jiang	8ecac84789	Revert "feat(proxy): add Prisma DB pool and engine health metrics to Promethe…" This reverts commit `0bb26c3f1b`.	2026-03-09 14:55:11 -07:00
yuneng-jiang	be9d1798b2	Merge pull request #23182 from BerriAI/litellm_/exciting-swanson [Fix] Model pricing schema test missing output_cost_per_image_token_batches	2026-03-09 14:26:21 -07:00
yuneng-jiang	379ce1aae5	[Fix] Add output_cost_per_image_token_batches to model pricing schema test The gemini-3.1-flash-image-preview model introduced a new pricing field that was missing from the test's validation schema and cost_fields list. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-09 14:17:52 -07:00
michelligabriele	c47f77a348	fix(agentcore): handle JSON responses from agents using sync return (#23165 ) * fix(agentcore): handle JSON responses from agents using sync return BedrockAgentCoreApp agents that use synchronous `return` (instead of async `yield`) respond with Content-Type: application/json instead of text/event-stream. The streaming parser only handles SSE format, silently discarding the JSON body and returning empty content to the client. This adds Content-Type detection in both sync and async streaming wrappers — when application/json is received, the response is parsed and converted to a single-chunk stream. Also extends _parse_json_response with a fallback chain supporting multiple agent response schemas (standard AgentCore, Strands framework, plain string, raw JSON fallback). * fix(agentcore): add dict-type guard to _parse_json_response Prevent AttributeError when json.loads() returns a non-dict (e.g. JSON array or primitive) by adding an isinstance check at the top of _parse_json_response. Non-dict values fall back to raw JSON string content. * fix(agentcore): handle malformed JSON and split streaming chunks - Wrap json.loads() in try/except in both sync and async streaming wrappers so malformed JSON bodies raise a structured BedrockError instead of a raw JSONDecodeError - Split the JSON-fallback streaming path into two chunks (content chunk with finish_reason=None, then stop sentinel with empty delta) to match the SSE path convention * fix(agentcore): catch IO errors in streaming JSON path + async error test - Broaden except clause to catch both json.JSONDecodeError and IO-level exceptions (httpx.ReadError, etc.) from response.read()/aread(), so all failures surface as structured BedrockError - Add async malformed-JSON test to mirror the sync test coverage	2026-03-09 10:22:36 -07:00
Aarish Alam	e21b06265a	fix fkey violation on deleting user (#23115 )	2026-03-09 08:53:11 -07:00
ohadgur	0bb26c3f1b	feat(proxy): add Prisma DB pool and engine health metrics to Prometheus (#22655 ) * feat(proxy): add Prisma DB pool and engine health metrics to Prometheus Add a PrismaMetricsCollector that periodically queries pg_stat_activity and the Prisma engine process to expose connection pool and engine health as Prometheus gauges/counters. Auto-enabled when prometheus_system is in service_callback. New metrics: - litellm_db_pool_active_connections (Gauge) - litellm_db_pool_idle_connections (Gauge) - litellm_db_pool_total_connections (Gauge) - litellm_db_pool_waiting_connections (Gauge) - litellm_db_engine_up (Gauge) - litellm_db_engine_restarts_total (Counter) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address Greptile review feedback - Only increment engine_restarts counter on heavy reconnects (engine actually dead), not lightweight network-blip reconnects - Fix potential KeyError in _get_or_create_gauge/counter fallback path when REGISTRY._names_to_collectors is absent - Rename litellm_db_pool_waiting_connections to litellm_db_pool_lock_waiting_connections to clarify it measures lock contention, not pool slot queuing Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: warn when prometheus_system enabled but watchdog disabled Log a warning when users have prometheus_system in service_callback but PRISMA_HEALTH_WATCHDOG_ENABLED=false, since DB pool and engine metrics won't be collected in that configuration. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * ci: retrigger CI checks Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor: use labeled gauge for DB pool connection metrics Replace 3 separate pool gauges (active, idle, total) with a single `litellm_db_pool_connections` gauge using a `state` label. This is more Prometheus-idiomatic and exposes all pg_stat_activity states (active, idle, idle in transaction, etc.) without ambiguity about what "total" includes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address Greptile review — stale labels and fallback re-registration - Zero out known pg_stat_activity states that are absent from the current query result, preventing stale gauge values from persisting. - Simplify _get_or_create_gauge/counter by removing the fallback loop that could re-register an already-registered metric (ValueError). - Add test for stale label clearing across collection cycles. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: include "unknown" in _PG_STATES for stale label clearing Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: collect immediately on start and consolidate into single query - Move sleep to end of loop so metrics appear on /metrics immediately after startup instead of after a 30s delay. - Combine pool state and lock waiting queries into a single SQL query using conditional aggregation, halving per-cycle DB overhead. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: prevent tight spin loop on collection error Move asyncio.sleep outside the try/except so it always executes even when _collect_engine_health() or _collect_pool_metrics() raises. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: add multiprocess_mode to _get_or_create_gauge initialization - Include `multiprocess_mode` parameter to properly support multiprocessing in Gauge creation. - Ensure consistent behavior for labeled and unlabeled Gauges. * fix: handle invalid env var and document watchdog prerequisite - Add try/except ValueError for PRISMA_METRICS_COLLECTION_INTERVAL_SECONDS to prevent proxy startup crash on non-numeric values (e.g. "30s") - Document that DB metrics require both prometheus_system callback and PRISMA_HEALTH_WATCHDOG_ENABLED=true Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: use defensive null coalescing for query_raw row values Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test: add invalid env var fallback test and fix mock signature - Add test for non-numeric PRISMA_METRICS_COLLECTION_INTERVAL_SECONDS - Add **kwargs to mock _patched_get_or_create_gauge for forward compat Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-09 08:49:46 -07:00
milan-berri	df2e1bca46	feat: allow JWT and OAuth2 auth to coexist on the same instance (#23153 ) When both enable_jwt_auth and enable_oauth2_auth are True, the proxy now routes tokens based on their format: - JWT tokens (3 dot-separated parts) -> JWT auth handler - Opaque tokens -> OAuth2 auth handler This enables using JWT for human users and OAuth2 for M2M (machine) clients on the same LiteLLM instance. Previously, enabling OAuth2 would intercept all tokens on LLM API routes before JWT auth could run. When only one auth method is enabled, behavior is unchanged (backward compatible).	2026-03-09 08:41:27 -07:00
Ihsan Soydemir	b1a6ba7711	feat(search): add Serper (serper.dev) as search provider (#23112 ) * Add Serper (serper.dev) as a new search provider * Add @greptileai fixes	2026-03-09 08:40:37 -07:00
Joe Reyna	36e04b6efe	fix(tests): restore litellm_params=None on mock agent in a2a invoke test (#23125 )	2026-03-09 07:16:02 -07:00
Joe Reyna	0bc1bd6871	fix(tests): use AsyncMock for prisma find_unique in agent get-by-id test (#23122 )	2026-03-09 07:13:50 -07:00
Giulio Leone	556c64875e	fix(models): set gpt-5.4-pro mode to responses instead of chat gpt-5.4-pro and gpt-5.4-pro-2026-03-05 do not support the /v1/chat/completions endpoint — OpenAI returns a 404 with "This is not a chat model". These models are responses-only, like o3-pro and o1-pro. Changes: - Set mode from "chat" to "responses" for both model entries - Update supported_endpoints to ["/v1/responses", "/v1/batch"] - Add regression test for responses API bridge routing Fixes BerriAI/litellm#23014	2026-03-09 12:10:08 +01:00
Sameer Kankute	a8301d5614	Fix: varaitions endpoint geting 401	2026-03-09 12:51:21 +05:30
Sameer Kankute	4b1929ce93	Fix mistral ocr failing test	2026-03-09 11:29:33 +05:30
Sameer Kankute	b20c0afb64	Fix test_anthropic_messages_openai_model_streaming_cost_injection & openrouter image gen	2026-03-09 11:29:04 +05:30
yuneng-jiang	3a1ac964f7	fix: pass organization_ids=None in get_users test calls When calling get_users() directly (not via FastAPI), Query() defaults are not resolved. Pass organization_ids=None explicitly to avoid 'Query' object has no attribute 'split' error. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-07 23:15:16 -08:00
yuneng-jiang	ac5128493e	fix: repair test regressions from org admin auth changes - test_get_users_*: pass proxy admin user_api_key_dict since get_users now calls _authorize_user_list_request which checks user_role - test_validate_team_member_add_permissions_non_admin: set organization_id on mock team since _is_user_org_admin_for_team accesses it Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-07 23:10:53 -08:00
yuneng-jiang	8f33983389	Merge remote-tracking branch 'origin' into litellm_org_admin_add_user_e2e	2026-03-07 22:58:57 -08:00
yuneng-jiang	ce317148b9	feat: org admin access to team management — backend auth, UI visibility, tests - Add _is_user_org_admin_for_team() reusable helper to common_utils.py - Grant org admins access to /team/list, /team/info, /team/member_add, /team/member_delete, /team/member_update, /team/model/add, /team/model/delete, /team/permissions_list, /team/permissions_update - Make validate_membership async with org admin fallback - Add /user/list to self_managed_routes (endpoint handles own auth) - UI: org admins see Members, Member Permissions, Settings tabs in team view - UI: CreateUserButton uses useOrganizations() for org dropdown - UI: org admin delete-member respects disable_team_admin_delete_team_user - Add 16 unit tests for _is_user_org_admin_for_team, validate_membership, _user_is_org_admin route check, and privilege escalation prevention Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-07 22:43:09 -08:00
yuneng-jiang	8bf3c0c67f	fix: org admin invite user — multi-org selector, organizations list in POST body, auth check - Thread org objects {organization_id, organization_alias} instead of bare IDs from users/page.tsx → view_users.tsx → CreateUserButton so the selector can show aliases - Replace single-select org dropdown with multi-select; always shown when organizationIds is non-null; disabled/pre-selected for single-org admins; displays "Alias (id)" - handleCreate: maps organization_ids → organizations before POST, removes redundant organizationMemberAddCall (backend _add_user_to_organizations handles it) - _user_is_org_admin: also checks organizations list field in addition to singular organization_id so /user/new succeeds for org admins - Add 5 backend unit tests for _user_is_org_admin and 2 frontend tests for new form behavior Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-07 20:34:12 -08:00
yuneng-jiang	67884c279a	fix: allow any authenticated user to call /user/available_roles Org admins and team admins opening the invite-user modal could not see the 4 global proxy roles because GET /user/available_roles has no request body, so the org-admin route check (which requires organization_id in the payload) always returned False and blocked them. Add /user/available_roles to self_managed_routes so the route-access check passes for any authenticated user. The endpoint's existing Depends(user_api_key_auth) still requires a valid API key. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-07 20:11:35 -08:00
Giulio Leone	1c3787264b	fix(bedrock): strip output_config from Bedrock Invoke requests (#23042 ) * fix(bedrock): strip output_config from Bedrock Invoke requests Bedrock Invoke API does not support the output_config parameter (added to Anthropic Messages API). Requests with output_config cause 400 errors: 'extraneous key [output_config] is not permitted'. Strip output_config in both Bedrock Invoke transformation layers (messages and chat), consistent with how output_format is already handled and how VertexAI strips both parameters. Fixes: https://github.com/BerriAI/litellm/issues/22797 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * test(bedrock): add output_config test for chat/invoke path Addresses review feedback — the chat/invoke_transformations path now has symmetric test coverage matching the messages/invoke_transformations path. Fixes: https://github.com/BerriAI/litellm/issues/22797 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: giulio-leone <6887247+giulio-leone@users.noreply.github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-03-07 19:33:53 -08:00
Krish Dholakia	cf439c269c	Agents - add max budget + tpm/rpm limiting per agent AND per agent session (#22849 ) * feat: enforce x-litellm-trace-id in header, if required * feat: update spend for agent * refactor: update agent table to follow similar format as other entities - also add a spend column - allows us to see spend of an agent * fix: cleanup ui * feat: return spend on agent endpoints * feat: scope pr * feat(agents/): support budgets + rate limiting on agents + agent sessions * fix: address PR review feedback - Add missing tpm_limit, rpm_limit, session_tpm_limit, session_rpm_limit columns to root schema.prisma to match proxy and extras schemas - Add backwards-compatible fallback to key metadata for max_iterations so existing users don't silently lose enforcement Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: qa'ed RPM limiting on agents --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-07 19:12:42 -08:00
Krish Dholakia	03ca98123f	Agents health checks (#23044 ) * feat: add health check toggle to agents page Backend: - Add health_check query parameter to GET /v1/agents endpoint - When health_check=true, performs concurrent GET requests to each agent's URL and filters out agents with unreachable URLs (5s timeout) - Agents returning HTTP <500 are considered healthy; 5xx and connection errors mark agents as unhealthy UI: - Add Health Check toggle (Switch) to agents panel header - Toggle triggers re-fetch with health_check=true, filtering the agent list - Icon color changes (green/gray) to indicate toggle state - Tooltip explains behavior: 'only agents with reachable URLs are shown' Networking: - Update getAgentsList to accept optional healthCheck boolean parameter Tests: - Backend: 9 new tests covering health check filtering, _check_agent_url_health helper (no URL, 200, 404, 500, connection error cases) - UI: 3 new tests verifying toggle renders, initial fetch without health check, and fetch with health check after toggle click Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> * fix: fix greptile comment re: security issue * fix: fix based on greptile feedback * fix: align health check tests with implementation - Rename test_should_return_unhealthy_when_no_url to test_should_return_healthy_when_no_url (implementation returns healthy=True for agents without a URL) - Patch get_async_httpx_client instead of httpx.AsyncClient so mocks actually intercept the HTTP calls made by _check_agent_url_health - Remove unnecessary __aenter__/__aexit__ context-manager mocks Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore: undo _experimental/out renames from cherry-pick Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Update litellm/proxy/agent_endpoints/endpoints.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> --------- Co-authored-by: Cursor Agent <cursoragent@cursor.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>	2026-03-07 18:32:47 -08:00
Krish Dholakia	e7714f0ce6	Fix CVEs: bump tar/minimatch/pypdf + harden Docker SBOM patching (#23082 ) * fix(docker): bump tar/minimatch/pypdf for CVE fixes + harden SBOM patching - Bump tar 7.5.8→7.5.10, minimatch 10.2.1→10.2.4, pypdf 6.6.2→6.7.3 - Add sed-based SBOM metadata patching with properly indented find/sed - Add npm package manager cleanup (apk del / apt-get purge) to remove stale SBOM entries from image scanners - Scope \|\| true to only apk del via brace grouping { ... \|\| true; } - Guard npm root -g with non-empty assertion to prevent silent failures - Scope minimatch sed regex to ^10.x to avoid matching other major versions Addresses: CVE-2026-27903, CVE-2026-27904, GHSA-qffp-2rhf-9h96, CVE-2026-27888 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(docker): scope find to /usr/local/lib /usr/lib, drop autoremove - Replace `find /` with `find /usr/local/lib /usr/lib` to avoid traversing /proc, /sys, /dev during SBOM metadata patching - Remove `apt-get autoremove -y` from Debian-based Dockerfiles to prevent nodejs from being removed as an auto-installed dependency Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-07 18:31:27 -08:00
yuneng-jiang	9e531195ec	Merge pull request #23057 from BerriAI/litellm_fix_user_filter_scope [Fix] User Filter Scope - Make Org-Scoping Opt-In	2026-03-07 18:22:26 -08:00
Ishaan Jaff	fc81edc4c4	revert: undo PR #22589 and follow-up vertex anyOf fixes (#23083 ) * Revert "fix(vertex): drop bare {} schemas from anyOf before adding nullable=True (#23060)" This reverts commit `3ad9a536d3`. * Revert "Merge pull request #22589 from Chesars/fix/vertex-preserve-any-type-schema" This reverts commit `da941e4261`, reversing changes made to `f77f28a5f8`.	2026-03-07 17:49:49 -08:00
yuneng-jiang	906288a1b2	Merge pull request #23065 from BerriAI/litellm_fix_team_scoped_virtual_keys fix: scoping virtual keys in the teams view to be applying the team filter	2026-03-07 17:43:25 -08:00
Ishaan Jaff	2b8db87a35	fix(pass_through): inject cost into Anthropic streaming chunks + fix SSE parsing in tests (#23078 ) streaming_handler.py: EndpointType.ANTHROPIC was missing from the cost injection block — only VERTEX_AI was handled, so Anthropic passthrough streaming never got cost injected into message_delta chunks even with include_cost_in_streaming_usage: true. test_anthropic_passthrough.py: AnthropicResponsesStreamWrapper yields full multi-line SSE frames as single bytes objects (e.g. "event: message_delta\ndata: {...}\n\n"). The tests were checking startswith('data: ') on the whole chunk, which starts with 'event:', so every message_delta event was silently skipped. Fix: split each chunk by \n before checking for the data: prefix. Also removes the @pytest.mark.skip added with wrong diagnosis on the OpenAI model test.	2026-03-07 17:27:51 -08:00
Ishaan Jaff	a30b71c946	fix(tests): generate square PNG in image_url fixture for DALL-E 2 variation test (#23073 ) DALL-E 2 create_variation requires a square PNG. The old fixture fetched the LiteLLM logo from S3 which is non-square, causing API rejections. Replace with a programmatically-generated 1024x1024 RGBA PNG via Pillow.	2026-03-07 16:58:27 -08:00
Ishaan Jaff	34984d22ae	fix(test): update openrouter image generation assertion for gemini-2.5-flash-image (#23070 ) * fix(anthropic/skills): remove ?beta=true query param from Skills API URLs Beta access is controlled via the anthropic-beta header (already set to skills-2025-10-02), not a URL query param. The spurious ?beta=true was causing 500 errors from Anthropic's server. * fix(test): update openrouter image generation assertion to accept any image format gemini-2.5-flash-image returns JPEG, not PNG. The assertion was hardcoded to png after the model was swapped from gemini-2.5-flash-image-preview (which returned PNG) in commit `34e8e972`.	2026-03-07 16:52:04 -08:00
Ishaan Jaff	66c822435e	fix(ci): image variation openai sdk 2.24.0 compat + swap bedrock nova-premier to nova-pro (#23066 ) * fix(ci): fix image variation test for openai sdk 2.24.0 and swap nova-premier to nova-pro image_gen_tests: openai==2.24.0 (bumped Feb 25) requires BytesIO objects to have a .name attribute for MIME type detection in multipart uploads. Add .name to the fixture so create_variation works. Also guard with OPENAI_API_KEY skipif. proxy_e2e_anthropic_messages_tests: nova-premier requires provisioned throughput not available via standard on-demand cross-region inference on the CI account. Swap to nova-pro which uses standard inference profiles. * fix: remove skipif, keep only .name fix for openai sdk compat	2026-03-07 16:41:54 -08:00
Ryan Crabbe	2cd0c767ee	fix: regression test	2026-03-07 16:40:29 -08:00
Ryan Crabbe	daf7c0c3a8	fix: scoping virtual keys in the teams view to be applying the team filter globally instead of an or branch	2026-03-07 16:23:12 -08:00
Ishaan Jaff	e8a7116899	fix(tests): fix repeating chunk and audio usage streaming tests (#23061 ) - Replace ModelResponse(stream=True) with ModelResponseStream in test_unit_test_custom_stream_wrapper_repeating_chunk — stream=True stores delta as a plain dict causing AttributeError in CustomStreamWrapper - Accept MidStreamFallbackError alongside InternalServerError in the repeating-chunk safety check assertion - Add @pytest.mark.flaky(retries=3) to the live OpenAI audio output usage test	2026-03-07 16:18:51 -08:00

1 2 3 4 5 ...

7349 Commits