Commit Graph

230 Commits

Author SHA1 Message Date
Yuneng Jiang 08e29e0a9a [Infra] Automated schema.prisma sync and drift detection
Sync all 3 schema.prisma copies and add GHA workflows to keep them in sync automatically.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 16:01:20 -07:00
yuneng-jiang bd2502eeaf [Feature] /v2/team/list: Add org admin access control, members_count, and indexes
Add org admin support to /v2/team/list so org admins can list teams
within their organizations instead of getting 401. Also enrich the
response with members_count and add missing indexes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 20:34:15 -07:00
Krrish Dholakia 57a48e3526 fix(agents.tsx): support granting agents access to subagents 2026-03-10 21:03:20 -07:00
Krish Dholakia cf439c269c Agents - add max budget + tpm/rpm limiting per agent AND per agent session (#22849)
* feat: enforce x-litellm-trace-id in header, if required

* feat: update spend for agent

* refactor: update agent table to follow similar format as other entities - also add a spend column - allows us to see spend of an agent

* fix: cleanup ui

* feat: return spend on agent endpoints

* feat: scope pr

* feat(agents/): support budgets + rate limiting on agents + agent sessions

* fix: address PR review feedback

- Add missing tpm_limit, rpm_limit, session_tpm_limit, session_rpm_limit
  columns to root schema.prisma to match proxy and extras schemas
- Add backwards-compatible fallback to key metadata for max_iterations
  so existing users don't silently lose enforcement

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: qa'ed RPM limiting on agents

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 19:12:42 -08:00
yuneng-jiang 55f448abb8 bump: version 0.4.51 → 0.4.52 2026-03-06 23:39:08 -08:00
Ishaan Jaff 9a4bacd85d fix: add missing spec_path column to LiteLLM_MCPServerTable schema (#22820)
The OpenAPI-to-MCP feature (PR #21575) added spec_path to the code
(_types.py, mcp_server_manager.py) but missed adding the column to
the Prisma schema files. This causes "Could not find field spec_path"
errors when creating OpenAPI-based MCP servers via the UI or API.

Adds `spec_path String?` to LiteLLM_MCPServerTable in all three
schema files (root, litellm/proxy, litellm-proxy-extras).

Made-with: Cursor
2026-03-04 16:07:05 -08:00
Ishaan Jaff 1f412bc6d8 [Feat] Add Tool Policies for AI Gateway (#22732)
* fix: fix ui render

* fix: fix minor bugs

* refactor: use prisma functions instead of raw sql (safer)

* fix(add-new-tiles-to-tool-policies): allow developer to see what's available

* feat: ensure tool allowlist runs correctly for tool names + mcp's

* refactor: more ui improvements

* feat: working key tool blocking

* feat(tools): show tool logs

* refactor: backend code improvements

* refactor: improve log viewer for tools

* fix: address PR review feedback for tool access control

- Add missing blocked_tools column to root schema.prisma (schema drift)
- Invalidate ToolPolicyRegistry after policy mutations so changes take effect immediately
- Remove dead code: unused get_effective_policies, get_tool_policies_cached, and helpers

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: race condition in permission resolution and remove duplicate allowlist check

- Use atomic update_many with object_permission_id=None to prevent concurrent
  requests from creating orphaned permission rows and losing tool blocks
- Remove duplicate allowed_tools enforcement from guardrail (already enforced
  in auth layer via check_tools_allowlist)
- Move inline uuid import to module level

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* update to account for  userAgent

* UI - Add ToolDetails

* input/output policy

* LiteLLM_PolicyAttachmentTable

* LiteLLM_PolicyAttachmentTable

* fix: add _enqueue_tool_registry_upsert

* fix: tool mgmt endpoints

* tool mgmt endpoints

* Update tests/test_litellm/proxy/db/test_tool_registry_writer.py

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

* Update tests/test_litellm/proxy/db/test_tool_registry_writer.py

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

* Update tests/test_litellm/proxy/db/test_tool_registry_writer.py

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

* fix: sync root schema.prisma and fix test_tool_registry_writer for input/output policy

- Migrate root schema.prisma LiteLLM_ToolTable from call_policy to
  input_policy/output_policy, add missing user_agent and last_used_at columns
  (now consistent with litellm/proxy/schema.prisma and litellm-proxy-extras)
- Fix SpendLogToolIndex comment across all three schema files
- Fix all call_policy references in test_tool_registry_writer.py:
  swapped update_tool_policy arguments, wrong get_tools_by_names return type
  assertions, _mock_tool_row setting call_policy instead of input_policy

Addresses Greptile review feedback on PR #22732.

Made-with: Cursor

---------

Co-authored-by: Krrish Dholakia <krrishdholakia@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
2026-03-03 20:22:20 -08:00
Krish Dholakia 67f90254ed feat(guardrails): team-based guardrail registration and approval workflow (#22459)
* feat(guardrails): team-based guardrail registration and approval workflow

Add team-based guardrail submission system where teams can register
Generic Guardrail API guardrails for admin review. Includes:

- POST /guardrails/register endpoint for team-scoped submissions
- Admin review endpoints (list/get/approve/reject submissions)
- Team Guardrails tab in the UI dashboard
- extra_headers support for forwarding client headers to guardrail APIs
- Prisma schema migration for status, submitted_at, reviewed_at fields
- Documentation for team-based guardrails and static/dynamic headers

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(guardrails): address review feedback - SSRF, silent failure, redundant query

- Validate api_base URL scheme (http/https only) and hostname in
  register_guardrail to prevent SSRF via team submissions
- Return warning field in approve response when in-memory initialization
  fails so admins know the guardrail won't work until next sync cycle
- Eliminate redundant DB query in list_guardrail_submissions by fetching
  all team guardrails once and deriving both filtered list and summary
  counts from the single result set

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(guardrails): add pending_review status guard to reject endpoint

Prevent rejecting already-active or already-rejected guardrails, which
would create a DB/memory inconsistency (active in memory but rejected
in DB). Now mirrors the approve endpoint's status check.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 22:06:49 -08:00
Ishaan Jaff 29e3fd5d79 [Release Fix] (#22411)
* fix(lint): suppress PLR0915 for 3 complex methods that exceed 50-statement limit

- streaming_iterator.py: _process_event (84 statements)
- transformation.py: translate_messages_to_responses_input (51 statements)
- transformation.py: transform_realtime_response (54 statements)

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix(mypy): resolve type errors in public_endpoints, user_api_key_auth, common_utils, transformation

- public_endpoints.py: fix _cached_endpoints type annotation
- user_api_key_auth.py: accept Optional[str] for end_user_id parameter
- common_utils.py: add NewProjectRequest/UpdateProjectRequest to Union type
- transformation.py: add ChatCompletionRedactedThinkingBlock and list[Any] to content type

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix(proxy-extras): bump version to 0.4.50 and sync schema

- Bump litellm-proxy-extras from 0.4.49 to 0.4.50
- Sync schema.prisma with main proxy schema
- Includes new LiteLLM_ClaudeCodePluginTable model
- Includes new @@index([startTime, request_id]) on SpendLogs
- Update version references in requirements.txt and pyproject.toml

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix(router): use string id in test_add_deployment and add defensive str() in register_model

- Change test to use string '100' instead of int 100 for model_info.id
- Add str() conversion in register_model to prevent AttributeError on non-string keys

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix(security): update minimatch to 10.2.4 to fix CVE-2026-27903 and CVE-2026-27904

- Run npm audit fix in docs/my-website
- Updates minimatch from 10.2.1 to 10.2.4 (fixes HIGH severity ReDoS vulnerabilities)

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix(test): update realtime guardrail test assertions to match actual guardrail behavior

- test_text_message_blocked_by_guardrail_no_ai_response: allow guardrail's own block
  message text in response.done (previously expected empty content)
- test_voice_transcript_blocked_by_guardrail: allow guardrail to send response.cancel
  + block message + response.create flow (previously expected no response.create)

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: revert proxy-extras version in requirements.txt and pyproject.toml

The litellm-proxy-extras 0.4.50 is not published to PyPI yet, so consumer
references must stay at 0.4.49. Only the source package pyproject.toml
should be bumped to 0.4.50 for the publish_proxy_extras CI job.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: make transcript delta check optional in voice guardrail test

The guardrail sends an error event (guardrail_violation) when blocking
voice transcripts; it does not always produce transcript deltas. Remove
the assertion requiring response.audio_transcript.delta since the error
event is the primary signal that blocked content was handled.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* Add missing env keys to documentation: LITELLM_MAX_STREAMING_DURATION_SECONDS and LITELLM_USE_CHAT_COMPLETIONS_URL_FOR_ANTHROPIC_MESSAGES

These two environment variables were used in code but not documented in the
environment variables reference section of config_settings.md, causing the
test_env_keys.py CI test to fail.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* Fix 13 mypy type errors across 6 files

- in_flight_requests_middleware.py: Fix type: ignore error codes from
  [union-attr] to [attr-defined], add [arg-type] for Gauge **kwargs
- transformation.py: Add [assignment] ignore for output_format reassignment,
  add fallback empty string for tool use id to fix arg-type
- responses/main.py: Remove redundant type annotation on second
  secret_fields assignment to fix no-redef
- streaming_iterator.py: Add [assignment] ignores for intermediate
  cache token assignments
- handler.py: Add [typeddict-item] ignore for AnthropicMessagesRequest
  construction from dict
- public_endpoints.py: Add [arg-type] ignore for _load_endpoints()
  return type mismatch with SupportedEndpoint model

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: add auth overrides to spend tracking tests, fix realtime guardrail assertion, update UI minimatch

- Add app.dependency_overrides for user_api_key_auth in 4 spend tracking tests
  that were returning 401 Unauthorized (error_code, error_message,
  error_code_and_key_alias, key_hash)
- Fix realtime guardrail test to check ANY error event for guardrail_violation
  instead of just the first (OpenAI may send its own errors first)
- Update ui/litellm-dashboard/package-lock.json to fix minimatch vulnerability

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* Fix failing MCP e2e and create_mcp_server UI tests

Test 1 (test_independent_clients_no_shared_session):
- Add allow_all_keys: true to MCP servers in test config. With master_key
  and no DB, get_allowed_mcp_servers returned empty, causing 0 tools and
  403 on tool calls. allow_all_keys bypasses per-key restrictions.
- Add asyncio.sleep(0.5) between client connections to allow MCP SDK
  TaskGroup cleanup and avoid ExceptionGroup on connection close (MCP #915).

Test 2 (create_mcp_server 'auth value is provided'):
- Use userEvent.setup({ delay: null }) for instant keystrokes to avoid
  timeout from default typing delay on CI.
- Increase per-test timeout to 15000ms for CI environments.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: stabilize proxy unit tests for parallel execution

- test_response_polling_handler: add xdist_group to prevent heavy import OOM
- test_db_schema_migration: use temp dir for worker isolation, sync schema.prisma index
- test_custom_tokenizer_bug: use lighter tokenizer to prevent OOM in parallel

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: add auth overrides to more spend tracking and model info tests

- Fix test_ui_view_spend_logs_pagination missing auth override (401)
- Fix test_view_spend_tags missing auth override (401)
- Fix test_view_spend_tags_no_database missing auth override (401)
- Fix test_empty_model_list.py to use app.dependency_overrides instead of patch()
  for FastAPI dependency injection auth

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix(test): use patch.object for aiohttp transport test to work in parallel execution

The @patch decorator was not intercepting the static method call in parallel
xdist workers. Using patch.object on the directly-imported class is more reliable.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix(security): update minimatch from 10.2.1 to 10.2.4 in Dockerfile

The Docker image was explicitly pinning minimatch@10.2.1 which has HIGH
severity ReDoS vulnerabilities (GHSA-7r86-cg39-jmmj, GHSA-23c5-xmqv-rm74).
Update to 10.2.4 which includes fixes for both CVEs.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix(ui): prevent MCP and TeamInfo test timeouts on CI

- Add userEvent.setup({ delay: null }) to all tests using userEvent in both files
- Add timeout: 15000 to tests with significant user interaction (typing, multiple clicks)
- Fixes: create_mcp_server Bearer Token test, TeamInfo cancel button test

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: stabilize parallel test execution and aiohttp transport test

- test_aiohttp_handler: rewrite transport test to not rely on static method mock
  (consistently fails in parallel xdist workers)
- test_proxy_cli: add xdist_group to prevent timeout during heavy imports
- test_swagger_chat_completions: add xdist_group to prevent timeout

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix(security): add serialize-javascript override to fix GHSA-5c6j-r48x-rmvq

Add npm override for serialize-javascript>=7.0.3 in docs/my-website
to fix HIGH severity RCE vulnerability via RegExp.flags.
Also bump minimatch override to >=10.2.4.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* Fix flaky tests: remove broken Vertex model, add retries for Anthropic

- Remove vertex_ai/meta/llama-4-scout-17b-16e-instruct-maas from
  test_partner_models_httpx_streaming - consistently returns 400 BadRequest
- Add @pytest.mark.flaky(retries=6, delay=10) to test_function_call_parsing
  for transient Anthropic API overload errors
- Add @pytest.mark.flaky(retries=6, delay=10) to test_openai_stream_options_call
  for transient Anthropic InternalServerError

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix(ci): add xdist_group(proxy_heavy) to prevent OOM in parallel proxy tests

- Add pytestmark = pytest.mark.xdist_group('proxy_heavy') to test_proxy_utils.py
- Change test_db_schema_migration.py from schema_migration to proxy_heavy group
- Add @pytest.mark.xdist_group('proxy_heavy') to test_proxy_server.py::test_health

Groups heavy proxy tests to run on same worker, avoiding worker OOM crashes.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* Fix vertex AI qwen global endpoint test to mock vertexai module import

The test_vertex_ai_qwen_global_endpoint_url test was failing because the
VertexAIPartnerModels.completion() method tries to 'import vertexai' before
any of the mocked code runs. In environments without google-cloud-aiplatform
installed, this import fails with a VertexAIError(status_code=400).

Fix by:
- Adding patch.dict('sys.modules', {'vertexai': MagicMock()}) to mock the
  vertexai module import
- Adding vertex_ai_location parameter to the acompletion call for completeness

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix(ci): add xdist_group to health endpoint and watsonx tests for parallel stability

- test_health_liveliness_endpoint: add xdist_group('proxy_health') to prevent timeout
- test_watsonx_gpt_oss tests: add xdist_group('watsonx_heavy') to prevent mock interference

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix(test): pre-populate WatsonX IAM token cache to prevent parallel test interference

The watsonx prompt transformation test was failing in parallel execution because
litellm.module_level_client.post mock was being interfered with by other tests.
Pre-populating the IAM token cache avoids the HTTP call entirely.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix(test): add spend data polling with retries for e2e pass-through tests

- test_vertex_with_spend.test.js: Replace 15s fixed wait with polling loop
  (up to 6 attempts, 10s apart) for spend data to appear in DB
- Increase test timeout from 25s to 90s to accommodate polling
- base_anthropic_messages_tool_search_test.py: Add flaky(retries=3) for
  streaming test that depends on live Anthropic API

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix(ci): reduce parallel workers from 8 to 4 for proxy tests to prevent OOM

- litellm_proxy_unit_testing_part2: -n 8 -> -n 4
- litellm_mapped_tests_proxy_part2: -n 8 -> -n 4, timeout 60 -> 120
- Worker crashes consistently caused by too many parallel proxy tests
  each loading the full FastAPI app and heavy dependency tree

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix(db): add migration for SpendLogs composite index (startTime, request_id)

The @@index([startTime, request_id]) was added to schema.prisma but had no
corresponding migration. This caused test_aaaasschema_migration_check to fail
because prisma migrate diff detected the missing index.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix(db): add migration for MCP available_on_public_internet default change to true

The schema.prisma changed the default for available_on_public_internet from
false to true, but no migration was created. This caused the schema migration
test to detect drift.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix(test): increase server wait time and add retry to flaky external API tests

- test_basic_python_version.py: increase server startup wait from 60s to 90s
  for slower CI environments (fixes installing_litellm_on_python_3_13)
- test_a2a_agent.py: add flaky(retries=3, delay=5) for non-streaming test
  that depends on live A2A agent endpoint

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix(test): add flaky retries to all intermittent external API tests for 0-fail CI

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix(test): add auth overrides to file endpoint tests that return 500

The test_target_storage tests were getting 500 because the FastAPI auth
dependency wasn't overridden. Added app.dependency_overrides for proper
auth bypass in test environment.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

---------

Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
2026-02-28 09:46:35 -08:00
Ishaan Jaff eea083fa4b fix(mcp): default available_on_public_internet to true (#22331)
* fix(mcp): default available_on_public_internet to true

MCPs were defaulting to private (available_on_public_internet=false) which
was a breaking change. This reverts the default to public (true) across:
- Pydantic models (AddMCPServerRequest, UpdateMCPServerRequest, LiteLLM_MCPServerTable)
- Prisma schema @default
- mcp_server_manager.py YAML config + DB loading fallbacks
- UI form initialValue and setFieldValue defaults

* fix(ui): add forceRender to Collapse.Panel so toggle defaults render correctly

Ant Design's Collapse.Panel lazy-renders children by default. Without
forceRender, the Form.Item for 'Available on Public Internet' isn't
mounted when the useEffect fires form.setFieldValue, causing the Switch
to visually show OFF even though the intended default is true.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix(mcp): update remaining schema copies and MCPServer type default to true

Missed in previous commit per Greptile review:
- schema.prisma (root)
- litellm-proxy-extras/litellm_proxy_extras/schema.prisma
- litellm/types/mcp_server/mcp_server_manager.py MCPServer class

* ui(mcp): reframe network access as 'Internal network only' restriction

Replace scary 'Available on Public Internet' toggle with 'Internal network only'
opt-in restriction. Toggle OFF (default) = all networks allowed. Toggle ON =
restricted to internal network only. Auth is always required either way.

- MCPPermissionManagement: new label/tooltip/description, invert display via
  getValueProps/getValueFromEvent so underlying available_on_public_internet
  value is unchanged
- mcp_server_view: 'Public' → 'All networks', 'Internal' → 'Internal only' (orange)
- mcp_server_columns: same badge updates

---------

Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
2026-02-27 20:06:07 -08:00
Rahul Dhanawade 64c85dbc9f Fix/claude code plugin schema (#22271)
* fix: add missing LiteLLM_ClaudeCodePluginTable to schema.prisma

- Claude Code Plugin Marketplace endpoints (/claude-code/marketplace.json,
  /claude-code/plugins) were returning 500 errors because
  LiteLLM_ClaudeCodePluginTable model was missing from both schema.prisma files
- Prisma client was generated without this table causing AttributeError:
  'Prisma' object has no attribute 'litellm_claudecodeplugintable'
- Added missing model definition to root schema.prisma and
  litellm/proxy/schema.prisma

Fixes #21310

* test: add regression test for LiteLLM_ClaudeCodePluginTable schema

* fix: address greptile review - add @updatedAt, clean up test imports
2026-02-27 15:59:37 -08:00
yuneng-jiang ee7b73764c bump: version 0.4.48 → 0.4.49 2026-02-26 20:29:43 -08:00
yuneng-jiang 9d6f02e8b7 Merge remote-tracking branch 'origin' into litellm_spend_log_duration 2026-02-25 12:06:19 -08:00
Krish Dholakia 12c4876891 Agents - assign tools (#22064)
* feat(proxy): add max_iterations limiter for agent session loops (#22058)

Adds a new proxy hook that enforces a per-session cap on the number of
LLM calls an agentic loop can make. Callers send a session_id with each
request, and the hook counts calls per session, returning 429 when the
configured max_iterations limit is exceeded.

- Uses Redis Lua script for atomic increment (multi-instance safe)
- Falls back to in-memory cache when Redis unavailable
- Follows parallel_request_limiter_v3 pattern
- Configurable via key metadata: {"max_iterations": 25}
- Session counters auto-expire via TTL (default 1hr)

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* feat: add new code execution dataset

* feat(agent_endpoints/): allow giving agents keys

* fix: ui fixes

* feat: allow assigning mcp servers to agents

* fix: eliminate duplicate DB queries in MCP agent auth and N+1 in agent listing (#22110)

- Extract _get_agent_object_permission helper so _get_allowed_mcp_servers_for_agent
  and _get_agent_tool_permissions_for_server share a single DB fetch instead of
  each independently querying the same agent row (was 1+N queries per MCP request)
- Use include={"object_permission": True} on find_many in get_all_agents_from_db
  to eagerly load permissions in one query instead of N+1
- Use include={"object_permission": True} on create/update/find_unique in all
  agent CRUD operations, removing attach_object_permission_to_dict follow-up calls

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 11:44:30 -08:00
yuneng-jiang b78a30f773 [Feature] Add request_duration_ms to SpendLogs
Add a `request_duration_ms` column to `LiteLLM_SpendLogs` to track request
duration. New rows are computed at write time. Legacy rows use a COALESCE
fallback in the `/spend/logs/ui` query to compute duration on the fly from
`endTime - startTime`. The field is also sortable in the UI endpoint.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-24 21:04:53 -08:00
Sameer Kankute 8decf04d8a Merge pull request #21877 from BerriAI/litellm_oss_staging_02_22_2026
Litellm oss staging 02 22 2026
2026-02-23 18:50:47 +05:30
Sameer Kankute 4934d89cc7 Merge pull request #21872 from BerriAI/litellm_dev_02_21_2026_p4
Litellm dev 02 19 2026 p2 (#21871)
2026-02-23 18:35:57 +05:30
Ephrim Stanley 7b5dc3fb9c State management fixes for CheckBatchCost 2026-02-23 07:16:25 -05:00
Krish Dholakia 76ccc9e844 Guardrail Policy Versioning (#21862)
* feat: initial commit, adding support for policy versioning on litellm

* fix(policy_registry): support policy versioning

* fix: multiple QA fixes for policy flow builder with guardrail versioning on litellm

* feat: ui improvements

* feat: add prisma migration

* fix: address greptile fixes
2026-02-21 20:14:31 -08:00
Krish Dholakia 886f1a3472 Litellm dev 02 19 2026 p2 (#21871)
* feat(ui/): new guardrails monitor 'demo

mock representation of what guardrails monitor looks like

* fix: ui updates

* style(ui/): fix styling

* feat: enable running ai monitor on individual guardrails

* feat: add backend logic for guardrail monitoring
2026-02-21 19:14:04 -08:00
yuneng-jiang f7fb4a270f Merge remote-tracking branch 'origin' into litellm_usage_perf_fix 2026-02-20 15:37:56 -08:00
Julio Quinteros Pro 81faad5d0d fix(tests): skip prisma DB test and sync root schema.prisma with spec_path field
- Add @pytest.mark.skip to test_create_audit_log_in_db which requires
  a live Prisma/PostgreSQL DB connection unavailable in CI
- Sync root schema.prisma with litellm/proxy/schema.prisma by adding
  the spec_path field to LiteLLM_MCPServerTable, fixing
  test_aaaasschema_migration_check which detected this drift

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-20 12:29:53 -03:00
yuneng-jiang e2e698944a perf: use SQL GROUP BY for aggregated daily activity endpoints
Replace find_many + Python-side aggregation with a single SQL GROUP BY
query via query_raw in get_daily_activity_aggregated. This collapses
rows across entities (users/teams/orgs) in the database, reducing ~150k
rows to ~2-3k grouped rows before transfer to Python.

Also adds composite indexes (entity_id, date) to all 6 daily spend
tables for faster filtered queries.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-19 14:36:28 -08:00
yuneng-jiang 2eded1dda6 bump: version 0.4.43 → 0.4.44 2026-02-19 10:50:23 -08:00
yuneng-jiang c911cfbabf Merge remote-tracking branch 'origin' into litellm_key_last_active_tracking 2026-02-19 10:27:48 -08:00
yuneng-jiang 6097905e55 [Feature] Track key last active timestamp
Virtual keys only track created_at and updated_at, which don't indicate
when a key was last used. This adds a last_active field that gets updated
during the async batch spend update, giving admins visibility into which
keys are actively being used.

Changes:
- Add last_active DateTime? to VerificationToken and
  DeletedVerificationToken in all 3 schema files and Python types
- Set last_active in the batch key spend update alongside spend increment
- Add Last Active column to virtual keys UI table with info popover
  and hover tooltip showing full date/time with timezone

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-18 23:11:58 -08:00
Harshit Jain 66ce7513f6 Merge branch 'main' into litellm_project_management_apis 2026-02-19 08:40:12 +05:30
Krish Dholakia e00c181f0c Mcp user permissions (#21462)
* feat(schema.prisma): add object permissions for end users

allows controlling if end user can call specific mcp servers

* feat: cleanup for customer_endpoints support of object permission id

* fix: cleanup str

* feat(customers/): enforce end user can only call allowed mcps - if configured

* docs: document customer/end user object permission usage

* feat: enforce end user permissions on MCP tool calls

This commit implements end user permission enforcement for MCP servers:

1. Always add server prefixes to MCP tool names
   - Removed conditional logic that only added prefixes when multiple servers existed
   - Now always adds server prefix for consistent tool naming across all scenarios
   - Updated 5 locations in server.py (list_tools, get_prompts, get_resources,
     get_resource_templates, get_prompt)

2. Created MCP End User Permission Guardrail Hook
   - New guardrail hook: litellm/proxy/guardrails/guardrail_hooks/mcp_end_user_permission.py
   - Runs on post_call to validate tool calls in LLM responses
   - Extracts MCP server name from tool names (splits on first '-')
   - Checks if end_user_id has permissions for the MCP server
   - Raises GuardrailRaisedException if end user lacks permission
   - Supports both streaming and non-streaming responses

3. Added comprehensive tests
   - Test file: tests/test_litellm/proxy/guardrails/guardrail_hooks/test_mcp_end_user_permission.py
   - Tests cover: authorized/unauthorized tools, non-MCP tools, no end_user scenarios
   - Tests permission checking logic and exception raising

The hook integrates with the existing MCPRequestHandler._get_allowed_mcp_servers_for_end_user
to fetch end user permissions and enforce access control at the response level.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* refactor: remove redundant add_prefix variable assignments

Simplified the code by removing intermediate `add_prefix` variable
assignments and passing `True` directly to function calls since
we now always add server prefixes.

Changes:
- Removed `add_prefix = True` variable assignments in 5 locations
- Changed `add_prefix=add_prefix` to `add_prefix=True` in function calls
- Added inline comments to clarify the behavior

This makes the code more concise and clearer in intent.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* feat(auth_utils.py): support safety_identifier as a valid way of passing the end user id for responses api

* feat(llms): ensure 'tools' is correctly updated for responses api

* fix: fix greptile feedback

* feat: transformation.py

proper responses api tool handling for guardrail translation layer

---------

Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-18 18:53:59 -08:00
Krish Dholakia 14a35a12cb Revert "End users - Allow giving end users access to specific mcp servers (#…" (#21461)
This reverts commit 1f521be0f2.
2026-02-17 22:49:57 -08:00
Krish Dholakia 1f521be0f2 End users - Allow giving end users access to specific mcp servers (#21411)
* feat(schema.prisma): add object permissions for end users

allows controlling if end user can call specific mcp servers

* feat: cleanup for customer_endpoints support of object permission id

* fix: cleanup str

* feat(customers/): enforce end user can only call allowed mcps - if configured

* docs: document customer/end user object permission usage

* feat: address greptile comments
2026-02-17 22:45:49 -08:00
Harshit Jain 56b53cb502 Merge branch 'main' into litellm_fix-virtual-key-grace-period 2026-02-15 08:09:13 +05:30
Ishaan Jaff eb432bf911 Litellm notes 181 12 (#21231)
* docs

* docs

* docs

* docs

* fix SCHEMA

* LiteLLM_PolicyTable

* litellm-proxy-extras = {version = "0.4.39",
2026-02-14 16:39:22 -08:00
Ishaan Jaffer 54bd5632d1 new schema 2026-02-14 09:56:45 -08:00
yuneng-jiang 72848f4c08 change to model name for backwards compat 2026-02-13 16:44:49 -08:00
Harshit Jain 673b7d1fea Merge branch 'main' into litellm_fix-virtual-key-grace-period 2026-02-13 18:14:47 +05:30
Harshit Jain f77fbefc22 fix: resolve conflicts with verification e2e 2026-02-13 06:11:03 +05:30
yuneng-jiang fbfaa6c8af rename unified access group to access group 2026-02-12 12:30:22 -08:00
yuneng-jiang 658a283be3 unified access groups v0 2026-02-11 20:44:30 -08:00
Ishaan Jaff f83620157e [Feat] Policies - Allow connecting Policies to Tags, Simulating Policies, Viewing how many keys, teams it applies on (#20904)
* init schema with TAGS

* ui: add policy test

* resolvePoliciesCall

* add_policy_sources_to_metadata + headers

* types Policy

* preview Impact

* def _describe_match_reason(

* match based on TAGs

* TestTagBasedAttachments

* test fixes

* add policy_resolve_router

* add_guardrails_from_policy_engine

* TestMatchAttribution

* refactor

* fix

* fix: address Greptile review feedback on policy resolve endpoints

- Track unnamed keys/teams as separate counts instead of inflating
  affected_keys_count with duplicate "(unnamed key)" placeholders.
  Added unnamed_keys_count and unnamed_teams_count to response.
- Push alias pattern matching to DB via _build_alias_where() which
  converts exact patterns to Prisma "in" and suffix wildcards to
  "startsWith" filters.
- Gate sync_policies_from_db/sync_attachments_from_db behind
  force_sync query param (default false) to avoid 2 DB round-trips
  on every /policies/resolve request.
- Remove worktree-only conftest.py that cleared sys.modules at import
  time — no longer needed since code moved to main repo.
- Rename MAX_ESTIMATE_IMPACT_ROWS → MAX_POLICY_ESTIMATE_IMPACT_ROWS.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: eliminate duplicate DB queries and fix header delimiter ambiguity

- Fetch teams table once in estimate_attachment_impact and reuse for
  both tag-based and alias-based lookups (was querying teams twice when
  both tag_patterns and team_patterns were provided).
- Convert tag/team filter functions from async DB queries to sync
  filters that operate on pre-fetched data (_filter_keys_by_tags,
  _filter_teams_by_tags).
- Fix comma ambiguity in x-litellm-policy-sources header: use '; '
  as entry delimiter since matched_via values can contain commas.
- Use '+' as the within-value separator in matched_via reason strings
  (e.g. "tag:healthcare+team:health-team") to avoid conflict with
  header delimiters.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Update litellm/proxy/policy_engine/policy_resolve_endpoints.py

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
2026-02-10 17:50:37 -08:00
Carlo Alberto Ferraris 248fe65736 add missing indexes on VerificationToken table 2026-02-09 16:59:28 +09:00
yuneng-jiang 87a75900a1 adding soft_budget to deleted teams table 2026-02-07 11:05:42 -08:00
Ishaan Jaff 9b1ccc0608 [Feat] IP-Based Access Control for MCP Servers (#20620)
* update MCPAuthenticatedUser

* add available_on_public_internet for MCPs

* update claude.md

* init IPAddressUtils

* init available_on_public_internet

* add on REST endpoints

* filter with IP

* TestIsInternalIp

* _extract_mcp_headers_from_request

* init get_mcp_client_ip

* _get_general_settings

* allowed_server_ids

* address PR comments

* get_mcp_server_by_name fix

* fix server

* fix review comments

* get_public_mcp_servers

* address _get_allowed_mcp_servers

* test fix

* fix linting

* inint ui types

* add ui for managing MCP private/public

* add ui

* fixes

* add to schema

* add types

* fix endpoint

* add endpoint

* update manager

* test mcp

* dont use external party for ip address
2026-02-06 17:58:24 -08:00
yuneng-jiang a17efa1c8e Add soft_budget to team table and create update endpoints 2026-02-05 14:43:48 -08:00
Sameer Kankute 3c12dda856 bump: litellm-proxy-extras 0.4.29 → 0.4.30
- Add allow_team_guardrail_config field to TeamTable and DeletedTeamTable
- Add migration 20260205091235_allow_team_guardrail_config
2026-02-05 09:21:20 +05:30
Sameer Kankute fae0554fdc Revert "add missing indexes on VerificationToken table (#20040)"
This reverts commit 1e8848ca97.
2026-02-03 15:01:28 +05:30
Harshit Jain 768f9a44b2 fix: virutal key grace period from env/UI 2026-02-03 09:50:10 +05:30
Carlo Alberto Ferraris 1e8848ca97 add missing indexes on VerificationToken table (#20040) 2026-02-02 18:22:15 +05:30
Ishaan Jaffer 47efa33f0b sync: generator client 2026-01-31 15:07:28 -08:00
Ishaan Jaff 9c5fed4f52 [Feat] LiteLLM Vector Stores - Add permission management for users, teams (#19972)
* fix: create_vector_store_in_db

* add team/user to LiteLLM_ManagedVectorStore

* add _check_vector_store_access

* add new fields

* test_check_vector_store_access

* add vector_store/list endpoints

* fix code QA checks
2026-01-28 18:55:40 -08:00
Ishaan Jaffer 9e02a38002 add schema.prisma 2026-01-23 13:16:58 -08:00