Commit Graph

460 Commits

Author SHA1 Message Date
Ryan Crabbe 01ef723c3f Merge remote-tracking branch 'origin/litellm_internal_staging' into litellm_fix-ag-not-resolved 2026-05-01 17:21:29 -07:00
Ryan Crabbe bf4c250d86 fix: gate key access_group override on group's own assignment
Replaces the previous intersect-with-team.access_group_ids check, which
made the override unreachable in practice (the team-gate fallback already
covered every case the intersection allowed). The override now resolves
each of the key's access_group_ids via get_access_object and accepts the
group only if its assigned_team_ids includes the key's team_id, or its
assigned_key_ids includes the key's token. This fulfills the original ask
(a key can extend a team's allow-list via a group the admin granted to
that team or that specific key) while still rejecting foreign groups
referenced by team members of other teams.
2026-05-01 16:29:33 -07:00
Ryan Crabbe f17d779666 fix: scope key access_group_ids override by team's assigned groups
A team member could set any access_group_ids on their key (e.g. a group
assigned only to a different team) and override the team's model
restriction. Intersect the key's access_group_ids with team_object.access_group_ids
in _key_access_group_grants_model so foreign groups are dropped before
model expansion. Adds a regression test that asserts expansion is never
called for foreign groups.
2026-05-01 15:54:03 -07:00
Yuneng Jiang 650821b538 Merge remote-tracking branch 'origin/litellm_internal_staging' into litellm_fix-config-update-targeted-upserts
# Conflicts:
#	tests/test_litellm/proxy/test_proxy_server.py
2026-05-01 10:38:34 -07:00
harish-berri 7c8fe86fd9 Merge branch 'litellm_internal_staging' into litellm_token_verification_query_opt 2026-04-30 17:25:12 -07:00
yuneng-jiang 15b7386859 Merge pull request #26815 from stuxf/fix/get-image-lfi-ssrf
chore(proxy): contain UI_LOGO_PATH / LITELLM_FAVICON_URL on unauthenticated asset endpoints
2026-04-30 17:10:15 -07:00
yuneng-jiang 4ff8f0e901 Merge pull request #26851 from stuxf/codex/fix-callback-env-secret-resolution
chore(proxy): block env callback refs in key metadata
2026-04-30 13:11:32 -07:00
yuneng-jiang aa76ab2df7 Merge pull request #26862 from stuxf/codex/control-field-sanitization
chore(proxy): harden request control fields
2026-04-30 13:10:58 -07:00
Michael-RZ-Berri 9637d8c17b Merge pull request #26802 from BerriAI/litellm_lazyLoadedFrontPage
[Feat / Fix] Lazy loaded imports, lazy loaded front page
2026-04-30 13:04:42 -07:00
harish-berri 8df24b5413 Merge branch 'litellm_internal_staging' into litellm_token_verification_query_opt 2026-04-30 12:04:08 -07:00
user b67a81da47 test(proxy): align favicon remote asset expectations 2026-04-30 11:46:45 -07:00
user 215f538d4f fix(static-assets): browser-load remote branding assets 2026-04-30 11:30:57 -07:00
user f48dfdbdd9 fix(proxy): require opt in for audit header fallback 2026-04-30 11:17:04 -07:00
user db00e674e2 test(proxy): cover control field hardening branches 2026-04-29 23:57:50 -07:00
user 119c70b576 fix(proxy): gate delegated audit attribution 2026-04-29 23:10:14 -07:00
user 842eea0131 chore(proxy): harden request control fields 2026-04-29 22:35:17 -07:00
user 22c01adeb2 chore(proxy): ignore invalid callback metadata rows 2026-04-29 20:05:35 -07:00
user f2f1e3a0ba chore(proxy): block env callback refs in key metadata 2026-04-29 19:54:40 -07:00
Michael Riad Zaky de75cd777e test_proxy_routes: dedupe lazy force-load to match vector_store test pattern 2026-04-29 17:20:56 -07:00
Michael Riad Zaky 0f8dd28542 lazy-load optional feature routers on first request 2026-04-29 17:20:55 -07:00
user 75d1a0116e fix(static-assets): use async_safe_get; drop SVG; serve bytes inline on cache miss
Three review items addressed:

* **Veria (Medium): SSRF via redirect.** ``fetch_validated_image_bytes``
  was calling ``validate_url(url)`` once and then fetching with the
  default httpx client, so a 3xx to an internal IP would have been
  followed unvalidated. Switched to ``async_safe_get`` (the existing
  SSRF primitive used elsewhere in the codebase) which walks each
  redirect hop, re-validates, and rejects redirects to blocked
  networks. Default ``litellm.user_url_validation`` is True so
  protection is on out of the box.

* **Greptile (P2): SVG can embed JS.** Removed ``image/svg+xml`` from
  the allowed-Content-Type set. The hardcoded response media type
  (``image/jpeg`` / ``image/x-icon``) means a real SVG body wouldn't
  render as SVG anyway in modern browsers — the allowlist entry was
  giving up XSS surface for no actual SVG-rendering benefit. If real
  SVG support is wanted later, that's a deliberate feature PR with CSP
  / nosniff bundled.

* **Greptile (P2): cache-write OSError drops validated bytes.** When
  the upstream fetch succeeded but ``open(cache_path, "wb")`` raised
  (read-only assets dir), the bytes were discarded and the default
  logo was served — a silent regression for that deployment. Now
  serve the validated bytes inline via ``Response(...)`` as a fallback
  before falling back to default.

Tests:

- Replaced low-level mocks of ``validate_url`` with mocks of
  ``async_safe_get`` directly, exercising the helper's contract
  rather than the SSRF primitive's internals.
- New ``test_rejects_svg_content_type`` confirms SVG is blocked.
- ``test_get_image_cache_logic`` fixture now sets
  ``mock_response.is_redirect = False`` so ``async_safe_get`` doesn't
  treat the Mock's truthy attribute as a redirect.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 21:57:22 +00:00
user 55d393d77d fix(static-assets): unblock CI — pass headers explicitly + harden + update legacy tests
Three CI failures from the previous push, all addressed:

* ``lint`` (mypy): ``async_client.get(url, **request_kwargs)`` confused
  mypy because ``AsyncHTTPHandler.get``'s second positional arg is typed
  ``bool | None``. Switched to an explicit branch:
  ``await async_client.get(rewritten_url, headers={"host": host_header})``
  for the HTTP-rewritten case, plain ``get(rewritten_url)`` otherwise.

* ``proxy-infra`` /
  ``test_get_image_custom_local_logo_bypasses_cache``: the existing
  test set ``UI_LOGO_PATH=/app/custom_logo.jpg`` with no
  ``LITELLM_ASSETS_PATH``, asserting the path was served verbatim. That
  was the LFI behaviour the new path-containment guard closes. Updated
  the test to set ``LITELLM_ASSETS_PATH=/app`` so the path is inside an
  allowed root, and patched the helper's ``realpath`` / ``isfile`` to
  go along with the mocked filesystem. Test intent (bypass cache when
  ``UI_LOGO_PATH`` is local) is preserved.

* ``auth-and-jwt`` / ``test_get_image_cache_logic``: existing test
  built a ``Mock`` response without ``headers``, so the new
  Content-Type check tripped on ``Mock().split(";")[0]``. Two fixes:

    1. Set ``mock_response.headers = {"content-type": "image/jpeg"}``
       on the test (matches the real upstream contract — a logo CDN
       always sets a Content-Type).
    2. Make ``fetch_validated_image_bytes`` defensive: if the
       Content-Type header is missing or non-string, treat as non-image
       and fall back to default. Closes a subtle hole — pre-fix, an
       upstream that omits Content-Type entirely would have served
       arbitrary bytes under the ``image/jpeg`` wrapper.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 21:47:41 +00:00
user bdb00c43cf fix(spend-tracking): drop orphaned imports; align tests with alias contract
CI surfaced two issues from the previous commit:

1. ``general_settings`` and ``master_key`` were still imported at the top
   of ``get_logging_payload`` but had no remaining users after the
   master-key hash-detection blocks were removed. Drop the import.

2. ``tests/proxy_unit_tests/test_user_api_key_auth.py::test_x_litellm_api_key``
   and ``tests/proxy_unit_tests/test_key_generate_prisma.py::test_master_key_hashing``
   asserted ``valid_token.token == hash_token(master_key)`` — the
   pre-alias behavior. The new contract is
   ``valid_token.token == LITELLM_PROXY_MASTER_KEY_ALIAS`` (and !=
   ``hash_token(master_key)``), since the master key (and its hash)
   must not propagate to the verification-token column or any other
   downstream consumer.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 18:53:12 +00:00
harish-berri 1d62ca0e23 Merge branch 'litellm_internal_staging' into litellm_token_verification_query_opt 2026-04-28 17:34:17 -07:00
Krrish Dholakia fd32f29e39 Revert "lazy-load optional feature routers on first request (#26534)" (#26727)
This reverts commit 21ed38971d.
2026-04-29 00:21:41 +00:00
Michael-RZ-Berri 21ed38971d lazy-load optional feature routers on first request (#26534)
Co-authored-by: Michael Riad Zaky <michaelr@Mac.localdomain>
2026-04-28 17:04:40 -07:00
harish-berri 84b6bd60af update test cases to match new behaviour. The earlier test cases assumed the cache stores a pydantic object 2026-04-28 21:08:46 +00:00
Yuneng Jiang abbe5d7f85 fix(proxy): /config/update writes only sent sections, drop store_model_in_db gate
The endpoint loaded the full merged YAML+DB config and re-saved every
top-level section to LiteLLM_Config rows via save_config(), so a UI toggle
of one field persisted unrelated YAML state to DB as a side effect. It
also rejected every request when store_model_in_db was False — including
the request that would flip the flag to True (chicken-and-egg).

Replace save_config with targeted per-section upserts: read the existing
litellm_config row, merge in the request, upsert just that row. Sections
the caller did not send are not touched. Drop the blanket
store_model_in_db guard — the endpoint already requires prisma_client,
and the startup-side override at proxy_server.py:6491 picks up
general_settings.store_model_in_db=True from the DB on next restart.
2026-04-27 14:59:33 -07:00
Yuneng Jiang 4884b0b611 Merge remote-tracking branch 'origin/litellm_internal_staging' into litellm_yj_apr23
# Conflicts:
#	litellm/proxy/management_endpoints/key_management_endpoints.py
2026-04-25 09:47:47 -07:00
yuneng-jiang c05de83f1c Merge pull request #26490 from BerriAI/litellm_restrict_global_spend_routes
[Fix] Restrict /global/spend/* routes to admin roles
2026-04-25 09:30:36 -07:00
Ryan Crabbe 7eab549190 test: tighten prepare_key_update_data mock and apply black
- test_prepare_key_update_data: replace bare MagicMock with
  MagicMock(spec=LiteLLM_VerificationToken) and explicitly set
  existing_key_row.metadata = {}, so reserved-field reads return real
  values instead of MagicMock-returning-MagicMock. Fixes a regression
  surfaced by the new reserved-metadata preservation logic.
- test_key_management_endpoints.py: black-format-only changes from
  recent edits.
2026-04-25 09:08:55 -07:00
Yuneng Jiang 01eee0944c [Fix] Restrict /global/spend/* routes to admin roles
The routes in `global_spend_tracking_routes` (e.g. /global/spend/report,
/global/spend/teams, /global/spend/keys) return spend aggregated across
every team, customer, and api_key in the proxy. They were included in
`internal_user_routes` and `internal_user_view_only_routes`, so non-admin
roles could read proxy-wide spend.

Drop them from both non-admin route lists. PROXY_ADMIN and
PROXY_ADMIN_VIEW_ONLY access is preserved through their existing branches
in route_checks.py, and the `get_spend_routes` permission opt-in
continues to grant access for keys that need it.

Updates two pre-existing test parametrizations whose expected results
flip from True to False, and adds parametrized coverage over every
route in `global_spend_tracking_routes` for: PROXY_ADMIN_VIEW_ONLY
allowed, INTERNAL_USER blocked, INTERNAL_USER_VIEW_ONLY blocked,
INTERNAL_USER + get_spend_routes permission allowed.
2026-04-24 22:46:07 -07:00
Yuneng Jiang 000ce70127 Merge remote-tracking branch 'origin/litellm_internal_staging' into litellm_migration_projects
# Conflicts:
#	litellm/proxy/ui_crud_endpoints/proxy_setting_endpoints.py
#	uv.lock
2026-04-24 12:52:10 -07:00
user 3737d6a1f3 fix(auth): centralize common_checks to close authorization bypass
Multiple paths through _user_api_key_auth_builder returned a
UserAPIKeyAuth without running common_checks(): OAuth2 token validation,
OAuth2 proxy header hook, JWT admin shortcut, master_key path,
pass-through custom headers, the /user/auth route, and the
allow_requests_on_db_unavailable fallback. An operator-configured key
model-access list, max_budget, team_blocked flag, or team model scope
was therefore silently skipped on those paths. The HA-fallback token
was worse: it was a full proxy-admin synthetic, so a DB outage granted
full admin to every caller.

Fix three root causes (VERIA-18):

1. Centralize common_checks in the user_api_key_auth wrapper. The
   builder paths no longer call it; the wrapper runs it once after the
   builder returns, for every path. Introduces _run_centralized_common_checks
   which gathers team/user/project/end_user/global_spend context in
   parallel via asyncio.gather. Preserves the existing
   custom_auth_run_common_checks opt-out for custom-auth deployments.

2. Narrow is_database_connection_error — drop the blanket PrismaError
   catch that routed data-layer errors (UniqueViolationError, etc.)
   into the HA fallback. Only real connectivity failures plus the
   no_db_connection marker now qualify.

3. DB-unavailable fallback issues an INTERNAL_USER token with user_id
   DB_UNAVAILABLE_FALLBACK_USER_ID instead of proxy-admin. An outage
   can no longer escalate an anonymous caller.

JWT admin / master_key tokens still grant admin via a synthesized
admin user_object (so non_proxy_admin_allowed_routes_check in
common_checks recognizes them); other common_checks branches
(team_blocked, team_model_access) now apply uniformly.
2026-04-23 00:04:42 +00:00
Ryan Crabbe f92594f2c6 fix: honor key access_group_ids when team restricts models
Two model-access gates run per request in `common_checks` and they're
asymmetric: `can_key_call_model` falls back to the key's
`access_group_ids`, but `can_team_access_model` only looks at
`team.models` + `team.access_group_ids`. A key granted a model via its
own access group on a model-restricted team is silently denied at the
team gate.

Wrap `can_team_access_model` in try/except in `common_checks`: on
`team_model_access_denied`, consult a new `_key_access_group_grants_model`
helper that expands `valid_token.access_group_ids` via the existing
`_get_models_from_access_groups` and checks via `_can_object_call_model`.
Re-raise if the key's access groups don't grant the model. Any other
exception propagates unchanged.

Effect: request allowed if `team allows X` OR `key's access group
grants X`, making the two gates symmetric.

Test: add three unit tests for `_key_access_group_grants_model`
covering: group covers model, key has no groups, group resolves but
does not cover model.
2026-04-22 14:28:58 -07:00
Yuneng Jiang 4b3f5d7f81 [Fix] conftest: flush cache instances and warn on silent skips
Addresses review feedback on the snapshot approach:

1. Class-instance mutable state
   The snapshot only covers primitives + collections + None. Class
   instances (DualCache, LLMClientCache) weren't reset between tests,
   so in-place cache mutations could leak. Can't deepcopy these — they
   hold thread locks — but they expose flush_cache(). Collect every
   module attribute whose value implements flush_cache() at conftest
   import, and invoke it per-test alongside the snapshot restore.

2. Silent skips are now warnings
   _snapshot_mutable_state and _restore_mutable_state previously
   swallowed exceptions, so if a future attr gained a property without
   a setter (or other non-round-trippable state), an isolation gap
   would have no signal. Emit warnings.warn on each failure path.

3. Docstring
   Explicitly documents what IS and IS NOT reset, and tells authors to
   use monkeypatch.setattr() for in-place mutations of instances
   without flush_cache() (ProxyLogging, JWTHandler, etc.).
2026-04-20 22:19:36 -07:00
Yuneng Jiang 5411ebedae [Fix] conftest snapshot: also reset scalar module attributes
The previous snapshot only tracked list/dict/set values. Tests mutate
scalar module attrs too — master_key, premium_user, prisma_client — and
importlib.reload used to reset those implicitly. Under the snapshot
approach they were leaking between tests, so test_active_callbacks
failed in CI with "No api key passed in." once an earlier test left
master_key set to sk-1234.

Expand the snapshot to cover primitives (str/int/float/bool/bytes/tuple)
and None-valued attributes. Complex object instances are still skipped
to avoid deepcopy issues.
2026-04-20 21:03:07 -07:00
Yuneng Jiang ccf928361b [Infra] Speed up proxy unit tests by replacing litellm reload with state snapshot
tests/proxy_unit_tests/conftest.py was calling importlib.reload(litellm) in an
autouse function-scoped fixture, which cost ~17s per test because it re-ran
the full litellm __init__ import chain. With 400+ proxy unit tests, this was
the single biggest driver of CI wall time — 18 of the top 20 slowest durations
in a typical run were just the 17s fixture setup.

Replace the reload with a snapshot-and-restore approach: snapshot the mutable
lists/dicts/sets on litellm and litellm.proxy.proxy_server once at conftest
import, then deep-copy that snapshot back before each test. Callback lists,
caches, router state, etc. still get reset between tests, but the expensive
import chain only runs once per worker.

Local measurement on test_proxy_utils.py: 188 tests in 3.50s (previously took
~15 minutes of CI wall time on a single worker).
2026-04-20 17:59:05 -07:00
Yuneng Jiang 11c3270cdc Merge remote-tracking branch 'origin/litellm_internal_staging' into litellm_yj_apr17
# Conflicts:
#	litellm/__init__.py
2026-04-17 17:36:40 -07:00
yuneng-jiang b9f5be8956 Merge pull request #25922 from BerriAI/litellm_a2a_agent_acl
[Fix] Agent endpoint and routing permission checks
2026-04-17 17:19:59 -07:00
ishaan-berri 1c128a86b8 Merge pull request #25256 from BerriAI/litellm_ishaan_april6
Litellm ishaan april6
2026-04-17 16:26:45 -07:00
Ishaan Jaffer e8461b5b97 style: run black formatter on files from main merge 2026-04-17 13:02:59 -07:00
Ishaan Jaffer f31d4faa87 Merge origin/main into litellm_ishaan_april6 2026-04-17 12:36:51 -07:00
Yuneng Jiang f8f356fac3 test: add parametrized tests for api_key value handling in credential check 2026-04-16 16:08:47 -07:00
user 9622864ef2 test: set allow_client_tags on duplicate-tag merge test
test_add_litellm_data_to_request_duplicate_tags tests the request/key
tag merge when tags overlap. The merge requires caller-supplied tags to
flow through — set allow_client_tags=True on the key so the merge path
stays testable under the new default-deny regime.
2026-04-16 22:50:54 +00:00
user 8526628a8f test: update tag-merge tests for default-deny client tag policy
Two pre-existing tests codified the pre-fix behavior where any caller-
supplied metadata.tags would flow through to spend logs and routing:

- test_add_key_or_team_level_spend_logs_metadata_to_request exercised
  the request/key/team tag merge. Set allow_client_tags=True on the key
  metadata so the merge path is still tested under the new regime.

- test_create_file_with_nested_litellm_metadata asserted that
  litellm_metadata[tags] form-data propagated to the handler. Drop the
  tag field; the test still proves nested form-parser correctness via
  spend_logs_metadata and environment.
2026-04-16 22:40:43 +00:00
Yuneng Jiang dafa1bf97c Merge remote-tracking branch 'origin/litellm_internal_staging' into litellm_yj_apr15
# Conflicts:
#	litellm/litellm_core_utils/litellm_logging.py
#	uv.lock
2026-04-16 09:17:20 -07:00
Tim Ren dd4a41951f fix(utils): allowed_openai_params must not forward unset params as None (#25777)
* feat(proxy): add NO_OPENAPI env var to disable /openapi.json endpoint (#25696)

* feat(proxy): add NO_OPENAPI env var to disable /openapi.json endpoint - Fixes #25538

* test(proxy): add tests for _get_openapi_url

---------

Co-authored-by: Progressive-engg <lov.kumari55@gmail.com>

* feat(prometheus): add api_provider label to spend metric (#25693)

* feat(prometheus): add api_provider label to spend metric

Add `api_provider` to `litellm_spend_metric` labels so users can
build Grafana dashboards that break down spend by cloud provider
(e.g. bedrock, anthropic, openai, azure, vertex_ai).

The `api_provider` label already exists in UserAPIKeyLabelValues and
is populated from `standard_logging_payload["custom_llm_provider"]`,
but was not included in the spend metric's label list.

* add api_provider to requests metric + add test

Address review feedback:
- Add api_provider to litellm_requests_metric too (same call-site as
  spend metric, keeps label sets in sync)
- Add test_api_provider_in_spend_and_requests_metrics following the
  existing pattern in test_prometheus_labels.py

* fix: ensure `litellm_metadata` is attached to `pre_call` guardrail to align with `post_call` guardrail (#25641)

* fix: ensure `litellm_metadata` is attached to pre_call to align with post_call

* refactor: remove unused BaseTranslation._ensure_litellm_metadata

* refactor: module level imports for ensure_litellm_metadata and CodeQL

* fix: update based off of Codex comment

* revert: undo usage of `_guardrail_litellm_metadata`

* feat: add pricing entry for openrouter/google/gemini-3.1-flash-lite-preview (#25610)

* fix(bedrock): skip synthetic tool injection for json_object with no schema (#25740)

When response_format={"type": "json_object"} is sent without a JSON
schema, _create_json_tool_call_for_response_format builds a tool with an
empty schema (properties: {}). The model follows the empty schema and
returns {} instead of the actual JSON the caller asked for.

This patch:
- Skips synthetic json_tool_call injection when no schema is provided.
  The model already returns JSON when the prompt asks for it.
- Fixes finish_reason: after _filter_json_mode_tools strips all
  synthetic tool calls, finish_reason stays "tool_calls" instead of
  "stop". Callers (like the OpenAI SDK) misinterpret this as a pending
  tool invocation.

json_schema requests with an explicit schema are unchanged.

Co-authored-by: Claude <noreply@anthropic.com>

* fix(utils): allowed_openai_params must not forward unset params as None

`_apply_openai_param_overrides` iterated `allowed_openai_params` and
unconditionally wrote `optional_params[param] = non_default_params.pop(param, None)`
for each entry. If the caller listed a param name but did not actually
send that param in the request, the pop returned `None` and `None` was
still written to `optional_params`. The openai SDK then rejected it as
a top-level kwarg:

    AsyncCompletions.create() got an unexpected keyword argument 'enable_thinking'

Reproducer (from #25697):

    allowed_openai_params = ["chat_template_kwargs", "enable_thinking"]
    body = {"chat_template_kwargs": {"enable_thinking": False}}

Here `enable_thinking` is only present nested inside
`chat_template_kwargs`, so the helper should forward
`chat_template_kwargs` and leave `enable_thinking` alone. Instead it
wrote `optional_params["enable_thinking"] = None`.

Fix: only forward a param if it was actually present in
`non_default_params`. Behavior is unchanged for the happy path (param
sent → still forwarded), and the explicit `None` leakage is gone.

Adds a regression test exercising the helper in isolation so the test
does not depend on any provider-specific `map_openai_params` plumbing.

Fixes #25697

---------

Co-authored-by: lovek629 <59618812+lovek629@users.noreply.github.com>
Co-authored-by: Progressive-engg <lov.kumari55@gmail.com>
Co-authored-by: Ori Kotek <ori.k@codium.ai>
Co-authored-by: Alexander Grattan <51346343+agrattan0820@users.noreply.github.com>
Co-authored-by: Mohana Siddhartha Chivukula <103447836+iamsiddhu3007@users.noreply.github.com>
Co-authored-by: Amiram Mizne <amiramm@users.noreply.github.com>
Co-authored-by: Claude <noreply@anthropic.com>
2026-04-16 19:04:26 +05:30
Yuneng Jiang eba43b5c04 Merge remote-tracking branch 'origin/litellm_internal_staging' into litellm_migration_projects 2026-04-15 10:14:06 -07:00
user f521e27371 test(gemini): align API key expectations 2026-04-14 23:28:13 +00:00