Commit Graph

15 Commits

Author SHA1 Message Date
user 5ba6bc0784 chore(deps): bump uv to 0.11.7 + drop dead npm sed
- UV_IMAGE across all Dockerfiles: 0.10.9 -> 0.11.7.
- Loosen `required-version` in enterprise/ and litellm-proxy-extras/
  from strict `==0.10.9` to `>=0.10.9` so the new Docker image can
  build those workspace members. Matches the main pyproject range.
- Drop the `sed` block that rewrote tar/minimatch version ranges in
  npm's bundled package.json files. The override loop above already
  swaps the vendored directories on disk; npm doesn't re-resolve at
  runtime, so the sed was cosmetic.
2026-04-24 00:36:59 +00:00
stuxf a6c30b30bf build: migrate packaging, CI, and Docker from Poetry to uv (#25007)
* build: migrate packaging metadata to uv

* ci: move automation and local tooling to uv

* docker: migrate image builds and runtime setup to uv

* docs: update install and deployment guidance for uv

* chore: align auxiliary scripts and tests with uv

* test: harden test_litellm isolation

* fix: keep release and health check images self-contained

* build: pin uv tooling and health check deps

* test: isolate bedrock image request formatting from suite state

* test: cover sandbox executor requirements flow

* ci: fix circleci no-op command steps

* ci: fix circleci publish workflow parsing

* fix: stabilize remaining uv migration CI checks

* ci: increase matrix test timeout headroom

* fix: restore published docker and license coverage

* fix: restore proxy runtime build parity

* fix: restore proxy extras parity and venv migrations

* ci: persist uv path across circleci steps

* fix: keep psycopg binary in default test env

* docker: preserve prisma cache across stages

* test: run local proxy checks through uv python

* build: restore runtime deps moved into ci

* build: refresh uv lock after upstream merge

* fix: restore module import in test_check_migration after merge

The conflict resolution imported only the function but the test body
references check_migration as a module throughout.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: revert dependency promotions, remove nodejs-wheel-binaries, fix Docker layer caching

- Move google-generativeai, Pillow, tenacity back to ci group (they are
  lazily imported and bloat the base SDK install needlessly)
- Remove nodejs-wheel-binaries from extra_proxy and proxy-dev (redundant
  in Docker where system Node.js is already installed via apk)
- Remove all nodejs-wheel node replacement and venv npm patching blocks
  from Dockerfiles since the wheel is no longer installed
- Add --no-default-groups to CodSpeed benchmark workflow so the benchmark
  environment matches the old minimal pip install footprint
- Apply standard uv two-phase Docker pattern: copy metadata first, install
  deps (cached layer), then copy source and install project
- Replace CircleCI enterprise no-op with proper uv sync command

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: regenerate uv.lock after removing nodejs-wheel-binaries

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(ci): use cache/restore instead of cache to prevent cache poisoning

The old workflow used actions/cache/restore (read-only). The uv migration
changed it to actions/cache (read-write), which zizmor flags as a cache
poisoning risk. Restore the safer read-only variant.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(ci): disable setup-uv built-in cache to silence cache-poisoning alert

The setup-uv action enables caching by default, which zizmor flags as a
cache poisoning risk. Disable it since we already use a read-only
cache/restore step.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(ci): disable setup-uv cache in publish workflow

Silences zizmor cache-poisoning alert. Publishing workflow runs
infrequently on protected branches so caching adds no real benefit.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(test): remove duplicate verbose_logger mock in test_check_migration

The logger was patched twice — first via mocker.patch() then via
mocker.patch.object(autospec=True). The second call fails because
autospec cannot inspect an already-mocked attribute. Remove the
redundant first patch.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(ci): free disk space before Docker build in test-server-root-path

The Dockerfile.non_root build ran out of disk on the CI runner. Remove
Android SDK, .NET, Boost, and GHC toolchains (~12GB) to free space.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 11:46:23 -07:00
ishaan-berri 61b295238b cherry-pick: tag query fix + MCP metadata support (#25145)
* added support for metadata (#24261)

* added support for metadata

* fix: PR review - meta truthiness, BlobResourceContents mimeType, add Blob+empty meta tests

Made-with: Cursor

* pyproject to .25

* feat(teams): resolve access group models/MCPs/agents in team endpoints

Add access_group_models, access_group_mcp_server_ids, and
access_group_agent_ids to /team/info and /v2/team/list responses.
These fields contain resources inherited from access groups, kept
separate from direct assignments so the UI can distinguish the source.

Backend: _resolve_access_group_resources() helper resolves access
group resources via existing _get_*_from_access_groups() functions.

UI: Teams table and detail view show direct models as blue badges
and access-group-sourced models as green badges.

* perf(teams): single-pass access group resolution + asyncio.gather in list endpoint

- Fetch each access group object once and extract all 3 resource fields
  in a single pass instead of 3 separate calls (3N → N lookups)
- Use asyncio.gather to resolve access groups across teams concurrently
  in list_team_v2 instead of sequential awaits
- Add 5 unit tests for _resolve_access_group_resources

* docs: add default_team_params to config reference and update examples

- Add default_team_params to litellm_settings reference table in
  config_settings.md with all sub-fields documented
- Update self_serve.md and msft_sso.md examples to include
  team_member_permissions, tpm_limit, and rpm_limit
- Fix misleading comment that implied default_team_params only applies
  to SSO auto-created teams — it applies to all /team/new calls

* docs: clarify that models sub-field only applies to SSO auto-created teams

* fix: lazy import get_access_object to break cyclic import + short-circuit all-proxy-models display

- Remove get_access_object from module-level import in team_endpoints.py
  and use a lazy _get_access_object wrapper to avoid cyclic dependency
- Add _prisma_client is None early-exit guard in _resolve_access_group_resources
- Short-circuit UI to show "All Proxy Models" when team.models is empty
  or contains "all-proxy-models", skipping access group model resolution

* add: making organizations a select instead of read only badges

* fix(ui): only send organization_id when changed and use raw initial value

* fix(ui): add paginated team search to usage page filter

Replace the static team dropdown on the usage page with a new
TeamMultiSelect component that uses the paginated v2/team/list
endpoint with debounced server-side search and infinite scroll.

* fix(ui): fix imports and update placeholder for team multi select

* fix(ui): wire team_id filter to key alias dropdown on Virtual Keys tab

The Key Alias dropdown on the Virtual Keys page was showing aliases from
all teams regardless of which team was selected. The team_id was never
passed through the frontend chain to the backend /key/aliases endpoint.

- Backend: add optional team_id query param to /key/aliases endpoint
- networking.tsx: add team_id param to keyAliasesCall
- useKeyAliases: accept and forward team_id to API call and query key
- filter.tsx: pass allFilters context to custom filter components
- PaginatedKeyAliasSelect: read Team ID from allFilters and pass to hook

* fix(tests): correct mock targets in TestResolveAccessGroupResources

Three tests were patching the non-existent `get_access_object` instead
of `_get_access_object` (the lazy-import wrapper), causing AttributeError.
Also added missing `prisma_client` mock so tests get past the early-exit
guard and actually exercise the resolution logic.

* fix: use direct attribute access with or [] fallback in _resolve_access_group_resources

Replace getattr(ag, "field", []) with ag.field or [] for cleaner
access and safe handling if a field is None.

* fix(ui): remove model source legend from team detail view

The blue/green color distinction is self-explanatory; the legend added
visual clutter without providing enough value.

* fix(ui): add missing access_group fields to TeamData.team_info type

The TeamData interface was missing access_group_models,
access_group_mcp_server_ids, and access_group_agent_ids fields,
causing a TypeScript build failure.

* perf(teams): batch-fetch access groups in single DB query

Replace per-ID _resolve_access_group_resources loop with a single
find_many call that deduplicates IDs across all teams. Removes the
N+1 query pattern on cold cache for the team list endpoint.

* refactor(proxy): extract helpers to fix PLR0915 violations

Extract `_apply_non_admin_alias_scope` from `key_aliases`,
`_resolve_team_access_group_resources` from `team_info`, and
`_enforce_list_team_v2_access` from `list_team_v2` to bring each
function under ruff's 50-statement limit. No behavior changes.

* test(ui): update tests to match new team_id / access-group signatures

- useKeyAliases, PaginatedKeyAliasSelect: add trailing `undefined` to
  spy matchers for the new `team_id` param on `useInfiniteKeyAliases`
  and `keyAliasesCall`.
- EntityUsage: mock new `TeamMultiSelect` child so QueryClientProvider
  is not required for team-entity tests.
- ModelsCell: replace the overflow-accordion test with one that
  verifies the new collapse-on-`all-proxy-models` behavior (no
  accordion, single badge).

* fix(ui): send null (not '') for cleared organization_id on team update

AntD <Select allowClear> returns undefined when the user clears the
selection. Coalescing to "" caused the team-update payload to carry
organization_id: "" instead of null, relying on the backend to coerce
it. Send null directly so the intent is explicit at the source.

* poetry

* chore: regen poetry.lock for litellm-proxy-extras 0.4.64 bump

* chore: update Next.js build artifacts (2026-04-04 17:55 UTC, node v22.16.0)

---------

Co-authored-by: shivam <shivam@uni.minerva.edu>
Co-authored-by: Ryan Crabbe <ryan@berri.ai>
Co-authored-by: yuneng-jiang <yuneng@berri.ai>

* Tag query fix (#25094)

* feat(tag-spend): implement separate scheduler job for daily tag spend updates

* fix(docker): add g++ to build dependencies in Dockerfile

* initial test cases. TODO: check scheduler init and test cases in proxy_server related to it

* resolved QPS issue when redis transaction buffer is enabled

* resolving circular import error flagged by greptile

* fix(mypy): use Optional[str] for api_base in PydanticAI provider to match superclass signature

---------

Co-authored-by: Shivam Rawat <shivam@berri.ai>
Co-authored-by: shivam <shivam@uni.minerva.edu>
Co-authored-by: Ryan Crabbe <ryan@berri.ai>
Co-authored-by: yuneng-jiang <yuneng@berri.ai>
Co-authored-by: Harish <harishgokul01@gmail.com>
Co-authored-by: Ishaan Jaffer <ishaan@berri.ai>
2026-04-04 16:44:02 -07:00
Yuneng Jiang 85f72c9d24 [Fix] Remove unused aioboto3 dependency and botocore conflict workarounds
aioboto3 was listed as a dependency for async sagemaker calls but is not
imported anywhere in the codebase — async calls use httpx + botocore SigV4
instead. Removing it eliminates the unresolvable botocore version conflict
between boto3 and aiobotocore, along with all grep -v / --no-deps workarounds
across Dockerfiles and CI.

Also addresses Greptile review feedback: collapse redundant grpcio
python-version markers, bump pyproject.toml cryptography to 46.0.5 to
match Docker (GHSA-r6ph-v2qm-q3c2), and fix misleading .npmrc comment.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 14:25:44 -07:00
Yuneng Jiang 821a634d25 [Fix] Handle boto3/aioboto3 botocore conflict across CI and Docker builds
boto3==1.42.80 and aioboto3==15.5.0 have incompatible botocore version
ranges. No aioboto3 release supports botocore 1.42.x yet. Both uv and
pip 26.0.1 reject the resolution.

Fix: filter aioboto3 out of requirements.txt at install time, then
install aioboto3+aiobotocore with --no-deps to bypass resolution.
Added wrapt and aioitertools to requirements.txt as pinned transitive
deps of aiobotocore (skipped by --no-deps). Fixed pip stdin handling
(/dev/stdin). Applied to all 5 Dockerfiles and all CircleCI install
paths.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 12:27:21 -07:00
Yuneng Jiang 5f63873dca [Infra] Pin all Docker build dependencies to exact versions
Pin every dependency across all Docker builds so upgrades are intentional.
Verified by building all 3 production images and diffing pip freeze against
known-good v1.83.0-nightly baselines — zero version drift.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 00:05:39 -07:00
yuneng-jiang d3587b1d8e fix: bump PyJWT to 2.12.0 in all Dockerfiles and tar to 7.5.11
All Dockerfiles were pinning PyJWT 2.9.0 (Dockerfile, Dockerfile.database,
Dockerfile.dev) or had a stale wheel build for 2.9.0 (Dockerfile.non_root).
Updated to 2.12.0 to match pyproject.toml. Also bumps tar to 7.5.11 in
Dockerfile.non_root for security.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-14 19:54:54 -07:00
yuneng-jiang 6a90596377 updating Dockerfile to tar 7.5.11 2026-03-13 11:16:17 -07:00
Krish Dholakia e7714f0ce6 Fix CVEs: bump tar/minimatch/pypdf + harden Docker SBOM patching (#23082)
* fix(docker): bump tar/minimatch/pypdf for CVE fixes + harden SBOM patching

- Bump tar 7.5.8→7.5.10, minimatch 10.2.1→10.2.4, pypdf 6.6.2→6.7.3
- Add sed-based SBOM metadata patching with properly indented find/sed
- Add npm package manager cleanup (apk del / apt-get purge) to remove
  stale SBOM entries from image scanners
- Scope || true to only apk del via brace grouping { ... || true; }
- Guard npm root -g with non-empty assertion to prevent silent failures
- Scope minimatch sed regex to ^10.x to avoid matching other major versions

Addresses: CVE-2026-27903, CVE-2026-27904, GHSA-qffp-2rhf-9h96, CVE-2026-27888

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(docker): scope find to /usr/local/lib /usr/lib, drop autoremove

- Replace `find /` with `find /usr/local/lib /usr/lib` to avoid
  traversing /proc, /sys, /dev during SBOM metadata patching
- Remove `apt-get autoremove -y` from Debian-based Dockerfiles to
  prevent nodejs from being removed as an auto-installed dependency

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 18:31:27 -08:00
Harshit28j 3e6c10a071 security: fix critical/high CVEs in OS-level libs and NPM transitive 2026-02-24 19:40:09 +05:30
Harshit Jain 3b043ee8bf fix critical CVE vulnerabliltes (#20683) 2026-02-07 22:23:01 -08:00
Ishaan Jaffer a002907389 fix tar security issue with TAR 2026-01-31 11:46:53 -08:00
Alexsander Hamir 1544e8f971 feat: Add line_profiler support for performance analysis and fix Windows CRLF issues in Docker builds (#18773) 2026-01-07 11:36:57 -08:00
Alexsander Hamir 454ffcd9c7 fix: install runtime node for prisma (#16410)
Prisma CLI recently started bootstrapping npm@10 inside the runtime image, which now fails with a sizeCalculation cache error on the slim Python base. Installing Debian's nodejs/npm (along with libatomic1) lets Prisma reuse the system binaries so prisma generate completes again.
2025-11-08 15:48:32 -08:00
Ishaan Jaff 209362664f add Dockerfile.dev 2025-06-03 12:03:52 -07:00