The dependency license checker only read the legacy free-text
`info.license` field from PyPI. Packages that adopt PEP 639 publish
their license as an SPDX expression in `info.license_expression` and
leave the legacy field null, so the checker reported "Unknown license"
and failed CI for every newly-bumped PEP 639 dependency.
`get_package_license_from_pypi` now resolves the license in order:
`license_expression`, then legacy `license`, then the
`License :: OSI Approved :: ...` trove classifiers.
`is_license_acceptable` splits compound SPDX expressions on the
uppercase OR/AND operators (case-sensitive, so the lowercase
`-or-later` inside an identifier is not mistaken for an operator) and
strips `WITH <exception>` suffixes, requiring every component to be
acceptable. Free-text license blobs are detected and fall back to the
original whole-string matching.
The `black` and `pydantic-settings` entries in liccheck.ini that
existed solely to work around this now resolve correctly on their own
and have been removed.
Split the monolithic LiteLLM proxy into independently scalable Kubernetes components to allow separate horizontal scaling of the LLM data plane and management API surfaces
- Add DatabaseURLSettings pydantic-settings model that assembles DATABASE_URL (and optional DATABASE_URL_READ_REPLICA) from discrete DATABASE_* env vars before Prisma initializes, supporting both IAM token auth (minting short-lived RDS tokens) and password auth; replaces the CLI-only path that componentized entrypoints bypass
- Add gateway component (port 4000) that trims the proxy route table to the LLM data-plane surface (chat, embeddings, completions, audio, realtime, provider passthroughs, health/metrics) via an allowlist applied inside the lifespan context so plugin-registered routes are captured
- Add backend component (port 4001) that exposes the management/admin surface (keys, users, teams, orgs, spend analytics, model management, SSO, audit logs) with a complementary allowlist
- Add ui component — Next.js static export served by nginx (port 3000) with RSC payload routing, asset prefix aliasing, and SPA fallback for dashboard routes
- Add migrations component with dedicated Dockerfile that runs prisma migrate deploy via a Helm pre-install/pre-upgrade Job, eliminating per-pod schema contention on the Prisma advisory lock
- Add Helm chart (helm/litellm) with separate Deployments, Services, HPAs, and ConfigMap for each component; shared _helpers.tpl emits DATABASE_*, IAM_TOKEN_DB_AUTH, REDIS_*, and DISABLE_SCHEMA_UPDATE env vars from chart values; ingress template routes traffic to the correct component by path prefix
- Add comprehensive tests for DatabaseURLSettings covering IAM auth, password auth, read replica fallbacks, operator-pinned URL preservation, and percent-encoding; add coverage test asserting gateway + backend allowlist union equals the full proxy route set
- Add pydantic-settings>=2.14.1 as a proxy extra dependency and update liccheck allowlist
Co-authored-by: Yassin Kortam <yassinkortam@g.ucla.edu>
* feat(audio_transcription): add NVIDIA Riva STT provider
Adds nvidia_riva as a new audio transcription provider, supporting both
NVCF-hosted and self-hosted Riva ASR deployments via gRPC streaming.
- Auto-resamples input audio to 16 kHz mono LINEAR_PCM (soundfile + numpy,
audioread fallback) so callers can send any common format.
- Maps OpenAI params: language (en -> en-US), response_format (text/json/
verbose_json), timestamp_granularities=["word"] -> enable_word_time_offsets,
word offsets converted ms -> s for verbose_json.
- Auth: NVCF when nvcf_function_id is set (SSL on by default), self-hosted
otherwise (SSL off by default), with explicit use_ssl override.
- gRPC errors wrapped via NvidiaRivaException -> litellm exception classes.
- Optional deps gated behind [stt-nvidia-riva] extra (nvidia-riva-client,
soundfile, audioread, numpy).
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(nvidia_riva): address PR review feedback
- handler: forward call-level `timeout` to streaming_response_generator
(kwarg-detected via inspect for older riva-client compat) so a stalled
Riva server cannot block the caller indefinitely.
- audio_utils: spill bytes to a tempfile before audioread.audio_open;
most audioread backends (FFmpeg, GStreamer) require a real filesystem
path and previously raised TypeError on BytesIO, breaking the mp3/m4a
fallback path.
- audio_utils: prefer soxr / scipy.signal.resample_poly for resampling
(anti-aliased polyphase) when installed, falling back to linear only
as a last resort. Avoids aliasing on 44.1/48 kHz -> 16 kHz downsamples.
- transformation: bare `es` now maps to es-ES (Castilian) instead of
es-US, matching BCP-47 conventions.
Co-authored-by: Cursor <cursoragent@cursor.com>
* chore: trigger CI re-run [stabilize loop 1/3]
* Update litellm/llms/nvidia_riva/audio_transcription/transformation.py
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
* chore: trigger CI re-run [stabilize loop 1/3]
* fix code qa
* fix lint
* fix mypy
* fix mypy
* Fix NVIDIA Riva ASR service lookup
* Fix NVIDIA Riva transcription payload logging
---------
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: oss-pr-review-agent-shin[bot] <281797381+oss-pr-review-agent-shin[bot]@users.noreply.github.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: mateo-berri <277851410+mateo-berri@users.noreply.github.com>
CI's license check fails on the new dev dep because liccheck cannot read
the PEP 639 'License-Expression' field that pytest-recording uses. Add
the package to the manually-verified allowlist (MIT, confirmed via PyPI
classifier).
Also addresses greptile P2 review comments:
- Add 'anthropic-version' to the request-header filter list so live and
mock recordings produce structurally identical cassettes.
- Replace the indentation-sensitive regex in
'_strip_nondeterministic_headers' with a YAML parse-and-rewrite so the
helper keeps working if vcrpy ever changes its serialization style.
Co-authored-by: Mateo Wang <mateo-berri@users.noreply.github.com>
langgraph-prebuilt was previously pulled in as a transitive of langgraph
so PyPI license metadata was reported as unknown. Now that it is
explicitly pinned (==1.0.8) to avoid the broken 1.0.9 release, the
license checker flags it. It is published under MIT by the same
langchain-ai/langgraph repository as langgraph itself.
RestrictedPython (ZPL-2.1, a BSD-style permissive license) was added as
a dependency for the custom_code guardrail sandbox, but the license
checker didn't recognize it. Add to authorized packages list.
* build: migrate packaging metadata to uv
* ci: move automation and local tooling to uv
* docker: migrate image builds and runtime setup to uv
* docs: update install and deployment guidance for uv
* chore: align auxiliary scripts and tests with uv
* test: harden test_litellm isolation
* fix: keep release and health check images self-contained
* build: pin uv tooling and health check deps
* test: isolate bedrock image request formatting from suite state
* test: cover sandbox executor requirements flow
* ci: fix circleci no-op command steps
* ci: fix circleci publish workflow parsing
* fix: stabilize remaining uv migration CI checks
* ci: increase matrix test timeout headroom
* fix: restore published docker and license coverage
* fix: restore proxy runtime build parity
* fix: restore proxy extras parity and venv migrations
* ci: persist uv path across circleci steps
* fix: keep psycopg binary in default test env
* docker: preserve prisma cache across stages
* test: run local proxy checks through uv python
* build: restore runtime deps moved into ci
* build: refresh uv lock after upstream merge
* fix: restore module import in test_check_migration after merge
The conflict resolution imported only the function but the test body
references check_migration as a module throughout.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: revert dependency promotions, remove nodejs-wheel-binaries, fix Docker layer caching
- Move google-generativeai, Pillow, tenacity back to ci group (they are
lazily imported and bloat the base SDK install needlessly)
- Remove nodejs-wheel-binaries from extra_proxy and proxy-dev (redundant
in Docker where system Node.js is already installed via apk)
- Remove all nodejs-wheel node replacement and venv npm patching blocks
from Dockerfiles since the wheel is no longer installed
- Add --no-default-groups to CodSpeed benchmark workflow so the benchmark
environment matches the old minimal pip install footprint
- Apply standard uv two-phase Docker pattern: copy metadata first, install
deps (cached layer), then copy source and install project
- Replace CircleCI enterprise no-op with proper uv sync command
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: regenerate uv.lock after removing nodejs-wheel-binaries
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(ci): use cache/restore instead of cache to prevent cache poisoning
The old workflow used actions/cache/restore (read-only). The uv migration
changed it to actions/cache (read-write), which zizmor flags as a cache
poisoning risk. Restore the safer read-only variant.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(ci): disable setup-uv built-in cache to silence cache-poisoning alert
The setup-uv action enables caching by default, which zizmor flags as a
cache poisoning risk. Disable it since we already use a read-only
cache/restore step.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(ci): disable setup-uv cache in publish workflow
Silences zizmor cache-poisoning alert. Publishing workflow runs
infrequently on protected branches so caching adds no real benefit.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(test): remove duplicate verbose_logger mock in test_check_migration
The logger was patched twice — first via mocker.patch() then via
mocker.patch.object(autospec=True). The second call fails because
autospec cannot inspect an already-mocked attribute. Remove the
redundant first patch.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(ci): free disk space before Docker build in test-server-root-path
The Dockerfile.non_root build ran out of disk on the CI runner. Remove
Android SDK, .NET, Boost, and GHC toolchains (~12GB) to free space.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
aioboto3 was listed as a dependency for async sagemaker calls but is not
imported anywhere in the codebase — async calls use httpx + botocore SigV4
instead. Removing it eliminates the unresolvable botocore version conflict
between boto3 and aiobotocore, along with all grep -v / --no-deps workarounds
across Dockerfiles and CI.
Also addresses Greptile review feedback: collapse redundant grpcio
python-version markers, bump pyproject.toml cryptography to 46.0.5 to
match Docker (GHSA-r6ph-v2qm-q3c2), and fix misleading .npmrc comment.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Both are transitive deps of aiobotocore, added to requirements.txt in
the previous commit. aioitertools is MIT, wrapt is BSD.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
hf-xet is Apache 2.0 licensed but PyPI metadata doesn't expose the
license string, so the automated checker can't determine it.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* build(README.md): initial commit adding a separate folder for additional proxy files. Meant to reduce size of core package
* build(litellm-proxy-extras/): new pip package for storing migration files
allows litellm proxy to use migration files, without adding them to core repo
* build(litellm-proxy-extras/): cleanup pyproject.toml
* build: move prisma migration files inside new proxy extras package
* build(run_migration.py): update script to write to correct folder
* build(proxy_cli.py): load in migration files from litellm-proxy-extras
Closes https://github.com/BerriAI/litellm/issues/9558
* build: add MIT license to litellm-proxy-extras
* test: update test
* fix: fix schema
* bump: version 0.1.0 → 0.1.1
* build(publish-proxy-extras.sh): add script for publishing new proxy-extras version
* build(liccheck.ini): add litellm-proxy-extras to authorized packages
* fix(litellm-proxy-extras/utils.py): move prisma migrate logic inside extra proxy pkg
easier since migrations folder already there
* build(pre-commit-config.yaml): add litellm_proxy_extras to ci tests
* docs(config_settings.md): document new env var
* build(pyproject.toml): bump relevant files when litellm-proxy-extras version changed
* build(pre-commit-config.yaml): run poetry check on litellm-proxy-extras as well