* feat(audio_transcription): add NVIDIA Riva STT provider
Adds nvidia_riva as a new audio transcription provider, supporting both
NVCF-hosted and self-hosted Riva ASR deployments via gRPC streaming.
- Auto-resamples input audio to 16 kHz mono LINEAR_PCM (soundfile + numpy,
audioread fallback) so callers can send any common format.
- Maps OpenAI params: language (en -> en-US), response_format (text/json/
verbose_json), timestamp_granularities=["word"] -> enable_word_time_offsets,
word offsets converted ms -> s for verbose_json.
- Auth: NVCF when nvcf_function_id is set (SSL on by default), self-hosted
otherwise (SSL off by default), with explicit use_ssl override.
- gRPC errors wrapped via NvidiaRivaException -> litellm exception classes.
- Optional deps gated behind [stt-nvidia-riva] extra (nvidia-riva-client,
soundfile, audioread, numpy).
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(nvidia_riva): address PR review feedback
- handler: forward call-level `timeout` to streaming_response_generator
(kwarg-detected via inspect for older riva-client compat) so a stalled
Riva server cannot block the caller indefinitely.
- audio_utils: spill bytes to a tempfile before audioread.audio_open;
most audioread backends (FFmpeg, GStreamer) require a real filesystem
path and previously raised TypeError on BytesIO, breaking the mp3/m4a
fallback path.
- audio_utils: prefer soxr / scipy.signal.resample_poly for resampling
(anti-aliased polyphase) when installed, falling back to linear only
as a last resort. Avoids aliasing on 44.1/48 kHz -> 16 kHz downsamples.
- transformation: bare `es` now maps to es-ES (Castilian) instead of
es-US, matching BCP-47 conventions.
Co-authored-by: Cursor <cursoragent@cursor.com>
* chore: trigger CI re-run [stabilize loop 1/3]
* Update litellm/llms/nvidia_riva/audio_transcription/transformation.py
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
* chore: trigger CI re-run [stabilize loop 1/3]
* fix code qa
* fix lint
* fix mypy
* fix mypy
* Fix NVIDIA Riva ASR service lookup
* Fix NVIDIA Riva transcription payload logging
---------
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: oss-pr-review-agent-shin[bot] <281797381+oss-pr-review-agent-shin[bot]@users.noreply.github.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: mateo-berri <277851410+mateo-berri@users.noreply.github.com>
CI's license check fails on the new dev dep because liccheck cannot read
the PEP 639 'License-Expression' field that pytest-recording uses. Add
the package to the manually-verified allowlist (MIT, confirmed via PyPI
classifier).
Also addresses greptile P2 review comments:
- Add 'anthropic-version' to the request-header filter list so live and
mock recordings produce structurally identical cassettes.
- Replace the indentation-sensitive regex in
'_strip_nondeterministic_headers' with a YAML parse-and-rewrite so the
helper keeps working if vcrpy ever changes its serialization style.
Co-authored-by: Mateo Wang <mateo-berri@users.noreply.github.com>
langgraph-prebuilt was previously pulled in as a transitive of langgraph
so PyPI license metadata was reported as unknown. Now that it is
explicitly pinned (==1.0.8) to avoid the broken 1.0.9 release, the
license checker flags it. It is published under MIT by the same
langchain-ai/langgraph repository as langgraph itself.
RestrictedPython (ZPL-2.1, a BSD-style permissive license) was added as
a dependency for the custom_code guardrail sandbox, but the license
checker didn't recognize it. Add to authorized packages list.
* build: migrate packaging metadata to uv
* ci: move automation and local tooling to uv
* docker: migrate image builds and runtime setup to uv
* docs: update install and deployment guidance for uv
* chore: align auxiliary scripts and tests with uv
* test: harden test_litellm isolation
* fix: keep release and health check images self-contained
* build: pin uv tooling and health check deps
* test: isolate bedrock image request formatting from suite state
* test: cover sandbox executor requirements flow
* ci: fix circleci no-op command steps
* ci: fix circleci publish workflow parsing
* fix: stabilize remaining uv migration CI checks
* ci: increase matrix test timeout headroom
* fix: restore published docker and license coverage
* fix: restore proxy runtime build parity
* fix: restore proxy extras parity and venv migrations
* ci: persist uv path across circleci steps
* fix: keep psycopg binary in default test env
* docker: preserve prisma cache across stages
* test: run local proxy checks through uv python
* build: restore runtime deps moved into ci
* build: refresh uv lock after upstream merge
* fix: restore module import in test_check_migration after merge
The conflict resolution imported only the function but the test body
references check_migration as a module throughout.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: revert dependency promotions, remove nodejs-wheel-binaries, fix Docker layer caching
- Move google-generativeai, Pillow, tenacity back to ci group (they are
lazily imported and bloat the base SDK install needlessly)
- Remove nodejs-wheel-binaries from extra_proxy and proxy-dev (redundant
in Docker where system Node.js is already installed via apk)
- Remove all nodejs-wheel node replacement and venv npm patching blocks
from Dockerfiles since the wheel is no longer installed
- Add --no-default-groups to CodSpeed benchmark workflow so the benchmark
environment matches the old minimal pip install footprint
- Apply standard uv two-phase Docker pattern: copy metadata first, install
deps (cached layer), then copy source and install project
- Replace CircleCI enterprise no-op with proper uv sync command
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: regenerate uv.lock after removing nodejs-wheel-binaries
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(ci): use cache/restore instead of cache to prevent cache poisoning
The old workflow used actions/cache/restore (read-only). The uv migration
changed it to actions/cache (read-write), which zizmor flags as a cache
poisoning risk. Restore the safer read-only variant.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(ci): disable setup-uv built-in cache to silence cache-poisoning alert
The setup-uv action enables caching by default, which zizmor flags as a
cache poisoning risk. Disable it since we already use a read-only
cache/restore step.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(ci): disable setup-uv cache in publish workflow
Silences zizmor cache-poisoning alert. Publishing workflow runs
infrequently on protected branches so caching adds no real benefit.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(test): remove duplicate verbose_logger mock in test_check_migration
The logger was patched twice — first via mocker.patch() then via
mocker.patch.object(autospec=True). The second call fails because
autospec cannot inspect an already-mocked attribute. Remove the
redundant first patch.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(ci): free disk space before Docker build in test-server-root-path
The Dockerfile.non_root build ran out of disk on the CI runner. Remove
Android SDK, .NET, Boost, and GHC toolchains (~12GB) to free space.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
aioboto3 was listed as a dependency for async sagemaker calls but is not
imported anywhere in the codebase — async calls use httpx + botocore SigV4
instead. Removing it eliminates the unresolvable botocore version conflict
between boto3 and aiobotocore, along with all grep -v / --no-deps workarounds
across Dockerfiles and CI.
Also addresses Greptile review feedback: collapse redundant grpcio
python-version markers, bump pyproject.toml cryptography to 46.0.5 to
match Docker (GHSA-r6ph-v2qm-q3c2), and fix misleading .npmrc comment.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Both are transitive deps of aiobotocore, added to requirements.txt in
the previous commit. aioitertools is MIT, wrapt is BSD.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
hf-xet is Apache 2.0 licensed but PyPI metadata doesn't expose the
license string, so the automated checker can't determine it.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* build(README.md): initial commit adding a separate folder for additional proxy files. Meant to reduce size of core package
* build(litellm-proxy-extras/): new pip package for storing migration files
allows litellm proxy to use migration files, without adding them to core repo
* build(litellm-proxy-extras/): cleanup pyproject.toml
* build: move prisma migration files inside new proxy extras package
* build(run_migration.py): update script to write to correct folder
* build(proxy_cli.py): load in migration files from litellm-proxy-extras
Closes https://github.com/BerriAI/litellm/issues/9558
* build: add MIT license to litellm-proxy-extras
* test: update test
* fix: fix schema
* bump: version 0.1.0 → 0.1.1
* build(publish-proxy-extras.sh): add script for publishing new proxy-extras version
* build(liccheck.ini): add litellm-proxy-extras to authorized packages
* fix(litellm-proxy-extras/utils.py): move prisma migrate logic inside extra proxy pkg
easier since migrations folder already there
* build(pre-commit-config.yaml): add litellm_proxy_extras to ci tests
* docs(config_settings.md): document new env var
* build(pyproject.toml): bump relevant files when litellm-proxy-extras version changed
* build(pre-commit-config.yaml): run poetry check on litellm-proxy-extras as well