Adds a CI job that rebuilds the admin UI from source and fails if the
committed static export at litellm/proxy/_experimental/out/ has drifted
from what npm run build produces. This prevents silently shipping stale
UI bytes and is a prerequisite for the non_root Dockerfile streamlining
work, which will stage the UI from _experimental/out/ directly instead
of rebuilding it inside the image.
Also regenerates litellm/proxy/_experimental/out/ to match a fresh
npm run build (Node 20.20.2) — the committed tree had drifted from
source prior to this commit.
Co-authored-by: yuneng-jiang <yuneng-berri@users.noreply.github.com>
Required test-unit-* and related workflows only triggered on PRs targeting
main, so feature PRs routed through litellm_internal_staging or
litellm_oss_branch never dispatched the full suite. Branch protection
reported BLOCKED even when CircleCI was green.
Expand pull_request and push branch filters to also match
litellm_internal_staging, litellm_oss_branch, and "litellm_**" (using **
so branch names containing "/" also match).
Adds a GHA that fails PRs to main unless the head branch is
'litellm_internal_staging' or 'litellm_hotfix_*'. Also fails merge_group
events since merge queue is not in use.
1. exclude-newer: change from absolute "2026-04-10" to relative "3 days".
All pinned deps were published before the 3-day cutoff. Re-locked so
uv lock --check passes in test-mcp.yml and test-linting.yml.
2. test_eager_tiktoken_load: run all 10 env var values in a single
subprocess instead of spawning 10 separate processes. Each cold
import litellm takes ~78s on CI, so the old loop took ~13 min on a
single xdist worker. Now takes ~78s total.
3. proxy-db remaining timeout: increase from 20 to 30 minutes. The
remaining group has 51 test files and was consistently timing out at
71% across all branches (pre-existing issue, not migration-related).
* build: migrate packaging metadata to uv
* ci: move automation and local tooling to uv
* docker: migrate image builds and runtime setup to uv
* docs: update install and deployment guidance for uv
* chore: align auxiliary scripts and tests with uv
* test: harden test_litellm isolation
* fix: keep release and health check images self-contained
* build: pin uv tooling and health check deps
* test: isolate bedrock image request formatting from suite state
* test: cover sandbox executor requirements flow
* ci: fix circleci no-op command steps
* ci: fix circleci publish workflow parsing
* fix: stabilize remaining uv migration CI checks
* ci: increase matrix test timeout headroom
* fix: restore published docker and license coverage
* fix: restore proxy runtime build parity
* fix: restore proxy extras parity and venv migrations
* ci: persist uv path across circleci steps
* fix: keep psycopg binary in default test env
* docker: preserve prisma cache across stages
* test: run local proxy checks through uv python
* build: restore runtime deps moved into ci
* build: refresh uv lock after upstream merge
* fix: restore module import in test_check_migration after merge
The conflict resolution imported only the function but the test body
references check_migration as a module throughout.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: revert dependency promotions, remove nodejs-wheel-binaries, fix Docker layer caching
- Move google-generativeai, Pillow, tenacity back to ci group (they are
lazily imported and bloat the base SDK install needlessly)
- Remove nodejs-wheel-binaries from extra_proxy and proxy-dev (redundant
in Docker where system Node.js is already installed via apk)
- Remove all nodejs-wheel node replacement and venv npm patching blocks
from Dockerfiles since the wheel is no longer installed
- Add --no-default-groups to CodSpeed benchmark workflow so the benchmark
environment matches the old minimal pip install footprint
- Apply standard uv two-phase Docker pattern: copy metadata first, install
deps (cached layer), then copy source and install project
- Replace CircleCI enterprise no-op with proper uv sync command
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: regenerate uv.lock after removing nodejs-wheel-binaries
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(ci): use cache/restore instead of cache to prevent cache poisoning
The old workflow used actions/cache/restore (read-only). The uv migration
changed it to actions/cache (read-write), which zizmor flags as a cache
poisoning risk. Restore the safer read-only variant.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(ci): disable setup-uv built-in cache to silence cache-poisoning alert
The setup-uv action enables caching by default, which zizmor flags as a
cache poisoning risk. Disable it since we already use a read-only
cache/restore step.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(ci): disable setup-uv cache in publish workflow
Silences zizmor cache-poisoning alert. Publishing workflow runs
infrequently on protected branches so caching adds no real benefit.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(test): remove duplicate verbose_logger mock in test_check_migration
The logger was patched twice — first via mocker.patch() then via
mocker.patch.object(autospec=True). The second call fails because
autospec cannot inspect an already-mocked attribute. Remove the
redundant first patch.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(ci): free disk space before Docker build in test-server-root-path
The Dockerfile.non_root build ran out of disk on the CI runner. Remove
Android SDK, .NET, Boost, and GHC toolchains (~12GB) to free space.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Redis caching unit tests (test_dual_cache, test_redis_batch_optimizations,
test_router_utils) required Redis secrets that should live in CircleCI.
- Add redis_caching_unit_tests job to CircleCI config
- Delete test-unit-caching-redis.yml GHA workflow
- Remove all Redis plumbing (inputs, secrets, env vars) from
_test-unit-services-base.yml and its callers
Pin all cosign public key references to the immutable commit hash
(0112e53) that first introduced the key, instead of fetching it from
the release tag. This addresses the concern that an attacker with push
access could replace the key on main/tags and re-sign tampered images.
Docs now show two verification methods: commit hash (recommended) and
release tag (convenience), with explanation of why the hash is stronger.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Remove redundant matrix unit test workflow
All test paths in test-litellm-matrix.yml are fully covered by the
newer semantic unit test workflows (test-unit-*.yml), making the
matrix workflow redundant CI spend.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Add Codecov coverage reporting to semantic unit test workflows
Add coverage collection (--cov) and Codecov OIDC upload to both
reusable base workflows and all 12 caller workflows, replacing the
coverage reporting that was previously only in the matrix workflow.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Move id-token/pull-requests permissions to job level for multi-job workflows
For workflows with multiple jobs (llm-providers, proxy-db), move
id-token: write and pull-requests: write from workflow level to job
level so permissions are scoped to only the jobs that need them.
Removes zizmor inline suppressions that were masking the issue.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The proxy_e2e_azure_batches_tests workflow is consistently flaky and
does not provide reliable signal on whether changes break anything.
Remove the workflow from both CircleCI and GitHub Actions, along with
the test directory it exclusively used.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs(blog): add cosign Docker image verification instructions
Add steps for verifying Docker images with cosign to three security blog posts:
CI/CD v2, Security Townhall, and Security Update.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs(proxy): add cosign verification to Docker/Helm/Terraform deploy page
Add image signature verification steps to the main deployment doc so
users pulling Docker images know how to verify them with cosign.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: fixes
* Update index.md
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
* [Docs] Scope cosign signing docs to GHCR and specify starting version
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* [Docs] Add starting version callout to ci_cd_v2 blog post
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Krrish Dholakia <krrish+github@berri.ai>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
- Use nullish coalescing for potentially null response body
- Create release as draft first, then publish atomically to avoid partial-release state
- Pin cosign.pub URL to release tag instead of main branch
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Prepend Docker image signature verification instructions to auto-generated
release notes, using the cosign public key committed to the repo.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add create-release.yml workflow triggered via workflow_dispatch to create
GitHub releases with auto-generated notes. Add cosign public key for
container image signature verification.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: harden npm supply chain — pin overrides, enforce npm ci, add ignore-scripts
Replace open-ended >= version overrides with exact pins matching lockfile
versions across all 6 package.json files. Remove dead overrides for packages
not present in lockfiles. Switch CI and devcontainer from npm install to
npm ci for deterministic lockfile-based installs.
Add .npmrc to all 7 JS project directories with ignore-scripts=true (blocks
postinstall RAT vectors like the axios@1.14.1 supply chain attack) and
min-release-age=3d (refuses packages published <3 days ago, requires npm
>=11.10). Remove Yarn-only resolutions field from docs/my-website.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: bump sharp to 0.33.5 in docs, add docs .npmrc
sharp 0.32.x uses postinstall to download native binaries, which breaks
with ignore-scripts=true. sharp 0.33+ distributes via optionalDependencies
instead, making it compatible with the new .npmrc hardening.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: remove docs .npmrc to fix Vercel deploy
Vercel's build for docs/my-website uses npm install which needs
sharp 0.32.6's postinstall script. Since we don't control Vercel's
build process, remove the .npmrc from docs rather than fight it.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: Dockerfile npm ci + nvm checksum verification
- Replace npm install with npm ci in Dockerfile.non_root,
Dockerfile.custom_ui, and spend-logs/Dockerfile for deterministic
lockfile-based installs
- Replace curl-pipe-bash nvm install with download-then-verify pattern
in build_admin_ui.sh, build_ui.sh, and build_ui_custom_path.sh
- Update nvm from v0.38.0 (2021) to v0.40.4 (Jan 2026) with SHA256
checksum verification before execution
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: macOS sha256sum compat + clarify min-release-age scope
- Use shasum -a 256 fallback on macOS where sha256sum is unavailable
- Clarify in .npmrc comments that min-release-age only protects local
npm install, not npm ci (used in CI)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
GHA doesn't support conditional service containers, so the Postgres container
always starts even for Redis-only jobs. Use integration-redis-postgres
environment for any workflow with enable-redis so the Postgres container gets
valid credentials.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Move POSTGRES_USER and POSTGRES_PASSWORD from hardcoded values to
environment secrets so no credentials appear in workflow files at all.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The hardcoded postgresql://postgres:postgres@localhost connection string was
being flagged by secret scanners. Move DATABASE_URL to a GHA environment
secret (integration-postgres) so the password is never in the workflow file.
Changes:
- _test-unit-services-base.yml: DATABASE_URL now comes from secrets, environment
is derived from enable-* flags (integration-postgres, integration-redis, or
integration-redis-postgres)
- test-unit-proxy-db.yml: switched to push-only trigger (uses secrets now)
- test-unit-security.yml: switched to push-only trigger (uses secrets now)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add three new GHA workflows for tests requiring service containers, plus a
reusable base workflow that provides Postgres and cloud Redis support.
New workflows:
- test-unit-proxy-db.yml: proxy DB tests (key generation, auth checks,
remaining) using a local Postgres container with a 3-way descriptive matrix
- test-unit-caching-redis.yml: caching tests that need Redis but no provider
API keys, using cloud Redis via the integration-redis environment
- test-unit-security.yml: proxy security tests using a local Postgres container
Reusable base (_test-unit-services-base.yml):
- Local Postgres pinned by digest (postgres@sha256:705a5d5b...)
- Cloud Redis credentials scoped to the integration-redis GHA environment
- Environment binding is derived from enable-redis flag inside the base
(not caller-controllable) to prevent secret scope bypass
- Supports workers=0 for tests that cannot run in parallel
Security hardening:
- All actions pinned to commit SHAs
- persist-credentials: false on all checkouts
- permissions: contents: read only
- Postgres-only workflows (proxy-db, security) use zero secrets and trigger on
both pull_request and push to main/litellm_*
- Redis workflow triggers on push only (not pull_request) to prevent external
PRs from accessing Redis Cloud credentials
- Added ${TEST_PATH:?} guard to both _test-unit-base.yml and
_test-unit-services-base.yml to fail fast on empty test paths
- All files pass zizmor --pedantic with zero findings
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Revert path fixes for documentation tests that CircleCI never ran
(test_exception_types, test_general_setting_keys, test_readme_providers,
test_standard_logging_payload). Update the GHA workflow to run only the
4 tests CircleCI actually executed: test_env_keys, test_router_settings,
test_api_docs, test_circular_imports.
Add 2 missing router_settings keys (enable_health_check_routing,
health_check_staleness_threshold) and 27 missing general_settings keys
to config_settings.md so test_router_settings passes.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fix agent health check tests failing with 500 errors in parallel CI by
mocking prisma_client to None. Fix documentation validation tests using
CWD-relative paths that break depending on the working directory.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
These test suites are not pure unit tests and don't belong in Phase 1:
- litellm_utils_tests: health check tests need OPENAI_API_KEY
- pass_through_unit_tests: tests hit real Anthropic API
- router_unit_tests: tests call real OpenAI moderation endpoints
- proxy_security_tests: requires DATABASE_URL (Postgres)
- documentation_tests: requires docs directory at specific relative path
These will be re-added in later phases with proper secret scoping.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Rename job keys from generic 'test' to descriptive names (e.g.,
'core-utils', 'proxy-auth', 'router') so GitHub checks display as
'core-utils / run' instead of 'test / test'.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace monolithic matrix workflow with individual, descriptively-named
workflow files. Each workflow uses a shared reusable base and follows
least-privilege security: zero secrets, read-only permissions, SHA-pinned
actions, persist-credentials: false, and env-var indirection to prevent
template injection.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add persist-credentials: false to check-schema-sync (read-only, no push needed).
Explicitly set persist-credentials: true on sync-schema (required for git push).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sync all 3 schema.prisma copies and add GHA workflows to keep them in sync automatically.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Switch query suite from security-extended to security-and-quality to
match the default GitHub Advanced Security setup. Run scheduled scans
daily instead of weekly. Remove paths-ignore for _experimental/out so
build artifacts are also scanned.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>