* fix(projects): fire useProjects hook for all authenticated users, not just admins
* fix(routes): add /project/list and /project/info to internal_user_routes allowlist
* fix(projects): use members_with_roles + LiteLLM_UserTable.teams for membership checks
* feat(ui): add "Your Usage" view for admin users on usage page
Admins were forced to use the global usage view with no way to scope it
to their own activity without manually searching for themselves in the
user filter dropdown.
Adds a new "Your Usage" option (admin-only) to the usage view selector.
When selected, it locks the data to the admin's own user_id and hides
the "Filter by user" dropdown.
* feat(ui): wire my-usage view to admin's own user_id in UsagePageView
When usageView is "my-usage", effectiveUserId resolves to the logged-in
admin's own userID. The "Filter by user" dropdown is hidden in this
view (only shown for "global").
* add: screenshots for usage page Your Usage admin fix
* fix(ui): gate useProjects on admin roles to fix failing unit test
* feat(proxy): add /project/list and /project/info to internal user routes
* fix(enterprise): use members_with_roles and litellm_usertable.teams for project access checks
* remove .github screenshots and workflow file from PR
The pre-release detector in create-release.yml uses `\.dev` (literal dot
before `dev`), which matches PEP 440 canonical tags like `1.84.0.dev2`
but misses the SemVer/Docker form `1.84.0-dev.2` (hyphen-dev). Per the
release design doc's PyPI<->Docker mapping rule, both forms are valid
production-track release tags and both are pre-releases (opt-in via
`pip install --pre litellm`), so the workflow should mark them as
GitHub pre-releases either way.
Change the regex to `[-.]dev` so it accepts `.dev` and `-dev`.
`prerelease: false` was hardcoded, so dispatching create-release with
`1.84.0rc1`, `1.84.0.dev42`, or legacy `v1.83.13-nightly` would publish
them as stable releases on the GitHub Releases page. Derive the flag
from the tag instead.
The detector matches `rc`, `.dev`, `nightly`, `alpha`, `beta`. PEP 440
post-releases (`1.84.0.post1`) and legacy `-stable[.patch.N]` are
stable maintenance releases per PEP 440, so they intentionally do not
match.
The tag validator required a leading `v`, so dispatching create-release
with `1.84.0` (or `1.84.0rc1`, `1.84.0.dev42`, `1.84.0.post1`) failed
even though those are the new naming convention. Make the leading `v`
optional in both create-release.yml and create-release-branch.yml so
both legacy (`v1.83.10-stable`, `v1.83.14.rc.1`, `v1.82.3.dev.9`,
`v1.82.3-stable.patch.4`, `v1.83.13-nightly`) and new PEP 440 forms are
accepted during the transition. Refresh the input descriptions to show
the new examples.
Add a new CI workflow that rejects pull requests from forks when they:
- Modify uv.lock (any change at all)
- Add new dependencies to any pyproject.toml file (root, litellm-proxy-extras, enterprise)
Security properties:
- Uses pull_request (not pull_request_target) so no secrets are exposed
- All action refs pinned to full SHA hashes
- persist-credentials: false on all checkouts
- permissions: {} (no GitHub token permissions)
- No user-controlled input in run: blocks (no script injection)
- Proper TOML parsing via stdlib tomllib (not regex on raw text)
- Only triggers when dependency files are actually changed (paths filter)
Internal PRs (from branches in the canonical repo) skip the job entirely.
Co-authored-by: Krrish Dholakia <krrish-berri-2@users.noreply.github.com>
Replaces the rm-and-symlink hack with a plain actions/checkout
using path: docs/my-website. The previous approach failed on this
branch because docs/my-website no longer exists in the repo (its
parent docs/ directory was also removed), so ln -s had nowhere
to create the symlink.
Also adds the same checkout step to test-unit-documentation.yml,
which was silently relying on docs/my-website existing in-tree
for test_env_keys.py and test_router_settings.py.
The file was moved to tests/enterprise/litellm_enterprise/proxy/management_endpoints/
and is covered by the CircleCI litellm_mapped_enterprise_tests job. The stale path
was causing pytest to error with 'file or directory not found'.
The `_test-unit-services-base.yml` reusable workflow attached every job
to the `integration-postgres` GHA environment to read three "secrets":
DATABASE_URL, POSTGRES_USER, POSTGRES_PASSWORD. These are not secrets —
the postgres service container is spawned per-job on localhost and
destroyed with the job, so the user/password are bootstrap values for a
throwaway container and the URL is always `postgresql://…@localhost:…`.
Each environment attachment produces a "temporarily deployed to
integration-postgres" deployment record, which the PR timeline renders
as a message per matrix shard per push. With 14 proxy-db shards that's
~14 notifications per push, drowning the PR conversation.
Changes:
* Hardcode POSTGRES_USER/POSTGRES_PASSWORD/POSTGRES_DB and the derived
DATABASE_URL in `_test-unit-services-base.yml`.
* Delete the `environment: integration-postgres` attachment.
* Delete the `secrets:` declarations on the reusable workflow and on
the two callers (test-unit-proxy-db.yml, test-unit-security.yml).
* The `services:` container still starts a fresh postgres per job;
the connection string now matches what the container boots up with.
Security review: no regression. The environment wasn't gating anything
real — no protection rules configured, no approval gates, and the
branch restriction is already enforced by `on: push: branches: [...]`
on both caller workflows. Zizmor pedantic-mode findings are identical
before and after (same 6 pre-existing findings, zero new ones).
The `integration-postgres` environment and its three "secrets" in repo
settings are now unreferenced and can be deleted from repo admin.
test_db_schema_migration.py has exactly one test, and that test is mostly
waiting on prisma subprocesses (~170s: prisma migrate deploy + prisma
migrate diff). No CPU-bound Python work inside the test body, and only
one test in the file means xdist's parallelism is unused regardless.
Previous run on commit 5df9f397e6: 10.0m wall-clock for the shard, of
which 4:56 was silence between step start and pytest banner — the cost
of 4 xdist workers each cold-starting (pytest plugin load + litellm
import + pytest-cov instrumentation) so that exactly one of them could
pick up the single test.
Switching to workers: 0 takes the serial pytest branch in the base
workflow, which already handles this case correctly (no -n, no --dist).
Single-process startup instead of 4. Expected wall-clock: ~6m.
Two changes:
1. workers: 8 -> 4 on every non-serial proxy-db shard. ubuntu-latest is a
4-core runner; -n 8 oversubscribes 2x and workers block each other
during their cold-start imports (pytest-cov instruments every litellm
module per worker). Measured ~441% CPU locally with -n 8 on 8 cores
(i.e. ~55% effective). Matching -n to physical cores should give
~2x faster worker startup, which is where most of the ~9m wall-clock
per shard goes (7+ minutes is plugin load + xdist imports before any
test runs).
2. Revert the -k split on test_proxy_utils.py. It was split into
proxy-utils-a-h / proxy-utils-i-z as a semantic-adjacent hack; merge
back to a single proxy-utils shard. Still uses --dist=worksteal so
xdist can balance the 188 parametrized cases across workers.
Also drops the now-unused `keyword` input from _test-unit-services-base.yml
and its matching matrix field across all proxy-db entries.
Shard count: 14 -> 13 (+ the assert-shard-coverage guard).
Default GHA matrix job names join every matrix field, producing unreadable
check labels like:
'proxy-db (logging-misc, tests/proxy_unit_tests/test_proxy_reject_logging.py
tests/proxy_unit_tests/test_audit_logs_proxy.py ..., 8, loadscope, "", 15)'
Set the job's display name to '${{ matrix.test-group }}' so each check
shows just 'logging-misc', 'proxy-utils-a-h', etc.
Previous run (13.8m total) was bottlenecked by shards with 9-12m wall-clock.
Setup + xdist spawn + coverage teardown is ~3m per shard, so each shard's
pytest runtime must stay under ~4m to fit inside 7m total.
Observed per-shard pytest times (before split):
db-and-spend 9:08 (170s outlier: test_aaaasschema_migration_check)
proxy-server 7:15
logging-and-callbacks 6:45
guardrails-budget-hooks 6:37
proxy-utils 6:23
auth-and-jwt 6:54
Split 6 shards into 12, keeping key-generation and endpoints-and-responses
(already <7m). Adds a `keyword` input to _test-unit-services-base.yml so
test_proxy_utils.py can be split by -k expression (same file, two runners).
New matrix entries:
auth-and-jwt -> auth-checks + jwt-and-keys
proxy-server -> proxy-server-core + proxy-runtime
logging-and-callbacks -> custom-logging + logging-misc
db-and-spend -> schema-migration (isolated 170s test) + db-and-spend
guardrails-budget-hooks-> guardrails-hooks + budgets
proxy-utils -> proxy-utils-a-h + proxy-utils-i-z (-k split)
The -k expression split is verified to cover every one of the 64 test
functions in test_proxy_utils.py exactly once. The assert-shard-coverage
guard still catches any file not in any shard.
The create-branch job in create-release.yml calls the reusable
create-release-branch.yml workflow, which requires contents: write.
The top-level permissions: {} blocks the inherited default, and only
the release job overrode it, so the nested call failed with:
The nested job 'create-branch' is requesting 'contents: write',
but is only allowed 'contents: none'.
Add the permission at the calling job level so the reusable
workflow is granted what it needs.
Two fixes to proxy-db CI:
1. test_realtime_webrtc_endpoints.py's `proxy_app` fixture mutated the
module-global `proxy_server.master_key` without restoring it, leaking
state into any test that shared the same xdist worker. Under
--dist=loadscope with 2 workers (GHA proxy-endpoints), this caused the
google_endpoints tests to fail with "No api key passed in." because
user_api_key_auth saw a set master_key and a missing API key on the
test request. The fixture now saves and restores the original value.
2. Address the Greptile note that the semantic shard design has no
catch-all, so a new test file added to tests/proxy_unit_tests/ without
a matrix entry would silently skip CI. Adds an assert-shard-coverage
job that enumerates test_*.py files and fails the workflow if any are
not referenced by a matrix entry, with a clear message telling the
author which semantic shard to place it in. All proxy-db shards now
depend on this guard.
Split into two related cleanups:
1. Delete CCI jobs that duplicate GHA coverage:
- mcp_testing (tests/mcp_tests) — already run by test-mcp.yml
- litellm_mapped_tests_proxy_part1/part2 (tests/test_litellm/proxy) —
already run across test-unit-proxy-auth.yml, test-unit-proxy-endpoints.yml,
and test-unit-proxy-infra.yml
Add rag_endpoints and realtime_endpoints to test-unit-proxy-endpoints.yml
(they were only covered by the deleted CCI part2 job).
Remove the corresponding workflow wiring, coverage combine entries, and
upload-coverage dependencies in .circleci/config.yml.
2. Re-shard test-unit-proxy-db.yml from 4 alphabetic buckets to 8 semantic
ones (auth-and-jwt, proxy-server, logging-and-callbacks, db-and-spend,
guardrails-budget-hooks, endpoints-and-responses, plus the existing
serial key-generation and test_proxy_utils.py shards). New test files are
placed in whichever group they belong to instead of reshuffling slices.
Add a dist input to _test-unit-services-base.yml so the test_proxy_utils.py
shard can use --dist=worksteal to spread its ~64 (many parametrized)
functions across workers; the default --dist=loadscope pins a single file
to a single worker, which was the root cause of that shard running 10m+.
Extracts release branch creation into a separate reusable workflow
(create-release-branch.yml) that can be triggered independently via
workflow_dispatch or called from other workflows via workflow_call.
create-release.yml now dispatches it as a dependent job after the
release publishes, keeping both workflows decoupled.
Stragglers from the 2026-04-21 Python 3.12 standardization:
- .github/workflows/check_duplicate_issues.yml (was 3.11)
- .github/workflows/llm-translation-testing.yml (was 3.11)
- .github/workflows/scan_duplicate_issues.yml (was 3.13)
- .circleci proxy_build_from_pip_tests (was 3.13)
The only intentional non-3.12 CI job is installing_litellm_on_python_3_13,
which exists as an explicit "latest supported Python" smoke matrix.
Principle: GHA handles work that doesn't need external API keys; CCI
stays for integration tests that hit real API endpoints.
Four CCI jobs moved to new or extended GHA workflows:
1. check_code_and_doc_quality (was 25 runs: ruff + import-safety +
21 code_coverage_tests + 3 documentation_tests + circular-imports).
- The 21 tests/code_coverage_tests/*.py scripts and the 3
tests/documentation_tests/*.py scripts run in the new
.github/workflows/test-code-quality.yml workflow.
- ruff, import-safety, and circular-imports were already run by
.github/workflows/test-linting.yml — no new migration needed.
- The 3 documentation_tests scripts read
docs/my-website/docs/proxy/config_settings.md. Since docs have
moved to BerriAI/litellm-docs, the GHA workflow checks out that
repo and symlinks docs/my-website -> the checkout so the
existing hardcoded paths resolve without touching the scripts.
The stale local docs/my-website/ copy in this repo will be
removed in a separate PR.
2. semgrep (custom-rule SAST against .semgrep/rules).
- New .github/workflows/test-semgrep.yml.
3. installing_litellm_on_python + installing_litellm_on_python_3_13
(pip install compat checks on Python 3.12 and 3.13).
- New .github/workflows/test-install-litellm.yml as a matrix job.
- 3.12 run also verifies litellm_enterprise import; 3.13 run
skips that check (matches previous CCI behavior).
- installing_litellm_on_python_v2_migration_resolver stays in CCI
because it requires a postgres service.
CCI .circleci/config.yml: -112 lines, 4 jobs and their workflow refs
removed.
The "remaining" proxy-db job was consistently timing out at ~98% because
--dist=loadscope pins every test in test_proxy_utils.py (168+ parametrized
tests) to a single xdist worker. 7 workers finished their files in ~15
minutes, then one worker ran alone for another 8+ minutes and hit the
30-minute job cap.
Give test_proxy_utils.py its own matrix entry so its tests spread across
all 8 workers, and add it to the "remaining" ignore list.
- streaming_iterator.py: adopted main's more defensive version of the
tool-arg queueing check (.get() instead of [], isinstance guard) —
same logic, same behavior, lower crash surface
- model_prices_and_context_window.json + backup: combined staging's
search_context_cost_per_query fields (PR #24372) with main's new
supports_service_tier field — both are independent additions to the
same Gemini model entries
- test_streaming_handler.py: kept Azure streaming regression test
(PR #24354) and added main's two new Gemini legacy vertex
finish_reason normalization tests
- test_gemini_batch_embeddings.py: kept staging's unsupported-params
filtering tests (PR #24370) and added main's index/order test
Resolved conflicts:
- streaming_handler.py: combined role check (PR #24354, Azure streaming)
with reasoning_items check (new in main) — both are independent OR
conditions in is_chunk_non_empty()
- CI/CD: accepted main's versions throughout
- Redis tests migrated to CircleCI (PR #25354): removed enable-redis
from GH Actions workflows
- E2E UI tests restructured (PR #25365): simplified CircleCI job
- Coverage via Codecov added to all GH Actions unit test workflows
- Deleted test-litellm-matrix.yml and test-proxy-e2e-azure-batches.yml
(removed in main)
Required test-unit-* and related workflows only triggered on PRs targeting
main, so feature PRs routed through litellm_internal_staging or
litellm_oss_branch never dispatched the full suite. Branch protection
reported BLOCKED even when CircleCI was green.
Expand pull_request and push branch filters to also match
litellm_internal_staging, litellm_oss_branch, and "litellm_**" (using **
so branch names containing "/" also match).
Adds a GHA that fails PRs to main unless the head branch is
'litellm_internal_staging' or 'litellm_hotfix_*'. Also fails merge_group
events since merge queue is not in use.
- workflow proxy-config matrix: drop test_project*.py glob now that the
test lives under tests/enterprise/
- update uv.lock to match bumped litellm version
- fix mypy: loosen FieldInfo annotation on register_extra_ui_setting
(pydantic.Field stubs report the default's type) and silence
create_model overload resolution when passing **tuple_dict
- fix inline imports in moved test_project_endpoints_prisma.py to
target litellm_enterprise.proxy.management_endpoints.project_endpoints
1. exclude-newer: change from absolute "2026-04-10" to relative "3 days".
All pinned deps were published before the 3-day cutoff. Re-locked so
uv lock --check passes in test-mcp.yml and test-linting.yml.
2. test_eager_tiktoken_load: run all 10 env var values in a single
subprocess instead of spawning 10 separate processes. Each cold
import litellm takes ~78s on CI, so the old loop took ~13 min on a
single xdist worker. Now takes ~78s total.
3. proxy-db remaining timeout: increase from 20 to 30 minutes. The
remaining group has 51 test files and was consistently timing out at
71% across all branches (pre-existing issue, not migration-related).
* build: migrate packaging metadata to uv
* ci: move automation and local tooling to uv
* docker: migrate image builds and runtime setup to uv
* docs: update install and deployment guidance for uv
* chore: align auxiliary scripts and tests with uv
* test: harden test_litellm isolation
* fix: keep release and health check images self-contained
* build: pin uv tooling and health check deps
* test: isolate bedrock image request formatting from suite state
* test: cover sandbox executor requirements flow
* ci: fix circleci no-op command steps
* ci: fix circleci publish workflow parsing
* fix: stabilize remaining uv migration CI checks
* ci: increase matrix test timeout headroom
* fix: restore published docker and license coverage
* fix: restore proxy runtime build parity
* fix: restore proxy extras parity and venv migrations
* ci: persist uv path across circleci steps
* fix: keep psycopg binary in default test env
* docker: preserve prisma cache across stages
* test: run local proxy checks through uv python
* build: restore runtime deps moved into ci
* build: refresh uv lock after upstream merge
* fix: restore module import in test_check_migration after merge
The conflict resolution imported only the function but the test body
references check_migration as a module throughout.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: revert dependency promotions, remove nodejs-wheel-binaries, fix Docker layer caching
- Move google-generativeai, Pillow, tenacity back to ci group (they are
lazily imported and bloat the base SDK install needlessly)
- Remove nodejs-wheel-binaries from extra_proxy and proxy-dev (redundant
in Docker where system Node.js is already installed via apk)
- Remove all nodejs-wheel node replacement and venv npm patching blocks
from Dockerfiles since the wheel is no longer installed
- Add --no-default-groups to CodSpeed benchmark workflow so the benchmark
environment matches the old minimal pip install footprint
- Apply standard uv two-phase Docker pattern: copy metadata first, install
deps (cached layer), then copy source and install project
- Replace CircleCI enterprise no-op with proper uv sync command
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: regenerate uv.lock after removing nodejs-wheel-binaries
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(ci): use cache/restore instead of cache to prevent cache poisoning
The old workflow used actions/cache/restore (read-only). The uv migration
changed it to actions/cache (read-write), which zizmor flags as a cache
poisoning risk. Restore the safer read-only variant.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(ci): disable setup-uv built-in cache to silence cache-poisoning alert
The setup-uv action enables caching by default, which zizmor flags as a
cache poisoning risk. Disable it since we already use a read-only
cache/restore step.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(ci): disable setup-uv cache in publish workflow
Silences zizmor cache-poisoning alert. Publishing workflow runs
infrequently on protected branches so caching adds no real benefit.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(test): remove duplicate verbose_logger mock in test_check_migration
The logger was patched twice — first via mocker.patch() then via
mocker.patch.object(autospec=True). The second call fails because
autospec cannot inspect an already-mocked attribute. Remove the
redundant first patch.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(ci): free disk space before Docker build in test-server-root-path
The Dockerfile.non_root build ran out of disk on the CI runner. Remove
Android SDK, .NET, Boost, and GHC toolchains (~12GB) to free space.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Redis caching unit tests (test_dual_cache, test_redis_batch_optimizations,
test_router_utils) required Redis secrets that should live in CircleCI.
- Add redis_caching_unit_tests job to CircleCI config
- Delete test-unit-caching-redis.yml GHA workflow
- Remove all Redis plumbing (inputs, secrets, env vars) from
_test-unit-services-base.yml and its callers
Pin all cosign public key references to the immutable commit hash
(0112e53) that first introduced the key, instead of fetching it from
the release tag. This addresses the concern that an attacker with push
access could replace the key on main/tags and re-sign tampered images.
Docs now show two verification methods: commit hash (recommended) and
release tag (convenience), with explanation of why the hash is stronger.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Remove redundant matrix unit test workflow
All test paths in test-litellm-matrix.yml are fully covered by the
newer semantic unit test workflows (test-unit-*.yml), making the
matrix workflow redundant CI spend.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Add Codecov coverage reporting to semantic unit test workflows
Add coverage collection (--cov) and Codecov OIDC upload to both
reusable base workflows and all 12 caller workflows, replacing the
coverage reporting that was previously only in the matrix workflow.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Move id-token/pull-requests permissions to job level for multi-job workflows
For workflows with multiple jobs (llm-providers, proxy-db), move
id-token: write and pull-requests: write from workflow level to job
level so permissions are scoped to only the jobs that need them.
Removes zizmor inline suppressions that were masking the issue.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The proxy_e2e_azure_batches_tests workflow is consistently flaky and
does not provide reliable signal on whether changes break anything.
Remove the workflow from both CircleCI and GitHub Actions, along with
the test directory it exclusively used.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs(blog): add cosign Docker image verification instructions
Add steps for verifying Docker images with cosign to three security blog posts:
CI/CD v2, Security Townhall, and Security Update.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs(proxy): add cosign verification to Docker/Helm/Terraform deploy page
Add image signature verification steps to the main deployment doc so
users pulling Docker images know how to verify them with cosign.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: fixes
* Update index.md
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
* [Docs] Scope cosign signing docs to GHCR and specify starting version
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* [Docs] Add starting version callout to ci_cd_v2 blog post
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Krrish Dholakia <krrish+github@berri.ai>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>