PAT_TOKEN_2 does not have the scope for addPullRequestReview.
github.token cannot approve its own PR either, so drop the approval
step entirely. Auto-merge with github.token is enough: the PR will
merge automatically once required CI checks pass.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Now that "Allow GitHub Actions to create and approve pull requests" is
enabled in repo settings:
- PR creation uses github.token (no secret needed)
- Approval uses PAT_TOKEN_2 (GitHub requires a different identity from
the PR creator to approve)
- Auto-merge is enabled with --squash so the PR merges as soon as
required checks pass
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
github.token cannot open PRs when "Allow GitHub Actions to create and
approve pull requests" is disabled in repo settings. PAT_TOKEN_2
bypasses that restriction.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The workflow fails with:
The option "--no-update" does not exist
--no-update was removed in Poetry 2.x. Plain `poetry lock` is the
correct equivalent — it re-solves only what pyproject.toml requires
without upgrading already-locked packages.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Drop the PAT_TOKEN_2 secret (whose scope is unknown) in favour of the
built-in github.token, which is always available. Grant it exactly the
two permissions it needs:
- contents: write → push the auto/regenerate-* branch
- pull-requests: write → open the PR via gh cli
No external secret needed.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
GH_TOKEN is not configured in this repository. The correct PAT secret
is PAT_TOKEN_2, which has the permissions needed to push branches and
open PRs.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When secrets.GH_TOKEN is not configured, the workflow fails immediately with:
"Input required and not supplied: token"
Using || github.token ensures a valid token is always available.
GH_TOKEN (PAT) is preferred when set; github.token is used as fallback.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
A heredoc inside \$() inside a double-quoted string inside a YAML multiline
run block breaks the YAML parser at line 60. Replaced with: write the PR
body to /tmp/pr-body.md using a standalone heredoc, then pass it via
gh pr create --body-file.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
GitHub Advanced Security flagged that the workflow had no permissions block,
leaving GITHUB_TOKEN with its default broad scope. All write operations
(git push, gh pr create) already use GH_TOKEN (PAT), so the implicit
GITHUB_TOKEN only needs read access.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
A re-run within the same second (or a leftover branch) would cause
`git push` to fail. Adding -f is safe since this is a bot-owned branch
that is immediately turned into a PR and never used for anything else.
Fixes inline suggestion from Greptile review.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Without the token in the checkout step the subsequent `git push` uses the
default GITHUB_TOKEN which lacks permission to push new branches, causing
the workflow to fail silently. Fixes issue flagged by Greptile review.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds a workflow that triggers whenever pyproject.toml is merged into main
and opens a PR with the refreshed lock file, fixing the recurring CI failure:
"pyproject.toml changed significantly since poetry.lock was last generated."
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Running `cd enterprise && poetry run pip install -e .` causes poetry to
create a separate venv in `enterprise/.venv` (since enterprise/ has its
own pyproject.toml). The main project's tests run with `.venv/bin/python`,
so the enterprise package installed in `enterprise/.venv` is never seen.
Fix: run `poetry run pip install -e enterprise/` from the repo root so
poetry uses the main project's venv. This ensures litellm_enterprise is
importable when tests run.
This explains why enterprise tests kept failing with:
AttributeError: '_PROXY_LiteLLMManagedFiles' object has no attribute
'_check_file_deletion_allowed'
even after --force-reinstall was added — the reinstall was going to the
wrong virtual environment.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The same PyPI-override issue existed in test-litellm.yml, test-mcp.yml,
and .circleci/config.yml. Also adds --no-deps (enterprise has no runtime
deps) to avoid redundant dependency resolution on every forced reinstall.
Addresses greptile review comments on PR #21481.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
poetry install includes litellm-enterprise from PyPI, then the editable
install step runs. When the same version is already installed, pip may
skip the editable install leaving the PyPI build in place - which may
lack methods added after the latest PyPI release. Adding
--force-reinstall ensures the local editable version always wins.
Fixes enterprise tests failing with AttributeError on methods that exist
locally but not in the cached PyPI-installed package.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
tests/proxy_unit_tests/test_key_generate_prisma.py imports PrismaClient
at module level, which triggers a Prisma binary check. Without running
prisma generate first, all tests in that file ERROR at collection time
with "Unable to find Prisma binaries. Please run 'prisma generate' first."
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Implements three key improvements to reduce test flakiness from parallel execution:
1. **Split Vertex AI tests into separate group** (workers: 1)
- Vertex AI tests often have environment variable pollution issues
- Running serially prevents cross-test interference with GOOGLE_APPLICATION_CREDENTIALS
- Isolates authentication-related test failures
2. **Reduce workers for other LLM tests** (4 -> 2)
- Decreases chance of race conditions and state conflicts
- Still parallel but with less contention
3. **Add --dist=loadscope to pytest-xdist**
- Keeps tests from the same file together on one worker
- Reduces interference between unrelated test modules
- Data shows 70% pass rate WITH loadscope vs 40% WITHOUT
- Better test isolation while maintaining parallelism
Note: loadscope exposes one tokenizer cache issue in core-utils which will be
fixed in a separate PR. The tradeoff is worth it (7/10 pass vs 4/10 without).
These changes address the root causes of intermittent test failures in:
PRs #21268, #21271, #21272, #21273, #21275, #21276:
- Environment variable pollution (GOOGLE_APPLICATION_CREDENTIALS, VERTEXAI_PROJECT)
- Global state conflicts (litellm.known_tokenizer_config)
- Async mock timing issues with parallel execution
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Remove pytest-retry from dev dependencies in pyproject.toml
- Add pytest-xdist as proper dev dependency (was only in pip install)
- Update CI workflow to reflect proper dependency management
- Prevents conflict between pytest-retry and pytest-rerunfailures
Having both pytest-retry and pytest-rerunfailures installed simultaneously
causes unpredictable behavior and excessive retries.
- Add pytest-xdist back to pip install line (required for -n flag)
- Was accidentally removed when removing pytest-retry
- Without pytest-xdist, all CI jobs fail with 'unrecognized option -n'
- Remove pytest-retry to avoid duplicate retry mechanisms (only use pytest-rerunfailures)
- Remove --dist loadgroup flag (no tests use xdist_group marker)
- Remove unused LITELLM_CI environment variable
- Remove sequential test step with error masking
- Simplify workflow for clarity
This fixes the issue where tests could be retried 60+ times due to
duplicate retry plugins (pytest-retry with retries=20 + pytest-rerunfailures
with --reruns 2-3).
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Reduce workers from 4 to 2 to avoid race conditions
- Add --reruns with 2-3 retries per test group
- Increase timeout from 15 to 20 minutes
- Add better test isolation
The test-complete aggregate job adds no value as GitHub Actions
already provides visibility into matrix job results.
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Split tests/test_litellm into 10 parallel CI jobs using GitHub Actions
matrix strategy to reduce PR feedback time from ~25 min to ~8-10 min.
Changes:
- Add new test-litellm-matrix.yml workflow with 10 matrix jobs:
- llms (~225 files, 4 workers)
- proxy-guardrails (~51 files, 4 workers)
- proxy-core (~52 files, 4 workers)
- proxy-misc (~77 files, 4 workers)
- integrations (~60 files, 4 workers)
- core-utils (~32 files, 2 workers)
- other (~69 files, 4 workers) - includes all previously uncovered dirs
- root (~34 files, 4 workers)
- proxy-unit-a (~20 files, 2 workers)
- proxy-unit-b (~28 files, 2 workers)
- Deprecate test-litellm.yml (moved to workflow_dispatch for manual use)
- Add matching Makefile targets for local testing:
- make test-unit-llms
- make test-unit-proxy-guardrails
- make test-unit-proxy-core
- make test-unit-proxy-misc
- make test-unit-integrations
- make test-unit-core-utils
- make test-unit-other
- make test-unit-root
- make test-proxy-unit-a
- make test-proxy-unit-b
Benefits:
- ~3x faster wall-clock time through parallelization
- Dependency caching for faster subsequent runs
- Concurrency control to cancel stale runs
- Better failure isolation per test group
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Replace independent auto-incrementing chart versioning with 1-1 sync
to LiteLLM version. This allows users to easily map Helm chart versions
to LiteLLM versions without needing to inspect appVersion.
Changes:
- Remove auto-increment logic that read from OCI registry
- Chart version now equals LiteLLM tag without 'v' prefix (v1.81.0 -> 1.81.0)
- appVersion equals full Docker tag (v1.81.0)
- Update both ghcr_deploy.yml and ghcr_helm_deploy.yml workflows
Before: helm chart 0.1.837 -> user has to guess LiteLLM version
After: helm chart 1.81.0 -> matches LiteLLM v1.81.0
References:
- https://codefresh.io/docs/docs/ci-cd-guides/helm-best-practices/
* fix: sync Helm chart versioning with production standards and Docker versions
- Update Chart.yaml version from 0.4.10 to 1.0.0 (SemVer 0.x is for development, 1.0+ for production)
- Update appVersion from v1.50.2 to v1.80.12 to match current Docker image version
- Update workflow defaults from 0.1.0 to 1.0.0 for new chart version scheme
- Maintain independent chart versioning per Helm best practices
This ensures:
- Helm chart follows SemVer production standards (1.x instead of 0.x)
- appVersion stays synchronized with Docker/application version
- Chart version remains independent for flexibility (can update chart without waiting for app releases)
* fix: sync Helm chart appVersion with Docker image tags in release workflow
Updates the GitHub workflow to ensure Helm chart appVersion matches the
Docker image tags that are actually published:
- For stable/rc releases: Uses the workflow input tag (e.g., v1.80.12)
- For latest/dev releases: Uses the release_type to match main-{type} tags
- Makes 'tag' input required to prevent accidental releases with wrong versions
- Simplifies fallback logic by removing git-describe dependency
This ensures the chart's appVersion correctly references Docker images
that exist, preventing deployment failures from missing image tags.
* Update ghcr_deploy.yml