Pin all pip install commands to exact versions and SHA-pin all GitHub
Actions to prevent supply chain attacks. Remove snok/install-poetry
in favor of direct pip install. Delete orphaned load test scripts.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add `if: github.repository == 'BerriAI/litellm'` guard to scheduled
jobs in stale.yml, codeql.yml, and create_daily_staging_branch.yml.
This matches the existing pattern in auto_update_price_and_context_window.yml
and prevents these workflows from running unnecessarily on fork repositories.
The release job was failing with "Resource not accessible by integration"
because other jobs explicitly set permissions, causing GitHub to scope the
default token down for all jobs. The release job needs contents:write to
create GitHub releases.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds a new job to the existing daily staging branch workflow that creates
a `litellm_internal_dev_MM_DD_YYYY` branch from main twice a day. This
branch serves as a staging area before merging into main to improve
stability.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(codeql): switch to security-extended query suite
The security-and-quality suite produces result sets > 2 GiB on this
codebase, causing fatal OOM failures and blocking CI. Switching to
security-extended reduces query scope to security-only checks, which
still complete successfully. Quality/maintainability checks are
already covered by the existing lint pipeline.
* fix(codeql): exclude OOM queries from security-extended
The linting workflow force-installed openai==1.100.1 which conflicts
with litellm's requirement of openai>=2.8.0, causing pip dependency
resolver errors and CI cancellation.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
PR #22785 used pytest.importorskip which causes exit code 5 (all
skipped) in CI. Instead, add tenacity to the CI workflow pip install
and restore direct imports.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Run test_e2e_managed_batch with -vv -s for terminal output on failure
- PostgreSQL, Poetry, Prisma setup
- Upload logs as artifact on failure
Made-with: Cursor
The observatory test workflow failed because the "Verify tunnel
connectivity" step used a single curl with no retries. Cloudflare quick
tunnels need time for DNS propagation, and the first lookup can return
NXDOMAIN (curl exit 6). Replace with a retry loop (10 attempts, 5s
apart) matching the pattern already used in the health check step.
Also add `# noqa: PLR0915` to `_completion_streaming_iterator` in
router.py, matching the suppression already on its async twin.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
actions/checkout treats short commit hashes as branch names, causing
fetch failures. The checkout only needs the config file from the
repo, so use the default branch instead of a specific ref.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The secrets context is not available in step-level if: conditions,
causing the workflow file to fail validation. Move the conditional
check into the shell script instead.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Pass AZURE_API_KEY, AZURE_API_BASE, OBSERVATORY_URL,
OBSERVATORY_API_KEY, and REQUEST_ID through step-level env
blocks so they are never interpolated directly into shell scripts.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Validate inputs.tag matches vX.Y.Z format to prevent script
injection via workflow_dispatch
- Pass tag via env var instead of direct interpolation in shell
- Add cleanup step to kill cloudflared and remove docker container
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add timeout-minutes: 30 to prevent runaway jobs
- Build /run-test payload with jq --arg to safely escape
TUNNEL_URL and LITELLM_MASTER_KEY values
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fail early if request_id is missing or null from the /run-test
response instead of polling /run-status/null for 15 minutes.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Avoids shell quoting issues with single quotes in JSON and
multi-line output truncation when using GITHUB_OUTPUT.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add permissions block (contents: read) per GitHub security scan
- Poll /run-status/{request_id} instead of global /queue-status
to avoid race conditions with concurrent test runs
- Add result verification step that fails the workflow if tests
did not pass or the run errored
- Fix auth header to use X-LiteLLM-Observatory-API-Key
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- New reusable workflow that spins up a LiteLLM container from the
release image, exposes it via cloudflared tunnel, and triggers
test runs on the Railway-hosted observatory
- Integrates into ghcr_deploy.yml for RC and stable releases
- Can also be triggered manually via workflow_dispatch
- Add placeholder litellm_config.yaml for observatory test models
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The greptile suggestion in #22034 was applied without removing the
original env block, leaving a duplicate env key that makes the YAML
invalid. GitHub fails to parse the workflow on every push to main,
creating failed run entries ("No jobs were run").