Commit Graph

429 Commits

Author SHA1 Message Date
Sameer Kankute 0ee4d90d7e Fix enterpise bump yml 2026-03-09 16:43:40 +05:30
Sameer Kankute 4d92c720c7 Fix enterpise bump yml 2026-03-09 16:39:38 +05:30
Sameer Kankute a52a4fd28a fix(enterprise): create PR for version bump instead of pushing to protected main
Made-with: Cursor
2026-03-09 16:31:27 +05:30
Julio Quinteros Pro 512a5fa3c7 Merge pull request #22788 from BerriAI/fix/azure-batches-add-tenacity-ci
Add tenacity to e2e Azure batch CI and revert importorskip
2026-03-04 11:50:44 -03:00
Julio Quinteros Pro 75b2e40cd3 Remove incompatible openai==1.100.1 pin from linting CI
The linting workflow force-installed openai==1.100.1 which conflicts
with litellm's requirement of openai>=2.8.0, causing pip dependency
resolver errors and CI cancellation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 11:46:31 -03:00
Julio Quinteros Pro aa62ddaf0a Add tenacity to e2e Azure batch CI and revert importorskip
PR #22785 used pytest.importorskip which causes exit code 5 (all
skipped) in CI. Instead, add tenacity to the CI workflow pip install
and restore direct imports.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 11:45:14 -03:00
Sameer Kankute 213bf11ede Merge pull request #22763 from BerriAI/litellm_test_e2e_batches_test
feat(tests): add proxy e2e azure batches test
2026-03-04 18:28:52 +05:30
Sameer Kankute 7b6a972fed Add this test in cicd 2026-03-04 17:21:00 +05:30
Sameer Kankute 49738bb3e3 ci: add proxy e2e azure batches workflow
- Run test_e2e_managed_batch with -vv -s for terminal output on failure
- PostgreSQL, Poetry, Prisma setup
- Upload logs as artifact on failure

Made-with: Cursor
2026-03-04 17:15:33 +05:30
Chesars dad7805b42 fix(deps): update python-multipart version to 0.0.22 in all files
Align requirements.txt, CI workflow, liccheck, and license cache
with the >=0.0.22 constraint already set in pyproject.toml.
2026-03-03 15:09:33 -03:00
Julio Quinteros Pro 2f6298d00f Fix observatory tunnel flaky DNS and suppress PLR0915 in router
The observatory test workflow failed because the "Verify tunnel
connectivity" step used a single curl with no retries. Cloudflare quick
tunnels need time for DNS propagation, and the first lookup can return
NXDOMAIN (curl exit 6). Replace with a retry loop (10 attempts, 5s
apart) matching the pattern already used in the health check step.

Also add `# noqa: PLR0915` to `_completion_streaming_iterator` in
router.py, matching the suppression already on its async twin.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-01 17:45:49 -03:00
Julio Quinteros Pro cc0b1323d7 Fix observatory checkout failing on commit hash ref
actions/checkout treats short commit hashes as branch names, causing
fetch failures. The checkout only needs the config file from the
repo, so use the default branch instead of a specific ref.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-01 17:05:49 -03:00
Julio Quinteros Pro b40b1e6a4b Fix invalid secrets context in test-linting workflow
The secrets context is not available in step-level if: conditions,
causing the workflow file to fail validation. Move the conditional
check into the shell script instead.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-01 16:54:53 -03:00
Julio Quinteros Pro 369edb2afb Move all secrets to env blocks instead of direct interpolation
Pass AZURE_API_KEY, AZURE_API_BASE, OBSERVATORY_URL,
OBSERVATORY_API_KEY, and REQUEST_ID through step-level env
blocks so they are never interpolated directly into shell scripts.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-01 16:34:10 -03:00
Julio Quinteros Pro a24ba226ba Validate tag input and add explicit cleanup step
- Validate inputs.tag matches vX.Y.Z format to prevent script
  injection via workflow_dispatch
- Pass tag via env var instead of direct interpolation in shell
- Add cleanup step to kill cloudflared and remove docker container

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-01 16:19:30 -03:00
Julio Quinteros Pro a2946e2cc8 Add job timeout and use jq for safe JSON construction
- Add timeout-minutes: 30 to prevent runaway jobs
- Build /run-test payload with jq --arg to safely escape
  TUNNEL_URL and LITELLM_MASTER_KEY values

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-01 15:30:09 -03:00
Julio Quinteros Pro 7a46aaff2b Pin cloudflared to v2025.2.1 for reproducible builds
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-01 15:30:09 -03:00
Julio Quinteros Pro 58264aadb7 Validate request_id before polling
Fail early if request_id is missing or null from the /run-test
response instead of polling /run-status/null for 15 minutes.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-01 15:30:09 -03:00
Julio Quinteros Pro b4e0c4db07 Use temp file for JSON result passing between steps
Avoids shell quoting issues with single quotes in JSON and
multi-line output truncation when using GITHUB_OUTPUT.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-01 15:30:09 -03:00
Julio Quinteros Pro 1fdaa1588d Address PR review comments on observatory workflow
- Add permissions block (contents: read) per GitHub security scan
- Poll /run-status/{request_id} instead of global /queue-status
  to avoid race conditions with concurrent test runs
- Add result verification step that fails the workflow if tests
  did not pass or the run errored
- Fix auth header to use X-LiteLLM-Observatory-API-Key

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-01 15:30:09 -03:00
Julio Quinteros Pro d7dd7ef33b Add observatory test workflow for RC/stable releases
- New reusable workflow that spins up a LiteLLM container from the
  release image, exposes it via cloudflared tunnel, and triggers
  test runs on the Railway-hosted observatory
- Integrates into ghcr_deploy.yml for RC and stable releases
- Can also be triggered manually via workflow_dispatch
- Add placeholder litellm_config.yaml for observatory test models

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-01 15:30:09 -03:00
Julio Quinteros Pro bc9c28eb80 Merge pull request #22397 from BerriAI/fix/codeql-custom-workflow
fix(ci): replace default CodeQL with custom workflow to unblock CI
2026-02-28 17:19:42 -03:00
Ishaan Jaff b5f5b42035 bump: litellm-enterprise 0.1.32 → 0.1.33 + manual publish workflow (#22421)
* bump: litellm-enterprise 0.1.32 → 0.1.33

* ci: add manual workflow to publish litellm-enterprise to PyPI

* Apply suggestion from @greptile-apps[bot]

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

* Apply suggestion from @greptile-apps[bot]

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

* ci: add manual workflow to publish litellm-proxy-extras to PyPI

* fix(ci): commit before publish, add poetry.lock update to enterprise + proxy-extras workflows

---------

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
2026-02-28 10:56:15 -08:00
Julio Quinteros Pro ce0753243b Merge pull request #22193 from BerriAI/test/secret-scan-ci
test(ci): add secret scan test and CI job
2026-02-28 14:05:55 -03:00
Chesars 10a91c5199 fix(ci): remove duplicate env key in scan_duplicate_issues workflow
The greptile suggestion in #22034 was applied without removing the
original env block, leaving a duplicate env key that makes the YAML
invalid. GitHub fails to parse the workflow on every push to main,
creating failed run entries ("No jobs were run").
2026-02-28 13:27:59 -03:00
Cesar Garcia 7f5c8653f0 Merge pull request #18478 from Chesars/fix/prevent-scheduled-workflow-in-forks
fix: update_price_and_context_window workflow from running in forks
2026-02-28 13:10:15 -03:00
Julio Quinteros Pro d7340b595b Update .github/workflows/codeql.yml
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
2026-02-28 12:16:42 -03:00
Julio Quinteros Pro 53f3123030 fix(ci): add custom CodeQL workflow to replace expensive default setup
The default CodeQL setup runs all 45 Python security queries against the
entire codebase. Two queries (CleartextLogging, PolynomialReDoS) produce
result sets > 2 GiB, causing 49+ minute runs that fail and block CI.

- Add custom workflow with 30-minute timeout and concurrency limits
- Exclude py/clear-text-logging-sensitive-data (CWE-312)
- Exclude py/polynomial-redos (CWE-730)
- Skip scanning tests/, docs/, and UI build output

NOTE: The Default Setup must be disabled in repo Settings > Code security
before merging, otherwise both will run simultaneously.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 11:40:22 -03:00
Julio Quinteros Pro 5a28ca985c Update .github/workflows/scan_duplicate_issues.yml
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
2026-02-28 00:17:31 -03:00
Julio Quinteros Pro 94b7342da8 Update .github/workflows/check_duplicate_issues.yml
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
2026-02-28 00:17:22 -03:00
Julio Quinteros Pro 1c376afc85 fix(ci): use secrets context in ggshield step condition
Step-level env is not visible to the if condition — reference
secrets directly so ggshield actually runs when the key is configured.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 12:51:28 -03:00
Julio Quinteros Pro 05c3a95da8 fix(ci): add permissions block to secret-scan job
Address github-advanced-security bot review comment by setting explicit
minimal permissions (contents: read) for the GITHUB_TOKEN.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 12:48:43 -03:00
Julio Quinteros Pro 2fce35a162 test(ci): add secret scan test and CI job to prevent hardcoded credentials
- Add unit test that scans Python source for Base64 Basic Auth patterns
  that would be flagged by secret scanners like GitGuardian/ggshield
- Add secret-scan job to the linting CI workflow that runs the test on
  every PR and optionally runs ggshield if GITGUARDIAN_API_KEY is set

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 12:46:42 -03:00
Julio Quinteros Pro db3d61f433 feat(ci): add duplicate issue detection and auto-close bot
Add a Python script that detects duplicate issues using title similarity
(difflib.SequenceMatcher) and closes them via the gh CLI. Two-tier system:
- 0.6 threshold: informational comment via existing wow-actions step
- 0.85 threshold: auto-close with comment, label, and not_planned reason

Includes a workflow_dispatch workflow for one-time batch scans and
integrates auto-close into the existing check_duplicate_issues workflow
for newly opened issues.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 16:49:06 -03:00
Ryan Crabbe 079ff24d78 Revert duplicate issue checker to text-based matching, remove duplicate PR workflow
Remove the Claude Code-powered duplicate PR detection workflow and revert
the duplicate issue checker back to wow-actions/potential-duplicates with
text similarity matching.
2026-02-23 15:28:13 -08:00
Krrish Dholakia a26f83fd3c fix: update calendly on repo 2026-02-23 06:13:59 -08:00
Ryan Crabbe c7d3198d9a fix: pass prompt as env var in duplicate detection workflows
Fixes "Input must be provided either through stdin or as a prompt
argument" error by moving the prompt to a PROMPT env variable
instead of inline multiline shell string.
2026-02-21 14:29:08 -08:00
Ryan Crabbe 1d0f91010b feat: switch duplicate detection workflows from opencode to Claude Code
Route through LiteLLM proxy using LITELLM_VIRTUAL_KEY and LITELLM_BASE_URL
secrets. Also adds --repo flag to all gh commands to fix missing repo context.
2026-02-20 17:51:12 -08:00
yuneng-jiang deeaae7e10 Merge pull request #21606 from BerriAI/litellm_ai-duplicate-issue-detection
feat: upgrade duplicate issue detection to be AI-powered instead of title text
2026-02-20 09:48:32 -08:00
Julio Quinteros Pro b551b98b26 ci: further split b2/b3 to isolate single heavy files
Isolate the two dominant files so they no longer block smaller tests:
- proxy-unit-b2: test_proxy_server.py alone (2750 lines)
- proxy-unit-b3: test_proxy_server_*.py + test_proxy_setting_guardrails.py (618 lines)
- proxy-unit-b4: test_proxy_utils.py alone (2339 lines)
- proxy-unit-b5: test_proxy_token_counter.py (1279 lines)
- proxy-unit-b6: test_[r-t]*.py (renamed from b4, 1988 lines)
- proxy-unit-b7: test_[u-z]*.py (renamed from b5, 2394 lines)

Matrix grows from 18 → 20 jobs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-20 14:08:25 -03:00
Julio Quinteros Pro c8ddbd90d1 ci: rebalance matrix groups based on actual timings
Split the two slowest groups based on measured wall-clock times:
- proxy-unit-b2 (was 7m15s, test_proxy_[s-z]*):
  → proxy-unit-b2: test_proxy_s*.py  (server + setting_guardrails, ~3368 lines)
  → proxy-unit-b3: test_proxy_[t-z]*.py (utils + token_counter, ~3618 lines)
- proxy-unit-b3 (was 4m30s, test_[r-z]*):
  → proxy-unit-b4: test_[r-t]*.py (response_polling + search + skills + realtime, ~1988 lines)
  → proxy-unit-b5: test_[u-z]*.py (user_api_key_auth + zero_cost + update_spend + unit_tests, ~2394 lines)

proxy-unit-a2 (6m15s) will self-resolve once PR #21679 merges
(55 skip markers added to test_key_generate_prisma.py).

Matrix grows from 16 → 18 jobs; all groups expected ≤ 3-4m.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-20 14:08:25 -03:00
Julio Quinteros Pro 1572162fdc ci: split slow test matrix groups to reduce wall-clock time
Three groups were bottlenecking CI (proxy-unit-b: 15min, other: 20+min,
proxy-unit-a: 6min). Split each into smaller parallel jobs based on
actual line counts of the test files.

proxy-unit-a (6min) → proxy-unit-a1 + proxy-unit-a2
  - a1: test_[a-j]*.py  (jwt 1564, auth_checks 978, google_gemini 478, ...)
  - a2: test_[k-o]*.py  (key_generate_prisma 4346, ...)

proxy-unit-b (15min) → proxy-unit-b1 + proxy-unit-b2 + proxy-unit-b3
  - b1: prisma/project/prompt + test_proxy_[c-r]*.py  (config, custom, routes, ...)
  - b2: test_proxy_[s-z]*.py  (proxy_server 2745, proxy_utils 2339, proxy_token_counter 1276)
  - b3: test_[r-z]*.py  (response_polling 1399, user_api_key_auth 1136, ...)

other (20+min) → other-1 + other-2 + other-3
  - other-1: responses (5942) + caching (1723) + types (819) ≈ 8.5k lines
  - other-2: enterprise (3062) + google_genai (2511) + router_utils (1982) ≈ 7.6k lines
  - other-3: remaining 11 dirs ≈ 8.0k lines

Total matrix jobs: 11 → 16. No test files are added, removed, or skipped.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-20 14:08:25 -03:00
Ryan Crabbe dc13378505 feat: add AI-powered duplicate PR detection via opencode
Same approach as the duplicate issue detector — uses opencode run
with gh pr commands to find potentially duplicate open PRs when
external contributors open new PRs. Skips core team and bots.
2026-02-19 17:51:44 -08:00
Julio Quinteros Pro 11d0fca0de fix(ci): drop PAT_TOKEN_2 approval step, use github.token for auto-merge
PAT_TOKEN_2 does not have the scope for addPullRequestReview.
github.token cannot approve its own PR either, so drop the approval
step entirely. Auto-merge with github.token is enough: the PR will
merge automatically once required CI checks pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-19 21:17:40 -03:00
Julio Quinteros Pro 41776b0382 feat(ci): auto-approve and auto-merge the regenerated poetry.lock PR
Now that "Allow GitHub Actions to create and approve pull requests" is
enabled in repo settings:
- PR creation uses github.token (no secret needed)
- Approval uses PAT_TOKEN_2 (GitHub requires a different identity from
  the PR creator to approve)
- Auto-merge is enabled with --squash so the PR merges as soon as
  required checks pass

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-19 20:23:07 -03:00
Julio Quinteros Pro 9c70cd615a fix(ci): use PAT_TOKEN_2 for gh pr create
github.token cannot open PRs when "Allow GitHub Actions to create and
approve pull requests" is disabled in repo settings. PAT_TOKEN_2
bypasses that restriction.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-19 20:13:58 -03:00
Julio Quinteros Pro 08b5907c9c fix(ci): remove --no-update flag removed in Poetry 2.x
The workflow fails with:
  The option "--no-update" does not exist

--no-update was removed in Poetry 2.x. Plain `poetry lock` is the
correct equivalent — it re-solves only what pyproject.toml requires
without upgrading already-locked packages.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-19 20:07:14 -03:00
Julio Quinteros Pro 54470ec1d9 fix(ci): use github.token with explicit permissions instead of PAT secret
Drop the PAT_TOKEN_2 secret (whose scope is unknown) in favour of the
built-in github.token, which is always available. Grant it exactly the
two permissions it needs:
  - contents: write      → push the auto/regenerate-* branch
  - pull-requests: write → open the PR via gh cli

No external secret needed.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-19 19:56:59 -03:00
Julio Quinteros Pro 8cc50d6736 chore: fix stale GH_TOKEN comment 2026-02-19 19:55:44 -03:00
Julio Quinteros Pro e590674083 fix(ci): use PAT_TOKEN_2 instead of non-existent GH_TOKEN secret
GH_TOKEN is not configured in this repository. The correct PAT secret
is PAT_TOKEN_2, which has the permissions needed to push branches and
open PRs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-19 19:55:33 -03:00