Commit Graph

36319 Commits

Author SHA1 Message Date
Yuneng Jiang 310d61ba70 [Fix] Replace cast with proper typing for reasoning_items
Add reasoning_items field to ChatCompletionAssistantMessage TypedDict
and extract a typed _get_reasoning_items helper instead of using cast.
Also widen _reasoning_item_to_response_input to accept ChatCompletionReasoningItem.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 11:33:26 -07:00
Yuneng Jiang 0f785f988b [Fix] Add missing user_api_key_project_alias to failed-response PagerDuty event
The hanging-response constructor was fixed but the sibling failed-response
constructor at line 104 was still missing this field.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 11:07:04 -07:00
Yuneng Jiang f08d281641 [Fix] Resolve mypy type errors across 3 files
Add missing `user_api_key_project_alias` key to SpendLogsMetadata and
PagerDutyInternalEvent constructors, and cast `reasoning_items` to list
for safe iteration in responses transformation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 10:56:54 -07:00
yuneng-jiang 40d4e79a00 Merge pull request #24742 from BerriAI/litellm_gha_p3
[Infra] Add unit test workflows for Postgres, Redis, and security suites
2026-03-28 14:54:20 -07:00
Yuneng Jiang 3b5b98327e [Fix] Use integration-redis-postgres env for Redis workflows since Postgres always starts
GHA doesn't support conditional service containers, so the Postgres container
always starts even for Redis-only jobs. Use integration-redis-postgres
environment for any workflow with enable-redis so the Postgres container gets
valid credentials.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 14:25:29 -07:00
Yuneng Jiang 3ae80407dd [Fix] Move Postgres username and password to environment secrets
Move POSTGRES_USER and POSTGRES_PASSWORD from hardcoded values to
environment secrets so no credentials appear in workflow files at all.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 13:31:58 -07:00
Yuneng Jiang d42e2f6429 [Fix] Move Postgres DATABASE_URL to environment secret to avoid credential leak warnings
The hardcoded postgresql://postgres:postgres@localhost connection string was
being flagged by secret scanners. Move DATABASE_URL to a GHA environment
secret (integration-postgres) so the password is never in the workflow file.

Changes:
- _test-unit-services-base.yml: DATABASE_URL now comes from secrets, environment
  is derived from enable-* flags (integration-postgres, integration-redis, or
  integration-redis-postgres)
- test-unit-proxy-db.yml: switched to push-only trigger (uses secrets now)
- test-unit-security.yml: switched to push-only trigger (uses secrets now)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 13:28:41 -07:00
Yuneng Jiang 6549f3eb1a [Infra] Add unit test workflows for Postgres, Redis, and security test suites
Add three new GHA workflows for tests requiring service containers, plus a
reusable base workflow that provides Postgres and cloud Redis support.

New workflows:
- test-unit-proxy-db.yml: proxy DB tests (key generation, auth checks,
  remaining) using a local Postgres container with a 3-way descriptive matrix
- test-unit-caching-redis.yml: caching tests that need Redis but no provider
  API keys, using cloud Redis via the integration-redis environment
- test-unit-security.yml: proxy security tests using a local Postgres container

Reusable base (_test-unit-services-base.yml):
- Local Postgres pinned by digest (postgres@sha256:705a5d5b...)
- Cloud Redis credentials scoped to the integration-redis GHA environment
- Environment binding is derived from enable-redis flag inside the base
  (not caller-controllable) to prevent secret scope bypass
- Supports workers=0 for tests that cannot run in parallel

Security hardening:
- All actions pinned to commit SHAs
- persist-credentials: false on all checkouts
- permissions: contents: read only
- Postgres-only workflows (proxy-db, security) use zero secrets and trigger on
  both pull_request and push to main/litellm_*
- Redis workflow triggers on push only (not pull_request) to prevent external
  PRs from accessing Redis Cloud credentials
- Added ${TEST_PATH:?} guard to both _test-unit-base.yml and
  _test-unit-services-base.yml to fail fast on empty test paths
- All files pass zizmor --pedantic with zero findings

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 12:06:45 -07:00
yuneng-jiang 666a31d47a Merge pull request #24741 from BerriAI/litellm_gha_p2
[Fix] Test Isolation and Path Resolution for GHA Unit Tests
2026-03-28 11:32:48 -07:00
Yuneng Jiang 7851567091 [Fix] Scope documentation workflow to match CircleCI and add missing router settings
Revert path fixes for documentation tests that CircleCI never ran
(test_exception_types, test_general_setting_keys, test_readme_providers,
test_standard_logging_payload). Update the GHA workflow to run only the
4 tests CircleCI actually executed: test_env_keys, test_router_settings,
test_api_docs, test_circular_imports.

Add 2 missing router_settings keys (enable_health_check_routing,
health_check_staleness_threshold) and 27 missing general_settings keys
to config_settings.md so test_router_settings passes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 11:23:53 -07:00
Yuneng Jiang 7100ed5d0a [Fix] Test isolation for agent health checks and documentation test path resolution
Fix agent health check tests failing with 500 errors in parallel CI by
mocking prisma_client to None. Fix documentation validation tests using
CWD-relative paths that break depending on the working directory.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 11:00:22 -07:00
yuneng-jiang 428d837704 Merge pull request #24740 from BerriAI/litellm_unit_test_workflow_isolation
[Infra] Isolate unit test workflows with hardened security posture
2026-03-28 10:30:13 -07:00
Yuneng Jiang c717189ed2 [Infra] Remove workflows that require API keys or external services
These test suites are not pure unit tests and don't belong in Phase 1:
- litellm_utils_tests: health check tests need OPENAI_API_KEY
- pass_through_unit_tests: tests hit real Anthropic API
- router_unit_tests: tests call real OpenAI moderation endpoints
- proxy_security_tests: requires DATABASE_URL (Postgres)
- documentation_tests: requires docs directory at specific relative path

These will be re-added in later phases with proper secret scoping.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 10:16:19 -07:00
Yuneng Jiang a34ed20901 [Infra] Fix job naming in reusable workflow callers
Rename job keys from generic 'test' to descriptive names (e.g.,
'core-utils', 'proxy-auth', 'router') so GitHub checks display as
'core-utils / run' instead of 'test / test'.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 10:07:32 -07:00
Yuneng Jiang 3d527b722d [Infra] Add isolated unit test workflows with hardened security posture
Replace monolithic matrix workflow with individual, descriptively-named
workflow files. Each workflow uses a shared reusable base and follows
least-privilege security: zero secrets, read-only permissions, SHA-pinned
actions, persist-credentials: false, and env-var indirection to prevent
template injection.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 09:56:58 -07:00
ryan-crabbe-berri 2eb3c20e76 Merge pull request #24718 from BerriAI/litellm_ryan-march-26
litellm ryan march 26
2026-03-28 09:01:11 -07:00
ryan-crabbe-berri 726a34627c Merge pull request #24717 from BerriAI/litellm_fix-user-cache-invalidation
fix(jwt): invalidate user cache after role/team sync updates
2026-03-27 19:50:41 -07:00
ryan-crabbe-berri 7907e5e126 Merge pull request #24711 from BerriAI/litellm_fix-edit-budget
fix(ui): refactor budget page to React Query hooks and fix crashes
2026-03-27 19:49:30 -07:00
Ryan Crabbe dd11e77852 fix: add explicit TTL to cache writes and test coverage for user cache invalidation
Add DEFAULT_MANAGEMENT_OBJECT_IN_MEMORY_CACHE_TTL to both async_set_cache
calls in sync_user_role_and_teams for consistency with all other user cache
writes. Add 3 tests covering cache invalidation on role change, team change,
and no-op when nothing changes.
2026-03-27 19:45:13 -07:00
Ryan Crabbe 2ece79930b fix(jwt): invalidate user cache after role/team sync updates
sync_user_role_and_teams updates the DB when a user's JWT role changes,
but the in-memory cache retained the stale role until TTL expiry. This
caused subsequent requests to see the old role for up to 60 seconds.

Fix: accept user_api_key_cache parameter and re-cache the updated user
object after both role and team membership DB writes.
2026-03-27 19:36:38 -07:00
Ryan Crabbe 98ecf17550 fix(ui): refactor budget page to React Query hooks and fix crashes
- Migrate budget CRUD from manual state to React Query hooks (useBudgets, useCreateBudget, useUpdateBudget, useDeleteBudget)
- Fix crash when budget list contains null entries by filtering in query hook
- Fix max_budget type from string to number to match DB schema (double precision)
- Disable budget_id field in edit modal to prevent accidental changes
- Use budget_id as React key instead of array index
- Update tests to mock hooks instead of networking functions
2026-03-27 19:34:24 -07:00
ryan-crabbe-berri 5b651048f2 Merge pull request #24706 from BerriAI/litellm_fix-jwt-none-guard
fix(auth): guard JWTHandler.is_jwt() against None token
2026-03-27 18:06:24 -07:00
ryan-crabbe-berri a533de0b08 Merge pull request #24701 from BerriAI/litellm_fix-jwt-role-mappings
fix(sso): pass decoded JWT access token to role mapping during SSO login
2026-03-27 18:06:15 -07:00
ryan-crabbe-berri 52e9ca7a73 Merge pull request #24708 from BerriAI/litellm_fix-bulk-update
fix: add /user/bulk_update to management routes
2026-03-27 18:05:16 -07:00
Ryan Crabbe 0c67f274e5 docs: add /user/bulk_update to internal_user_endpoints module docstring 2026-03-27 18:01:08 -07:00
Ryan Crabbe a5ff668f5e fix: add /user/bulk_update to management_routes so proxy admins can access it
/user/bulk_update was missing from the management_routes list in _types.py,
causing it to fall through to a 403 in non_proxy_admin_allowed_routes_check
even for proxy admin users. Also added it to the PROXY_ADMIN_VIEW_ONLY
blocked write operations list in route_checks.py to prevent view-only
admins from using it.
2026-03-27 17:50:42 -07:00
yuneng-jiang fe080a86b2 Merge pull request #24705 from BerriAI/litellm_auto_schema_sync
[Infra] Automated schema.prisma sync and drift detection
2026-03-27 17:08:23 -07:00
yuneng-jiang 846e4b44b6 Merge pull request #24682 from michelligabriele/fix/budget-spend-counters
fix(proxy): enforce budget limits across multi-pod deployments via Redis-backed spend counters
2026-03-27 16:59:23 -07:00
Ryan Crabbe 8e3755931d test(auth): add regression tests for JWTHandler.is_jwt(None)
Add None-token test cases to both proxy_unit_tests and test_litellm
to cover the guard added in the previous commit. Also add -> bool
return type annotation to is_jwt().
2026-03-27 16:51:08 -07:00
Ryan Crabbe e36ab04a18 fix(auth): guard JWTHandler.is_jwt() against None token
When JWT auth is enabled and a request arrives without an Authorization
header (e.g. health checks, monitoring), api_key is None due to
APIKeyHeader(auto_error=False). The is_jwt() call crashes with
AttributeError: 'NoneType' object has no attribute 'split'.

Return False for None tokens since they are not JWTs.
2026-03-27 16:51:08 -07:00
Yuneng Jiang a074d1d68b [Infra] Mirror litellm_table_patch source changes (no binaries)
Cherry-pick source-only changes from litellm_table_patch, excluding
build artifacts from the incident response period.

- Remove destructive DROP COLUMN migration (20260311180521_schema_sync)
- Remove now-unnecessary restore migration (20260327232350)
- Bump litellm-proxy-extras 0.4.60 → 0.4.61
- Add regression test to block future DROP COLUMN migrations
- Fix double error handling in getTeamPermissionsCall

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 16:45:12 -07:00
Yuneng Jiang 46b92da0bd [Infra] Add migration for restored BYOM lifecycle fields
The schema sync adopted the proxy version which includes source_url,
approval_status, and other BYOM fields. These were previously dropped
in migration 20260311180521 due to schema drift. This migration
restores them to match the now-unified schema.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 16:24:03 -07:00
Yuneng Jiang e0e0c5e293 [Infra] Fix zizmor artipacked warnings on schema sync workflows
Add persist-credentials: false to check-schema-sync (read-only, no push needed).
Explicitly set persist-credentials: true on sync-schema (required for git push).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 16:14:06 -07:00
Yuneng Jiang 08e29e0a9a [Infra] Automated schema.prisma sync and drift detection
Sync all 3 schema.prisma copies and add GHA workflows to keep them in sync automatically.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 16:01:20 -07:00
Ryan Crabbe e24819afef fix(sso): pass decoded JWT access token to role mapping during SSO login
During SSO login, bearer tokens are stripped from the OAuth response
before role mapping runs. Custom role claims encoded inside the JWT
access token are lost, so map_jwt_role_to_litellm_role() returns None
and the user falls back to internal_user_viewer.

process_sso_jwt_access_token() now returns the decoded JWT payload, and
a new _sync_user_role_from_jwt_role_map() receives it so
jwt_litellm_role_map works correctly during SSO login.
2026-03-27 13:50:30 -07:00
michelligabriele d533b432fd fix(proxy): enforce budget limits across multi-pod deployments via Redis-backed spend counters
Budget checks on API keys, teams, and team members were not enforced in
multi-pod deployments because user_api_key_cache is intentionally
in-memory-only. Each pod tracked spend independently, so with N pods
the effective budget was N × max_budget.

Introduces a separate spend_counter_cache (DualCache wired to
redis_usage_cache) with atomic increment/read helpers:
- increment_spend_counters(): awaited in cost callback (not create_task)
  to update both in-memory and Redis before the next auth check
- get_current_spend(): reads Redis first (cross-pod authoritative),
  falls back to in-memory, then to cached object .spend from DB

Budget check functions (_virtual_key_max_budget_check,
_team_max_budget_check, _check_team_member_budget) now read spend via
get_current_spend() instead of cached object .spend fields.

When Redis is not configured, falls back to in-memory-only counters
(same as current single-instance behavior).

Fixes #23714
2026-03-27 20:39:52 +01:00
yuneng-jiang d949085310 Merge pull request #24697 from BerriAI/litellm_codeql_gha
[Infra] Improve CodeQL scanning coverage and schedule
2026-03-27 12:17:39 -07:00
yuneng-jiang 241c45663b Merge pull request #24696 from BerriAI/litellm_pin_prisma_node_ci
[Fix] Pin Prisma Node.js dependency in CI workflows
2026-03-27 12:17:26 -07:00
Yuneng Jiang ec4273ed8b [Infra] Improve CodeQL scanning coverage and schedule
Switch query suite from security-extended to security-and-quality to
match the default GitHub Advanced Security setup. Run scheduled scans
daily instead of weekly. Remove paths-ignore for _experimental/out so
build artifacts are also scanned.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 12:04:09 -07:00
yuneng-jiang 1b111d23f3 Merge pull request #24688 from Sameerlite/litellm_litellm_team-model-group-name-routing-fix
fix(team-routing): preserve sibling deployment candidates for team public models
2026-03-27 12:00:34 -07:00
Sameer Kankute c4159a2ade Fix codeql 2026-03-28 00:01:33 +05:30
Yuneng Jiang ca3457b091 Pin nodejs-wheel-binaries in CI workflows running prisma generate
prisma generate internally runs `npm install prisma@5.4.2` against the
npm registry at runtime. Without a bundled Node.js, this causes
ECONNRESET failures on flaky GitHub Actions network and leaves the
npm transitive dependency tree unpinned.

Pre-install nodejs-wheel-binaries==24.13.1 (matching the Dockerfiles)
so prisma uses the bundled Node/npm instead of fetching from the
registry.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 11:25:03 -07:00
yuneng-jiang 8c2c6a40a6 Merge pull request #24689 from Sameerlite/litellm_litellm_remove-200k-pricing-opus-sonnet-46
fix(pricing): remove above_200k_tokens price tiers for claude-opus-4-6 and claude-sonnet-4-6
2026-03-27 10:26:17 -07:00
Sameer Kankute b4d0e3213f Fix the Pricing changes for claude models 2026-03-27 22:44:40 +05:30
Sameer Kankute 453fc75ee9 fix(pricing): remove above_200k_tokens price tiers for claude-opus-4-6 and claude-sonnet-4-6
These models include the full 1M token context at standard pricing with no 2x surcharge above 200k tokens.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 22:44:40 +05:30
yuneng-jiang 2ac1efdc0d Merge pull request #24603 from Sameerlite/litellm_openrouter-wildcard-strip-prefix
fix(openrouter): strip routing prefix for wildcard proxy deployments
2026-03-27 10:11:01 -07:00
yuneng-jiang b6506bf40f Merge pull request #24610 from Sameerlite/litellm_lyria-3-cost-map-doc
feat(gemini): Lyria 3 preview models in cost map and docs
2026-03-27 10:10:03 -07:00
yuneng-jiang 4bf5e66dbf Merge pull request #24624 from Sameerlite/litellm_sanitize-proxy-inputs
fix(proxy): sanitize user_id input and block dangerous env var keys
2026-03-27 10:08:32 -07:00
Krrish Dholakia 412fd469e2 Merge pull request #24692 from BerriAI/litellm_security_townhall_blog
Litellm security townhall blog
2026-03-27 10:00:54 -07:00
yuneng-jiang 53ac4c5459 Merge pull request #24661 from Sameerlite/litellm_filter-metadata-user-id
fix(anthropic): strip undocumented keys from metadata before sending to API
2026-03-27 10:00:36 -07:00