Add reasoning_items field to ChatCompletionAssistantMessage TypedDict
and extract a typed _get_reasoning_items helper instead of using cast.
Also widen _reasoning_item_to_response_input to accept ChatCompletionReasoningItem.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The hanging-response constructor was fixed but the sibling failed-response
constructor at line 104 was still missing this field.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add missing `user_api_key_project_alias` key to SpendLogsMetadata and
PagerDutyInternalEvent constructors, and cast `reasoning_items` to list
for safe iteration in responses transformation.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
GHA doesn't support conditional service containers, so the Postgres container
always starts even for Redis-only jobs. Use integration-redis-postgres
environment for any workflow with enable-redis so the Postgres container gets
valid credentials.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Move POSTGRES_USER and POSTGRES_PASSWORD from hardcoded values to
environment secrets so no credentials appear in workflow files at all.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The hardcoded postgresql://postgres:postgres@localhost connection string was
being flagged by secret scanners. Move DATABASE_URL to a GHA environment
secret (integration-postgres) so the password is never in the workflow file.
Changes:
- _test-unit-services-base.yml: DATABASE_URL now comes from secrets, environment
is derived from enable-* flags (integration-postgres, integration-redis, or
integration-redis-postgres)
- test-unit-proxy-db.yml: switched to push-only trigger (uses secrets now)
- test-unit-security.yml: switched to push-only trigger (uses secrets now)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add three new GHA workflows for tests requiring service containers, plus a
reusable base workflow that provides Postgres and cloud Redis support.
New workflows:
- test-unit-proxy-db.yml: proxy DB tests (key generation, auth checks,
remaining) using a local Postgres container with a 3-way descriptive matrix
- test-unit-caching-redis.yml: caching tests that need Redis but no provider
API keys, using cloud Redis via the integration-redis environment
- test-unit-security.yml: proxy security tests using a local Postgres container
Reusable base (_test-unit-services-base.yml):
- Local Postgres pinned by digest (postgres@sha256:705a5d5b...)
- Cloud Redis credentials scoped to the integration-redis GHA environment
- Environment binding is derived from enable-redis flag inside the base
(not caller-controllable) to prevent secret scope bypass
- Supports workers=0 for tests that cannot run in parallel
Security hardening:
- All actions pinned to commit SHAs
- persist-credentials: false on all checkouts
- permissions: contents: read only
- Postgres-only workflows (proxy-db, security) use zero secrets and trigger on
both pull_request and push to main/litellm_*
- Redis workflow triggers on push only (not pull_request) to prevent external
PRs from accessing Redis Cloud credentials
- Added ${TEST_PATH:?} guard to both _test-unit-base.yml and
_test-unit-services-base.yml to fail fast on empty test paths
- All files pass zizmor --pedantic with zero findings
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Revert path fixes for documentation tests that CircleCI never ran
(test_exception_types, test_general_setting_keys, test_readme_providers,
test_standard_logging_payload). Update the GHA workflow to run only the
4 tests CircleCI actually executed: test_env_keys, test_router_settings,
test_api_docs, test_circular_imports.
Add 2 missing router_settings keys (enable_health_check_routing,
health_check_staleness_threshold) and 27 missing general_settings keys
to config_settings.md so test_router_settings passes.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fix agent health check tests failing with 500 errors in parallel CI by
mocking prisma_client to None. Fix documentation validation tests using
CWD-relative paths that break depending on the working directory.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
These test suites are not pure unit tests and don't belong in Phase 1:
- litellm_utils_tests: health check tests need OPENAI_API_KEY
- pass_through_unit_tests: tests hit real Anthropic API
- router_unit_tests: tests call real OpenAI moderation endpoints
- proxy_security_tests: requires DATABASE_URL (Postgres)
- documentation_tests: requires docs directory at specific relative path
These will be re-added in later phases with proper secret scoping.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Rename job keys from generic 'test' to descriptive names (e.g.,
'core-utils', 'proxy-auth', 'router') so GitHub checks display as
'core-utils / run' instead of 'test / test'.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace monolithic matrix workflow with individual, descriptively-named
workflow files. Each workflow uses a shared reusable base and follows
least-privilege security: zero secrets, read-only permissions, SHA-pinned
actions, persist-credentials: false, and env-var indirection to prevent
template injection.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add DEFAULT_MANAGEMENT_OBJECT_IN_MEMORY_CACHE_TTL to both async_set_cache
calls in sync_user_role_and_teams for consistency with all other user cache
writes. Add 3 tests covering cache invalidation on role change, team change,
and no-op when nothing changes.
sync_user_role_and_teams updates the DB when a user's JWT role changes,
but the in-memory cache retained the stale role until TTL expiry. This
caused subsequent requests to see the old role for up to 60 seconds.
Fix: accept user_api_key_cache parameter and re-cache the updated user
object after both role and team membership DB writes.
- Migrate budget CRUD from manual state to React Query hooks (useBudgets, useCreateBudget, useUpdateBudget, useDeleteBudget)
- Fix crash when budget list contains null entries by filtering in query hook
- Fix max_budget type from string to number to match DB schema (double precision)
- Disable budget_id field in edit modal to prevent accidental changes
- Use budget_id as React key instead of array index
- Update tests to mock hooks instead of networking functions
/user/bulk_update was missing from the management_routes list in _types.py,
causing it to fall through to a 403 in non_proxy_admin_allowed_routes_check
even for proxy admin users. Also added it to the PROXY_ADMIN_VIEW_ONLY
blocked write operations list in route_checks.py to prevent view-only
admins from using it.
Add None-token test cases to both proxy_unit_tests and test_litellm
to cover the guard added in the previous commit. Also add -> bool
return type annotation to is_jwt().
When JWT auth is enabled and a request arrives without an Authorization
header (e.g. health checks, monitoring), api_key is None due to
APIKeyHeader(auto_error=False). The is_jwt() call crashes with
AttributeError: 'NoneType' object has no attribute 'split'.
Return False for None tokens since they are not JWTs.
The schema sync adopted the proxy version which includes source_url,
approval_status, and other BYOM fields. These were previously dropped
in migration 20260311180521 due to schema drift. This migration
restores them to match the now-unified schema.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add persist-credentials: false to check-schema-sync (read-only, no push needed).
Explicitly set persist-credentials: true on sync-schema (required for git push).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sync all 3 schema.prisma copies and add GHA workflows to keep them in sync automatically.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
During SSO login, bearer tokens are stripped from the OAuth response
before role mapping runs. Custom role claims encoded inside the JWT
access token are lost, so map_jwt_role_to_litellm_role() returns None
and the user falls back to internal_user_viewer.
process_sso_jwt_access_token() now returns the decoded JWT payload, and
a new _sync_user_role_from_jwt_role_map() receives it so
jwt_litellm_role_map works correctly during SSO login.
Budget checks on API keys, teams, and team members were not enforced in
multi-pod deployments because user_api_key_cache is intentionally
in-memory-only. Each pod tracked spend independently, so with N pods
the effective budget was N × max_budget.
Introduces a separate spend_counter_cache (DualCache wired to
redis_usage_cache) with atomic increment/read helpers:
- increment_spend_counters(): awaited in cost callback (not create_task)
to update both in-memory and Redis before the next auth check
- get_current_spend(): reads Redis first (cross-pod authoritative),
falls back to in-memory, then to cached object .spend from DB
Budget check functions (_virtual_key_max_budget_check,
_team_max_budget_check, _check_team_member_budget) now read spend via
get_current_spend() instead of cached object .spend fields.
When Redis is not configured, falls back to in-memory-only counters
(same as current single-instance behavior).
Fixes#23714
Switch query suite from security-extended to security-and-quality to
match the default GitHub Advanced Security setup. Run scheduled scans
daily instead of weekly. Remove paths-ignore for _experimental/out so
build artifacts are also scanned.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
prisma generate internally runs `npm install prisma@5.4.2` against the
npm registry at runtime. Without a bundled Node.js, this causes
ECONNRESET failures on flaky GitHub Actions network and leaves the
npm transitive dependency tree unpinned.
Pre-install nodejs-wheel-binaries==24.13.1 (matching the Dockerfiles)
so prisma uses the bundled Node/npm instead of fetching from the
registry.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
These models include the full 1M token context at standard pricing with no 2x surcharge above 200k tokens.
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>