mirror of
https://github.com/tiennm99/litellm.git
synced 2026-06-18 00:48:01 +00:00
5e16f20962
* test(proxy): phase-4 payload behavior pinning for tier-2/3 key + team management endpoints Extends the Phase 1–3 behavior-pin suite at tests/proxy_behavior/management/ with a second axis: payload-shape pinning. Phase 1–3 held payload minimal and pinned (actor, target) → status across 37 routes; Phase 4 holds the caller fixed at an authorized actor, varies the payload shape, and asserts the observable DB effect (on accept) or the named guard / row-unchanged (on reject). Faithfulness contract from Phase 1–3 is unchanged. Six families + one gap-closer (59 new scenarios, 620 → 679 total): * F1 — key budget / rate-limit (test_key_budget_limits.py, 18) * F2 — key↔team reassignment (test_key_team_change.py, 6) * F3 — team budget / rate-limit (test_team_budget_limits.py, 15) * F4 — member-info validation (test_team_member_info_validation.py, 5) * F5 — permission batching (test_team_permissions_bulk_update.py, 6) * F6 — org-scoped team access (+2 detail-string pins in existing files) * F7 — coverage gap-closer (test_f7_coverage_closeout.py, 7) Harness extensions in conftest.py (additive only): * create_scratch_org() seeder with its own scratch-prefixed budget row * budget / limit fields on create_scratch_team() * scratch teardown also sweeps litellm_organizationtable Coverage telemetry (behavior-suite-only): * key_management_endpoints.py 60 % → 65 % (+82 lines) * team_endpoints.py 62 % → 72 % (+137 lines, crosses 70 % stretch) Key lands under 70 % per plan §7 escape hatch — the gap is dominated by routes outside F1–F6 scope (key list/info v2 internals) and structurally dead org-budget guards (call sites at lines 889 + 2310 + 985 + 1751 load the org without include_budget_table=True, so org.litellm_budget_table is None at guard time and the aggregate guard no-ops). Pinned as observed no-op behavior so a future fix that flips the flag turns these into reds. Zero source-code changes; pyproject.toml diff is empty; test_route_coverage.py stays green untouched; G3 grep guards still green; local wall-time 14 s for the full suite (no coverage), 22 s with coverage. G4 regression-replay protocol executed against three representative fix-PR parents (410ce761dc,0bd49ecb8b,8bbc61e03c): all Phase 4 tests PASS at pre-fix SHAs — confirming the F1–F7 layer is a helper-body pin, not a regression-replay layer for those specific historical bypass shapes. Targeted RED-bait scenarios for each fix are left for a follow-up PR. * test(proxy): push key_management_endpoints.py past the 70% stretch (F7-extension) Adds 24 more payload-pin scenarios in test_f7_key_coverage_push.py following the same accepted-effect / rejected-guard pattern. Each scenario cites the file:line range it pins; same anti-snapshot rules apply. Target ranges (all reachable via HTTP-boundary payload variation): * 5942-6063 /key/health with metadata.logging → test_key_logging body * 4565-4692 /key/reset_spend happy + 404 + non-admin gate + value validation * 4421-4533 /key/regenerate ghost-404 + happy + new_key + grace_period * 4168-4202 _insert_deprecated_key body via grace_period * 6118-6133 _enforce_unique_key_alias duplicate-alias rejection * 6148-6169 validate_model_max_budget malformed-payload rejection * 4708-4789 validate_key_list_check user/team/org/key_hash branches * 2622-2733 /key/bulk_update mixed success/failure + admin gate + size limits * 2797-2950 /team/key/bulk_update all-keys path + explicit-keys dedupe + 404 * 5108-5207 /key/aliases admin + scoped + search-filter branches * 3253-3303 /key/info ghost + explicit-key + no-key-uses-auth-header * 3427-3436 generate_key_helper_fn budget_limits initialization * 1794-1815 prepare_key_update_data duration + budget_duration paths * 5280-5388 _build_filter_conditions across include_created_by_keys/team/sort/alias Coverage telemetry — full PR4 dataset: key_management_endpoints.py: 60 % → 71 % (+11 pts, +194 lines) team_endpoints.py: 62 % → 72 % (+10 pts, +137 lines) Both files now over the plan §7 PR4.M4 70 % stretch as a side effect of pinning real payload behavior. 721 tests pass in 19 s local (full suite, no coverage); 27 s with coverage. Zero source-code changes; pyproject.toml diff still empty; test_route_coverage.py + G3 grep guards still green. Honest finding (kept from the prior commit's body): four structurally-dead org-budget guards remain pinned as observed no-op behavior — they fire only when get_org_object is called with include_budget_table=True, which none of the four management-endpoint call sites currently do. Pinned so a future change that flips the flag turns these into reds. Two helper guards are honest-ceiling: _validate_reset_spend_value's isinstance check at line 4568 is unreachable from HTTP because Pydantic 422s non-float before the helper runs; same shape for /team/key/bulk_update's missing team_id / no-selector pre-handler guards. * test(proxy): address PR review — try/finally cleanup + loosen 500 envelope pins + Optional annotations Greptile review feedback on PR #28681: 1. Wrap manual budget-row cleanup in try/finally so an assertion failure doesn't leave non-scratch-prefixed budget rows orphaned across CI re-runs (test_team_new_with_team_member_budget_creates_budget_row and test_team_update_team_member_budget_upserts). 2. Loosen the two 500-status pins to in (400, 422, 500) — the named-guard substring is the real pin; the outer ValueError-wrap envelope is an implementation detail that a future improvement should be free to fix to a proper 400/422 without flipping these tests red. 3. Add missing Optional annotations on _seed_token's max_budget / metadata / team_id keyword args (they default to None). Greptile's typo flag on 'read-world' in the conftest comment is declined — 'read-world' is the project's established term for the immutable seeded world fixture (see other usages in conftest.py and actors.py). 721 tests still pass in 17 s.
362 lines
13 KiB
Python
362 lines
13 KiB
Python
"""Session-scoped async ASGI client for HTTP-boundary behavior tests."""
|
|
|
|
import os
|
|
import tempfile
|
|
import uuid
|
|
from dataclasses import dataclass
|
|
from typing import Any, AsyncIterator, Dict, Optional
|
|
|
|
import httpx
|
|
import pytest_asyncio
|
|
import yaml
|
|
from prisma import Json
|
|
|
|
from litellm.proxy.utils import hash_token
|
|
|
|
MASTER_KEY = "sk-1234"
|
|
SCRATCH_PREFIX = "scratch-"
|
|
|
|
|
|
def _write_minimal_proxy_config() -> str:
|
|
config = {
|
|
"general_settings": {"master_key": MASTER_KEY},
|
|
"litellm_settings": {},
|
|
}
|
|
database_url = os.environ.get("DATABASE_URL")
|
|
if database_url:
|
|
config["general_settings"]["database_url"] = database_url
|
|
f = tempfile.NamedTemporaryFile(mode="w", suffix=".yaml", delete=False)
|
|
yaml.dump(config, f)
|
|
f.close()
|
|
return f.name
|
|
|
|
|
|
@pytest_asyncio.fixture(scope="session")
|
|
async def proxy_app():
|
|
from litellm.proxy import proxy_server
|
|
from litellm.proxy.proxy_server import (
|
|
app,
|
|
cleanup_router_config_variables,
|
|
initialize,
|
|
proxy_startup_event,
|
|
)
|
|
|
|
cleanup_router_config_variables()
|
|
config_path = _write_minimal_proxy_config()
|
|
|
|
# proxy_startup_event re-reads master_key from LITELLM_MASTER_KEY and
|
|
# unconditionally overwrites the global, even when initialize() already
|
|
# set it from the config YAML. Force (not setdefault) both vars: an
|
|
# ambient LITELLM_MASTER_KEY with a different value would make the proxy
|
|
# authenticate on that key while the tests still send MASTER_KEY.
|
|
os.environ["LITELLM_MASTER_KEY"] = MASTER_KEY
|
|
os.environ["CONFIG_FILE_PATH"] = config_path
|
|
|
|
await initialize(config=config_path)
|
|
|
|
# /key/regenerate is gated behind premium_user; flipping it lets the matrix
|
|
# pin authz behavior instead of the licensing gate.
|
|
proxy_server.premium_user = True
|
|
|
|
async with proxy_startup_event(app):
|
|
proxy_server.premium_user = True # lifespan re-runs _license_check
|
|
# The lifespan fires check_view_exists() as a background task; on a
|
|
# fresh DB the first auth call races it and resolves user_id=None.
|
|
if proxy_server.prisma_client is not None:
|
|
await proxy_server.prisma_client.check_view_exists()
|
|
yield app
|
|
|
|
|
|
@pytest_asyncio.fixture(scope="session")
|
|
async def proxy_client(proxy_app) -> AsyncIterator[httpx.AsyncClient]:
|
|
transport = httpx.ASGITransport(app=proxy_app)
|
|
async with httpx.AsyncClient(
|
|
transport=transport, base_url="http://testserver"
|
|
) as client:
|
|
yield client
|
|
|
|
|
|
@pytest_asyncio.fixture(scope="session")
|
|
async def prisma(proxy_app):
|
|
from litellm.proxy import proxy_server
|
|
|
|
assert proxy_server.prisma_client is not None
|
|
return proxy_server.prisma_client
|
|
|
|
|
|
@pytest_asyncio.fixture(scope="session")
|
|
async def world(prisma):
|
|
from .actors import seed_world
|
|
|
|
return await seed_world(prisma)
|
|
|
|
|
|
@dataclass(frozen=True)
|
|
class Scratch:
|
|
prefix: str
|
|
|
|
def tag(self, suffix: str = "") -> str:
|
|
return f"{self.prefix}-{suffix}" if suffix else self.prefix
|
|
|
|
|
|
async def create_scratch_key(
|
|
proxy_client,
|
|
seeder_cleartext: str,
|
|
scratch_prefix: str,
|
|
*,
|
|
user_id: str,
|
|
team_id: Optional[str] = None,
|
|
organization_id: Optional[str] = None,
|
|
key_alias: Optional[str] = None,
|
|
) -> str:
|
|
"""Seed a scratch-tagged key via /key/generate; returns its cleartext.
|
|
|
|
Shared by the write-scenario matrices (key update/regenerate/delete).
|
|
key_alias defaults to scratch_prefix; pass a distinct scratch-prefixed
|
|
alias when a single scenario needs more than one key (/key/generate
|
|
enforces unique aliases).
|
|
"""
|
|
body: Dict[str, Any] = {
|
|
"key_alias": key_alias or scratch_prefix,
|
|
"user_id": user_id,
|
|
}
|
|
if team_id is not None:
|
|
body["team_id"] = team_id
|
|
if organization_id is not None:
|
|
body["organization_id"] = organization_id
|
|
resp = await proxy_client.post(
|
|
"/key/generate",
|
|
headers={"Authorization": f"Bearer {seeder_cleartext}"},
|
|
json=body,
|
|
)
|
|
assert resp.status_code == 200, f"setup failed: {resp.text}"
|
|
return resp.json()["key"]
|
|
|
|
|
|
async def create_scratch_team(
|
|
prisma,
|
|
team_id: str,
|
|
*,
|
|
organization_id: Optional[str] = None,
|
|
admin_user_ids: Optional[list] = None,
|
|
member_user_ids: Optional[list] = None,
|
|
team_member_permissions: Optional[list] = None,
|
|
models: Optional[list] = None,
|
|
max_budget: Optional[float] = None,
|
|
tpm_limit: Optional[int] = None,
|
|
rpm_limit: Optional[int] = None,
|
|
metadata: Optional[dict] = None,
|
|
) -> str:
|
|
"""Raw-seed a scratch-tagged team row; returns its team_id.
|
|
|
|
The target team for the team write matrices (update / member_*). Raw
|
|
prisma (not POST /team/new) avoids creation side effects — no creator
|
|
auto-add, no membership rows written onto the world's users — so seeding
|
|
never mutates the immutable read-world. The authz gates read the team's
|
|
members_with_roles JSON, so a raw-seeded team exercises them exactly as
|
|
a /team/new-created team would. team_id must start with the scratch
|
|
prefix so the `scratch` fixture reclaims the row.
|
|
|
|
team_member_permissions / models seed the matching raw columns — needed
|
|
by the team-key-permission and team-model matrices.
|
|
|
|
max_budget / tpm_limit / rpm_limit / metadata seed the team's own limit
|
|
columns (Phase 4 F1+F3) — they live directly on LiteLLM_TeamTable, no
|
|
budget-table relation needed.
|
|
"""
|
|
admin_user_ids = list(admin_user_ids or [])
|
|
member_user_ids = list(member_user_ids or [])
|
|
members_with_roles = [
|
|
{"user_id": uid, "role": "admin"} for uid in admin_user_ids
|
|
] + [{"user_id": uid, "role": "user"} for uid in member_user_ids]
|
|
data: Dict[str, Any] = {
|
|
"team_id": team_id,
|
|
"team_alias": team_id,
|
|
"admins": admin_user_ids,
|
|
"members": admin_user_ids + member_user_ids,
|
|
"members_with_roles": Json(members_with_roles),
|
|
}
|
|
if organization_id is not None:
|
|
data["organization_id"] = organization_id
|
|
if team_member_permissions is not None:
|
|
data["team_member_permissions"] = team_member_permissions
|
|
if models is not None:
|
|
data["models"] = models
|
|
if max_budget is not None:
|
|
data["max_budget"] = max_budget
|
|
if tpm_limit is not None:
|
|
data["tpm_limit"] = tpm_limit
|
|
if rpm_limit is not None:
|
|
data["rpm_limit"] = rpm_limit
|
|
if metadata is not None:
|
|
data["metadata"] = Json(metadata)
|
|
await prisma.db.litellm_teamtable.create(data=data)
|
|
return team_id
|
|
|
|
|
|
async def create_scratch_org(
|
|
prisma,
|
|
scratch_prefix: str,
|
|
*,
|
|
max_budget: Optional[float] = None,
|
|
tpm_limit: Optional[int] = None,
|
|
rpm_limit: Optional[int] = None,
|
|
models: Optional[list] = None,
|
|
metadata: Optional[dict] = None,
|
|
suffix: str = "org",
|
|
) -> str:
|
|
"""Seed a scratch-tagged org + its own budget row; returns organization_id.
|
|
|
|
The org's `budget_id` points at a fresh `litellm_budgettable` row that
|
|
carries the per-org limits (`_check_org_key_limits` and the team budget
|
|
helpers read `org_table.litellm_budget_table.<limit>`, not columns on the
|
|
org row itself). Both rows share the scratch prefix so the teardown
|
|
reclaims them — budget by `budget_id` prefix (already swept), org by
|
|
`organization_id` prefix (added in this PR to the `scratch` fixture).
|
|
|
|
models / metadata seed the matching org columns; `_check_org_team_limits`
|
|
(F3) reads `org_table.models`, and the org metadata mirror of
|
|
model_rpm_limit / model_tpm_limit is what F1's model-specific org guard
|
|
consults.
|
|
"""
|
|
org_id = f"{scratch_prefix}-{suffix}"
|
|
budget_id = f"{scratch_prefix}-{suffix}-budget"
|
|
budget_data: Dict[str, Any] = {
|
|
"budget_id": budget_id,
|
|
"created_by": "phase4-scratch",
|
|
"updated_by": "phase4-scratch",
|
|
}
|
|
if max_budget is not None:
|
|
budget_data["max_budget"] = max_budget
|
|
if tpm_limit is not None:
|
|
budget_data["tpm_limit"] = tpm_limit
|
|
if rpm_limit is not None:
|
|
budget_data["rpm_limit"] = rpm_limit
|
|
await prisma.db.litellm_budgettable.create(data=budget_data)
|
|
|
|
org_data: Dict[str, Any] = {
|
|
"organization_id": org_id,
|
|
"organization_alias": org_id,
|
|
"budget_id": budget_id,
|
|
"created_by": "phase4-scratch",
|
|
"updated_by": "phase4-scratch",
|
|
}
|
|
if models is not None:
|
|
org_data["models"] = models
|
|
if metadata is not None:
|
|
org_data["metadata"] = Json(metadata)
|
|
await prisma.db.litellm_organizationtable.create(data=org_data)
|
|
return org_id
|
|
|
|
|
|
@dataclass(frozen=True)
|
|
class SeededActor:
|
|
user_id: str
|
|
cleartext: str
|
|
hashed: str
|
|
|
|
|
|
async def create_scratch_actor(
|
|
prisma,
|
|
scratch_prefix: str,
|
|
*,
|
|
user_role: str,
|
|
org_admin_of: tuple = (),
|
|
organization_id: Optional[str] = None,
|
|
suffix: str = "actor",
|
|
) -> SeededActor:
|
|
"""Mint a scratch-prefixed user + verification token (+ org memberships).
|
|
|
|
Reclaimed by the existing `scratch` teardown, which sweeps
|
|
litellm_usertable, litellm_verificationtoken, and
|
|
litellm_organizationmembership by scratch prefix — no bespoke cleanup
|
|
needed. Does NOT write litellm_teammembership against world teams: the
|
|
teardown reclaims that table only by team_id prefix, so a scratch actor
|
|
needing team membership must join a scratch team instead. The cleartext
|
|
is hashed with the real hash_token so the key authenticates end-to-end;
|
|
models=[] satisfies LiteLLM_VerificationTokenView.
|
|
"""
|
|
user_id = f"{scratch_prefix}-{suffix}"
|
|
cleartext = "sk-" + uuid.uuid4().hex
|
|
hashed = hash_token(cleartext)
|
|
await prisma.db.litellm_usertable.create(
|
|
data={
|
|
"user_id": user_id,
|
|
"user_role": user_role,
|
|
"organization_id": organization_id,
|
|
}
|
|
)
|
|
token_data: Dict[str, Any] = {
|
|
"token": hashed,
|
|
"key_name": f"{scratch_prefix}-{suffix}-key",
|
|
"key_alias": f"{scratch_prefix}-{suffix}-alias",
|
|
"user_id": user_id,
|
|
"models": [],
|
|
}
|
|
if organization_id is not None:
|
|
token_data["organization_id"] = organization_id
|
|
await prisma.db.litellm_verificationtoken.create(data=token_data)
|
|
for org_id in org_admin_of:
|
|
await prisma.db.litellm_organizationmembership.create(
|
|
data={
|
|
"user_id": user_id,
|
|
"organization_id": org_id,
|
|
"user_role": "org_admin",
|
|
}
|
|
)
|
|
return SeededActor(user_id=user_id, cleartext=cleartext, hashed=hashed)
|
|
|
|
|
|
@pytest_asyncio.fixture
|
|
async def scratch(prisma):
|
|
handle = Scratch(prefix=f"{SCRATCH_PREFIX}{uuid.uuid4().hex[:12]}")
|
|
try:
|
|
yield handle
|
|
finally:
|
|
# Children before parents to avoid FK violations.
|
|
await prisma.db.litellm_verificationtoken.delete_many(
|
|
where={
|
|
"OR": [
|
|
{"key_alias": {"startswith": handle.prefix}},
|
|
{"key_name": {"startswith": handle.prefix}},
|
|
]
|
|
}
|
|
)
|
|
await prisma.db.litellm_teammembership.delete_many(
|
|
where={"team_id": {"startswith": handle.prefix}}
|
|
)
|
|
await prisma.db.litellm_organizationmembership.delete_many(
|
|
where={"user_id": {"startswith": handle.prefix}}
|
|
)
|
|
await prisma.db.litellm_teamtable.delete_many(
|
|
where={"team_id": {"startswith": handle.prefix}}
|
|
)
|
|
await prisma.db.litellm_usertable.delete_many(
|
|
where={"user_id": {"startswith": handle.prefix}}
|
|
)
|
|
# F1+F3 seed scratch orgs via create_scratch_org; the world seeder is
|
|
# the only other writer of LiteLLM_OrganizationTable and uses the
|
|
# behavior-pin- prefix, so a scratch-prefixed sweep here cannot
|
|
# collide with the read-world. Org must be reclaimed BEFORE its
|
|
# budget — org.budget_id → budget.budget_id, so deleting the parent
|
|
# first would FK-violate on any still-attached scratch org.
|
|
await prisma.db.litellm_organizationtable.delete_many(
|
|
where={"organization_id": {"startswith": handle.prefix}}
|
|
)
|
|
await prisma.db.litellm_budgettable.delete_many(
|
|
where={"budget_id": {"startswith": handle.prefix}}
|
|
)
|
|
# /team/member_add writes LiteLLM_UserTable.teams; the available-team
|
|
# self-join writes it on a world actor whose row must survive. Strip
|
|
# dangling scratch-team refs so the read-world stays immutable.
|
|
polluted = await prisma.db.litellm_usertable.find_many(
|
|
where={"teams": {"isEmpty": False}}
|
|
)
|
|
for user in polluted:
|
|
cleaned = [t for t in user.teams if not t.startswith(handle.prefix)]
|
|
if cleaned != list(user.teams):
|
|
await prisma.db.litellm_usertable.update(
|
|
where={"user_id": user.user_id},
|
|
data={"teams": {"set": cleaned}},
|
|
)
|