Files
litellm/backend/main.py
T
Yassin Kortam 014cb8fa9d feat: add componentized proxy deployment with gateway, backend, ui, and migrations (#27557)
Split the monolithic LiteLLM proxy into independently scalable Kubernetes components to allow separate horizontal scaling of the LLM data plane and management API surfaces

- Add DatabaseURLSettings pydantic-settings model that assembles DATABASE_URL (and optional DATABASE_URL_READ_REPLICA) from discrete DATABASE_* env vars before Prisma initializes, supporting both IAM token auth (minting short-lived RDS tokens) and password auth; replaces the CLI-only path that componentized entrypoints bypass
- Add gateway component (port 4000) that trims the proxy route table to the LLM data-plane surface (chat, embeddings, completions, audio, realtime, provider passthroughs, health/metrics) via an allowlist applied inside the lifespan context so plugin-registered routes are captured
- Add backend component (port 4001) that exposes the management/admin surface (keys, users, teams, orgs, spend analytics, model management, SSO, audit logs) with a complementary allowlist
- Add ui component — Next.js static export served by nginx (port 3000) with RSC payload routing, asset prefix aliasing, and SPA fallback for dashboard routes
- Add migrations component with dedicated Dockerfile that runs prisma migrate deploy via a Helm pre-install/pre-upgrade Job, eliminating per-pod schema contention on the Prisma advisory lock
- Add Helm chart (helm/litellm) with separate Deployments, Services, HPAs, and ConfigMap for each component; shared _helpers.tpl emits DATABASE_*, IAM_TOKEN_DB_AUTH, REDIS_*, and DISABLE_SCHEMA_UPDATE env vars from chart values; ingress template routes traffic to the correct component by path prefix
- Add comprehensive tests for DatabaseURLSettings covering IAM auth, password auth, read replica fallbacks, operator-pinned URL preservation, and percent-encoding; add coverage test asserting gateway + backend allowlist union equals the full proxy route set
- Add pydantic-settings>=2.14.1 as a proxy extra dependency and update liccheck allowlist

Co-authored-by: Yassin Kortam <yassinkortam@g.ucla.edu>
2026-05-16 09:25:17 -07:00

52 lines
1.5 KiB
Python

"""UI backend entrypoint.
Reuses the existing FastAPI app from `litellm.proxy.proxy_server` and trims its
route table to just the management/admin surface used by the dashboard. Purely
additive — no existing module is modified.
Run with:
uvicorn backend.main:app --host 0.0.0.0 --port 4001
"""
from contextlib import asynccontextmanager
from fastapi.routing import Mount
# See gateway/main.py for why we assemble DATABASE_URL(s) here before
# importing proxy_server.
from litellm.proxy.db.db_url_settings import DatabaseURLSettings
DatabaseURLSettings.from_env().apply_to_env()
from litellm.proxy.proxy_server import app
from backend.routes.allowlist import BACKEND_EXACT_PATHS, BACKEND_PATH_PREFIXES
def _is_backend_route(route) -> bool:
"""Keep the route on the backend if its path is in the management surface."""
path = getattr(route, "path", None)
if path is None:
return False
if isinstance(route, Mount):
# Static UI mounts are served by the dedicated UI container, not here.
return False
if path in BACKEND_EXACT_PATHS:
return True
return any(path.startswith(prefix) for prefix in BACKEND_PATH_PREFIXES)
# See gateway/main.py for why the trim runs inside the lifespan instead of at
# module scope.
_proxy_lifespan = app.router.lifespan_context
@asynccontextmanager
async def _backend_lifespan(app_):
async with _proxy_lifespan(app_):
app_.router.routes = [r for r in app_.router.routes if _is_backend_route(r)]
yield
app.router.lifespan_context = _backend_lifespan