mirror of
https://github.com/tiennm99/litellm.git
synced 2026-06-17 08:52:34 +00:00
014cb8fa9d
Split the monolithic LiteLLM proxy into independently scalable Kubernetes components to allow separate horizontal scaling of the LLM data plane and management API surfaces - Add DatabaseURLSettings pydantic-settings model that assembles DATABASE_URL (and optional DATABASE_URL_READ_REPLICA) from discrete DATABASE_* env vars before Prisma initializes, supporting both IAM token auth (minting short-lived RDS tokens) and password auth; replaces the CLI-only path that componentized entrypoints bypass - Add gateway component (port 4000) that trims the proxy route table to the LLM data-plane surface (chat, embeddings, completions, audio, realtime, provider passthroughs, health/metrics) via an allowlist applied inside the lifespan context so plugin-registered routes are captured - Add backend component (port 4001) that exposes the management/admin surface (keys, users, teams, orgs, spend analytics, model management, SSO, audit logs) with a complementary allowlist - Add ui component — Next.js static export served by nginx (port 3000) with RSC payload routing, asset prefix aliasing, and SPA fallback for dashboard routes - Add migrations component with dedicated Dockerfile that runs prisma migrate deploy via a Helm pre-install/pre-upgrade Job, eliminating per-pod schema contention on the Prisma advisory lock - Add Helm chart (helm/litellm) with separate Deployments, Services, HPAs, and ConfigMap for each component; shared _helpers.tpl emits DATABASE_*, IAM_TOKEN_DB_AUTH, REDIS_*, and DISABLE_SCHEMA_UPDATE env vars from chart values; ingress template routes traffic to the correct component by path prefix - Add comprehensive tests for DatabaseURLSettings covering IAM auth, password auth, read replica fallbacks, operator-pinned URL preservation, and percent-encoding; add coverage test asserting gateway + backend allowlist union equals the full proxy route set - Add pydantic-settings>=2.14.1 as a proxy extra dependency and update liccheck allowlist Co-authored-by: Yassin Kortam <yassinkortam@g.ucla.edu>
43 lines
1.1 KiB
Docker
43 lines
1.1 KiB
Docker
# syntax=docker/dockerfile:1.7
|
|
|
|
# UI container — Next.js static export served by nginx.
|
|
|
|
ARG NODE_VERSION=20.18-alpine3.20
|
|
ARG NGINX_VERSION=1.27-alpine
|
|
|
|
# ---------- builder ----------
|
|
FROM node:${NODE_VERSION} AS builder
|
|
|
|
ENV NEXT_TELEMETRY_DISABLED=1 \
|
|
npm_config_fund=false \
|
|
npm_config_audit=false
|
|
|
|
WORKDIR /app
|
|
|
|
# Layer the lockfile-only install above the source copy so source-only
|
|
# edits don't bust the install cache.
|
|
COPY ui/litellm-dashboard/package.json ui/litellm-dashboard/package-lock.json ./
|
|
RUN --mount=type=cache,target=/root/.npm \
|
|
npm ci --prefer-offline
|
|
|
|
COPY ui/litellm-dashboard/ ./
|
|
RUN npm run build
|
|
|
|
# ---------- runtime ----------
|
|
FROM nginx:${NGINX_VERSION} AS runtime
|
|
|
|
# Drop the upstream default :80 server; we own the config.
|
|
RUN rm -f /etc/nginx/conf.d/default.conf
|
|
|
|
# Static export → web root.
|
|
COPY --from=builder /app/out /usr/share/nginx/html
|
|
|
|
# Routing rules — see ui/nginx.conf for the full description.
|
|
COPY ui/nginx.conf /etc/nginx/nginx.conf
|
|
|
|
EXPOSE 3000/tcp
|
|
|
|
# nginx as PID 1 in foreground; respects SIGTERM out of the box, so
|
|
# no tini/dumb-init wrapper needed.
|
|
CMD ["nginx", "-g", "daemon off;"]
|