[Infra] Dockerfile.non_root: add BuildKit uv cache mount

Mount /app/.cache/uv as a BuildKit type=cache on both 'uv sync' steps. The cache persists across builds on the same builder (and, when used with type=gha in CI, across CI runs) so repeat builds don't re-download every wheel. Side-effect: because the cache lives outside the image layer, the ~742MB of downloaded wheel archives that were previously baked into /app/.cache/uv drop out of the final image. Compressed image size goes from ~5.0GB to ~3.7GB, and the 'USER nobody' prisma-generate layer is 1.7GB vs 2.4GB. Warm-build timing: a uv-sync-invalidating edit now takes ~1m30s vs ~2m39s without the cache mount, on this dev VM. API parity and UI visual regression continue to match baseline. Trivy HIGH/CRITICAL: 6 at baseline -> 2 now, no new CVEs. Co-authored-by: yuneng-jiang <yuneng-berri@users.noreply.github.com>
2026-08-02 04:21:34 +00:00 · 2026-04-19 06:35:39 +00:00
parent ca52e346b0
commit e24c02f478
1 changed files with 4 additions and 2 deletions
@@ -41,7 +41,8 @@ COPY enterprise/pyproject.toml enterprise/
 COPY litellm-proxy-extras/pyproject.toml litellm-proxy-extras/

 # Install third-party dependencies (cached unless pyproject.toml/uv.lock change)
-RUN uv sync --frozen --no-install-project --no-install-workspace --no-default-groups --no-editable \
+RUN --mount=type=cache,target=/app/.cache/uv,id=litellm-uv-cache \
+    uv sync --frozen --no-install-project --no-install-workspace --no-default-groups --no-editable \
    --extra proxy \
    --extra proxy-runtime \
    --extra extra_proxy \
@@ -71,7 +72,8 @@ RUN mkdir -p /var/lib/litellm/ui /var/lib/litellm/assets && \
      done && \
      touch .litellm_ui_ready )

-RUN if [ "$PROXY_EXTRAS_SOURCE" = "published" ]; then \
+RUN --mount=type=cache,target=/app/.cache/uv,id=litellm-uv-cache \
+    if [ "$PROXY_EXTRAS_SOURCE" = "published" ]; then \
      uv sync --frozen --no-default-groups --no-editable \
        --extra proxy \
        --extra proxy-runtime \