Commit Graph

77 Commits

Author SHA1 Message Date
yuneng-jiang d3587b1d8e fix: bump PyJWT to 2.12.0 in all Dockerfiles and tar to 7.5.11
All Dockerfiles were pinning PyJWT 2.9.0 (Dockerfile, Dockerfile.database,
Dockerfile.dev) or had a stale wheel build for 2.9.0 (Dockerfile.non_root).
Updated to 2.12.0 to match pyproject.toml. Also bumps tar to 7.5.11 in
Dockerfile.non_root for security.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-14 19:54:54 -07:00
yuneng-jiang 1f485007fb fix: update PyJWT pin in Dockerfile.non_root to 2.12.0
The wheels directory contains 2.12.0 after the pyproject.toml bump,
so the hardcoded 2.10.1 pin fails at build time.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-14 19:53:52 -07:00
yuneng-jiang 6a90596377 updating Dockerfile to tar 7.5.11 2026-03-13 11:16:17 -07:00
Krish Dholakia e7714f0ce6 Fix CVEs: bump tar/minimatch/pypdf + harden Docker SBOM patching (#23082)
* fix(docker): bump tar/minimatch/pypdf for CVE fixes + harden SBOM patching

- Bump tar 7.5.8→7.5.10, minimatch 10.2.1→10.2.4, pypdf 6.6.2→6.7.3
- Add sed-based SBOM metadata patching with properly indented find/sed
- Add npm package manager cleanup (apk del / apt-get purge) to remove
  stale SBOM entries from image scanners
- Scope || true to only apk del via brace grouping { ... || true; }
- Guard npm root -g with non-empty assertion to prevent silent failures
- Scope minimatch sed regex to ^10.x to avoid matching other major versions

Addresses: CVE-2026-27903, CVE-2026-27904, GHSA-qffp-2rhf-9h96, CVE-2026-27888

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(docker): scope find to /usr/local/lib /usr/lib, drop autoremove

- Replace `find /` with `find /usr/local/lib /usr/lib` to avoid
  traversing /proc, /sys, /dev during SBOM metadata patching
- Remove `apt-get autoremove -y` from Debian-based Dockerfiles to
  prevent nodejs from being removed as an auto-installed dependency

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 18:31:27 -08:00
Harshit28j 3e6c10a071 security: fix critical/high CVEs in OS-level libs and NPM transitive 2026-02-24 19:40:09 +05:30
Achilleas Athanasiou Fragkoulis cb95b1cf92 fix: Add LITELLM_UI_PATH and LITELLM_ASSETS_PATH for read-only filesystem support (#20492)
Fixes #19578

---

When deploying the LiteLLM proxy with `readOnlyRootFilesystem: true` in Kubernetes, UI routes returned `404` because:

- Hardcoded paths:
  - `/var/lib/litellm/ui`
  - `/var/lib/litellm/assets`
- Runtime copy/restructure operations failed on read-only filesystems
- No detection mechanism for pre-restructured UI

---

Add configurable environment variables with intelligent detection, graceful fallbacks, and code quality improvements.

---

- **`LITELLM_UI_PATH`** — Custom UI directory location
  - Default: `/var/lib/litellm/ui` (when `LITELLM_NON_ROOT=true`)
  - Default: packaged UI path (otherwise)
  - Example: `/app/var/litellm/ui` for `emptyDir` volumes

- **`LITELLM_ASSETS_PATH`** — Custom assets directory location
  - Default: `/var/lib/litellm/assets` (when `LITELLM_NON_ROOT=true`)
  - Default: current working directory (otherwise)
  - Example: `/app/var/litellm/assets`

---

UI is detected as **pre-restructured and ready** if any of the following apply:

1. **Primary**: `.litellm_ui_ready` marker file exists (created by Dockerfile)
2. **Fallback**: Pattern-based detection — finds *any* subdirectory containing `index.html`
   (resilient to UI structure changes; no hardcoded route names)
3. **Safety**: Filesystem writability check before operations

---

**`litellm/proxy/proxy_server.py`**

- `_validate_ui_directory()` — Verifies UI has required structure (`index.html`, `_next/`)
- `_is_ui_pre_restructured()` — Pattern-based detection (not hardcoded routes)
- `_try_populate_ui_directory()` — Helper for clean error handling
- Refactored UI path decision tree with numbered cases (1, 2, 3, 4a, 4b)
- Updated UI path logic to use `LITELLM_UI_PATH`
- Added writability checks before copy/restructure operations
- Graceful fallback to packaged UI if operations fail
- Updated `server_root_path` replacement with read-only check
- Simplified assets directory creation (try/except instead of complex parent checks)
- Updated `get_image()` endpoint to use `LITELLM_ASSETS_PATH`
- Added validation for packaged and final UI paths

**`docker/Dockerfile.non_root`**

- Added `touch .litellm_ui_ready` marker after UI restructuring
- Enables automatic detection of pre-built UI in Docker images

**`tests/proxy_unit_tests/test_ui_path_detection.py`**

- Added comprehensive unit tests for new functionality
- Tests env var handling, detection logic, and writability checks

---

**`docs/my-website/docs/proxy/config_settings.md`**

- Added `LITELLM_UI_PATH` and `LITELLM_ASSETS_PATH` to env vars table
- Documented defaults and use cases

**`docs/my-website/docs/proxy/prod.md`**

- Added comprehensive "Read-Only Root Filesystem" section
- Quick fixes for permission errors
- Full Kubernetes setup with `initContainer` + `emptyDir` volumes
- API-only deployment option
- Environment variables reference table
- Notes on migrations, caching, and `server_root_path`

**`docker/README.md`**

- Updated hardened setup notes to mention pre-built UI
- Added details about UI serving from read-only paths

---

- No breaking changes
- Existing deployments continue working without modifications
- New env vars are optional with sensible defaults
- Detection logic supports both old and new builds
- Graceful fallbacks throughout

---

```yaml
apiVersion: apps/v1
kind: Deployment
spec:
  template:
    spec:
      initContainers:
        - name: setup-ui
          image: ghcr.io/berriai/litellm:main-stable
          command: ["sh", "-c", "cp -r /var/lib/litellm/ui/* /app/var/litellm/ui/"]
          volumeMounts:
            - name: ui-volume
              mountPath: /app/var/litellm/ui
      containers:
        - name: litellm
          env:
            - name: LITELLM_UI_PATH
              value: "/app/var/litellm/ui"
            - name: LITELLM_ASSETS_PATH
              value: "/app/var/litellm/assets"
          securityContext:
            readOnlyRootFilesystem: true
          volumeMounts:
            - name: ui-volume
              mountPath: /app/var/litellm/ui
      volumes:
        - name: ui-volume
          emptyDir:
            sizeLimit: 100Mi
2026-02-12 19:39:04 +05:30
Harshit Jain 3b043ee8bf fix critical CVE vulnerabliltes (#20683) 2026-02-07 22:23:01 -08:00
Ishaan Jaffer ef66a6cb62 fix security scans 2026-02-07 11:15:02 -08:00
yuneng-jiang 7831e30666 keep package-lock.json in non-root 2026-02-04 17:58:22 -08:00
Ishaan Jaffer a002907389 fix tar security issue with TAR 2026-01-31 11:46:53 -08:00
milan-berri 8fcdf6105f fix: run prisma generate as nobody user in non-root container (#20000)
Fixes permission error where prisma generate fails with 'Permission denied'
when trying to write schema.prisma in non-root containers.

The issue was that prisma generate was running as root before switching
to nobody user, causing generated files to be owned by root:root.
Moving prisma generate after USER nobody ensures files are owned by
nobody:nobody and can be written to during runtime.

Fixes #19859
2026-01-29 19:04:59 -08:00
yuneng-jiang 1bf32deb6c Adding python3-dev to non root 2026-01-22 10:05:09 -08:00
Alexsander Hamir 5a06868652 Fix in-flight request termination on SIGTERM when health-check runs in a separate process (#19427) 2026-01-20 12:17:06 -08:00
Alexsander Hamir 0cd7763d5f Add health check scripts and parallel execution support (#19295)
- Add health_check_client.py for monitoring model availability
- Add health_check_client_README.md with usage documentation
- Add health_check_requirements.txt for dependencies
- Add run_parallel_health_checks.ps1 (PowerShell version)
- Add run_parallel_health_checks.sh (Bash version)
- Organize all scripts under scripts/health_check/ directory
2026-01-19 08:38:38 -08:00
.mobo 1b3c8fec83 put logfile and pidfile in /tmp to avoid permission denied error in non root environment (#17267) 2026-01-16 20:57:09 +05:30
Ishaan Jaff f98814ba8a fix include proxy/prisma_migration.py in non root (#18971) 2026-01-12 08:12:39 -08:00
Alexsander Hamir 1544e8f971 feat: Add line_profiler support for performance analysis and fix Windows CRLF issues in Docker builds (#18773) 2026-01-07 11:36:57 -08:00
Cesar Garcia 22ae1628e1 Add libsndfile to database Docker image for audio processing (#18612)
The litellm-database Docker image was missing the libsndfile system
library, which is required by the soundfile Python package for audio
file processing. This caused failures when using audio transcription
endpoints that attempt to calculate audio duration.

This adds libsndfile to the runtime dependencies in Dockerfile.database,
consistent with Dockerfile.alpine which already includes this library.
2026-01-06 01:23:30 +05:30
yuneng-jiang 05dd247ff5 Fix UI disappearing for development instances 2025-12-23 15:24:07 -08:00
yuneng-jiang 6bb5254c9b Revert "[Fix] UI - Disappears in Development Environments" 2025-12-23 15:08:07 -08:00
yuneng-jiang fccd2d1e87 Fix UI disappearing for development instances 2025-12-23 11:46:55 -08:00
Alexsander Hamir 4b652e19d8 [Fix] CI/CD - security­_tests (#18305) 2025-12-20 17:08:28 -08:00
Cesar Garcia 089d1eb08b fix(docker): add libsndfile to Alpine image for audio processing (#18092)
ARM64 Alpine image was missing libsndfile library causing soundfile
module to fail with "cannot load library" error.
2025-12-17 10:07:57 +05:30
Mateo Di Loreto 107ea9043a [Feature] Download Prisma binaries at build time instead of at runtime for Security Restricted environments (#17695)
* Use config file to enable prometheus metrics

* Revert "Use config file to enable prometheus metrics"

This reverts commit 15ae36e1711791c0ac0a7aa84dcec142951717f5.

* Improve hardened stack and Prisma offline flow

* Document hardened compose usage

* Remove undesired change in fastapi-sso

* Restore dashboard lockfile

* Remove unecessary tempdirs

* Document hardened/offline Docker validation flow
2025-12-16 21:25:53 +05:30
yuneng-jiang 1d95595522 Merge remote-tracking branch 'origin' into litellm_non_root_docker_logo_fix 2025-12-06 20:00:33 -08:00
Ishaan Jaffer 8af1be31eb fix build from pip 2025-12-06 16:09:27 -08:00
Ishaan Jaffer 3090197861 fix docker 2025-12-06 16:02:02 -08:00
Ishaan Jaffer bfdcfca8b0 fix test 2025-12-06 10:15:00 -08:00
Ishaan Jaffer 6358be3d0b fix build from PIP 2025-12-06 09:44:09 -08:00
Alexsander Hamir db40a38999 Add retry logic to apk package installation in Dockerfile.non_root (#17596)
- Add retry loop (3 attempts with 5s delay) to builder stage apk add command
- Add retry logic to runtime stage apk upgrade and apk add commands
- Improves resilience to transient network errors during package downloads
2025-12-06 08:17:50 -08:00
Alexsander Hamir 655e04f16c Fix: apply_guardrail method and improve test isolation (#17555)
* Fix Bedrock guardrail apply_guardrail method and test mocks

Fixed 4 failing tests in the guardrail test suite:

1. BedrockGuardrail.apply_guardrail now returns original texts when guardrail
   allows content but doesn't provide output/outputs fields. Previously returned
   empty list, causing test_bedrock_apply_guardrail_success to fail.

2. Updated test mocks to use correct Bedrock API response format:
   - Changed from 'content' field to 'output' field
   - Fixed nested structure from {'text': {'text': '...'}} to {'text': '...'}
   - Added missing 'output' field in filter test

3. Fixed endpoint test mocks to return GenericGuardrailAPIInputs format:
   - Changed from tuple (List[str], Optional[List[str]]) to dict {'texts': [...]}
   - Updated method call assertions to use 'inputs' parameter correctly

All 12 guardrail tests now pass successfully.

* fix: remove python3-dev from Dockerfile.build_from_pip to avoid Python version conflict

The base image cgr.dev/chainguard/python:latest-dev already includes Python 3.14
and its development tools. Installing python3-dev pulls Python 3.13 packages
which conflict with the existing Python 3.14 installation, causing file
ownership errors during apk install.

* fix: disable callbacks in vertex fine-tuning tests to prevent Datadog logging interference

The test was failing because Datadog logging was making an HTTP POST request
that was being caught by the mock, causing assert_called_once() to fail.
By disabling callbacks during the test, we prevent Datadog from making any
HTTP calls, allowing the mock to only see the Vertex AI API call.

* fix: ensure test isolation in test_logging_non_streaming_request

Add proper cleanup to restore original litellm.callbacks after test execution.
This prevents test interference when running as part of a larger test suite,
where global state pollution was causing async_log_success_event to be
called multiple times instead of once.

Fixes test failure where the test expected async_log_success_event to be
called once but was being called twice due to callbacks from previous tests
not being cleaned up.
2025-12-05 12:59:35 -08:00
Krish Dholakia 74ba18df55 Litellm chainguard fixes 12 02 2025 p1 (#17406)
* build: update dockerfile non root

* build: update build

* build: update non root

* build: dockerfile fixes

* build: ensure dockerfile + dockerfile.database also work
2025-12-02 22:50:13 -08:00
Krrish Dholakia 8ee298f9c9 fix: remove python3 headers 2025-12-02 16:06:06 -08:00
Krrish Dholakia 7fb2f4730b build: remove duplicate packages 2025-12-02 15:53:10 -08:00
yuneng-jiang 031677636a Add user writable file to non root docker for logo 2025-11-26 21:44:02 -08:00
yuneng-jiang e371ff454a Non root docker build fix (#17060) 2025-11-24 20:45:56 -08:00
Ishaan Jaffer be71138af3 fix build bad db url 2025-11-22 10:10:08 -08:00
Ishaan Jaffer c34d8af329 test fix 2025-11-22 10:02:15 -08:00
yuneng-jiang 4b25398afe [Infra] CI/CD Fixes (#16937)
* Attempt CI/CD Fix

* Adding test for coverage

* Adding max depth to copilot and vertex

* Fixing mypy lint and docker database

* Fixing UI build issues

* Update playwright test
2025-11-21 13:58:19 -08:00
Alexsander Hamir 454ffcd9c7 fix: install runtime node for prisma (#16410)
Prisma CLI recently started bootstrapping npm@10 inside the runtime image, which now fails with a sizeCalculation cache error on the slim Python base. Installing Debian's nodejs/npm (along with libatomic1) lets Prisma reuse the system binaries so prisma generate completes again.
2025-11-08 15:48:32 -08:00
Ishaan Jaff 9288c8543c fix docker (#16342) 2025-11-07 14:38:20 -08:00
yuneng-jiang 5d158775b1 [Fix] Litellm non root docker Model Hub Table fix (#16282)
* Fix model hub table 404 on non-root docker

* Adding test
2025-11-05 18:30:20 -08:00
Kowyo 858f557bce docs: use docker compose instead of docker-compose 2025-09-29 11:59:53 +00:00
Arthur 6c97a31c9c bug: add supervisor to non-root image 2025-08-24 15:43:57 +02:00
Jan Kessler 3eecff44c6 fix permission access on prisma migrate in non-root image 2025-08-21 09:00:55 +02:00
Ishaan Jaff a328ad56e3 [Bug Fix] Fixes for using Auto Router with LiteLLM Docker Image (#13788)
* fix install auto router.sh

* fixes for Docker IMG
2025-08-19 18:36:30 -07:00
Ishaan Jaff 76f1064229 [Bug Fix] litellm incompatible with newest release of openAI v1.100.0 (#13728)
* fix imports OpenAI SDK

* ResponseText fixes

* fixes ResponseText

* fix imports

* catch AttributeError

* fix import

* use openai==1.100.1

* fix build from PIP

* fix lint test

* Print OpenAI version

* fix Install dependencies
2025-08-18 18:26:17 -07:00
Parham Alvani 849c262a02 fix: we need to have project files for running migration using this image (#13379) 2025-08-07 13:31:10 -07:00
Jugal D. Bhatt 9aeca96c16 fix openshift (#13239) 2025-08-02 22:37:02 -07:00
Mateo Di Loreto 6e5fe51184 add openssl in apk install in runtime stage in dockerfile.non_root (#13168)
* add openssl in apk install in runtime stage in dockerfile.non_rootdocker-compose logs -f litellm

* Improve Docker-compose.yaml for local debugging

---------

Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>
2025-07-31 21:52:11 -07:00