Files
litellm/ci_cd/security_scans.sh
T
Ishaan Jaff 28c33f53a3 CircleCI test stability (#23055)
* fix: resolve ruff lint errors and mypy type error

- Remove unused import get_user_credential (F401)
- Add noqa: PLR0915 for 3 large functions exceeding 50 statements
- Cast result_data['q'] to str for _append_domain_filters (mypy arg-type)

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: add /vertex_ai/live to supported endpoints and azure gpt-5.1 reasoning flags

- Add /vertex_ai/live to JSON schema validation enum in test_utils.py
- Add supports_none_reasoning_effort=true to 10 azure/gpt-5.1 model entries
  (matching the OpenAI gpt-5.1 behavior)

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: handle non-string team_alias/key_alias in PolicyMatchContext

Prevent Pydantic validation errors when team_alias or key_alias are not
proper strings (e.g. MagicMock in tests). Only pass values that are
actually strings; default to None otherwise.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: initialize jwt_handler.litellm_jwtauth in JWT test

The test_jwt_non_admin_team_route_access test was failing because
user_api_key_auth now accesses jwt_handler.litellm_jwtauth.virtual_key_claim_field
before reaching the mocked JWTAuthManager.auth_builder. Initialize the
jwt_handler with a default LiteLLM_JWTAuth object.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: add missing mock attributes to MCP server test

The test_add_update_server_fallback_to_server_id test was failing because
MagicMock auto-creates attributes when accessed. build_mcp_server_from_table
accesses many fields via getattr(), which on a MagicMock returns another
MagicMock instead of None, causing Pydantic validation errors in MCPServer.

Explicitly set all required mock attributes.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: update UI tests for leftnav, navbar, and KeyLifecycleSettings

- leftnav: Add mock for useTeams hook, add isUserTeamAdminForAnyTeam to
  roles mock, update topLevelLabels to match current component menu items
- navbar: Add mocks for useDisableBouncingIcon, BlogDropdown, UserDropdown,
  and serverRootPath. Update test to work with the new component structure.
- KeyLifecycleSettings: Fix placeholder and tooltip assertions to match
  actual component behavior

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: update health check test assertion from 'connected' to 'healthy'

The /health/readiness endpoint now returns {"status": "healthy"} with the
DB status in a separate field, instead of the previous {"status": "connected"}.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: clear litellm.api_key in OpenRouter validate_environment test

The test_validate_environment_raises_without_key test was failing because
litellm.api_key may be set globally in the test environment. Clear it
along with OPENROUTER_API_KEY and OR_API_KEY env vars using monkeypatch.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: patch HTTPHandler class-level in VLLM embedding test

The test_encoding_format_not_sent_in_actual_request test was patching
client.post on an instance, but the handler uses the class method.
Patch HTTPHandler.post at class level, add caching=False to prevent
cache hits, and remove broad try/except that hid errors.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: make test_redaction_responses_api_stream resilient to async callback timing

Replace fixed 1s sleep with polling wait for async_log_success_event.
Streaming success handler runs via asyncio.create_task; 1s was insufficient
in CI. Add 0.5s initial sleep for event loop to schedule the task, then
poll up to 10s for the callback to fire.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: update dompurify and svgo to fix security CVEs

- CVE-2026-0540: dompurify XSS vulnerability - fix by upgrading to 3.3.2+
- CVE-2026-29074: svgo DoS via entity expansion - fix by upgrading to 3.3.3+

Added npm overrides in docs/my-website/package.json and regenerated
package-lock.json.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: remove unused json import in config_override_endpoints.py

Ruff F401: json is imported but unused (safe_json_loads/safe_dumps
are used instead)

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: add missing MCP mock attributes and provider documentation entries

- Add missing mock attributes to test_add_update_server_with_alias and
  test_add_update_server_without_alias (same fix as fallback test)
- Add bedrock_mantle and searchapi to provider_endpoints_support.json
- Remove unused json import from config_override_endpoints.py

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: override _supports_reasoning_effort_level for Azure gpt5_series prefix

The Azure GPT-5 config uses 'gpt5_series/' as a routing prefix, but
_supports_factory(model='gpt5_series/gpt-5.1') fails to resolve because
'gpt5_series' is not a recognized provider. Override the method to strip
the prefix and prepend 'azure/' for correct model info lookup.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: accept both 'healthy' and 'connected' in health check test

The test_health_and_chat_completion test runs against both source builds
(which return 'healthy') and pip-installed versions (which may return
'connected'). Accept both values.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: mock extract_mcp_auth_context in streamable HTTP MCP handler test

The handle_streamable_http_mcp function now calls extract_mcp_auth_context
before session_manager.handle_request, but the test didn't mock it. The
auth extraction fails with the minimal mock scope, preventing
handle_request from being called. Also relax assertion to not check
exact args since the send wrapper may be modified by debug injection.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: add test for _combine_fallback_usage to satisfy router code coverage

The router_code_coverage.py check requires all functions in router.py
to be called in test files. Add a basic test for _combine_fallback_usage.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: add @log_guardrail_information decorator to CrowdStrike AIDR guardrail

The check_guardrail_apply_decorator.py CI check requires all guardrail
apply_guardrail methods to have the @log_guardrail_information decorator.
The CrowdStrike AIDR handler was missing it.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: document PRISMA_RECONNECT_ESCALATION_THRESHOLD and REDIS_CLUSTER_NODES env keys

Add missing environment variable documentation to config_settings.md
to satisfy the test_env_keys.py CI check.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: document enforced_file_expires_after and enforced_batch_output_expires_after in new_team docstring

The test_api_docs.py CI check validates that all Pydantic model fields
are documented in the function docstring. Add missing parameter docs
for enforced_file_expires_after and enforced_batch_output_expires_after.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: regenerate poetry.lock to match pyproject.toml

The poetry.lock file was out of sync with pyproject.toml, causing
proxy_e2e_azure_batches_tests to fail during dependency installation.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: set master_key=None in test_create_file_with_deep_nested_litellm_metadata

The test was missing the master_key monkeypatch that other tests in the
same file set. In CI with parallel execution (-n 4), another test may
set master_key to a non-None value, causing auth failures (500) when
the test sends 'Bearer test-key'.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: document enforced_*_expires_after in update_team docstring too

Same missing params as new_team - also needed in update_team docstring
for the test_api_docs.py CI check to pass.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: use get_async_httpx_client in a2a_protocol and add master_key monkeypatch to files tests

- Replace httpx.AsyncClient() with get_async_httpx_client() in a2a_protocol/main.py
  to satisfy the ensure_async_clients_test CI check
- Add httpxSpecialProvider.A2AProvider enum value
- Add master_key=None monkeypatch to test_managed_files_with_loadbalancing

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: remove unused httpx import from a2a_protocol/main.py

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: use cache-key-only param for A2A extra_headers to avoid AsyncHTTPHandler init error

The 'extra_headers' key in params was being passed to AsyncHTTPHandler.__init__()
which doesn't accept it. Use 'disable_aiohttp_transport' as the cache-key-only
param since it's explicitly filtered out before reaching the constructor.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: add additionalProperties:false and resolve $defs/$ref in Anthropic output_format schemas

Anthropic API now requires additionalProperties=false for all object-type
schemas in output_format. Also resolve $defs/$ref references by inlining
them using unpack_defs before sending to Anthropic, since Anthropic
doesn't support external schema references.

Fixes: llm_translation_testing Anthropic JSON schema failures

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: allowlist CVE-2026-2297 and GHSA-qffp-2rhf-9h96 in security scans

- CVE-2026-2297: Python 3.13 SourcelessFileLoader audit hook bypass,
  no fix available in base image
- GHSA-qffp-2rhf-9h96: tar hardlink path traversal, from nodejs_wheel
  bundled npm, not used in application runtime code

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: isolate files endpoint tests from shared proxy state in CI parallel execution

Override user_api_key_auth dependency to return a fixed UserAPIKeyAuth
with PROXY_ADMIN role, avoiding auth lookups via prisma_client,
user_api_key_cache, or master_key. Set prisma_client=None to prevent
DB state contamination. Use try/finally to clean up dependency overrides.

Fixes persistent test_create_file_with_deep_nested_litellm_metadata and
test_managed_files_with_loadbalancing 500 errors in CI with -n 4.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: apply same auth override to test_managed_files_with_loadbalancing

Same CI parallel execution fix as test_create_file_with_deep_nested -
override user_api_key_auth dependency and set prisma_client=None.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

---------

Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
2026-03-07 15:19:39 -08:00

260 lines
13 KiB
Bash
Executable File

#!/bin/bash
# Security Scans Script for LiteLLM
# This script runs comprehensive security scans including Trivy and Grype
set -e
echo "Starting security scans for LiteLLM..."
# Function to install Trivy and required tools
install_trivy() {
echo "Installing Trivy and required tools..."
sudo apt-get update
sudo apt-get install -y wget apt-transport-https gnupg lsb-release jq curl
wget -qO - https://aquasecurity.github.io/trivy-repo/deb/public.key | sudo apt-key add -
echo "deb https://aquasecurity.github.io/trivy-repo/deb $(lsb_release -sc) main" | sudo tee -a /etc/apt/sources.list.d/trivy.list
sudo apt-get update
sudo apt-get install trivy
echo "Trivy and required tools installed successfully"
}
# Function to install Grype
install_grype() {
echo "Installing Grype..."
curl -sSfL https://raw.githubusercontent.com/anchore/grype/main/install.sh | sudo sh -s -- -b /usr/local/bin
echo "Grype installed successfully"
}
# Function to install ggshield
install_ggshield() {
echo "Installing ggshield..."
pip3 install --upgrade pip
pip3 install ggshield
echo "ggshield installed successfully"
}
# # Function to run secret detection scans
# run_secret_detection() {
# echo "Running secret detection scans..."
# if ! command -v ggshield &> /dev/null; then
# install_ggshield
# fi
# # Check if GITGUARDIAN_API_KEY is set (required for CI/CD)
# if [ -z "$GITGUARDIAN_API_KEY" ]; then
# echo "Warning: GITGUARDIAN_API_KEY environment variable is not set."
# echo "ggshield requires a GitGuardian API key to scan for secrets."
# echo "Please set GITGUARDIAN_API_KEY in your CI/CD environment variables."
# exit 1
# fi
# echo "Scanning codebase for secrets..."
# echo "Note: Large codebases may take several minutes due to API rate limits (50 requests/minute on free plan)"
# echo "ggshield will automatically handle rate limits and retry as needed."
# echo "Binary files, cache files, and build artifacts are excluded via .gitguardian.yaml"
# # Use --recursive for directory scanning and auto-confirm if prompted
# # .gitguardian.yaml will automatically exclude binary files, wheel files, etc.
# # GITGUARDIAN_API_KEY environment variable will be used for authentication
# echo y | ggshield secret scan path . --recursive || {
# echo ""
# echo "=========================================="
# echo "ERROR: Secret Detection Failed"
# echo "=========================================="
# echo "ggshield has detected secrets in the codebase."
# echo "Please review discovered secrets above, revoke any actively used secrets"
# echo "from underlying systems and make changes to inject secrets dynamically at runtime."
# echo ""
# echo "For more information, see: https://docs.gitguardian.com/secrets-detection/"
# echo "=========================================="
# echo ""
# exit 1
# }
# echo "Secret detection scans completed successfully"
# }
# Function to run Trivy scans
run_trivy_scans() {
echo "Running Trivy scans..."
echo "Scanning LiteLLM Docs..."
trivy fs --ignorefile .trivyignore --scanners vuln --dependency-tree --exit-code 1 --severity HIGH,CRITICAL,MEDIUM ./docs/
echo "Scanning LiteLLM UI..."
trivy fs --ignorefile .trivyignore --scanners vuln --dependency-tree --exit-code 1 --severity HIGH,CRITICAL,MEDIUM ./ui/
echo "Trivy scans completed successfully"
}
# Function to build and scan Docker images with Grype
run_grype_scans() {
echo "Running Grype scans..."
# Temporarily add wheel files to .dockerignore for security scans
echo "Temporarily modifying .dockerignore to exclude problematic wheel files..."
cp .dockerignore .dockerignore.backup 2>/dev/null || touch .dockerignore.backup
echo "/*.whl" >> .dockerignore
# Build and scan Dockerfile.database
echo "Building and scanning Dockerfile.database..."
docker build --no-cache -t litellm-database:latest -f ./docker/Dockerfile.database .
grype litellm-database:latest --config ci_cd/.grype.yaml --fail-on critical
# Build and scan main Dockerfile
echo "Building and scanning main Dockerfile..."
docker build --no-cache -t litellm:latest .
grype litellm:latest --config ci_cd/.grype.yaml --fail-on critical
# Restore original .dockerignore
echo "Restoring original .dockerignore..."
mv .dockerignore.backup .dockerignore
# Scan the locally built LiteLLM image for vulnerabilities with CVSS >= 4.0
echo "Scanning locally built LiteLLM image for high-severity vulnerabilities..."
echo "Using locally built image: litellm:latest"
# Allowlist of CVEs to be ignored in failure threshold/reporting
# - CVE-2025-8869: Not applicable on Python >=3.13 (PEP 706 implemented); pip fallback unused; no OS-level fix
# - GHSA-4xh5-x5gv-qwph: GitHub Security Advisory alias for CVE-2025-8869
# - GHSA-5j98-mcp5-4vw2: glob CLI command injection via -c/--cmd; glob CLI is not used in the litellm runtime image,
# and the vulnerable versions are pulled in only via OS-level/node tooling outside of our application code
ALLOWED_CVES=(
"CVE-2025-8869"
"GHSA-4xh5-x5gv-qwph"
"CVE-2025-8291" # no fix available as of Oct 11, 2025
"GHSA-5j98-mcp5-4vw2"
"CVE-2025-13836" # Python 3.13 HTTP response reading OOM/DoS - no fix available in base image
"CVE-2025-12084" # Python 3.13 xml.dom.minidom quadratic algorithm - no fix available in base image
"CVE-2025-60876" # BusyBox wget HTTP request splitting - no fix available in Chainguard Wolfi base image
"CVE-2026-0861" # Wolfi glibc still flagged even on 2.42-r5; upstream patched build unavailable yet
"CVE-2010-4756" # glibc glob DoS - awaiting patched Wolfi glibc build
"CVE-2019-1010022" # glibc stack guard bypass - awaiting patched Wolfi glibc build
"CVE-2019-1010023" # glibc ldd remap issue - awaiting patched Wolfi glibc build
"CVE-2019-1010024" # glibc ASLR mitigation bypass - awaiting patched Wolfi glibc build
"CVE-2019-1010025" # glibc pthread heap address leak - awaiting patched Wolfi glibc build
"CVE-2026-22184" # zlib untgz buffer overflow - untgz unused + no fixed Wolfi build yet
"GHSA-58pv-8j8x-9vj2" # jaraco.context path traversal - setuptools vendored only (v5.3.0), not used in application code (using v6.1.0+)
"GHSA-34x7-hfp2-rc4v" # node-tar hardlink path traversal - not applicable, tar CLI not exposed in application code
"GHSA-r6q2-hw4h-h46w" # node-tar not used by application runtime, Linux-only container, not affect by macOS APFS-specific exploit
"GHSA-8rrh-rw8j-w5fx" # wheel is from chainguard and will be handled by then TODO: Remove this after Chainguard updates the wheel
"CVE-2025-59465" # Node only used for Admin UI build/prisma
"CVE-2025-55131" # Node only used for Admin UI build/prisma
"CVE-2025-59466" # Node only used for Admin UI build/prisma
"CVE-2025-55130" # Node only used for Admin UI build/prisma
"CVE-2025-59467" # Node only used for Admin UI build/prisma
"CVE-2026-21637" # Node only used for Admin UI build/prisma
"CVE-2025-55132" # Node only used for Admin UI build/prisma
"GHSA-hx9q-6w63-j58v" # orjson dumps recursion; allowlisted
"CVE-2025-15281" # No fix available yet
"CVE-2026-0865" # No fix available yet
"CVE-2025-15282" # No fix available yet
"CVE-2026-0672" # No fix available yet
"CVE-2025-15366" # No fix available yet
"CVE-2025-15367" # No fix available yet
"CVE-2025-12781" # No fix available yet
"CVE-2025-11468" # No fix available yet
"CVE-2026-1299" # Python 3.13 email module header injection - not applicable, LiteLLM doesn't use BytesGenerator for email serialization
"CVE-2026-0775" # npm cli incorrect permission assignment - no fix available yet, npm is only used at build/prisma-generate time
"GHSA-3ppc-4f35-3m26" # minimatch ReDoS via repeated wildcards - from nodejs_wheel bundled npm, not used in application runtime code
"GHSA-83g3-92jg-28cx" # tar arbitrary file read/write via hardlink - from nodejs_wheel bundled npm, not used in application runtime code
"CVE-2026-25639" # axios - full fix requires 1.x major version bump; pinned to >=0.30.2 to clear other axios CVEs, upgrade to 1.x in follow-up
"CVE-2026-2297" # Python 3.13 SourcelessFileLoader audit hook bypass - no fix available in base image
"GHSA-qffp-2rhf-9h96" # tar hardlink path traversal - from nodejs_wheel bundled npm, not used in application runtime code
)
# Build JSON array of allowlisted CVE IDs for jq
ALLOWED_IDS_JSON=$(printf '%s\n' "${ALLOWED_CVES[@]}" | jq -R . | jq -s .)
echo "Checking for vulnerabilities with CVSS score >= 4.0..."
echo "Allowlisted CVEs (ignored in threshold): ${ALLOWED_CVES[*]}"
echo ""
# Show all high-severity vulnerabilities for transparency
TOTAL_HIGH_SEVERITY=$(grype litellm:latest -o json | jq -r '
.matches[]
| select(.vulnerability.cvss[]?.metrics.baseScore >= 4.0)
| .vulnerability.id' | wc -l)
if [ "$TOTAL_HIGH_SEVERITY" -gt 0 ]; then
echo "Total vulnerabilities found with CVSS >= 4.0: $TOTAL_HIGH_SEVERITY"
echo ""
echo "All high-severity vulnerabilities (including allowlisted):"
grype litellm:latest -o json | jq --argjson allow "$ALLOWED_IDS_JSON" -r '
["Package", "Version", "Vulnerability ID", "CVSS Score", "Allowlisted"],
(.matches[]
| select(.vulnerability.cvss[]?.metrics.baseScore >= 4.0)
| [.artifact.name, .artifact.version, .vulnerability.id, .vulnerability.cvss[0].metrics.baseScore, (if (.vulnerability.id as $id | $allow | index($id)) then "YES" else "NO" end)])
| @tsv' | column -t -s $'\t'
echo ""
fi
HIGH_SEVERITY_COUNT=$(grype litellm:latest -o json | jq --argjson allow "$ALLOWED_IDS_JSON" -r '
.matches[]
| select(.vulnerability.cvss[]?.metrics.baseScore >= 4.0)
| select((.vulnerability.id as $id | $allow | index($id) | not))
| .vulnerability.id' | wc -l)
if [ "$HIGH_SEVERITY_COUNT" -gt 0 ]; then
echo ""
echo "=========================================="
echo "ERROR: Security Scan Failed"
echo "=========================================="
echo "Found $HIGH_SEVERITY_COUNT non-allowlisted vulnerabilities with CVSS score >= 4.0 in litellm:latest"
echo ""
echo "These vulnerabilities are NOT in the allowlist and must be addressed."
echo "Current allowlisted CVEs: ${ALLOWED_CVES[*]}"
echo ""
echo "Detailed vulnerability report:"
echo ""
grype litellm:latest -o json | jq --argjson allow "$ALLOWED_IDS_JSON" -r '
["Package", "Version", "Vulnerability ID", "CVSS Score", "Severity", "Fix Version", "Description"],
(.matches[]
| select(.vulnerability.cvss[]?.metrics.baseScore >= 4.0)
| select((.vulnerability.id as $id | $allow | index($id) | not))
| [.artifact.name, .artifact.version, .vulnerability.id, .vulnerability.cvss[0].metrics.baseScore, .vulnerability.severity, (.vulnerability.fix.versions[0] // "No fix available"), .vulnerability.description])
| @tsv' | column -t -s $'\t'
echo ""
echo "=========================================="
echo "Action Required:"
echo "=========================================="
echo "1. If a fix is available, update the package to the fixed version"
echo "2. If the vulnerability is not applicable or has no fix:"
echo " - Add the CVE/GHSA ID to ALLOWED_CVES array in ci_cd/security_scans.sh"
echo " - Add a comment explaining why it's safe to ignore"
echo ""
echo "Note: Some vulnerabilities may have multiple IDs (CVE-XXXX and GHSA-XXXX)."
echo "Add all relevant IDs to the allowlist if they refer to the same issue."
echo "=========================================="
echo ""
exit 1
else
echo "No high-severity vulnerabilities (CVSS >= 4.0) found in litellm:latest"
fi
echo "Grype scans completed successfully"
}
# Main execution
main() {
echo "Installing security scanning tools..."
install_trivy
install_grype
# echo "Running secret detection scans..."
# run_secret_detection
echo "Running filesystem vulnerability scans..."
run_trivy_scans
echo "Running Docker image vulnerability scans..."
run_grype_scans
echo "All security scans completed successfully!"
}
# Execute main function
main "$@"