Commit Graph

98 Commits

Author SHA1 Message Date
yuneng-jiang 846e4b44b6 Merge pull request #24682 from michelligabriele/fix/budget-spend-counters
fix(proxy): enforce budget limits across multi-pod deployments via Redis-backed spend counters
2026-03-27 16:59:23 -07:00
michelligabriele d533b432fd fix(proxy): enforce budget limits across multi-pod deployments via Redis-backed spend counters
Budget checks on API keys, teams, and team members were not enforced in
multi-pod deployments because user_api_key_cache is intentionally
in-memory-only. Each pod tracked spend independently, so with N pods
the effective budget was N × max_budget.

Introduces a separate spend_counter_cache (DualCache wired to
redis_usage_cache) with atomic increment/read helpers:
- increment_spend_counters(): awaited in cost callback (not create_task)
  to update both in-memory and Redis before the next auth check
- get_current_spend(): reads Redis first (cross-pod authoritative),
  falls back to in-memory, then to cached object .spend from DB

Budget check functions (_virtual_key_max_budget_check,
_team_max_budget_check, _check_team_member_budget) now read spend via
get_current_spend() instead of cached object .spend fields.

When Redis is not configured, falls back to in-memory-only counters
(same as current single-instance behavior).

Fixes #23714
2026-03-27 20:39:52 +01:00
Sameer Kankute 92a07e2d6e fix(proxy): address Greptile review feedback
- Remove HTTP_PROXY/HTTPS_PROXY from blocklist (legitimately used in corporate envs)
- Add NO_PROXY/no_proxy to blocklist (prevents bypassing proxy monitoring)
- Remove dead code in _is_valid_user_id (space exception was unreachable)
- Update tests accordingly

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-27 20:38:36 +05:30
Sameer Kankute 8112fbf274 fix(proxy): sanitize user_id input and block dangerous env var keys
Add input validation to get_user_id_from_request (length limit, control char rejection) and a blocklist of dangerous environment variable keys in _load_environment_variables to prevent PATH/LD_PRELOAD/PYTHONPATH override via config.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-27 20:38:36 +05:30
Ryan Crabbe ad43a35d76 feat: add control plane for multi-proxy worker management
Adds a control plane capability that enables a central admin instance
to manage multiple regional worker proxies from a single UI.

Backend:
- Worker registry loaded from YAML config (worker_id, name, url)
- /.well-known/litellm-ui-config exposes is_control_plane and workers list
- /v3/login + /v3/login/exchange: opaque code exchange for cross-origin
  username/password auth (JWT never in URL/logs, single-use 60s TTL)
- SSO cookie handoff with return_to → opaque code → exchange
- _validate_return_to: full origin validation (scheme+hostname+port)
- Startup warning when control_plane_url set without Redis
- Both /v3 endpoints gated behind control_plane_url config

Frontend:
- Worker selector dropdown on login page (gated behind is_control_plane)
- Cross-origin SSO code exchange handling on callback
- switchToWorkerUrl: localStorage-persisted worker URL for API calls
- useWorker hook: shared worker state management
- WorkerDropdown in navbar for switching workers
- Logout/switch clears worker state from localStorage

Tests:
- 7 tests for /v3/login + /v3/login/exchange
- 10 tests for _validate_return_to
- 2 tests for control plane discovery endpoint
2026-03-19 22:50:19 -07:00
Sameer Kankute 7e2f2a8ffa Fix inflight mypy 2026-03-02 19:41:32 +05:30
Darien Kindlund dc96ade956 fix: preserve interval_hours in model cost map reload config (#22200)
The upsert update branches for model_cost_map_reload_config were
overwriting param_value with only the force_reload flag, dropping
interval_hours. This caused scheduled reloads to self-destruct
after their first execution.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-02 19:21:11 +05:30
Ryan Crabbe 5bcaeabfd8 Merge origin/main into litellm_fix_streaming_connection_pool_leak
Resolve conflict in test_proxy_server.py: keep both async_data_generator
cleanup tests and store_model_in_db DB config override tests.
2026-02-21 12:44:50 -08:00
yuneng-jiang 6bfab8acd4 address greptile review feedback (greploop iteration 2)
Reset logo_path to default_logo when custom UI_LOGO_PATH file doesn't
exist, so the else branch at the bottom of get_image serves the default
logo instead of the non-existent custom path.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-19 20:25:00 -08:00
yuneng-jiang 145efe2267 address greptile review feedback (greploop iteration 1)
Add os.path.exists check before serving custom local logo so that a
non-existent UI_LOGO_PATH gracefully falls through to the cache/default
instead of causing a FileResponse error.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-19 20:09:42 -08:00
yuneng-jiang a8026154ab [Fix] /get_image returns stale cached logo instead of custom UI_LOGO_PATH
The /get_image endpoint checked for cached_logo.jpg before reading the
UI_LOGO_PATH env var, so a pre-existing cache (e.g. baked into the base
Docker image) would always be served, ignoring the user's custom logo.

Move the UI_LOGO_PATH read before the cache check and serve local file
paths directly, bypassing the cache. The cache optimization is preserved
for HTTP URLs and the default logo where it is actually needed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-19 18:32:48 -08:00
Julio Quinteros Pro 68b83b1376 fix(tests): restore litellm.model_cost after TestPriceDataReloadIntegration tests
test_complete_reload_flow and test_distributed_reload_check_function both
trigger code paths that assign a minimal stub dict to litellm.model_cost
(via the /reload/model_cost_map endpoint and _check_and_reload_model_cost_map).
Without restoring, subsequent tests in the same worker can't find gpt-4o
pricing and calculate spend=0.0 instead of the expected value.

Added try/finally save-and-restore of litellm.model_cost in both tests,
matching the pattern used in test_reload_model_cost_map_admin_access.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-19 01:42:48 -03:00
yuneng-jiang 2c6095bdf2 Merge pull request #21511 from BerriAI/litellm_store_model_in_db_from_database
[Feature] Allow store_model_in_db to be set via database
2026-02-18 17:43:56 -08:00
yuneng-jiang 6356db560d [Feature] Allow store_model_in_db to be set via database
Users had to set store_model_in_db in the config YAML and restart the proxy,
causing service downtime. This change allows the value to be written to the
LiteLLM_Config table and read from the database at runtime, with DB values
overriding config file values.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-18 17:25:39 -08:00
Julio Quinteros Pro 1c0f4302f8 fix(tests): restore litellm.model_cost after reload endpoint test
test_reload_model_cost_map_admin_access calls the /reload/model_cost_map
HTTP endpoint with get_model_cost_map mocked to return a single-entry
dict. The endpoint handler does a direct module-level assignment
(litellm.model_cost = new_model_cost_map) which persists after the
patch context manager exits, stripping all models except gpt-3.5-turbo
from the in-memory cost map and causing subsequent tests that rely on
models like gemini-1.5-flash, multimodalembedding@001, and gpt-4o to
fail with "model not mapped" errors or zero-cost spend payloads.

Fix: save litellm.model_cost before the test and restore it (along with
invalidating the case-insensitive lookup cache) in a finally block.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-18 18:20:48 -03:00
Julio Quinteros Pro 77f315eb11 fix: address Greptile review feedback for test isolation
- test_pillar_guardrails.py: Fix fixture to properly update module-level
  litellm reference using global keyword and assignment from reload
- test_anthropic_experimental_pass_through_messages_handler.py: Add missing
  assert keywords to kwargs comparison statements (lines 36, 60-62)
- test_proxy_server.py: Replace silent pytest.skip with explicit assertion
  to catch router initialization regressions

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-17 21:28:23 -03:00
jquinter f555846c83 Merge pull request #21227 from BerriAI/fix/sso-test-premium-check
Fix SSO test flakiness by correctly mocking premium_user
2026-02-15 13:18:08 -03:00
Julio Quinteros Pro c52251ca72 test: Fix test isolation issues caused by module reloading
Fix several tests that fail in CI due to parallel test execution and
module reloading in conftest.py.

1. test_empty_assistant_message_handling:
   - Use patch.object on factory_module.litellm instead of direct assignment
   - Ensures the correct litellm reference is modified after conftest reloads

2. test_embedding_header_forwarding_with_model_group:
   - Use patch.object on pre_call_utils_module.litellm instead of direct assignment
   - Same fix for module reloading issue

3. test_embedding_input_array_of_tokens:
   - Move mock inside test function (after fixture initializes router)
   - Add skip condition if llm_router is None
   - Fixes "AttributeError: None does not have 'aembedding'" in parallel execution

Root cause: conftest.py reloads litellm at module scope, which can cause:
- Different litellm references between test code and library code
- Global state (like llm_router) being None at decorator execution time
- isinstance checks failing due to class identity mismatches
2026-02-15 13:08:41 -03:00
Julio Quinteros Pro 12fddb4b8a Fix SSO test flakiness by mocking premium_user correctly
The test_sso_key_generate_shows_deprecation_banner test was failing in CI
with a 403 Forbidden error because the SSO endpoint checks for premium_user
at line 297 in ui_sso.py.

The fix adds a monkeypatch for premium_user at its source location
(litellm.proxy.proxy_server.premium_user) to bypass the enterprise check
during testing.

Fixes the intermittent test failure where the endpoint would return 403
instead of the expected 200 status code.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-15 13:05:37 -03:00
Ryan Crabbe f5e36066ab fix: add debug logging to stream cleanup, improve tests 2026-02-14 17:31:39 -08:00
Ryan Crabbe 75ca44434a fix: close streaming connections to prevent connection pool exhaustion
- Add aclose() to CustomStreamWrapper to delegate to underlying stream
- Add finally block in async_data_generator to release HTTP connections
- Thread shared_session through async_streaming to reuse connection pool
- Set finite default timeout (600s) in _get_openai_client
2026-02-13 21:53:28 -08:00
yuneng-jiang 111593397a fixing core proxy tests 2026-02-12 17:54:32 -08:00
yuneng-jiang 40295595c7 allow team and org admins to call invitation/new 2026-02-11 11:23:27 -08:00
Ishaan Jaffer 35e29c2bcd Revert "Merge pull request #18790 from BerriAI/litellm_key_team_routing_3"
This reverts commit ae26d8e68a, reversing
changes made to 864e8c6543.
2026-01-31 17:58:46 -08:00
Ishaan Jaffer 280e8a9cd7 test_get_image_non_root_uses_var_lib_assets_dir 2026-01-31 12:05:09 -08:00
yuneng-jiang 58dd3bd134 fixing sorting for v2/model/info 2026-01-28 18:07:22 -08:00
yuneng-jiang 28ca991296 Allow dynamic setting of store_prompts_in_spend_logs 2026-01-27 20:52:07 -08:00
yuneng-jiang 1581bcf985 add sortBy and sortOrder params for /v2/model/info 2026-01-27 16:54:52 -08:00
yuneng-jiang 47810f1523 Model and Team filtering 2026-01-24 14:45:14 -08:00
yuneng-jiang b44ac6c682 Fixing ruff check 2026-01-24 09:08:29 -08:00
yuneng-jiang 3ee7aab5f2 All Models Backend Search 2026-01-22 22:00:22 -08:00
yuneng-jiang 6723b30d03 Adding scope to /models 2026-01-21 16:40:31 -08:00
yuneng-jiang d0e35751a1 Fixing tests and linting 2026-01-21 11:02:39 -08:00
yuneng-jiang b5a7d2ab34 Paginating model/info endpoint 2026-01-21 10:44:18 -08:00
YutaSaito eec4ed640b Revert "Stabilise mock tests" 2026-01-17 06:26:18 +09:00
Sameer Kankute 83e33944ef Fix: mock test tests 2026-01-15 22:02:42 +05:30
Sameer Kankute 4bdda9cc28 Fix: tests/test_litellm/proxy/test_proxy_server.py::test_embedding_input_array_of_tokens 2026-01-15 19:46:35 +05:30
yuneng-jiang 1b9c7deec6 Merge remote-tracking branch 'origin' into litellm_key_team_routing_3 2026-01-08 10:39:12 -08:00
yuneng-jiang 51759424a6 Key and Team Routing Setting 2026-01-07 17:17:30 -08:00
yuneng-jiang 1c84af8ae4 normalize proxy config callbacks 2026-01-07 12:22:57 -08:00
yuneng-jiang 564b2b51cc Fix for dev env 2025-12-23 16:09:17 -08:00
yuneng-jiang 05dd247ff5 Fix UI disappearing for development instances 2025-12-23 15:24:07 -08:00
yuneng-jiang 6bb5254c9b Revert "[Fix] UI - Disappears in Development Environments" 2025-12-23 15:08:07 -08:00
yuneng-jiang fccd2d1e87 Fix UI disappearing for development instances 2025-12-23 11:46:55 -08:00
yuneng-jiang ed4a4c13d6 Base commit 2025-12-23 11:46:35 -08:00
Alexsander Hamir 5534038e93 Fix CI: Revert security scan changes and add GitGuardian ignore rules (#18358) 2025-12-22 17:03:53 -08:00
Ishaan Jaffer 6112160a16 Revert "[Fix] Security - Remove example API keys with high entropy (#18255)"
This reverts commit 24edbccf5c.
2025-12-20 20:48:11 +05:30
Alexsander Hamir 24edbccf5c [Fix] Security - Remove example API keys with high entropy (#18255) 2025-12-19 10:09:50 -08:00
yuneng-jiang dd182a2ed0 Adding tests 2025-12-17 16:28:15 -08:00
yuneng-jiang 70f7c8b771 Merge remote-tracking branch 'origin' into litellm_dd_callback_fix 2025-12-17 11:05:46 -08:00