litellm

mirror of https://github.com/tiennm99/litellm.git synced 2026-06-24 09:38:34 +00:00

Author	SHA1	Message	Date
ryan-crabbe-berri	2eb3c20e76	Merge pull request #24718 from BerriAI/litellm_ryan-march-26 litellm ryan march 26	2026-03-28 09:01:11 -07:00
Ryan Crabbe	98ecf17550	fix(ui): refactor budget page to React Query hooks and fix crashes - Migrate budget CRUD from manual state to React Query hooks (useBudgets, useCreateBudget, useUpdateBudget, useDeleteBudget) - Fix crash when budget list contains null entries by filtering in query hook - Fix max_budget type from string to number to match DB schema (double precision) - Disable budget_id field in edit modal to prevent accidental changes - Use budget_id as React key instead of array index - Update tests to mock hooks instead of networking functions	2026-03-27 19:34:24 -07:00
Yuneng Jiang	a074d1d68b	[Infra] Mirror litellm_table_patch source changes (no binaries) Cherry-pick source-only changes from litellm_table_patch, excluding build artifacts from the incident response period. - Remove destructive DROP COLUMN migration (20260311180521_schema_sync) - Remove now-unnecessary restore migration (20260327232350) - Bump litellm-proxy-extras 0.4.60 → 0.4.61 - Add regression test to block future DROP COLUMN migrations - Fix double error handling in getTeamPermissionsCall Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-27 16:45:12 -07:00
Ryan Crabbe	7e50af9228	Migrate route_preview.tsx from Tremor to Ant Design Replace Tremor Card/Title/Subtitle with antd Card/Typography equivalents.	2026-03-23 22:32:26 -07:00
Ryan Crabbe	e40f68aec4	test(ui): add unit tests for 5 untested frontend components - AntDLoadingSpinner: rendering, prop forwarding, icon styling - MessageManager: static fallback, custom instance delegation - claude_code_plugins/helpers: all pure utility functions (15 describe blocks, 55 tests) - AgentSelector: fetch behavior, loading states, error handling, disabled state - WorkerDropdown: conditional rendering, worker options, selection changes	2026-03-23 22:03:10 -07:00
yuneng-jiang	38d477507d	remove outdated e2e test	2026-03-21 23:14:53 -07:00
yuneng-jiang	9073daeebc	[Fix] UI - TeamDropdown: Match org dropdown styling and fix test mock - Use Select.Option with font-medium alias + Text secondary ID to match OrganizationDropdown - Default page size to 20 - Add useInfiniteTeams mock to AddModelForm tests Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-21 22:52:16 -07:00
yuneng-jiang	aea8e32048	[Fix] UI: Team table refresh, infinite team dropdown, leftnav for dashboard routes - OldTeams: refresh table via fetchTeamsV2 after team create instead of appending - TeamDropdown: rewrite with useInfiniteTeams for paginated fetch, scroll-to-load, and debounced search - Update all TeamDropdown consumers to use the new self-fetching API - Dashboard layout: switch from Sidebar2 to SidebarProvider (leftnav) - Leftnav: add MIGRATED_PAGES routing for path-based navigation (api-reference) - Navbar: remove chat button Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-21 22:05:26 -07:00
yuneng-jiang	e3d4c29d37	Merge pull request #24323 from BerriAI/litellm_ryan_march_20 litellm ryan march 20	2026-03-21 15:57:28 -07:00
Krish Dholakia	f911d8d865	Merge pull request #23818 from BerriAI/litellm_oss_staging_03_17_2026 fix(fireworks): skip #transform=inline for base64 data URLs (#23729)	2026-03-21 14:54:39 -07:00
Ishaan Jaff	2ea9e207bd	Litellm ishaan march 20 (#24303 ) * feat(redis): add circuit breaker to RedisCache to fast-fail when Redis is down (#24181) * feat(redis): add circuit breaker env var constants * feat(redis): add RedisCircuitBreaker and apply guard decorator to all async ops * fix(dual_cache): fall back to L1 instead of re-raising on Redis increment failures * test(caching): add circuit breaker unit tests * fix(redis): fast-fail concurrent HALF_OPEN probes — only one probe at a time * fix(dual_cache): return None fallback when in_memory_cache is absent and Redis fails * test(caching): add regression tests for HALF_OPEN concurrency and None fallback * Fix blocking sync next in __anext__ (#24177) * Fix blocking sync next * Update tests/test_litellm/litellm_core_utils/test_streaming_handler.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * fix PEP 479 regression in __anext__ sync iterator exhaustion asyncio.to_thread re-raises thread exceptions inside a coroutine, where PEP 479 converts StopIteration to RuntimeError before any except clause can catch it. Add _next_sync_or_exhausted() module-level helper that catches StopIteration in the thread and returns a sentinel instead, then raise StopAsyncIteration in the coroutine. Also rewrites the non-blocking test to use asyncio.gather() instead of asyncio.create_task() (which returned None on Python 3.9 / pytest-asyncio in CI), and adds an exhaustion regression test that drains the wrapper fully and asserts no RuntimeError leaks out. --------- Co-authored-by: Emerson Gomes <emerson.gomes@thalesgroup.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * feat: add git-subdir source type to claude-code/plugins API (#24223) Support a third plugin source type `git-subdir` alongside the existing `github` and `url` types, as documented in the official Claude Code plugin marketplaces spec. New format: {"source": "git-subdir", "url": "...", "path": "subdir/path"} - Validates url and path fields are present and non-empty - Rejects absolute paths, '..' segments, backslashes, and percent-encoded traversal sequences (including double-encoded variants via regex check) - Extracts path validation into _validate_git_subdir_path() helper - Updates Pydantic field description to document all three source types - Adds isValidUrl() check for url/git-subdir source types in the UI form - Adds "Git Subdir" option to the UI form with a required Path field - Adds unit tests covering success, update, missing/empty fields, path traversal variants, and unknown source type Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * [FEAT] add extract_header and extract_footer to Mistral OCR supported params (#24213) * docs: add git-subdir source type to claude-code plugin marketplace docs (#24289) * fix(ui): swap J/K keyboard navigation in log details drawer (#24279) (#24286) J should navigate down (next) and K should navigate up (previous), matching vim/standard conventions. * fix: use async_set_cache in user_api_key_auth hot path (#24302) * fix: use async_set_cache in auth hot path to avoid blocking event loop * test: assert no blocking set_cache call in _user_api_key_auth_builder * test: broaden blocking call check to all sync DualCache methods * test: fix regression test to actually catch blocking cache calls * fix: ruff lint unused variable + UI build MessageManager error - litellm/caching/redis_cache.py: remove unused variable 'e' in circuit breaker exception handler (F841) - add_plugin_form.tsx: use MessageManager.error() instead of undefined message.error() for git URL validation Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * docs: add REDIS_CIRCUIT_BREAKER env vars to config_settings reference Add REDIS_CIRCUIT_BREAKER_FAILURE_THRESHOLD and REDIS_CIRCUIT_BREAKER_RECOVERY_TIMEOUT to the environment variables reference table so test_env_keys.py passes. Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> --------- Co-authored-by: Emerson Gomes <emerson.gomes@thalesgroup.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Vincenzo Barrea <manamana88@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Robert Kirscht <rkirscht242@gmail.com> Co-authored-by: Imgyu Kim <kimimgo@gmail.com> Co-authored-by: Cursor Agent <cursoragent@cursor.com> Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>	2026-03-21 12:40:11 -07:00
Ryan Crabbe	773871025e	Remove required asterisks from v3 login form fields Hide the red asterisk indicators on Username and Password fields while keeping the validation rules intact.	2026-03-21 12:32:17 -07:00
ryan-crabbe	b21948fdc3	Merge pull request #24315 from BerriAI/litellm_copy_user_id_on_click polish: add click-to-copy icon on User ID in internal users table	2026-03-21 12:06:36 -07:00
Sameer Kankute	49abf98a27	Merge branch 'main' into litellm_oss_staging_03_17_2026	2026-03-21 21:16:49 +05:30
Ryan Crabbe	e1d2ab3a87	test(ui): add unit test for tags dropdown populated via useTags hook Made-with: Cursor	2026-03-20 22:58:02 -07:00
Ryan Crabbe	8766004c36	fix(ui): use useTags hook so tag dropdown populates on key creation The create key form used getPredefinedTags() which only extracted tags from existing keys' metadata. If no keys had tags, the dropdown was empty. Switch to the existing useTags() React Query hook that fetches from /tag/list, matching the edit key form behavior.	2026-03-20 22:58:02 -07:00
yuneng-jiang	2ca4fa6189	Merge branch 'main' into litellm_yj_march_19_2026	2026-03-20 17:28:41 -07:00
yuneng-jiang	3ea69c9539	Merge remote-tracking branch 'origin' into litellm_yj_march_19_2026	2026-03-20 12:37:26 -07:00
yuneng-jiang	8c396e5ca9	Merge pull request #24172 from BerriAI/litellm_extract_useChatHistory_hook [Refactor] Extract useChatHistory hook from ChatUI.tsx	2026-03-20 00:17:07 -07:00
yuneng-jiang	bdf24757c0	Merge pull request #24189 from BerriAI/litellm_teams_table_modernize [Feature] UI - Teams: Modernize Teams Table	2026-03-20 00:16:47 -07:00
yuneng-jiang	9b519c4754	Merge pull request #24192 from BerriAI/litellm_migrate_antd_message_to_context_api [Fix] UI: AntD Messages Not Rendering	2026-03-20 00:16:04 -07:00
yuneng-jiang	34d954f8cb	[Fix] UI: Migrate AntD message API to use context-based MessageManager AntD v5 static message API doesn't render without an App wrapper or useMessage() context holder. Mirrors the existing notification pattern by adding message.useMessage() to AntdGlobalProvider and routing all calls through a new MessageManager module. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-19 23:29:38 -07:00
Ryan Crabbe	ad43a35d76	feat: add control plane for multi-proxy worker management Adds a control plane capability that enables a central admin instance to manage multiple regional worker proxies from a single UI. Backend: - Worker registry loaded from YAML config (worker_id, name, url) - /.well-known/litellm-ui-config exposes is_control_plane and workers list - /v3/login + /v3/login/exchange: opaque code exchange for cross-origin username/password auth (JWT never in URL/logs, single-use 60s TTL) - SSO cookie handoff with return_to → opaque code → exchange - _validate_return_to: full origin validation (scheme+hostname+port) - Startup warning when control_plane_url set without Redis - Both /v3 endpoints gated behind control_plane_url config Frontend: - Worker selector dropdown on login page (gated behind is_control_plane) - Cross-origin SSO code exchange handling on callback - switchToWorkerUrl: localStorage-persisted worker URL for API calls - useWorker hook: shared worker state management - WorkerDropdown in navbar for switching workers - Logout/switch clears worker state from localStorage Tests: - 7 tests for /v3/login + /v3/login/exchange - 10 tests for _validate_return_to - 2 tests for control plane discovery endpoint	2026-03-19 22:50:19 -07:00
yuneng-jiang	883611aa91	[Feature] UI - Teams: Modernize teams table with AntD, server-side pagination, and v2 API Migrate the OldTeams table from Tremor to Ant Design components, matching the Access Groups page pattern. Switch from /team/list to /v2/team/list for server-side pagination, filtering, and sorting. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-19 22:42:15 -07:00
yuneng-jiang	f34fe4758a	fix: stale closure, simplified session isolation, debounce-race in useChatHistory - clearChatHistory: use functional setChatHistory updater so blob URL revocation operates on the latest snapshot, not a stale closure capture. - Simplified mode: skip sessionStorage hydration and persistence for messageTraceId, responsesSessionId, and useApiSessionManagement so embedded widgets don't cross-contaminate the full playground session. - Debounce race: skip re-writing empty chatHistory to sessionStorage after clearChatHistory already removed the key. - Added 5 new tests covering these fixes (39 total). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-19 21:19:51 -07:00
yuneng-jiang	954bfcbd42	chore: remove verbose console.log calls from useChatHistory hook Removes 17 console.log statements that fired on every streaming chunk in updateTextUI, updateTimingData, updateUsageData, and other hot-path functions. The console.error for sessionStorage parse failures is kept. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-19 20:46:51 -07:00
Sameer Kankute	c545c969f7	Merge branch 'main' into litellm_oss_staging_03_17_2026	2026-03-20 08:42:41 +05:30
yuneng-jiang	e88425b881	refactor: wire ChatUI to use useChatHistory hook Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-19 18:09:20 -07:00
yuneng-jiang	f3c6915d61	feat: add useChatHistory hook with tests (extracted from ChatUI) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-19 18:09:20 -07:00
yuneng-jiang	2e70c23307	Gate legacy redirect on authLoading to ensure proxyBaseUrl is resolved The redirect useEffect fires before getUiConfig() completes, so proxyBaseUrl is always "" on first render. Gate on !authLoading so the redirect only fires after config is fetched, matching the pattern used by the login redirect. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-19 14:05:56 -07:00
yuneng-jiang	e9e5ed989c	Move legacy redirect from render phase to useEffect router.replace was called directly during render, which is unsafe in React 18 concurrent mode. Move it into a useEffect and use a computed flag (isLegacyRedirect) to show LoadingScreen while redirecting. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-19 13:59:45 -07:00
yuneng-jiang	895951744d	Fix leftnav navigation regression and useProxySettings initial state The leftnav was updated to emit page="api-reference" but only "api_ref" was in LEGACY_REDIRECTS, causing clicks to fall through to the default Usage page. Add "api-reference" entry to the redirect map. Also include LITELLM_UI_API_DOC_BASE_URL in the hook's initial state to avoid a brief flash of incorrect base URL. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-19 13:36:21 -07:00
yuneng-jiang	519afa494c	[Refactor] UI - API Reference: Migrate to path-based routing Move the API Reference page from query-param routing (?page=api_ref) to Next.js path-based routing (/ui/api-reference). Add a LEGACY_REDIRECTS map in the root page.tsx so users with old bookmarks are seamlessly redirected. Future page migrations only need one new map entry. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-19 12:57:17 -07:00
yuneng-jiang	0b07f628ff	[Test] UI: Add vitest coverage for 10 previously untested components Add unit tests for: - SimpleToolCallBlock, SimpleMessageBlock, CollapsibleMessage, HistoryTree (log details drawer) - OnboardingForm (onboarding flow) - TeamsHeaderTabs, TeamsTable (teams page) - transform_key_info, filter_helpers (key/team helpers) - queryKeysFactory (query key generation utility) 47 new tests covering conditional rendering, user interactions, data transformation, and error handling. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-19 10:30:03 -07:00
Ryan Crabbe	1d7cff22cb	feat(ui): add click-to-copy icon on User ID in internal users table Add a CopyOutlined icon next to the truncated User ID that copies the full UUID to clipboard on click. Follows the existing pattern used in model_hub_table_columns.tsx.	2026-03-19 08:57:42 -07:00
Sameer Kankute	e2e4f9ed33	Merge branch 'main' into litellm_oss_staging_03_17_2026	2026-03-19 15:53:06 +05:30
yuneng-jiang	d984b293de	[Feature] UI - Leftnav: Add external link icon to Learning Resources Add ExportOutlined icon next to nav items that link to external pages, making it clear to users when a link opens in a new tab. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-18 17:56:25 -07:00
yuneng-jiang	0b63979d45	Fix build: cast endpointType to EndpointType at call site ChatUI stores endpointType as string but the narrowed prop expects EndpointType — add explicit cast at the call site. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-18 17:04:42 -07:00
yuneng-jiang	ebe329cdce	Fix build: use `as any` for SyntaxHighlighter style prop Matches the cast used in ChatUI.tsx — the react-syntax-highlighter type definitions don't accept CSSProperties directly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-18 17:02:18 -07:00
yuneng-jiang	b55cb249fe	Address Greptile feedback: use EndpointType enum, add CHAT MCP test - Narrow endpointType prop from string to EndpointType enum - Add missing test for MCP events on CHAT endpoint Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-18 16:46:41 -07:00
yuneng-jiang	3ba18d7084	[Refactor] UI - Playground: Extract ChatMessageBubble from ChatUI Extract the chat message bubble rendering (~165 lines) into a dedicated ChatMessageBubble component with 15 Vitest tests covering all display branches. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-18 16:38:58 -07:00
yuneng-jiang	d98440f452	Merge remote-tracking branch 'origin' into litellm_yj_march_18_2026	2026-03-18 16:20:07 -07:00
yuneng-jiang	83d185d867	Merge pull request #24039 from BerriAI/litellm_/vibrant-hypatia [Fix] UI - Default Team Settings: Add Missing Permission Options	2026-03-18 15:57:51 -07:00
yuneng-jiang	1e6abf8142	Merge branch 'main' into litellm_yj_march_17_2026	2026-03-18 15:13:41 -07:00
Ishaan Jaff	8e61b32b8e	[Staging] - Ishaan March 17th (#23903 ) * feat(xai): add grok-4.20 beta 2 models with pricing (#23900) Add three grok-4.20 beta 2 model variants from xAI: - grok-4.20-multi-agent-beta-0309 (reasoning + multi-agent) - grok-4.20-beta-0309-reasoning (reasoning) - grok-4.20-beta-0309-non-reasoning Pricing (from https://docs.x.ai/docs/models): - Input: $2.00/1M tokens ($0.20/1M cached) - Output: $6.00/1M tokens - Context: 2M tokens All variants support vision, function calling, tool choice, and web search. Closes LIT-2171 * docs: add Quick Install section for litellm --setup wizard (#23905) * docs: add Quick Install section for litellm --setup wizard * docs: clarify setup wizard is for local/beginner use * feat(setup): interactive setup wizard + install.sh (#23644) * feat(setup): add interactive setup wizard + install.sh Adds `litellm --setup` — a Claude Code-style TUI onboarding wizard that guides users through provider selection, API key entry, and proxy config generation, then optionally starts the proxy immediately. - litellm/setup_wizard.py: wizard with ASCII art, numbered provider menu (OpenAI, Anthropic, Azure, Gemini, Bedrock, Ollama), API key prompts, port/master-key config, and litellm_config.yaml generation - litellm/proxy/proxy_cli.py: adds --setup flag that invokes the wizard - scripts/install.sh: curl-installable script (detect OS/Python, pip install litellm[proxy], launch wizard) Usage: curl -fsSL https://raw.githubusercontent.com/BerriAI/litellm/main/scripts/install.sh \| sh litellm --setup * fix(install.sh): remove orange color, add LITELLM_BRANCH env var for branch installs * fix(install.sh): install from git branch so --setup is available for QA * fix(install.sh): remove stale LITELLM_BRANCH reference that caused unbound variable error * fix(install.sh): force-reinstall from git to bypass cached PyPI version * fix(install.sh): show pip progress bar during install * fix(install.sh): always launch wizard via $PYTHON_BIN -m litellm, not PATH binary * fix(install.sh): use litellm.proxy.proxy_cli module (no __main__.py exists) * fix(install.sh): suppress RuntimeWarning from module invocation * fix(install.sh): use Python bin-dir litellm binary to avoid CWD sys.path shadowing * fix(install.sh): use sysconfig.get_path('scripts') to find pip-installed litellm binary * fix(install.sh): redirect stdin from /dev/tty on exec so wizard gets terminal, not exhausted pipe * fix(install.sh): warn about git clone duration, drop --no-cache-dir so re-runs are faster * feat(setup_wizard): arrow-key selector, updated model names * fix(setup_wizard): use sysconfig binary to start proxy, not python -m litellm * feat(setup_wizard): credential validation after key entry + clear next-steps after proxy start * style(install.sh): show git clone warning in blue * refactor(setup_wizard): class with static methods, use check_valid_key from litellm.utils * address greptile review: fix yaml escaping, port validation, display name collisions, tests - setup_wizard.py: add _yaml_escape() for safe YAML embedding of API keys - setup_wizard.py: add _styled_input() with readline ANSI ignore markers - setup_wizard.py: change DIVIDER to _divider() fn to avoid import-time color capture - setup_wizard.py: validate port range 1-65535, initialize before loop - setup_wizard.py: qualify azure display names (azure-gpt-4o) to avoid collision with openai - setup_wizard.py: work on env_copy in _build_config to avoid mutating caller's dict - setup_wizard.py: skip model_list entries for providers with no credentials - setup_wizard.py: prompt for azure deployment name - setup_wizard.py: wrap os.execlp in try/except with friendly fallback - setup_wizard.py: wrap config write in try/except OSError - setup_wizard.py: fix _validate_and_report to use two print lines (no \r overwrite) - setup_wizard.py: add .gitignore tip next to key storage notice - setup_wizard.py: fix run_setup_wizard() return type annotation to None - scripts/install.sh: drop pipefail (not supported by dash on Ubuntu when invoked as sh) - scripts/install.sh: use litellm[proxy] from PyPI (not hardcoded dev branch) - scripts/install.sh: guard /dev/tty read with -r check for Docker/CI compat - scripts/install.sh: remove --force-reinstall to avoid downgrading dependencies - tests/test_litellm/test_setup_wizard.py: 13 unit tests for _build_config and _yaml_escape * style: black format setup_wizard.py * fix: address remaining greptile issues - Windows compat, YAML quoting, credential flow - guard termios/tty imports with try/except ImportError for Windows compat - quote master_key as YAML double-quoted scalar (same as env vars) - remove unused port param from _build_config signature - _validate_and_report now returns the final key so re-entered creds are stored - add test for master_key YAML quoting * fix: add --port to suggested command, guard /dev/tty exec in install.sh * fix: quote api_base in YAML, skip azure if no deployment, only redraw on state change * fix: address greptile review comments - _yaml_escape: add control character escaping (\n, \r, \t) - test: fix tautological assertion in test_build_config_azure_no_deployment_skipped - test: add tests for control character escaping in _yaml_escape * feat(ui): remove Chat UI page link and banner from sidebar and playground (#23908) * feat(guardrails): MCPJWTSigner - built-in guardrail for zero trust MCP auth (#23897) * Allow pre_mcp_call guardrail hooks to mutate outbound MCP headers * Enhance MCPServerManager to support hook-modified arguments and extra headers. Update tests to validate argument mutation and header injection behavior, including warnings for OpenAPI-backed servers when headers are present. * Refactor MCPServerManager to raise HTTPException for extra headers in OpenAPI-backed servers. Update tests to reflect this change, ensuring proper exception handling instead of logging warnings. * Allow pre_mcp_call guardrail hooks to mutate outbound MCP headers * Enhance MCPServerManager to support hook-modified arguments and extra headers. Update tests to validate argument mutation and header injection behavior, including warnings for OpenAPI-backed servers when headers are present. * Refactor MCPServerManager to raise HTTPException for extra headers in OpenAPI-backed servers. Update tests to reflect this change, ensuring proper exception handling instead of logging warnings. * feat(guardrails): add MCPJWTSigner built-in guardrail for zero trust MCP auth Signs outbound MCP tool calls with a LiteLLM-issued RS256 JWT so MCP servers can trust a single signing authority instead of every upstream IdP. Enable in config.yaml: guardrails: - guardrail_name: mcp-jwt-signer litellm_params: guardrail: mcp_jwt_signer mode: pre_mcp_call default_on: true JWT carries sub (user_id), act.sub (team_id, RFC 8693), tool-level scope, iss, aud, iat/exp/nbf. RSA-2048 keypair auto-generated at startup unless MCP_JWT_SIGNING_KEY env var is set. Adds /.well-known/jwks.json endpoint and jwks_uri to /.well-known/openid-configuration so MCP servers can verify LiteLLM-issued tokens via OIDC discovery. * Update MCPServerManager to raise HTTPException with status code 400 for extra headers in OpenAPI-backed servers. Adjust tests to verify the correct status code and exception message. * fix: address P1 issues in MCPJWTSigner - OpenAPI servers: warn + skip header injection instead of 500 - JWKS Cache-Control: 5min for auto-generated keys, 1h for persistent - sub claim: fallback to apikey:{token_hash} for anonymous callers - ttl_seconds: validate > 0 at init time * docs: add MCP zero trust auth guide with architecture diagram * docs: add FastMCP JWT verification guide to zero trust doc * fix: address remaining Greptile review issues (round 2) - mcp_server_manager: warn when hook Authorization overwrites existing header - __init__: remove _mcp_jwt_signer_instance from __all__ (private internal) - discoverable_endpoints: copy dict instead of mutating in-place on OIDC augmentation - test docstring: reflect warn-and-continue behavior for OpenAPI servers - test: update scope assertions for least-privilege (no mcp:tools/list on tool-call JWTs) * fix: address Greptile round 3 feedback - initialize_guardrail: validate mode='pre_mcp_call' at init time — misconfigured mode silently bypasses JWT injection, which is a zero-trust bypass - _build_claims: remove duplicate inline 'import re' (module-level import already present) - _types.py: add TODO comment explaining jwt_claims is forward-compat plumbing for a follow-up PR that will forward upstream IdP claims into outbound MCP JWTs * feat(mcp_jwt_signer): add verify+re-sign, claim ops, two-token model, configurable scopes Addresses all missing pieces from the scoping doc review: FR-5 (Verify + re-sign): MCPJWTSigner now accepts access_token_discovery_uri and token_introspection_endpoint. When set, the incoming Bearer token is extracted from raw_headers (threaded through pre_call_tool_check), verified against the IdP's JWKS (JWT) or introspected (opaque), and only re-signed if valid. Falls back to user_api_key_dict.jwt_claims for LiteLLM JWT-auth mode. FR-12 (Configurable end-user identity mapping): end_user_claim_sources ordered list drives sub resolution — sources: token:<claim>, litellm:user_id, litellm:email, litellm:end_user_id, litellm:team_id. FR-13 (Claim operations): add_claims (insert-if-absent), set_claims (always override), remove_claims (delete) applied in that order. FR-14 (Two-token model): channel_token_audience + channel_token_ttl issue a second JWT injected as x-mcp-channel-token: Bearer <token>. FR-15 (Incoming claim validation): required_claims raises HTTP 403 when any listed claim is absent; optional_claims passes listed claims from verified token into the outbound JWT. FR-9 (Debug headers): debug_headers: true emits x-litellm-mcp-debug with kid, sub, iss, exp, scope. FR-10 (Configurable scopes): allowed_scopes replaces auto-generation. Also fixed: tool-call JWTs no longer grant mcp:tools/list (overpermission). P1 fixes: - proxy/utils.py: _convert_mcp_hook_response_to_kwargs merges rather than replaces extra_headers, preserving headers from prior guardrails. - mcp_server_manager.py: warns when hook injects Authorization alongside a server-configured authentication_token (previously silent). - mcp_server_manager.py: pre_call_tool_check now accepts raw_headers and extracts incoming_bearer_token so FR-5 verification has the raw token. - proxy/utils.py: remove stray inline import inspect inside loop (pre-existing lint error, now cleaned up). Tests: 43 passing (28 new tests covering all FR flags + P1 fixes). * feat(mcp_jwt_signer): add verify+re-sign, claim ops, two-token model, configurable scopes (core) Remaining files from the FR implementation: mcp_jwt_signer.py — full rewrite with all new params: FR-5: access_token_discovery_uri, token_introspection_endpoint, verify_issuer, verify_audience + _verify_incoming_jwt(), _introspect_opaque_token() FR-12: end_user_claim_sources ordered resolution chain FR-13: add_claims, set_claims, remove_claims FR-14: channel_token_audience, channel_token_ttl → x-mcp-channel-token FR-15: required_claims (raises 403), optional_claims (passthrough) FR-9: debug_headers → x-litellm-mcp-debug FR-10: allowed_scopes; tool-call JWTs no longer over-grant tools/list mcp_server_manager.py: - pre_call_tool_check gains raw_headers param to extract incoming_bearer_token - Silent Authorization override warning fixed: now fires when server has authentication_token AND hook injects Authorization tests/test_mcp_jwt_signer.py: 28 new tests covering all FR flags + P1 fixes (43 total, all passing) * fix(mcp_jwt_signer): address pre-landing review issues - Remove stale TODO comment on UserAPIKeyAuth.jwt_claims — the field is already populated and consumed by MCPJWTSigner in the same PR - Fix _get_oidc_discovery to only cache the OIDC discovery doc when jwks_uri is present; a malformed/empty doc now retries on the next request instead of being permanently cached until proxy restart - Add FR-5 test coverage for _fetch_jwks (cache hit/miss), _get_oidc_discovery (cache/no-cache on bad doc), _verify_incoming_jwt (valid token, expired token), _introspect_opaque_token (active, inactive, no endpoint), and the end-to-end 401 hook path — 53 tests total, all passing * docs(mcp_zero_trust): rewrite as use-case guide covering all new JWT signer features Add scenario-driven sections for each new config area: - Verify+re-sign with Okta/Azure AD (access_token_discovery_uri, end_user_claim_sources, token_introspection_endpoint) - Enforcing caller attributes with required_claims / optional_claims - Adding metadata via add_claims / set_claims / remove_claims - Two-token model for AWS Bedrock AgentCore Gateway (channel_token_audience / channel_token_ttl) - Controlling scopes with allowed_scopes - Debugging JWT rejections with debug_headers Update JWT claims table to reflect configurable sub (end_user_claim_sources) * fix(mcp_jwt_signer): wire all config.yaml params through initialize_guardrail The factory was only passing issuer/audience/ttl_seconds to MCPJWTSigner. All FR-5/9/10/12/13/14/15 params (access_token_discovery_uri, end_user_claim_sources, add/set/remove_claims, channel_token_audience, required/optional_claims, debug_headers, allowed_scopes, etc.) were silently dropped, making every advertised advanced feature non-functional when loaded from config.yaml. Add regression test that asserts every param is wired through correctly. * docs(mcp_zero_trust): add hero image * docs(mcp_zero_trust): apply Linear-style edits - Lead with the problem (unsigned direct calls bypass access controls) - Shorter statement section headers instead of question-form headers - Move diagram/OIDC discovery block after the reader is bought in - Add 'read further only if you need to' callout after basic setup - Two-token section now opens from the user problem not product jargon - Add concrete 403 error response example in required_claims section - Debug section opens from the symptom (MCP server returning 401) - Lowercase claims reference header for consistency * fix(mcp_jwt_signer): fix algorithm confusion attack + add OIDC discovery 24h TTL - Remove alg from unverified JWT header; use signing_jwk.algorithm_name from JWKS key instead. Reading alg from attacker-controlled headers enables alg:none / HS256 confusion attacks. - Add _oidc_discovery_fetched_at timestamp and _OIDC_DISCOVERY_TTL = 86400 (24h). Without a TTL the cached discovery doc never refreshes, so IdP key rotation is invisible. --------- Co-authored-by: Noah Nistler <60981020+noahnistler@users.noreply.github.com> * fix(ci): stabilize CI - formatting, type errors, test polling, security CVEs, router bug, batch resolution Fix 1: Run Black formatter on 35 files Fix 2: Fix MyPy type errors: - setup_wizard.py: add type annotation for 'selected' set variable - user_api_key_auth.py: remove redundant type annotation on jwt_claims reassignment Fix 3: Fix spend accuracy test burst 2 polling to wait for expected total spend instead of just 'any increase' from burst 2 Fix 4: Bump Next.js 16.1.6 -> 16.1.7 to fix CVE-2026-27978, CVE-2026-27979, CVE-2026-27980, CVE-2026-29057 Fix 5: Fix router _pre_call_checks model variable being overwritten inside loop, causing wrong model lookups on subsequent deployments. Use local _deployment_model variable instead. Fix 6: Add missing resolve_output_file_ids_to_unified call in batch retrieve non-terminal-to-terminal path (matching the terminal path behavior) Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * chore: regenerate poetry.lock to sync with pyproject.toml Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix: format merged files from main and regenerate poetry.lock Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix(mypy): annotate jwt_claims as Optional[dict] to fix type incompatibility Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix(ci): update router region test to use gpt-4.1-mini (fix flaky model lookup) Replace deprecated gpt-3.5-turbo-1106 with gpt-4.1-mini + mock_response in test_router_region_pre_call_check, following the same pattern used in commit `717d37cc5b` for test_router_context_window_check_pre_call_check_out_group. Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * ci: retry flaky logging_testing (async event loop race condition) Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix(ci): aggregate all mock calls in langfuse e2e test to fix race condition The _verify_langfuse_call helper only inspected the last mock call (mock_post.call_args), but the Langfuse SDK may split trace-create and generation-create events across separate HTTP flush cycles. This caused an IndexError when the last call's batch contained only one event type. Fix: iterate over mock_post.call_args_list to collect batch items from ALL calls. Also add a safety assertion after filtering by trace_id and mark all langfuse e2e tests with @pytest.mark.flaky(retries=3) as an extra safety net for any residual timing issues. Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix(ci): black formatting + update OpenAPI compliance tests for spec changes - Apply Black 26.x formatting to litellm_logging.py (parenthesized style) - Update test_input_types_match_spec to follow $ref to InteractionsInput schema (Google updated their OpenAPI spec to use $ref instead of inline oneOf) - Update test_content_schema_uses_discriminator to handle discriminator without explicit mapping (Google removed the mapping key from Content discriminator) Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * revert: undo incorrect Black 26.x formatting on litellm_logging.py The file was correctly formatted for Black 23.12.1 (the version pinned in pyproject.toml). The previous commit applied Black 26.x formatting which was incompatible with the CI's Black version. Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix(ci): deduplicate and sort langfuse batch events after aggregation The Langfuse SDK may send the same event (e.g., trace-create) in multiple flush cycles, causing duplicates when we aggregate from all mock calls. After filtering by trace_id, deduplicate by keeping only the first event of each type, then sort to ensure trace-create is at index 0 and generation-create at index 1. Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> --------- Co-authored-by: Noah Nistler <60981020+noahnistler@users.noreply.github.com> Co-authored-by: Cursor Agent <cursoragent@cursor.com> Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>	2026-03-18 15:09:01 -07:00
yuneng-jiang	2c0c08722a	Merge pull request #24035 from BerriAI/litellm_fix_guardrail_mode_type_crash [Fix] UI - Logs: Guardrail Mode Type Crash on Non-String Values	2026-03-18 15:05:52 -07:00
yuneng-jiang	51f78b7d72	address greptile review feedback (greploop iteration 4) - Add guard assertion before non-null click on custom code switch - Use await act(async ...) for timer advancement to avoid act warnings - Pin locale in date range assertion for CI determinism Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-18 13:09:17 -07:00
yuneng-jiang	eb7efa36da	Add missing permission options to PERMISSION_OPTIONS list Adds /key/info, /key/list, /key/aliases, and /team/daily/activity to the hardcoded PERMISSION_OPTIONS in TeamSSOSettings.tsx. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-18 13:05:53 -07:00
yuneng-jiang	b75266d254	merge origin/main, resolve test file conflicts Resolved conflicts in ScoreChart.test.tsx and HelpLink.test.tsx by preferring origin/main's renderWithProviders pattern and merging unique tests from both branches. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-18 12:57:31 -07:00
yuneng-jiang	bbc120095e	address greptile review feedback (greploop iteration 3) - Remove Ant Design CSS class selector coupling in ExportFormatSelector test - Lift mock fns out of TestTable component body to enable callback assertions Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-18 12:48:01 -07:00

1 2 3 4 5 ...

3237 Commits