litellm

mirror of https://github.com/tiennm99/litellm.git synced 2026-06-27 13:05:45 +00:00

Author	SHA1	Message	Date
Krrish Dholakia	92db2df2f6	Merge pull request #23794 from ndgigliotti/feat/bedrock-structured-output-cost-json Bedrock: move native structured output model list to cost JSON, add Sonnet 4.6	2026-03-28 20:04:47 -07:00
Sameer Kankute	cdc1dd5c37	Fix the tests	2026-03-27 20:36:01 +05:30
Sameer Kankute	cc73ae776a	feat(gemini): add Lyria 3 preview models to cost map and docs Made-with: Cursor	2026-03-27 20:36:00 +05:30
Nicholas Gigliotti	92654bad37	Refactor _supports_native_structured_outputs to use standard supports_* utility pattern Addresses Greptile review feedback: replace direct litellm.model_cost lookup with the standard _supports_factory infrastructure used by supports_reasoning, supports_native_streaming, etc. - Add supports_native_structured_output() utility in litellm/utils.py - Add supports_native_structured_output field to ModelInfoBase type - Wire field into _get_model_info_helper return dict - Delegate from Bedrock _supports_native_structured_outputs to utility - Add field to JSON schema validator in test_utils.py	2026-03-26 21:49:03 -04:00
Sameer Kankute	92e98a2fd5	Fix test_aaamodel_prices_and_context_window_json_is_valid	2026-03-20 23:35:00 +05:30
Cesar Garcia	6bd7cd7573	Merge branch 'main' into litellm_oss_staging_03_11_2026	2026-03-12 10:43:08 -03:00
Chesars	1be6b31e2f	merge: resolve conflicts between main and litellm_oss_staging_03_11_2026	2026-03-12 09:38:31 -03:00
yuneng-jiang	626d120873	Merge pull request #23425 from BerriAI/cursor/litellm-ci-stability-4513 [Infra] CI/CD Fixes	2026-03-11 21:08:16 -07:00
Sameer Kankute	49d653c3aa	Revert "chore: cleanup deprecated models from pricing JSON"	2026-03-12 09:27:40 +05:30
Cursor Agent	d5fc63f63f	fix(ci): fix deprecated model refs and schema validation in unit tests - Replace gemini-pro with gemini-3-pro-preview in test_cost_discount_vertex_ai (gemini-pro removed from cost map) - Replace github/claude-3-5-sonnet-latest with github/claude-3-7-sonnet-20250219 in test_supports_function_calling_github_anthropic_alias (model removed) - Add supports_multimodal, uses_embed_content, input/output_cost_per_token_above_256k_tokens to JSON schema in test_utils.py (new properties added to model cost map) Co-authored-by: yuneng-jiang <yuneng-jiang@users.noreply.github.com>	2026-03-12 03:28:24 +00:00
yuneng-jiang	c9f7075690	Replace additional deprecated models across test files - tests/local_testing/test_completion_cost.py: - claude-3-5-sonnet-20240620 -> claude-sonnet-4-6 - gemini/gemini-1.5-flash-001 -> gemini/gemini-2.5-flash - tests/test_litellm/test_utils.py: - claude-3-5-sonnet-20240620 -> claude-sonnet-4-6 (VertexAI config test, proxy tests) - gemini-1.5-pro -> gemini-2.5-pro (pre_process_non_default_params) - gemini/gemini-1.5-pro -> gemini/gemini-2.5-pro (proxy tests) - tests/litellm_utils_tests/test_utils.py: - claude-3-opus-20240229 -> claude-sonnet-4-6 (trimming, vision tests) - gemini-pro -> gemini-2.5-pro (function calling test) - gemini-pro-vision -> gemini-2.5-flash (vision test) - gemini-1.5-pro -> gemini-2.5-pro (response schema test) - gemini/gemini-1.5-flash -> gemini/gemini-2.5-flash (function calling test) - gemini-1.5-pro -> gemini-2.5-pro (vision gemini test) - gpt-4-vision-preview -> gpt-4o (vision test) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-11 17:03:54 -07:00
yuneng-jiang	9379cb1038	[Fix] Replace deprecated models in function calling tests Replace deprecated model references in test_proxy_function_calling_support_consistency: - claude-3-5-sonnet-20240620 -> claude-sonnet-4-6 - gemini-pro -> gemini-2.5-pro - gemini/gemini-1.5-pro -> gemini/gemini-2.5-pro - gemini/gemini-1.5-flash -> gemini/gemini-2.5-flash Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-11 16:46:56 -07:00
Chesars	1a3fdc7ae3	fix: align Vertex AI Claude deprecations with Google's schedule - Restore vertex_ai/claude-3-7-sonnet@20250219 (Vertex AI shutdown is May 11, 2026, still active — was incorrectly removed based on Anthropic API retirement date) - Remove vertex_ai/claude-3-5-sonnet-v2 and vertex_ai/claude-3-5-sonnet-v2@20241022 (Vertex AI shutdown was Feb 19, 2026, already past) - Remove deprecated claude-3-7-sonnet-20250219 from web search test, use only non-deprecated models Source: https://docs.google.com/vertex-ai/generative-ai/docs/deprecations/partner-models	2026-03-11 15:11:07 -03:00
Chesars	d81d751af0	fix(tests): update tests to use models still present in pricing JSON Replace removed deprecated models (claude-3-5-sonnet-20241022, claude-3-5-haiku-20241022, claude-3-5-haiku-latest) with current models in web_search and cost calculation tests.	2026-03-11 14:50:47 -03:00
Chesars	a6cb510703	merge: resolve conflicts between main and litellm_oss_staging_03_04_2026 Resolved 14 file conflicts: - image_edits.md: combined OpenRouter + Black Forest Labs providers - utils.py: kept staging's message-level cache_control check - networking.tsx: kept export on 4 tool interfaces - tool_management_endpoints.py: kept ToolOutputPolicy import - Accepted main's version for: schema.prisma, a2a_protocol, mcp_server, _types.py, auth_checks.py, db_spend_update_writer, endpoints.py, spend_tracking_utils, a2a_endpoints, model_prices backup	2026-03-10 10:45:04 -03:00
yuneng-jiang	be9d1798b2	Merge pull request #23182 from BerriAI/litellm_/exciting-swanson [Fix] Model pricing schema test missing output_cost_per_image_token_batches	2026-03-09 14:26:21 -07:00
yuneng-jiang	379ce1aae5	[Fix] Add output_cost_per_image_token_batches to model pricing schema test The gemini-3.1-flash-image-preview model introduced a new pricing field that was missing from the test's validation schema and cost_fields list. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-09 14:17:52 -07:00
Ishaan Jaff	28c33f53a3	CircleCI test stability (#23055 ) * fix: resolve ruff lint errors and mypy type error - Remove unused import get_user_credential (F401) - Add noqa: PLR0915 for 3 large functions exceeding 50 statements - Cast result_data['q'] to str for _append_domain_filters (mypy arg-type) Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix: add /vertex_ai/live to supported endpoints and azure gpt-5.1 reasoning flags - Add /vertex_ai/live to JSON schema validation enum in test_utils.py - Add supports_none_reasoning_effort=true to 10 azure/gpt-5.1 model entries (matching the OpenAI gpt-5.1 behavior) Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix: handle non-string team_alias/key_alias in PolicyMatchContext Prevent Pydantic validation errors when team_alias or key_alias are not proper strings (e.g. MagicMock in tests). Only pass values that are actually strings; default to None otherwise. Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix: initialize jwt_handler.litellm_jwtauth in JWT test The test_jwt_non_admin_team_route_access test was failing because user_api_key_auth now accesses jwt_handler.litellm_jwtauth.virtual_key_claim_field before reaching the mocked JWTAuthManager.auth_builder. Initialize the jwt_handler with a default LiteLLM_JWTAuth object. Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix: add missing mock attributes to MCP server test The test_add_update_server_fallback_to_server_id test was failing because MagicMock auto-creates attributes when accessed. build_mcp_server_from_table accesses many fields via getattr(), which on a MagicMock returns another MagicMock instead of None, causing Pydantic validation errors in MCPServer. Explicitly set all required mock attributes. Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix: update UI tests for leftnav, navbar, and KeyLifecycleSettings - leftnav: Add mock for useTeams hook, add isUserTeamAdminForAnyTeam to roles mock, update topLevelLabels to match current component menu items - navbar: Add mocks for useDisableBouncingIcon, BlogDropdown, UserDropdown, and serverRootPath. Update test to work with the new component structure. - KeyLifecycleSettings: Fix placeholder and tooltip assertions to match actual component behavior Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix: update health check test assertion from 'connected' to 'healthy' The /health/readiness endpoint now returns {"status": "healthy"} with the DB status in a separate field, instead of the previous {"status": "connected"}. Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix: clear litellm.api_key in OpenRouter validate_environment test The test_validate_environment_raises_without_key test was failing because litellm.api_key may be set globally in the test environment. Clear it along with OPENROUTER_API_KEY and OR_API_KEY env vars using monkeypatch. Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix: patch HTTPHandler class-level in VLLM embedding test The test_encoding_format_not_sent_in_actual_request test was patching client.post on an instance, but the handler uses the class method. Patch HTTPHandler.post at class level, add caching=False to prevent cache hits, and remove broad try/except that hid errors. Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix: make test_redaction_responses_api_stream resilient to async callback timing Replace fixed 1s sleep with polling wait for async_log_success_event. Streaming success handler runs via asyncio.create_task; 1s was insufficient in CI. Add 0.5s initial sleep for event loop to schedule the task, then poll up to 10s for the callback to fire. Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix: update dompurify and svgo to fix security CVEs - CVE-2026-0540: dompurify XSS vulnerability - fix by upgrading to 3.3.2+ - CVE-2026-29074: svgo DoS via entity expansion - fix by upgrading to 3.3.3+ Added npm overrides in docs/my-website/package.json and regenerated package-lock.json. Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix: remove unused json import in config_override_endpoints.py Ruff F401: json is imported but unused (safe_json_loads/safe_dumps are used instead) Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix: add missing MCP mock attributes and provider documentation entries - Add missing mock attributes to test_add_update_server_with_alias and test_add_update_server_without_alias (same fix as fallback test) - Add bedrock_mantle and searchapi to provider_endpoints_support.json - Remove unused json import from config_override_endpoints.py Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix: override _supports_reasoning_effort_level for Azure gpt5_series prefix The Azure GPT-5 config uses 'gpt5_series/' as a routing prefix, but _supports_factory(model='gpt5_series/gpt-5.1') fails to resolve because 'gpt5_series' is not a recognized provider. Override the method to strip the prefix and prepend 'azure/' for correct model info lookup. Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix: accept both 'healthy' and 'connected' in health check test The test_health_and_chat_completion test runs against both source builds (which return 'healthy') and pip-installed versions (which may return 'connected'). Accept both values. Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix: mock extract_mcp_auth_context in streamable HTTP MCP handler test The handle_streamable_http_mcp function now calls extract_mcp_auth_context before session_manager.handle_request, but the test didn't mock it. The auth extraction fails with the minimal mock scope, preventing handle_request from being called. Also relax assertion to not check exact args since the send wrapper may be modified by debug injection. Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix: add test for _combine_fallback_usage to satisfy router code coverage The router_code_coverage.py check requires all functions in router.py to be called in test files. Add a basic test for _combine_fallback_usage. Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix: add @log_guardrail_information decorator to CrowdStrike AIDR guardrail The check_guardrail_apply_decorator.py CI check requires all guardrail apply_guardrail methods to have the @log_guardrail_information decorator. The CrowdStrike AIDR handler was missing it. Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix: document PRISMA_RECONNECT_ESCALATION_THRESHOLD and REDIS_CLUSTER_NODES env keys Add missing environment variable documentation to config_settings.md to satisfy the test_env_keys.py CI check. Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix: document enforced_file_expires_after and enforced_batch_output_expires_after in new_team docstring The test_api_docs.py CI check validates that all Pydantic model fields are documented in the function docstring. Add missing parameter docs for enforced_file_expires_after and enforced_batch_output_expires_after. Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix: regenerate poetry.lock to match pyproject.toml The poetry.lock file was out of sync with pyproject.toml, causing proxy_e2e_azure_batches_tests to fail during dependency installation. Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix: set master_key=None in test_create_file_with_deep_nested_litellm_metadata The test was missing the master_key monkeypatch that other tests in the same file set. In CI with parallel execution (-n 4), another test may set master_key to a non-None value, causing auth failures (500) when the test sends 'Bearer test-key'. Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix: document enforced__expires_after in update_team docstring too Same missing params as new_team - also needed in update_team docstring for the test_api_docs.py CI check to pass. Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> fix: use get_async_httpx_client in a2a_protocol and add master_key monkeypatch to files tests - Replace httpx.AsyncClient() with get_async_httpx_client() in a2a_protocol/main.py to satisfy the ensure_async_clients_test CI check - Add httpxSpecialProvider.A2AProvider enum value - Add master_key=None monkeypatch to test_managed_files_with_loadbalancing Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix: remove unused httpx import from a2a_protocol/main.py Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix: use cache-key-only param for A2A extra_headers to avoid AsyncHTTPHandler init error The 'extra_headers' key in params was being passed to AsyncHTTPHandler.__init__() which doesn't accept it. Use 'disable_aiohttp_transport' as the cache-key-only param since it's explicitly filtered out before reaching the constructor. Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix: add additionalProperties:false and resolve $defs/$ref in Anthropic output_format schemas Anthropic API now requires additionalProperties=false for all object-type schemas in output_format. Also resolve $defs/$ref references by inlining them using unpack_defs before sending to Anthropic, since Anthropic doesn't support external schema references. Fixes: llm_translation_testing Anthropic JSON schema failures Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix: allowlist CVE-2026-2297 and GHSA-qffp-2rhf-9h96 in security scans - CVE-2026-2297: Python 3.13 SourcelessFileLoader audit hook bypass, no fix available in base image - GHSA-qffp-2rhf-9h96: tar hardlink path traversal, from nodejs_wheel bundled npm, not used in application runtime code Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix: isolate files endpoint tests from shared proxy state in CI parallel execution Override user_api_key_auth dependency to return a fixed UserAPIKeyAuth with PROXY_ADMIN role, avoiding auth lookups via prisma_client, user_api_key_cache, or master_key. Set prisma_client=None to prevent DB state contamination. Use try/finally to clean up dependency overrides. Fixes persistent test_create_file_with_deep_nested_litellm_metadata and test_managed_files_with_loadbalancing 500 errors in CI with -n 4. Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * fix: apply same auth override to test_managed_files_with_loadbalancing Same CI parallel execution fix as test_create_file_with_deep_nested - override user_api_key_auth dependency and set prisma_client=None. Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> --------- Co-authored-by: Cursor Agent <cursoragent@cursor.com> Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>	2026-03-07 15:19:39 -08:00
Yangqian Yan	7f5d5c5c6e	fix: use DeepSeekChatConfig instead of OpenAIConfig for deepseek provider (#22971 ) * fix: use DeepSeekChatConfig instead of OpenAIConfig for deepseek provider The deepseek provider was incorrectly using OpenAIConfig().map_openai_params() instead of DeepSeekChatConfig().map_openai_params(), which meant DeepSeek-specific parameter mappings were not being applied. * test: add unit tests for deepseek DeepSeekChatConfig param mapping Verify that get_optional_params uses DeepSeekChatConfig (not OpenAIConfig) for the deepseek provider by testing thinking, reasoning_effort, and budget_tokens stripping behavior.	2026-03-06 18:16:27 -08:00
Sameer Kankute	d8f139fe4d	feat(openai): add 272K tier pricing for GPT-5.4/5.4-pro Prompts >272K input tokens priced at 2x input, 1.5x output for full session (standard, batch, flex). Applies to models with 1.05M context window (gpt-5.4, gpt-5.4-pro). - Add input/output_cost_per_token_above_272k_tokens to model_prices - Add above_272k fields to ModelInfoBase and get_model_info extraction - Add test_generic_cost_per_token_gpt54_above_272k_tokens Made-with: Cursor	2026-03-06 22:26:14 +05:30
Julio Quinteros	db8e909ef2	fix(test): add 'realtime' to model mode enum in schema validation gemini/gemini-live-2.5-flash-preview-native-audio-09-2025 uses mode='realtime' but the schema in test_aaamodel_prices_and_context_window_json_is_valid did not include 'realtime' as a valid enum value, causing a ValidationError. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-05 06:41:51 -03:00
Julio Quinteros Pro	4ec92ba924	fix: add new model_prices properties to validation schema Add cache_read_input_token_cost_per_audio_token, supports_code_execution, and supports_file_search to the JSON schema used by the model prices validation test. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-04 11:37:02 -03:00
Aarish Alam	ce54c39051	Bug Fix: auto-inject prompt caching support for Gemini models (#21881 ) * add explicit caching to litellm proxy for gemini models via injection * fix: add missing `supports_function_calling` for deepinfra models All 55 deepinfra models that had `supports_tool_choice: true` were missing the `supports_function_calling` flag, causing `litellm.supports_function_calling()` to incorrectly return False. Fixes #22619 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Managed batches - Address PR bot comments from #22464 * feat(togetherai): add support for TogetherAI Qwen3.5-397B-A17B model * Agent Tracing - support context_id based trace id propogation + nested llm calls (#22626) * style(ui/): distinguish agent calls from llm calls on ui * feat: initial grouping working * feat: set stable contextid for a2a calls - allows for easily passing to downstream llm/mcp calls * feat(a2a_endpoints.py): fix tracing to avoid recreating logging objects for the same call allows stable trace id usage * fix(guardrail_endpoints): handle string ui_type values in _build_field_dict _build_field_dict unconditionally called .value on ui_type, which crashes for guardrail configs that use plain strings (e.g. BlockCodeExecutionGuardrailConfigModel uses "multiselect" and "percentage"). Now checks with hasattr before calling .value. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: propagate trace/session id from headers in MCP server calls Cherry-picked mcp_server/server.py fixes from `6feb9bab`: adds get_chain_id_from_headers to extract x-litellm-trace-id / x-litellm-session-id from raw headers, and uses it in call_tool and list_tools to keep spend logs and tracing consistent with A2A. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * [Feat] UI - Add Open in New Tab on leftnav Bar (#22731) * Add minimal dev_config.yaml for proxy development Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * feat(ui): wrap left nav items in <a> tags for open-in-new-tab support Nav items are now rendered as <a> elements with proper href attributes, enabling right-click → 'Open in new tab', Ctrl/Cmd+click, and middle-click to open any sidebar page in a new browser tab. Normal clicks continue to use SPA navigation (no full page reload). Applied to both leftnav.tsx (query-param routing) and Sidebar2.tsx (Next.js file-based routing). Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> --------- Co-authored-by: Cursor Agent <cursoragent@cursor.com> Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * [Feat] Add Tool Policies for AI Gateway (#22732) * fix: fix ui render * fix: fix minor bugs * refactor: use prisma functions instead of raw sql (safer) * fix(add-new-tiles-to-tool-policies): allow developer to see what's available * feat: ensure tool allowlist runs correctly for tool names + mcp's * refactor: more ui improvements * feat: working key tool blocking * feat(tools): show tool logs * refactor: backend code improvements * refactor: improve log viewer for tools * fix: address PR review feedback for tool access control - Add missing blocked_tools column to root schema.prisma (schema drift) - Invalidate ToolPolicyRegistry after policy mutations so changes take effect immediately - Remove dead code: unused get_effective_policies, get_tool_policies_cached, and helpers Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: race condition in permission resolution and remove duplicate allowlist check - Use atomic update_many with object_permission_id=None to prevent concurrent requests from creating orphaned permission rows and losing tool blocks - Remove duplicate allowed_tools enforcement from guardrail (already enforced in auth layer via check_tools_allowlist) - Move inline uuid import to module level Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * update to account for userAgent * UI - Add ToolDetails * input/output policy * LiteLLM_PolicyAttachmentTable * LiteLLM_PolicyAttachmentTable * fix: add _enqueue_tool_registry_upsert * fix: tool mgmt endpoints * tool mgmt endpoints * Update tests/test_litellm/proxy/db/test_tool_registry_writer.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Update tests/test_litellm/proxy/db/test_tool_registry_writer.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Update tests/test_litellm/proxy/db/test_tool_registry_writer.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * fix: sync root schema.prisma and fix test_tool_registry_writer for input/output policy - Migrate root schema.prisma LiteLLM_ToolTable from call_policy to input_policy/output_policy, add missing user_agent and last_used_at columns (now consistent with litellm/proxy/schema.prisma and litellm-proxy-extras) - Fix SpendLogToolIndex comment across all three schema files - Fix all call_policy references in test_tool_registry_writer.py: swapped update_tool_policy arguments, wrong get_tools_by_names return type assertions, _mock_tool_row setting call_policy instead of input_policy Addresses Greptile review feedback on PR #22732. Made-with: Cursor --------- Co-authored-by: Krrish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * feat(proxy): add key_alias, key_hash, requested_model DD APM span tags (#22710) * feat(proxy): add key_alias, key_hash, requested_model tags to DD APM spans * refactor(proxy): consolidate DD APM tag helpers into DDSpanTagger class * refactor(proxy): move DDSpanTagger to its own file litellm/proxy/dd_span_tagger.py --------- Co-authored-by: liweiguang <codingpunk@gmail.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Ephrim Stanley <ephrim.stanley@point72.com> Co-authored-by: Varad Khonde <varadkhonde@gmail.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Sameer Kankute <sameer@berri.ai> Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: Cursor Agent <cursoragent@cursor.com> Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>	2026-03-03 20:25:35 -08:00
Sameer Kankute	7a83acf086	Merge pull request #22620 from OiPunk/codex/litellm-22619-deepinfra-function-calling fix: add missing supports_function_calling for deepinfra models	2026-03-04 08:51:21 +05:30
Julio Quinteros Pro	9b92ea16ab	fix: update response_format test for vertex_ai's intentional schema diff Vertex AI / Gemini uses Pydantic's model_json_schema() which omits additionalProperties: False (Gemini rejects it). The test expected the same schema for all providers. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-03 19:55:18 -03:00
liweiguang	81ddf08494	fix: add missing `supports_function_calling` for deepinfra models All 55 deepinfra models that had `supports_tool_choice: true` were missing the `supports_function_calling` flag, causing `litellm.supports_function_calling()` to incorrectly return False. Fixes #22619 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-03 12:12:43 +08:00
Cesar Garcia	587977e19a	Merge pull request #19792 from Chesars/fix/openrouter-register-model-index-error fix(register_model): handle openrouter models without '/' in name	2026-02-27 18:52:14 -03:00
Julio Quinteros Pro	bf8c219860	fix(tests): use os.path instead of Path to avoid NameError Path is not imported at module level. Use os.path.join which is already available. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-23 13:56:11 -03:00
Julio Quinteros Pro	a74b6eee23	Update tests/test_litellm/test_utils.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>	2026-02-23 13:55:49 -03:00
Julio Quinteros Pro	11a774e110	fix(tests): use absolute path for model_prices JSON in validation test The test used a relative path 'litellm/model_prices_and_context_window.json' which only works when pytest runs from a specific working directory. Use os.path based on __file__ to resolve the path reliably. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-23 13:55:49 -03:00
Sameer Kankute	f97ee62fb0	Merge pull request #21909 from BerriAI/litellm_cost_tracking_gemini Add Priority PayGo cost tracking gemini/vertex ai	2026-02-23 18:58:57 +05:30
Sameer Kankute	61e63b6553	Merge pull request #21904 from BerriAI/litellm_fix_model_cost_map fix model cost map for anthropic fast and inference_geo	2026-02-23 18:57:15 +05:30
Sameer Kankute	2f8d36be1b	Fix test_aaamodel_prices_and_context_window_json_is_valid	2026-02-23 18:56:12 +05:30
Sameer Kankute	c7aafdf794	Merge pull request #21926 from BerriAI/main merge main in oss 21 02	2026-02-23 18:17:30 +05:30
Sameer Kankute	22bccc4f61	Fix entries with fast and us/	2026-02-23 11:23:24 +05:30
Ryan Crabbe	ea32ad72c6	Merge origin/main into perf/callback-registration-routing Resolve conflicts: - logging_callback_manager.py: keep PR's MAX_CALLBACKS, _is_async_callable, Callable type - test_utils.py: keep both TestCallbackAsyncSyncSeparation and TestMetadataNoneHandling	2026-02-21 12:40:23 -08:00
Cesar Garcia	cc6ef0e3f7	fix(utils): normalize camelCase thinking param keys to snake_case (#21762 ) Clients like OpenCode's @ai-sdk/openai-compatible send budgetTokens (camelCase) instead of budget_tokens in the thinking parameter, causing validation errors. Add early normalization in completion().	2026-02-21 11:14:39 -08:00
Sameer Kankute	36fd14357c	FIx: replace deprecated claude-3-7-sonnet-20250219 with claude-4-sonnet-20250514	2026-02-20 17:27:59 -08:00
michelligabriele	d001fe9a16	fix(model-pricing): add missing fireworks_ai model pricing for glm-4p7, minimax-m2p1, kimi-k2p5 (#21642 ) * fix(model-pricing): add missing fireworks_ai model pricing for glm-4p7, minimax-m2p1, kimi-k2p5 Fireworks AI models called via short-form (fireworks_ai/<model>) were reporting $0.00 cost because the pricing JSON lacked short-form entries. The lookup fell through to the fireworks-ai-default bucket which has zero cost. Added 5 new entries to model_prices_and_context_window.json: - fireworks_ai/accounts/fireworks/models/glm-4p7 (new long-form) - fireworks_ai/accounts/fireworks/models/minimax-m2p1 (new long-form) - fireworks_ai/glm-4p7 (new short-form) - fireworks_ai/minimax-m2p1 (new short-form) - fireworks_ai/kimi-k2p5 (new short-form; long-form already existed) Pricing sourced from fireworks.ai model pages and pricing page. * add cache_read_input_token_cost to kimi-k2p5 long-form entry for consistency	2026-02-20 08:31:52 -08:00
SolitudePy	7fc29dc9f1	fix: allow github aliases to reuse upstream model metadata Update provider matching so github/<model> aliases can resolve capabilities from existing upstream model metadata, including OpenAI and Anthropic entries. Add regression tests for known github aliases and unknown-model fallback behavior.	2026-02-18 22:29:15 +02:00
Julio Quinteros Pro	d4755c8284	fix(tests): add inference_geo to model prices JSON schema The model_prices_and_context_window_backup.json file has 'inference_geo' fields (e.g. on 'us/claude-sonnet-4-6') for geo-prefixed Anthropic models used in cost calculation, but the JSON schema validator in test_utils.py did not include 'inference_geo' as an allowed property. This caused test_aaamodel_prices_and_context_window_json_is_valid to fail with: Additional properties are not allowed ('inference_geo' was unexpected) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-18 11:29:31 -03:00
BlueT - Matthew Lien - 練喆明	c0de6c5c6c	[Fix] handle metadata=None in SDK path retry/error logic (utils.py) (#20873 ) * [Fix] handle metadata=None in SDK path retry/error logic (utils.py) Fixes #20871 Same class of bug as #9717 (fixed by #9764 for the proxy path). The SDK path in utils.py has the same fragile pattern at 7 locations. Replace `kwargs.get("metadata", {})` with `(kwargs.get("metadata") or {})` to handle the case where metadata key exists with value None (e.g. from Azure OpenAI streaming responses). This is consistent with the existing correct pattern at line 602: `metadata = kwargs.get("metadata") or {}` Adds TestMetadataNoneHandling with 6 unit tests in test_utils.py. * fix: remove duplicate PerplexityResponsesConfig key in lazy imports registry Removes duplicate dictionary key added in commit `be0ebb15` (PR #20860). The entry at line 1042 is identical to the existing entry at line 906. This causes ruff F601 lint failure on all PRs targeting main.	2026-02-10 22:03:33 -08:00
Alexsander Hamir	ebce0e5f8c	[Release - 02/10/2026] v1.81.10-nightly	2026-02-10 16:26:30 -08:00
Ryan Crabbe	aaaf7f3b6c	perf: move async/sync callback separation from per-request to registration time The three loops in function_setup that called is_async_callable() on every callback each request were redundant after the first request. Move the async/sync routing into LoggingCallbackManager.add_litellm_*_callback() so it happens once at registration time instead of on every request.	2026-02-07 12:10:38 -08:00
ryan-crabbe	14c2b5da91	perf: replace enum construction with frozenset lookup in _is_streaming_request (#20302 ) CallTypes(call_type) was constructing an enum from string on every call, taking ~4.6µs/call (69.6% of function time). Replace with a frozenset membership test for ~0.8µs/call (8.3x faster).	2026-02-07 10:50:57 -08:00
shin-bot-litellm	df299d3193	fix(tests): Fix flaky container and scientific notation tests (#20650 ) * fix(tests): Mock async_container_create_handler for async router test The test was mocking container_create_handler (sync), but router.acreate_container uses _is_async=True which calls async_container_create_handler. This caused the test to hit the real OpenAI API. Fixed by using AsyncMock on async_container_create_handler. * fix(tests): Use uuid for unique model name in scientific notation test The test was using a static "unique" model name which could cause conflicts when running tests in parallel (-n 16 in CI). Using uuid ensures truly unique names to prevent test pollution. --------- Co-authored-by: Shin <shin@openclaw.ai>	2026-02-07 09:57:08 -08:00
yuneng-jiang	3504f05a5c	Adding tests + update pyproject	2026-02-05 21:00:05 -08:00
Sameer Kankute	bb363f0307	Fix: test_bedrock_optional_params_embeddings_dimension	2026-02-02 17:49:18 +05:30
Sameer Kankute	be0bb975c0	Fix test_aaamodel_prices_and_context_window_json_is_valid	2026-02-02 17:46:37 +05:30
shin-bot-litellm	0c006794f1	litellm_fix_mapped_tests_core: fix test isolation and mock injection issues (#20209 ) * litellm_fix_mapped_tests_core: fix test isolation and mock injection issues ## Problem Four tests in litellm_mapped_tests_core were failing: 1. test_register_model_with_scientific_notation - KeyError due to test isolation issues 2. test_search_uses_registry_credentials - Mock not being called due to incorrect patch path 3. test_send_email_missing_api_key - Real API calls despite mocking 4. test_stream_transformation_error_sync - Mock not effective, real API called ## Solution ### test_register_model_with_scientific_notation - Use unique model name to avoid conflicts with other tests - Clear LRU caches before test to prevent stale data - Clean up model_cost entry after test ### test_search_uses_registry_credentials - Use patch.object() on the actual base_llm_http_handler instance - String-based patching for instance methods can fail; direct object patching is more reliable ### test_send_email_missing_api_key - Directly inject mock HTTP client into logger instance - This bypasses any caching issues that could cause the fixture mock to be ineffective ### test_stream_transformation_error_sync - Patch litellm.completion directly instead of the handler module's litellm reference - This ensures the mock is effective regardless of import order ## Regression These tests were affected by LRU caching added in #19606 and HTTP client caching. * fix(test): use patch.object for container API tests to fix mock injection ## Problem test_retrieve_container_basic tests were failing because mocks weren't being applied correctly. The tests used string-based patching: patch('litellm.containers.main.base_llm_http_handler') But base_llm_http_handler is imported at module level, so the mock wasn't intercepting the actual handler calls, resulting in real HTTP requests to OpenAI API. ## Solution Use patch.object() to directly mock methods on the imported handler instance. Import base_llm_http_handler in the test file and patch like: patch.object(base_llm_http_handler, 'container_retrieve_handler', ...) This ensures the mock is applied to the actual object being used, regardless of import order or caching. * fix(test): add missing Prometheus metric labels to test_proxy_failure_metrics Add client_ip, user_agent, model_id labels to expected metric patterns. These labels were added in PRs #19717 and #19678 but test wasn't updated. * fix(test_resend_email): use direct mock injection for all email tests Extend the mock injection pattern used in test_send_email_missing_api_key to all other tests in the file: - test_send_email_success - test_send_email_multiple_recipients Instead of relying on fixture-based patching and respx mocks which can fail due to import order and caching issues, directly inject the mock HTTP client into the logger instance. This ensures mocks are always used regardless of test execution order. * fix(test): use patch.object for image_edit and vector_store tests - test_image_edit_merges_headers_and_extra_headers: import base_llm_http_handler and use patch.object instead of string path patching - test_search_uses_registry_credentials: import module and patch via module.base_llm_http_handler to ensure we patch the right instance --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>	2026-01-31 17:53:54 -08:00

1 2 3

124 Commits