Addresses Greptile review feedback: replace direct litellm.model_cost
lookup with the standard _supports_factory infrastructure used by
supports_reasoning, supports_native_streaming, etc.
- Add supports_native_structured_output() utility in litellm/utils.py
- Add supports_native_structured_output field to ModelInfoBase type
- Wire field into _get_model_info_helper return dict
- Delegate from Bedrock _supports_native_structured_outputs to utility
- Add field to JSON schema validator in test_utils.py
- Replace gemini-pro with gemini-3-pro-preview in test_cost_discount_vertex_ai
(gemini-pro removed from cost map)
- Replace github/claude-3-5-sonnet-latest with github/claude-3-7-sonnet-20250219
in test_supports_function_calling_github_anthropic_alias (model removed)
- Add supports_multimodal, uses_embed_content, input/output_cost_per_token_above_256k_tokens
to JSON schema in test_utils.py (new properties added to model cost map)
Co-authored-by: yuneng-jiang <yuneng-jiang@users.noreply.github.com>
Replace removed deprecated models (claude-3-5-sonnet-20241022,
claude-3-5-haiku-20241022, claude-3-5-haiku-latest) with current
models in web_search and cost calculation tests.
The gemini-3.1-flash-image-preview model introduced a new pricing field
that was missing from the test's validation schema and cost_fields list.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: resolve ruff lint errors and mypy type error
- Remove unused import get_user_credential (F401)
- Add noqa: PLR0915 for 3 large functions exceeding 50 statements
- Cast result_data['q'] to str for _append_domain_filters (mypy arg-type)
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
* fix: add /vertex_ai/live to supported endpoints and azure gpt-5.1 reasoning flags
- Add /vertex_ai/live to JSON schema validation enum in test_utils.py
- Add supports_none_reasoning_effort=true to 10 azure/gpt-5.1 model entries
(matching the OpenAI gpt-5.1 behavior)
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
* fix: handle non-string team_alias/key_alias in PolicyMatchContext
Prevent Pydantic validation errors when team_alias or key_alias are not
proper strings (e.g. MagicMock in tests). Only pass values that are
actually strings; default to None otherwise.
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
* fix: initialize jwt_handler.litellm_jwtauth in JWT test
The test_jwt_non_admin_team_route_access test was failing because
user_api_key_auth now accesses jwt_handler.litellm_jwtauth.virtual_key_claim_field
before reaching the mocked JWTAuthManager.auth_builder. Initialize the
jwt_handler with a default LiteLLM_JWTAuth object.
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
* fix: add missing mock attributes to MCP server test
The test_add_update_server_fallback_to_server_id test was failing because
MagicMock auto-creates attributes when accessed. build_mcp_server_from_table
accesses many fields via getattr(), which on a MagicMock returns another
MagicMock instead of None, causing Pydantic validation errors in MCPServer.
Explicitly set all required mock attributes.
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
* fix: update UI tests for leftnav, navbar, and KeyLifecycleSettings
- leftnav: Add mock for useTeams hook, add isUserTeamAdminForAnyTeam to
roles mock, update topLevelLabels to match current component menu items
- navbar: Add mocks for useDisableBouncingIcon, BlogDropdown, UserDropdown,
and serverRootPath. Update test to work with the new component structure.
- KeyLifecycleSettings: Fix placeholder and tooltip assertions to match
actual component behavior
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
* fix: update health check test assertion from 'connected' to 'healthy'
The /health/readiness endpoint now returns {"status": "healthy"} with the
DB status in a separate field, instead of the previous {"status": "connected"}.
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
* fix: clear litellm.api_key in OpenRouter validate_environment test
The test_validate_environment_raises_without_key test was failing because
litellm.api_key may be set globally in the test environment. Clear it
along with OPENROUTER_API_KEY and OR_API_KEY env vars using monkeypatch.
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
* fix: patch HTTPHandler class-level in VLLM embedding test
The test_encoding_format_not_sent_in_actual_request test was patching
client.post on an instance, but the handler uses the class method.
Patch HTTPHandler.post at class level, add caching=False to prevent
cache hits, and remove broad try/except that hid errors.
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
* fix: make test_redaction_responses_api_stream resilient to async callback timing
Replace fixed 1s sleep with polling wait for async_log_success_event.
Streaming success handler runs via asyncio.create_task; 1s was insufficient
in CI. Add 0.5s initial sleep for event loop to schedule the task, then
poll up to 10s for the callback to fire.
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
* fix: update dompurify and svgo to fix security CVEs
- CVE-2026-0540: dompurify XSS vulnerability - fix by upgrading to 3.3.2+
- CVE-2026-29074: svgo DoS via entity expansion - fix by upgrading to 3.3.3+
Added npm overrides in docs/my-website/package.json and regenerated
package-lock.json.
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
* fix: remove unused json import in config_override_endpoints.py
Ruff F401: json is imported but unused (safe_json_loads/safe_dumps
are used instead)
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
* fix: add missing MCP mock attributes and provider documentation entries
- Add missing mock attributes to test_add_update_server_with_alias and
test_add_update_server_without_alias (same fix as fallback test)
- Add bedrock_mantle and searchapi to provider_endpoints_support.json
- Remove unused json import from config_override_endpoints.py
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
* fix: override _supports_reasoning_effort_level for Azure gpt5_series prefix
The Azure GPT-5 config uses 'gpt5_series/' as a routing prefix, but
_supports_factory(model='gpt5_series/gpt-5.1') fails to resolve because
'gpt5_series' is not a recognized provider. Override the method to strip
the prefix and prepend 'azure/' for correct model info lookup.
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
* fix: accept both 'healthy' and 'connected' in health check test
The test_health_and_chat_completion test runs against both source builds
(which return 'healthy') and pip-installed versions (which may return
'connected'). Accept both values.
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
* fix: mock extract_mcp_auth_context in streamable HTTP MCP handler test
The handle_streamable_http_mcp function now calls extract_mcp_auth_context
before session_manager.handle_request, but the test didn't mock it. The
auth extraction fails with the minimal mock scope, preventing
handle_request from being called. Also relax assertion to not check
exact args since the send wrapper may be modified by debug injection.
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
* fix: add test for _combine_fallback_usage to satisfy router code coverage
The router_code_coverage.py check requires all functions in router.py
to be called in test files. Add a basic test for _combine_fallback_usage.
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
* fix: add @log_guardrail_information decorator to CrowdStrike AIDR guardrail
The check_guardrail_apply_decorator.py CI check requires all guardrail
apply_guardrail methods to have the @log_guardrail_information decorator.
The CrowdStrike AIDR handler was missing it.
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
* fix: document PRISMA_RECONNECT_ESCALATION_THRESHOLD and REDIS_CLUSTER_NODES env keys
Add missing environment variable documentation to config_settings.md
to satisfy the test_env_keys.py CI check.
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
* fix: document enforced_file_expires_after and enforced_batch_output_expires_after in new_team docstring
The test_api_docs.py CI check validates that all Pydantic model fields
are documented in the function docstring. Add missing parameter docs
for enforced_file_expires_after and enforced_batch_output_expires_after.
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
* fix: regenerate poetry.lock to match pyproject.toml
The poetry.lock file was out of sync with pyproject.toml, causing
proxy_e2e_azure_batches_tests to fail during dependency installation.
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
* fix: set master_key=None in test_create_file_with_deep_nested_litellm_metadata
The test was missing the master_key monkeypatch that other tests in the
same file set. In CI with parallel execution (-n 4), another test may
set master_key to a non-None value, causing auth failures (500) when
the test sends 'Bearer test-key'.
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
* fix: document enforced_*_expires_after in update_team docstring too
Same missing params as new_team - also needed in update_team docstring
for the test_api_docs.py CI check to pass.
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
* fix: use get_async_httpx_client in a2a_protocol and add master_key monkeypatch to files tests
- Replace httpx.AsyncClient() with get_async_httpx_client() in a2a_protocol/main.py
to satisfy the ensure_async_clients_test CI check
- Add httpxSpecialProvider.A2AProvider enum value
- Add master_key=None monkeypatch to test_managed_files_with_loadbalancing
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
* fix: remove unused httpx import from a2a_protocol/main.py
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
* fix: use cache-key-only param for A2A extra_headers to avoid AsyncHTTPHandler init error
The 'extra_headers' key in params was being passed to AsyncHTTPHandler.__init__()
which doesn't accept it. Use 'disable_aiohttp_transport' as the cache-key-only
param since it's explicitly filtered out before reaching the constructor.
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
* fix: add additionalProperties:false and resolve $defs/$ref in Anthropic output_format schemas
Anthropic API now requires additionalProperties=false for all object-type
schemas in output_format. Also resolve $defs/$ref references by inlining
them using unpack_defs before sending to Anthropic, since Anthropic
doesn't support external schema references.
Fixes: llm_translation_testing Anthropic JSON schema failures
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
* fix: allowlist CVE-2026-2297 and GHSA-qffp-2rhf-9h96 in security scans
- CVE-2026-2297: Python 3.13 SourcelessFileLoader audit hook bypass,
no fix available in base image
- GHSA-qffp-2rhf-9h96: tar hardlink path traversal, from nodejs_wheel
bundled npm, not used in application runtime code
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
* fix: isolate files endpoint tests from shared proxy state in CI parallel execution
Override user_api_key_auth dependency to return a fixed UserAPIKeyAuth
with PROXY_ADMIN role, avoiding auth lookups via prisma_client,
user_api_key_cache, or master_key. Set prisma_client=None to prevent
DB state contamination. Use try/finally to clean up dependency overrides.
Fixes persistent test_create_file_with_deep_nested_litellm_metadata and
test_managed_files_with_loadbalancing 500 errors in CI with -n 4.
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
* fix: apply same auth override to test_managed_files_with_loadbalancing
Same CI parallel execution fix as test_create_file_with_deep_nested -
override user_api_key_auth dependency and set prisma_client=None.
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
---------
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
* fix: use DeepSeekChatConfig instead of OpenAIConfig for deepseek provider
The deepseek provider was incorrectly using OpenAIConfig().map_openai_params()
instead of DeepSeekChatConfig().map_openai_params(), which meant DeepSeek-specific
parameter mappings were not being applied.
* test: add unit tests for deepseek DeepSeekChatConfig param mapping
Verify that get_optional_params uses DeepSeekChatConfig (not OpenAIConfig)
for the deepseek provider by testing thinking, reasoning_effort, and
budget_tokens stripping behavior.
gemini/gemini-live-2.5-flash-preview-native-audio-09-2025 uses mode='realtime'
but the schema in test_aaamodel_prices_and_context_window_json_is_valid did
not include 'realtime' as a valid enum value, causing a ValidationError.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add cache_read_input_token_cost_per_audio_token, supports_code_execution,
and supports_file_search to the JSON schema used by the model prices
validation test.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* add explicit caching to litellm proxy for gemini models via injection
* fix: add missing `supports_function_calling` for deepinfra models
All 55 deepinfra models that had `supports_tool_choice: true` were
missing the `supports_function_calling` flag, causing
`litellm.supports_function_calling()` to incorrectly return False.
Fixes#22619
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Managed batches - Address PR bot comments from #22464
* feat(togetherai): add support for TogetherAI Qwen3.5-397B-A17B model
* Agent Tracing - support context_id based trace id propogation + nested llm calls (#22626)
* style(ui/): distinguish agent calls from llm calls on ui
* feat: initial grouping working
* feat: set stable contextid for a2a calls - allows for easily passing to downstream llm/mcp calls
* feat(a2a_endpoints.py): fix tracing to avoid recreating logging objects for the same call
allows stable trace id usage
* fix(guardrail_endpoints): handle string ui_type values in _build_field_dict
_build_field_dict unconditionally called .value on ui_type, which crashes
for guardrail configs that use plain strings (e.g. BlockCodeExecutionGuardrailConfigModel
uses "multiselect" and "percentage"). Now checks with hasattr before calling .value.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: propagate trace/session id from headers in MCP server calls
Cherry-picked mcp_server/server.py fixes from 6feb9bab: adds
get_chain_id_from_headers to extract x-litellm-trace-id /
x-litellm-session-id from raw headers, and uses it in call_tool
and list_tools to keep spend logs and tracing consistent with A2A.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* [Feat] UI - Add Open in New Tab on leftnav Bar (#22731)
* Add minimal dev_config.yaml for proxy development
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
* feat(ui): wrap left nav items in <a> tags for open-in-new-tab support
Nav items are now rendered as <a> elements with proper href attributes,
enabling right-click → 'Open in new tab', Ctrl/Cmd+click, and
middle-click to open any sidebar page in a new browser tab.
Normal clicks continue to use SPA navigation (no full page reload).
Applied to both leftnav.tsx (query-param routing) and Sidebar2.tsx
(Next.js file-based routing).
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
---------
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
* [Feat] Add Tool Policies for AI Gateway (#22732)
* fix: fix ui render
* fix: fix minor bugs
* refactor: use prisma functions instead of raw sql (safer)
* fix(add-new-tiles-to-tool-policies): allow developer to see what's available
* feat: ensure tool allowlist runs correctly for tool names + mcp's
* refactor: more ui improvements
* feat: working key tool blocking
* feat(tools): show tool logs
* refactor: backend code improvements
* refactor: improve log viewer for tools
* fix: address PR review feedback for tool access control
- Add missing blocked_tools column to root schema.prisma (schema drift)
- Invalidate ToolPolicyRegistry after policy mutations so changes take effect immediately
- Remove dead code: unused get_effective_policies, get_tool_policies_cached, and helpers
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: race condition in permission resolution and remove duplicate allowlist check
- Use atomic update_many with object_permission_id=None to prevent concurrent
requests from creating orphaned permission rows and losing tool blocks
- Remove duplicate allowed_tools enforcement from guardrail (already enforced
in auth layer via check_tools_allowlist)
- Move inline uuid import to module level
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* update to account for userAgent
* UI - Add ToolDetails
* input/output policy
* LiteLLM_PolicyAttachmentTable
* LiteLLM_PolicyAttachmentTable
* fix: add _enqueue_tool_registry_upsert
* fix: tool mgmt endpoints
* tool mgmt endpoints
* Update tests/test_litellm/proxy/db/test_tool_registry_writer.py
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
* Update tests/test_litellm/proxy/db/test_tool_registry_writer.py
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
* Update tests/test_litellm/proxy/db/test_tool_registry_writer.py
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
* fix: sync root schema.prisma and fix test_tool_registry_writer for input/output policy
- Migrate root schema.prisma LiteLLM_ToolTable from call_policy to
input_policy/output_policy, add missing user_agent and last_used_at columns
(now consistent with litellm/proxy/schema.prisma and litellm-proxy-extras)
- Fix SpendLogToolIndex comment across all three schema files
- Fix all call_policy references in test_tool_registry_writer.py:
swapped update_tool_policy arguments, wrong get_tools_by_names return type
assertions, _mock_tool_row setting call_policy instead of input_policy
Addresses Greptile review feedback on PR #22732.
Made-with: Cursor
---------
Co-authored-by: Krrish Dholakia <krrishdholakia@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
* feat(proxy): add key_alias, key_hash, requested_model DD APM span tags (#22710)
* feat(proxy): add key_alias, key_hash, requested_model tags to DD APM spans
* refactor(proxy): consolidate DD APM tag helpers into DDSpanTagger class
* refactor(proxy): move DDSpanTagger to its own file litellm/proxy/dd_span_tagger.py
---------
Co-authored-by: liweiguang <codingpunk@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Ephrim Stanley <ephrim.stanley@point72.com>
Co-authored-by: Varad Khonde <varadkhonde@gmail.com>
Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>
Co-authored-by: Sameer Kankute <sameer@berri.ai>
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Vertex AI / Gemini uses Pydantic's model_json_schema() which omits
additionalProperties: False (Gemini rejects it). The test expected
the same schema for all providers.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
All 55 deepinfra models that had `supports_tool_choice: true` were
missing the `supports_function_calling` flag, causing
`litellm.supports_function_calling()` to incorrectly return False.
Fixes#22619
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The test used a relative path 'litellm/model_prices_and_context_window.json'
which only works when pytest runs from a specific working directory.
Use os.path based on __file__ to resolve the path reliably.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Clients like OpenCode's @ai-sdk/openai-compatible send budgetTokens
(camelCase) instead of budget_tokens in the thinking parameter, causing
validation errors. Add early normalization in completion().
* fix(model-pricing): add missing fireworks_ai model pricing for glm-4p7, minimax-m2p1, kimi-k2p5
Fireworks AI models called via short-form (fireworks_ai/<model>) were
reporting $0.00 cost because the pricing JSON lacked short-form entries.
The lookup fell through to the fireworks-ai-default bucket which has
zero cost.
Added 5 new entries to model_prices_and_context_window.json:
- fireworks_ai/accounts/fireworks/models/glm-4p7 (new long-form)
- fireworks_ai/accounts/fireworks/models/minimax-m2p1 (new long-form)
- fireworks_ai/glm-4p7 (new short-form)
- fireworks_ai/minimax-m2p1 (new short-form)
- fireworks_ai/kimi-k2p5 (new short-form; long-form already existed)
Pricing sourced from fireworks.ai model pages and pricing page.
* add cache_read_input_token_cost to kimi-k2p5 long-form entry for consistency
Update provider matching so github/<model> aliases can resolve capabilities from existing upstream model metadata, including OpenAI and Anthropic entries. Add regression tests for known github aliases and unknown-model fallback behavior.
The model_prices_and_context_window_backup.json file has 'inference_geo'
fields (e.g. on 'us/claude-sonnet-4-6') for geo-prefixed Anthropic models
used in cost calculation, but the JSON schema validator in test_utils.py
did not include 'inference_geo' as an allowed property.
This caused test_aaamodel_prices_and_context_window_json_is_valid to fail
with: Additional properties are not allowed ('inference_geo' was unexpected)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* [Fix] handle metadata=None in SDK path retry/error logic (utils.py)
Fixes#20871
Same class of bug as #9717 (fixed by #9764 for the proxy path).
The SDK path in utils.py has the same fragile pattern at 7 locations.
Replace `kwargs.get("metadata", {})` with `(kwargs.get("metadata") or {})`
to handle the case where metadata key exists with value None (e.g. from
Azure OpenAI streaming responses).
This is consistent with the existing correct pattern at line 602:
`metadata = kwargs.get("metadata") or {}`
Adds TestMetadataNoneHandling with 6 unit tests in test_utils.py.
* fix: remove duplicate PerplexityResponsesConfig key in lazy imports registry
Removes duplicate dictionary key added in commit be0ebb15 (PR #20860).
The entry at line 1042 is identical to the existing entry at line 906.
This causes ruff F601 lint failure on all PRs targeting main.
The three loops in function_setup that called is_async_callable() on every
callback each request were redundant after the first request. Move the
async/sync routing into LoggingCallbackManager.add_litellm_*_callback()
so it happens once at registration time instead of on every request.
CallTypes(call_type) was constructing an enum from string on every call,
taking ~4.6µs/call (69.6% of function time). Replace with a frozenset
membership test for ~0.8µs/call (8.3x faster).
* fix(tests): Mock async_container_create_handler for async router test
The test was mocking container_create_handler (sync), but
router.acreate_container uses _is_async=True which calls
async_container_create_handler. This caused the test to hit
the real OpenAI API.
Fixed by using AsyncMock on async_container_create_handler.
* fix(tests): Use uuid for unique model name in scientific notation test
The test was using a static "unique" model name which could cause
conflicts when running tests in parallel (-n 16 in CI). Using uuid
ensures truly unique names to prevent test pollution.
---------
Co-authored-by: Shin <shin@openclaw.ai>
* litellm_fix_mapped_tests_core: fix test isolation and mock injection issues
## Problem
Four tests in litellm_mapped_tests_core were failing:
1. test_register_model_with_scientific_notation - KeyError due to test isolation issues
2. test_search_uses_registry_credentials - Mock not being called due to incorrect patch path
3. test_send_email_missing_api_key - Real API calls despite mocking
4. test_stream_transformation_error_sync - Mock not effective, real API called
## Solution
### test_register_model_with_scientific_notation
- Use unique model name to avoid conflicts with other tests
- Clear LRU caches before test to prevent stale data
- Clean up model_cost entry after test
### test_search_uses_registry_credentials
- Use patch.object() on the actual base_llm_http_handler instance
- String-based patching for instance methods can fail; direct object patching is more reliable
### test_send_email_missing_api_key
- Directly inject mock HTTP client into logger instance
- This bypasses any caching issues that could cause the fixture mock to be ineffective
### test_stream_transformation_error_sync
- Patch litellm.completion directly instead of the handler module's litellm reference
- This ensures the mock is effective regardless of import order
## Regression
These tests were affected by LRU caching added in #19606 and HTTP client caching.
* fix(test): use patch.object for container API tests to fix mock injection
## Problem
test_retrieve_container_basic tests were failing because mocks weren't
being applied correctly. The tests used string-based patching:
patch('litellm.containers.main.base_llm_http_handler')
But base_llm_http_handler is imported at module level, so the mock wasn't
intercepting the actual handler calls, resulting in real HTTP requests
to OpenAI API.
## Solution
Use patch.object() to directly mock methods on the imported handler
instance. Import base_llm_http_handler in the test file and patch like:
patch.object(base_llm_http_handler, 'container_retrieve_handler', ...)
This ensures the mock is applied to the actual object being used,
regardless of import order or caching.
* fix(test): add missing Prometheus metric labels to test_proxy_failure_metrics
Add client_ip, user_agent, model_id labels to expected metric patterns.
These labels were added in PRs #19717 and #19678 but test wasn't updated.
* fix(test_resend_email): use direct mock injection for all email tests
Extend the mock injection pattern used in test_send_email_missing_api_key
to all other tests in the file:
- test_send_email_success
- test_send_email_multiple_recipients
Instead of relying on fixture-based patching and respx mocks which can
fail due to import order and caching issues, directly inject the mock
HTTP client into the logger instance. This ensures mocks are always used
regardless of test execution order.
* fix(test): use patch.object for image_edit and vector_store tests
- test_image_edit_merges_headers_and_extra_headers: import base_llm_http_handler
and use patch.object instead of string path patching
- test_search_uses_registry_credentials: import module and patch via
module.base_llm_http_handler to ensure we patch the right instance
---------
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>