Commit Graph

4981 Commits

Author SHA1 Message Date
Alexsander Hamir 5de9bfde53 [Fix] CI/CD - mypy & check_code_and_doc_quality & mcp_testing (#17920)
* Fix duplicate imports in SAP embedding transformation

* fix: add missing prompt_spec parameter to HumanloopLogger.get_chat_completion_prompt

- Add prompt_spec: Optional[PromptSpec] = None parameter to match base class signature
- Import PromptSpec from litellm.types.prompts.init_prompts
- Pass prompt_spec to super().get_chat_completion_prompt() call
- Fixes mypy type error: Signature incompatible with supertype CustomLogger

* fix: add missing parameters to AnthropicCacheControlHook.async_get_chat_completion_prompt

- Add ignore_prompt_manager_model and ignore_prompt_manager_optional_params parameters
- Change litellm_logging_obj type from Any to LiteLLMLoggingObj using TYPE_CHECKING pattern
- Pass all parameters including prompt_spec to get_chat_completion_prompt call
- Fixes mypy type errors: Signature incompatible with supertype CustomLogger and PromptManagementBase

* fix: add missing parameters to DotpromptManager.async_get_chat_completion_prompt

- Add ignore_prompt_manager_model and ignore_prompt_manager_optional_params parameters
- Change litellm_logging_obj type from Any to LiteLLMLoggingObj using TYPE_CHECKING pattern
- Pass all parameters including ignore flags to PromptManagementBase.async_get_chat_completion_prompt
- Fixes mypy type errors: Signature incompatible with supertype CustomLogger and PromptManagementBase

* fix: document envs

* fix: add missing parameters to LangfusePromptManagement.async_get_chat_completion_prompt

- Add ignore_prompt_manager_model and ignore_prompt_manager_optional_params parameters
- Pass all parameters including prompt_spec and ignore flags to get_chat_completion_prompt
- Fixes mypy type errors: Signature incompatible with supertype CustomLogger and PromptManagementBase

* fix: add missing parameters to prompt management async methods (Category 1)

- vector_store_pre_call_hook: add ignore_prompt_manager_model, ignore_prompt_manager_optional_params, prompt_spec
- gitlab_prompt_manager: add ignore parameters, fix litellm_logging_obj type
- bitbucket_prompt_manager: add ignore parameters, fix litellm_logging_obj type
- proxy/custom_prompt_management: add prompt_spec parameter
- Fixes mypy type errors: Signature incompatible with supertype

* fix: fix arize_phoenix_prompt_manager and custom_prompt_management (Category 2)

- arize_phoenix_prompt_manager: add prompt_spec to all methods, fix prompt_id types, implement async_compile_prompt_helper
- custom_prompt_management: implement async_compile_prompt_helper abstract method
- Fixes mypy type errors: Signature incompatible with supertype and abstract method errors

* fix: fix obvious type errors (Category 3 - Quick Wins)

- langfuse: change 'callable' to 'Callable' type annotation
- presidio: add type narrowing check for Choices vs StreamingChoices
  - StreamingChoices doesn't have .message attribute, only Choices does
  - Add hasattr check before accessing choice.message
- Fixes mypy type errors: callable? not callable and union-attr errors

* fix: handle expires_after None in Azure files handler (Todo 14)

- Extract logic to _prepare_create_file_data helper method
- Remove expires_after from dict if None to match SDK's Omit pattern
- Add type ignore for FileExpiresAfter -> file_create_params.ExpiresAfter mismatch
- Fixes mypy error: Argument expires_after has incompatible type

* fix: change purpose parameter type to OpenAIFilesPurpose (Todo 18)

- Import OpenAIFilesPurpose in storage_backend_service.py
- Change upload_file_to_storage_backend purpose parameter from str to OpenAIFilesPurpose
- Change _create_file_object_with_storage_metadata purpose parameter from str to OpenAIFilesPurpose
- Fixes mypy error: Argument purpose has incompatible type str; expected Literal type
- Purpose is already validated in files_endpoints.py before reaching these functions

* fix: handle UploadFile | str type for expires_after form fields (Todo 19)

- Validate expires_after[anchor] and expires_after[seconds] are strings, not UploadFiles
- Validate anchor equals 'created_at' before using literal in TypedDict
- Use literal 'created_at' (not variable) in FileExpiresAfter to satisfy Literal type
- Add proper error handling for invalid anchor values and int conversion
- Fixes mypy errors: Incompatible types for anchor and seconds in FileExpiresAfter

* fix: add type narrowing for expires_after_seconds_str to fix mypy error

- Add assert statement after UploadFile validation to help mypy narrow type
- Use validated variable with explicit str type annotation
- Fixes: Argument of type 'UploadFile | str' cannot be assigned to int()

* fix: trigger async_success_handler for MCP tool calls to enable cost tracking and logging

- Set call_type to CallTypes.call_mcp_tool.value before calling async_success_handler
- Update mcp_tool_call_metadata with cost info when server is found
- Call async_success_handler to build standard_logging_object and trigger callbacks
- Fixes test_mcp_cost_tracking by ensuring standard_logging_payload is populated

* refactor: use positive isinstance check for safer type narrowing

- Replace assert with positive isinstance(..., str) check
- Matches codebase pattern (see pass_through_endpoints.py)
- Safer than assert: assertions can be disabled with -O flag
- Mypy properly narrows type after positive isinstance check
- More explicit and readable than assert statement

* fix: add missing REDIS_DAILY_AGENT_SPEND_UPDATE_QUEUE to ServiceTypes enum (Todo 17)

- Add REDIS_DAILY_AGENT_SPEND_UPDATE_QUEUE enum value following the pattern of other daily spend queues
- Add corresponding entry to DEFAULT_SERVICE_CONFIGS with GAUGE metrics
- Fixes mypy error: 'type[ServiceTypes]' has no attribute 'REDIS_DAILY_AGENT_SPEND_UPDATE_QUEUE'
- This enum value is already used in redis_update_buffer.py for agent spend tracking
2025-12-13 08:18:43 -08:00
Ishaan Jaff 2f82c223d3 Litellm docs a2a cost tracking (#17913)
* docs - a2a cost tracking

* docs fix

* docs a2a cost trackign

* docs langgraph agents
2025-12-12 18:23:25 -08:00
YutaSaito 8899b63fa4 Merge pull request #17747 from BerriAI/litellm_feat_mcp-chat-completions
feat: add support for using MCPs on /chat/completions
2025-12-13 05:05:21 +09:00
Cesar Garcia 1531b58493 feat(openai): add reasoning_effort='xhigh' support for gpt-5.2 models (#17875)
Add support for the 'xhigh' reasoning effort level on all gpt-5.2 model
variants, not just gpt-5.2-pro. This enables deeper reasoning capabilities
for the base gpt-5.2 model.

Changes:
- Add is_model_gpt_5_2_model() method to detect gpt-5.2 variants
- Update xhigh validation to allow gpt-5.2 models
- Update documentation with gpt-5.2 reasoning_effort support
- Update tests to reflect new behavior
2025-12-12 11:40:35 -08:00
Ishaan Jaff d38f241032 [Feat] JWT Auth - auth allow selecting team_id from request header (#17884)
* feat: add get_team_id_from_header for JWT Auth

* fix Auth builder JWT Auth

* test_get_team_id_from_header

* test_auth_builder_uses_team_from_header_e2e

* Select Team via Request Header
2025-12-12 10:18:20 -08:00
Sameer Kankute d5fcb6fce6 Merge pull request #17882 from BerriAI/litellm_target_storage_documentation
Add documentation for target storage
2025-12-12 22:46:04 +05:30
Sameer Kankute 6f0efff28b Add documentation for target storage 2025-12-12 22:44:53 +05:30
AlexsanderHamir 1ad6763500 fix: add PROMETHEUS_MULTIPROC_DIR to docs 2025-12-12 08:33:38 -08:00
Krish Dholakia eab5bca583 Add Milvus REST client and update examples (#17736)
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
2025-12-12 04:38:28 -08:00
Ariel 5df701d15c [feat]: Add opt-in evidence results for Pillar Security guardrail during monitoring (#17812)
* add evidence headers to litellm

* ensure that evidence is surface-able, even in opt-in mode

* update the docs
2025-12-12 04:09:13 -08:00
Cesar Garcia a037414985 feat(deepseek): add native support for thinking and reasoning_effort params (#17712)
* feat(deepseek): add native support for thinking and reasoning_effort params

Add proper parameter mapping for DeepSeek thinking mode, allowing users
to use the unified LiteLLM interface instead of extra_body workarounds.

Supported formats:
- thinking={"type": "enabled"}
- thinking={"type": "enabled", "budget_tokens": X} (budget_tokens ignored)
- reasoning_effort="low|medium|high" (maps to thinking enabled)

DeepSeek only supports {"type": "enabled"} without budget_tokens,
so any budget_tokens are stripped and all reasoning_effort values
(except "none") map to enabled.

Reference: https://api-docs.deepseek.com/guides/thinking_mode

* docs(deepseek): add thinking and reasoning_effort parameter documentation
2025-12-11 15:28:43 -08:00
Jason Roberts 6fc39d31b4 feat(guardrails): add configurable fail-open, timeout, and app_user to PANW Prisma AIRS guardrail (#17785)
Add configurable fail-open/fail-closed behavior, timeout settings, and app_user
metadata tracking. Includes security hardening, enhanced
observability (:unscanned header), and comprehensive test coverage (44/44 passing).

No breaking changes.
2025-12-11 15:23:59 -08:00
Ishaan Jaff cca21c0926 [Feat] New API Provider - Add Azure AI Foundry Agents on /chat/completions, /responses, /messages + Agent Gateway (#17845)
* init get_azure_ai_route

* init AzureAIAgentsConfig

* init AzureAIAgentsConfig

* AzureAIAgentsHandler

* test_azure_ai_agents_acompletion_non_streaming

* test_azure_ai_agents_acompletion_streaming

* fix stream

* _process_sse_stream

* Azure AI Foundry Agents

* init  Azure AI Foundry Agent

* fix code QA checks

* fix api key

* docs fix
2025-12-11 15:21:28 -08:00
Dominic Fallows 756c60540e feat: add support for configurable confidence score thresholds and scope in Presidio PII masking (#17817)
* feat: add support for configurable confidence score thresholds in Presidio PII masking

* feat: enhance Presidio PII masking with configurable score thresholds and behavior documentation

* feat: add configurable output masking and filter scope for Presidio PII guardrail
2025-12-11 15:19:11 -08:00
Alexsander Hamir 15404db3d0 [Fix] CI/CD – Docs & Spend logs (#17843)
* fix: resolve mypy type errors in hiddenlayer guardrail and transformation

- Fix return type of apply_guardrail from str to GenericGuardrailAPIInputs
- Add None checks for logging_obj before accessing attributes
- Convert AllMessageValues to dict format for HiddenLayer API compatibility
- Fix payload type annotation in _call_hiddenlayer
- Ensure transformed_output always returns list[dict[str, Any]] in transformation.py

* fix: use litellm_call_id as trace_id fallback in langfuse logging

- Only use standard_logging_object.trace_id if explicitly set via litellm_session_id or litellm_trace_id params
- Fallback to litellm_call_id when no explicit trace_id is provided (matches test expectation)
- Return the trace_id we set instead of generation_client.trace_id for consistency
- Add warning if langfuse modifies the trace_id to help debug potential issues

Fixes test_logging_trace_id test failure where auto-generated UUID was used instead of litellm_call_id

* fix: document envs

* fix: handle None response in /spend/logs endpoint when no records found

- Return empty list [] instead of [None] when spend_log is None
- Prevents 500 errors when querying by request_id, api_key, or user_id with no matching records
- Fixes test_chat_completion_bad_model_with_spend_logs test failure

* fix: use standard_logging_object trace_id when available in langfuse logger

- Fix trace_id selection logic to use standard_logging_object.trace_id when available
- Previously only used standard_logging_object.trace_id if explicitly set via params
- Now uses standard_logging_object.trace_id whenever it's present, matching test expectations
- Falls back to litellm_call_id if no trace_id is found
- Fixes test_log_langfuse_v2_uses_standard_trace_id_when_available test failure
2025-12-11 14:00:33 -08:00
Peter Dave Hello 70643a8b9c Add support for OpenAI GPT-5.2 models (#17836)
References:
- https://openai.com/index/introducing-gpt-5-2/
- https://platform.openai.com/docs/models/gpt-5.2
2025-12-11 12:49:30 -08:00
yuneng-jiang f9dc034c73 Merge pull request #17775 from BerriAI/litellm_sendgrid
[Feature] Sendgrid integration
2025-12-11 09:19:41 -08:00
YutaSaito 13df50830d chore: prefer standard trace id for Langfuse logging (#17791) 2025-12-11 08:18:45 -08:00
CyrusTC 5d326386fb feat(bedrock): add serviceTier support for Converse API (#17810)
Add support for the Bedrock Converse API serviceTier parameter to allow
specifying processing tier (priority, default, or flex).

Changes:
- Add ServiceTierBlock type in litellm/types/llms/bedrock.py
- Add serviceTier to CommonRequestObject
- Add serviceTier to get_config_blocks() in AmazonConverseConfig
- Add comprehensive tests for serviceTier functionality
- Add documentation for serviceTier usage

This allows users to configure service tier via:
- litellm_params in proxy config
- optional_params in SDK calls
2025-12-11 08:16:32 -08:00
Ashton Sidhu a514313540 Add Hiddenlayer Guardrail Hooks (#17728)
* Core logic working, need to add tests

* Re add removed files

* Remove mistaken files

* one more file

* Add deployment params

* Add tests

* Remove unused imports

* Update docs from feedback

* Update guardrails
2025-12-11 07:43:26 -08:00
Sameer Kankute 8942053c8b Merge pull request #17700 from BerriAI/litellm_batches_passthrough_cost_tracking
Add anthropic retrieve batches and retreive file content support
2025-12-11 10:31:54 +05:30
yuneng-jiang 1c6de2b80d Merge remote-tracking branch 'origin' into litellm_sendgrid 2025-12-10 20:39:25 -08:00
yuneng-jiang cffd0ac350 sendgrid docs 2025-12-10 20:39:10 -08:00
Shivam Rawat 9d7a255d55 made litellm proxy and sdk difference cleaner in overview (#17790) 2025-12-10 19:14:49 -08:00
Yuta Saito 4efa21ee7d docs: clarify MCP tool support across providers 2025-12-11 10:39:24 +09:00
Ishaan Jaff 5d456bcdc3 [Feat] UI SSO - allow fetching role from generic SSO provider (Keycloak) (#17787)
* fix ui SSO

* TestGenericResponseConvertorUserRole

* Assigning User Roles via SSO
2025-12-10 13:09:28 -08:00
Alexsander Hamir 439bb5bfe3 fix: suggest Gunicorn instead of uvicorn when using max_requests_before_restart (#17788) 2025-12-10 13:09:00 -08:00
Ishaan Jaff 49b91c4a35 [Feat] A2a gateway - Add cost per token pricing (#17780)
* fix calculate_a2a_cost

* add cost_per_query

* add test_asend_message_uses_cost_per_query

* fix: _initialize_slack_alerting_jobs

* feat: add token tracking for agents invoke

* add A2ARequestUtils

* add _set_usage_on_logging_obj

* test_asend_message_token_tracking

* add _handle_a2a_response_logging

* test_asend_message_streaming_token_tracking

* add A2AStreamingIterator

* add cost calculator for agents

* test_asend_message_uses_input_output_cost_per_token

* docs gix
2025-12-10 13:08:15 -08:00
Ishaan Jaff 5ee32167c0 [Feat] New Provider - add langgraph (#17783)
* init LANGGRAPH

* init LangGraphConfig

* init LangGraphConfig types

* init langgraph

* init getting api base and key

* init transform langgraph

* fix SSE issues

* test_langgraph_acompletion_non_streaming

* add LangGraph to docs

* docs: Setting Up a Local LangGraph Server

* fix langgraph SSE

* fix import uuid
2025-12-10 12:30:35 -08:00
Krish Dholakia 8bc5e2ca7f Add /v1/messages/count_tokens endpoint documentation (#17772)
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
2025-12-10 11:34:11 -08:00
yuneng-jiang ba554a86b9 Merge pull request #16843 from BerriAI/litellm_allow_custom_mount_paths
[Feature] Allow Root Path to Redirect when Docs not on Root Path
2025-12-10 09:52:30 -08:00
Sameer Kankute 4c78c1afc8 Merge pull request #17756 from BerriAI/litellm_add_gemini_computer_use
Add support for computer use for gemini
2025-12-10 22:28:37 +05:30
Sameer Kankute 9e3a04a725 Add batch passthrough endpoint cost tracking for anthropic 2025-12-10 18:24:31 +05:30
Sameer Kankute 0d2f8ce931 Merge pull request #17711 from BerriAI/litellm_add_additional_drop_params_support
feat: Add nested field removal support to additional_drop_params
2025-12-10 15:37:39 +05:30
Krish Dholakia b0a5a4b81d Arize Phoenix OSS - Prompt Management Integration (#17750)
* docs(prompt_management.md): document how to onboard prompts to litellm

* feat(arize_phoenix_prompt_manager.py): support new prompt management integration

allows users to connect arize phoenix prompt manager to litellm

* fix(proxy/utils.py): remove prompt variables to avoid re-processing prompt

* docs(arize_phoenix_prompts.md): document new prompt management integration
2025-12-09 22:53:42 -08:00
Sameer Kankute bcac9e41f6 Add support for computer use for gemini 2025-12-10 10:34:08 +05:30
Cesar Garcia b4e0dabb37 fix: use absolute URL for Supported Endpoints link to avoid Docusaurus slug conflict (#17710)
The relative link was causing Docusaurus to incorrectly associate the
/supported_endpoints page with SDK Functions category instead of the
actual Supported Endpoints generated-index.
2025-12-09 18:49:49 -08:00
Krish Dholakia 254c1155a2 Remove streaming_logging.md documentation (#17739)
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
2025-12-09 18:26:08 -08:00
Krrish Dholakia d3531be9a0 docs(community.md): add new integration partner doc 2025-12-09 18:17:14 -08:00
Ishaan Jaff 42f5770cfa [docs] add docs for containers files api + code interpreter on LiteLLM (#17749)
* add new container api on OpenAI

* add related

* docs fix

* docs code interpreter

* code interp

* docs code interptert

* docs code int

* docs code interp

* docs code interp
2025-12-09 18:11:28 -08:00
Krish Dholakia 8d5e6cc62d Add community doc link (#17734)
* Add community contribution guide for integration partners

Co-authored-by: krrishdholakia <krrishdholakia@gmail.com>

* Update community docs to direct users to #integration-partners

Co-authored-by: krrishdholakia <krrishdholakia@gmail.com>

---------

Co-authored-by: Cursor Agent <cursoragent@cursor.com>
2025-12-09 18:10:00 -08:00
Yuta Saito ed5cbdac2f feat: add support for using MCPs on /chat/completions 2025-12-10 10:55:33 +09:00
Cesar Garcia 63a97db663 feat(voyage): add rerank API support (#17744)
* feat(voyage): add rerank API support

Add support for Voyage AI rerank models (rerank-2.5, rerank-2.5-lite,
rerank-2, rerank-2-lite) to the LiteLLM rerank API.

Changes:
- Add VoyageRerankConfig transformation class
- Register voyage provider in rerank_api/main.py
- Add voyage case in utils.py get_provider_rerank_config
- Add rerank-2.5 and rerank-2.5-lite models to pricing JSON
- Add unit tests for transformation logic
- Update documentation for voyage.md and rerank.md

Usage:
```python
from litellm import rerank

response = rerank(
    model="voyage/rerank-2.5",
    query="What is the capital of France?",
    documents=["Paris is...", "London is..."],
    top_n=3,
)
```

* refactor(voyage): simplify rerank transformation code

Remove verbose docstrings to align with other providers (jina_ai pattern).
No functional changes - 168 lines vs 169 for jina_ai.

* fix(voyage): remove incorrect input_cost_per_query from rerank models

Voyage AI charges per token, not per query. The input_cost_per_query
field was incorrectly set to the same value as input_cost_per_token
in the existing rerank-2 and rerank-2-lite models.

Removes input_cost_per_query from all Voyage rerank models:
- voyage/rerank-2
- voyage/rerank-2-lite
- voyage/rerank-2.5
- voyage/rerank-2.5-lite

Pricing source: https://docs.voyageai.com/docs/pricing
2025-12-09 17:34:09 -08:00
YutaSaito 80a18f989a feat: propagate Langfuse trace_id (#17669) 2025-12-09 12:25:52 -08:00
yuneng-jiang 39bf7a9f7c Merge remote-tracking branch 'origin' into litellm_allow_custom_mount_paths 2025-12-09 11:58:05 -08:00
Shivam Rawat 43a7bbeeaf added note for using Azure Active Directory Tokens with all the other endpoints (#17733) 2025-12-09 11:51:28 -08:00
yuneng-jiang aa450e7ebe Merge pull request #17738 from BerriAI/litellm_doc_update_1805
[Docs] Adding known issues to 1.80.5-stable docs
2025-12-09 11:46:08 -08:00
yuneng-jiang 431884f591 Adding known issues to 1.80.5-stable docs 2025-12-09 11:45:16 -08:00
Derek Duenas 3322523e07 Passthrough in response (#17102)
* attempt to implement the passthrough feature

* Formatting and small change

* Fix formatting

* feat: grayswan guardrail overwrite ModelResponse in passthrough mode

* fix missing exception error catching on certain
endpoints

* fix wrong call site

* fix: patch anthropic endpoint internal error on streaming obj

* fix grayswan testcase

* feat: update the violation response to more natural

* Formatting

* move passthrough exception definition to custom_guardrail.

* Enhancement: show whether the blocked at input or output

* update exception name

* fix a typo in testing unit.

---------

Co-authored-by: Xiaohan Fu <xiaohan@grayswan.ai>
2025-12-09 10:45:45 -08:00
Krish Dholakia 81f0bbad73 Add Azure AI Search to supported vector stores (#17726)
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
2025-12-09 09:04:04 -08:00