Commit Graph

4969 Commits

Author SHA1 Message Date
Cesar Garcia a037414985 feat(deepseek): add native support for thinking and reasoning_effort params (#17712)
* feat(deepseek): add native support for thinking and reasoning_effort params

Add proper parameter mapping for DeepSeek thinking mode, allowing users
to use the unified LiteLLM interface instead of extra_body workarounds.

Supported formats:
- thinking={"type": "enabled"}
- thinking={"type": "enabled", "budget_tokens": X} (budget_tokens ignored)
- reasoning_effort="low|medium|high" (maps to thinking enabled)

DeepSeek only supports {"type": "enabled"} without budget_tokens,
so any budget_tokens are stripped and all reasoning_effort values
(except "none") map to enabled.

Reference: https://api-docs.deepseek.com/guides/thinking_mode

* docs(deepseek): add thinking and reasoning_effort parameter documentation
2025-12-11 15:28:43 -08:00
Jason Roberts 6fc39d31b4 feat(guardrails): add configurable fail-open, timeout, and app_user to PANW Prisma AIRS guardrail (#17785)
Add configurable fail-open/fail-closed behavior, timeout settings, and app_user
metadata tracking. Includes security hardening, enhanced
observability (:unscanned header), and comprehensive test coverage (44/44 passing).

No breaking changes.
2025-12-11 15:23:59 -08:00
Ishaan Jaff cca21c0926 [Feat] New API Provider - Add Azure AI Foundry Agents on /chat/completions, /responses, /messages + Agent Gateway (#17845)
* init get_azure_ai_route

* init AzureAIAgentsConfig

* init AzureAIAgentsConfig

* AzureAIAgentsHandler

* test_azure_ai_agents_acompletion_non_streaming

* test_azure_ai_agents_acompletion_streaming

* fix stream

* _process_sse_stream

* Azure AI Foundry Agents

* init  Azure AI Foundry Agent

* fix code QA checks

* fix api key

* docs fix
2025-12-11 15:21:28 -08:00
Dominic Fallows 756c60540e feat: add support for configurable confidence score thresholds and scope in Presidio PII masking (#17817)
* feat: add support for configurable confidence score thresholds in Presidio PII masking

* feat: enhance Presidio PII masking with configurable score thresholds and behavior documentation

* feat: add configurable output masking and filter scope for Presidio PII guardrail
2025-12-11 15:19:11 -08:00
Alexsander Hamir 15404db3d0 [Fix] CI/CD – Docs & Spend logs (#17843)
* fix: resolve mypy type errors in hiddenlayer guardrail and transformation

- Fix return type of apply_guardrail from str to GenericGuardrailAPIInputs
- Add None checks for logging_obj before accessing attributes
- Convert AllMessageValues to dict format for HiddenLayer API compatibility
- Fix payload type annotation in _call_hiddenlayer
- Ensure transformed_output always returns list[dict[str, Any]] in transformation.py

* fix: use litellm_call_id as trace_id fallback in langfuse logging

- Only use standard_logging_object.trace_id if explicitly set via litellm_session_id or litellm_trace_id params
- Fallback to litellm_call_id when no explicit trace_id is provided (matches test expectation)
- Return the trace_id we set instead of generation_client.trace_id for consistency
- Add warning if langfuse modifies the trace_id to help debug potential issues

Fixes test_logging_trace_id test failure where auto-generated UUID was used instead of litellm_call_id

* fix: document envs

* fix: handle None response in /spend/logs endpoint when no records found

- Return empty list [] instead of [None] when spend_log is None
- Prevents 500 errors when querying by request_id, api_key, or user_id with no matching records
- Fixes test_chat_completion_bad_model_with_spend_logs test failure

* fix: use standard_logging_object trace_id when available in langfuse logger

- Fix trace_id selection logic to use standard_logging_object.trace_id when available
- Previously only used standard_logging_object.trace_id if explicitly set via params
- Now uses standard_logging_object.trace_id whenever it's present, matching test expectations
- Falls back to litellm_call_id if no trace_id is found
- Fixes test_log_langfuse_v2_uses_standard_trace_id_when_available test failure
2025-12-11 14:00:33 -08:00
Peter Dave Hello 70643a8b9c Add support for OpenAI GPT-5.2 models (#17836)
References:
- https://openai.com/index/introducing-gpt-5-2/
- https://platform.openai.com/docs/models/gpt-5.2
2025-12-11 12:49:30 -08:00
yuneng-jiang f9dc034c73 Merge pull request #17775 from BerriAI/litellm_sendgrid
[Feature] Sendgrid integration
2025-12-11 09:19:41 -08:00
YutaSaito 13df50830d chore: prefer standard trace id for Langfuse logging (#17791) 2025-12-11 08:18:45 -08:00
CyrusTC 5d326386fb feat(bedrock): add serviceTier support for Converse API (#17810)
Add support for the Bedrock Converse API serviceTier parameter to allow
specifying processing tier (priority, default, or flex).

Changes:
- Add ServiceTierBlock type in litellm/types/llms/bedrock.py
- Add serviceTier to CommonRequestObject
- Add serviceTier to get_config_blocks() in AmazonConverseConfig
- Add comprehensive tests for serviceTier functionality
- Add documentation for serviceTier usage

This allows users to configure service tier via:
- litellm_params in proxy config
- optional_params in SDK calls
2025-12-11 08:16:32 -08:00
Ashton Sidhu a514313540 Add Hiddenlayer Guardrail Hooks (#17728)
* Core logic working, need to add tests

* Re add removed files

* Remove mistaken files

* one more file

* Add deployment params

* Add tests

* Remove unused imports

* Update docs from feedback

* Update guardrails
2025-12-11 07:43:26 -08:00
Sameer Kankute 8942053c8b Merge pull request #17700 from BerriAI/litellm_batches_passthrough_cost_tracking
Add anthropic retrieve batches and retreive file content support
2025-12-11 10:31:54 +05:30
yuneng-jiang 1c6de2b80d Merge remote-tracking branch 'origin' into litellm_sendgrid 2025-12-10 20:39:25 -08:00
yuneng-jiang cffd0ac350 sendgrid docs 2025-12-10 20:39:10 -08:00
Shivam Rawat 9d7a255d55 made litellm proxy and sdk difference cleaner in overview (#17790) 2025-12-10 19:14:49 -08:00
Ishaan Jaff 5d456bcdc3 [Feat] UI SSO - allow fetching role from generic SSO provider (Keycloak) (#17787)
* fix ui SSO

* TestGenericResponseConvertorUserRole

* Assigning User Roles via SSO
2025-12-10 13:09:28 -08:00
Alexsander Hamir 439bb5bfe3 fix: suggest Gunicorn instead of uvicorn when using max_requests_before_restart (#17788) 2025-12-10 13:09:00 -08:00
Ishaan Jaff 49b91c4a35 [Feat] A2a gateway - Add cost per token pricing (#17780)
* fix calculate_a2a_cost

* add cost_per_query

* add test_asend_message_uses_cost_per_query

* fix: _initialize_slack_alerting_jobs

* feat: add token tracking for agents invoke

* add A2ARequestUtils

* add _set_usage_on_logging_obj

* test_asend_message_token_tracking

* add _handle_a2a_response_logging

* test_asend_message_streaming_token_tracking

* add A2AStreamingIterator

* add cost calculator for agents

* test_asend_message_uses_input_output_cost_per_token

* docs gix
2025-12-10 13:08:15 -08:00
Ishaan Jaff 5ee32167c0 [Feat] New Provider - add langgraph (#17783)
* init LANGGRAPH

* init LangGraphConfig

* init LangGraphConfig types

* init langgraph

* init getting api base and key

* init transform langgraph

* fix SSE issues

* test_langgraph_acompletion_non_streaming

* add LangGraph to docs

* docs: Setting Up a Local LangGraph Server

* fix langgraph SSE

* fix import uuid
2025-12-10 12:30:35 -08:00
Krish Dholakia 8bc5e2ca7f Add /v1/messages/count_tokens endpoint documentation (#17772)
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
2025-12-10 11:34:11 -08:00
yuneng-jiang ba554a86b9 Merge pull request #16843 from BerriAI/litellm_allow_custom_mount_paths
[Feature] Allow Root Path to Redirect when Docs not on Root Path
2025-12-10 09:52:30 -08:00
Sameer Kankute 4c78c1afc8 Merge pull request #17756 from BerriAI/litellm_add_gemini_computer_use
Add support for computer use for gemini
2025-12-10 22:28:37 +05:30
Sameer Kankute 9e3a04a725 Add batch passthrough endpoint cost tracking for anthropic 2025-12-10 18:24:31 +05:30
Sameer Kankute 0d2f8ce931 Merge pull request #17711 from BerriAI/litellm_add_additional_drop_params_support
feat: Add nested field removal support to additional_drop_params
2025-12-10 15:37:39 +05:30
Krish Dholakia b0a5a4b81d Arize Phoenix OSS - Prompt Management Integration (#17750)
* docs(prompt_management.md): document how to onboard prompts to litellm

* feat(arize_phoenix_prompt_manager.py): support new prompt management integration

allows users to connect arize phoenix prompt manager to litellm

* fix(proxy/utils.py): remove prompt variables to avoid re-processing prompt

* docs(arize_phoenix_prompts.md): document new prompt management integration
2025-12-09 22:53:42 -08:00
Sameer Kankute bcac9e41f6 Add support for computer use for gemini 2025-12-10 10:34:08 +05:30
Cesar Garcia b4e0dabb37 fix: use absolute URL for Supported Endpoints link to avoid Docusaurus slug conflict (#17710)
The relative link was causing Docusaurus to incorrectly associate the
/supported_endpoints page with SDK Functions category instead of the
actual Supported Endpoints generated-index.
2025-12-09 18:49:49 -08:00
Krish Dholakia 254c1155a2 Remove streaming_logging.md documentation (#17739)
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
2025-12-09 18:26:08 -08:00
Krrish Dholakia d3531be9a0 docs(community.md): add new integration partner doc 2025-12-09 18:17:14 -08:00
Ishaan Jaff 42f5770cfa [docs] add docs for containers files api + code interpreter on LiteLLM (#17749)
* add new container api on OpenAI

* add related

* docs fix

* docs code interpreter

* code interp

* docs code interptert

* docs code int

* docs code interp

* docs code interp
2025-12-09 18:11:28 -08:00
Krish Dholakia 8d5e6cc62d Add community doc link (#17734)
* Add community contribution guide for integration partners

Co-authored-by: krrishdholakia <krrishdholakia@gmail.com>

* Update community docs to direct users to #integration-partners

Co-authored-by: krrishdholakia <krrishdholakia@gmail.com>

---------

Co-authored-by: Cursor Agent <cursoragent@cursor.com>
2025-12-09 18:10:00 -08:00
Cesar Garcia 63a97db663 feat(voyage): add rerank API support (#17744)
* feat(voyage): add rerank API support

Add support for Voyage AI rerank models (rerank-2.5, rerank-2.5-lite,
rerank-2, rerank-2-lite) to the LiteLLM rerank API.

Changes:
- Add VoyageRerankConfig transformation class
- Register voyage provider in rerank_api/main.py
- Add voyage case in utils.py get_provider_rerank_config
- Add rerank-2.5 and rerank-2.5-lite models to pricing JSON
- Add unit tests for transformation logic
- Update documentation for voyage.md and rerank.md

Usage:
```python
from litellm import rerank

response = rerank(
    model="voyage/rerank-2.5",
    query="What is the capital of France?",
    documents=["Paris is...", "London is..."],
    top_n=3,
)
```

* refactor(voyage): simplify rerank transformation code

Remove verbose docstrings to align with other providers (jina_ai pattern).
No functional changes - 168 lines vs 169 for jina_ai.

* fix(voyage): remove incorrect input_cost_per_query from rerank models

Voyage AI charges per token, not per query. The input_cost_per_query
field was incorrectly set to the same value as input_cost_per_token
in the existing rerank-2 and rerank-2-lite models.

Removes input_cost_per_query from all Voyage rerank models:
- voyage/rerank-2
- voyage/rerank-2-lite
- voyage/rerank-2.5
- voyage/rerank-2.5-lite

Pricing source: https://docs.voyageai.com/docs/pricing
2025-12-09 17:34:09 -08:00
YutaSaito 80a18f989a feat: propagate Langfuse trace_id (#17669) 2025-12-09 12:25:52 -08:00
yuneng-jiang 39bf7a9f7c Merge remote-tracking branch 'origin' into litellm_allow_custom_mount_paths 2025-12-09 11:58:05 -08:00
Shivam Rawat 43a7bbeeaf added note for using Azure Active Directory Tokens with all the other endpoints (#17733) 2025-12-09 11:51:28 -08:00
yuneng-jiang aa450e7ebe Merge pull request #17738 from BerriAI/litellm_doc_update_1805
[Docs] Adding known issues to 1.80.5-stable docs
2025-12-09 11:46:08 -08:00
yuneng-jiang 431884f591 Adding known issues to 1.80.5-stable docs 2025-12-09 11:45:16 -08:00
Derek Duenas 3322523e07 Passthrough in response (#17102)
* attempt to implement the passthrough feature

* Formatting and small change

* Fix formatting

* feat: grayswan guardrail overwrite ModelResponse in passthrough mode

* fix missing exception error catching on certain
endpoints

* fix wrong call site

* fix: patch anthropic endpoint internal error on streaming obj

* fix grayswan testcase

* feat: update the violation response to more natural

* Formatting

* move passthrough exception definition to custom_guardrail.

* Enhancement: show whether the blocked at input or output

* update exception name

* fix a typo in testing unit.

---------

Co-authored-by: Xiaohan Fu <xiaohan@grayswan.ai>
2025-12-09 10:45:45 -08:00
Krish Dholakia 81f0bbad73 Add Azure AI Search to supported vector stores (#17726)
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
2025-12-09 09:04:04 -08:00
mzagar 9bb0e7dd75 feat: Replace jsonpath-ng with custom minimal parser for additional_drop_params 2025-12-09 17:30:30 +05:30
Chetan Choudhary 38eda3409a docs: Add SumoLogic integration documentation (#17647)
* docs: Add SumoLogic integration documentation

* minor update
2025-12-08 18:54:07 -08:00
Yi Ding e0a8f7435d docs(json): make it clearer how to get Pydantic model output (#17671) 2025-12-08 18:38:57 -08:00
Ishaan Jaff a904067d38 [Feat] New model - add bedrock writer models (#17685)
* add new bedrock models

* test bedrock writer models

* docs bedrock writer palmyra

* add palymra models

* add bedrock writer models

* docs fix
2025-12-08 17:49:06 -08:00
Ishaan Jaff 074445edb1 [Fix] AI Gateway Auth - allow using wildcard patterns for public routes (#17686)
* edit auth utils to allow wildcard patterns

* docs fix private / public routes

* test_route_in_additional_public_routes_wildcard_match
2025-12-08 17:39:53 -08:00
Ishaan Jaff 2f335ac5a6 [Feat] Dynamic Rate Limiter - allow specifying ttl for in memory cache (#17679)
* fix _get_saturation_value_from_cache

* fix _get_saturation_check_cache_ttl

* fix test_saturation_check_cache_ttl_configuration

* docs saturation_check_cache_ttl
2025-12-08 17:20:52 -08:00
Krish Dholakia fbe18a21c9 Docs: Add integration documentation instructions (#17644)
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
2025-12-08 16:29:15 -08:00
Ishaan Jaff 601da4a3d1 [Feat] New model - add nvidia nim llama-3.2-nv-rerankqa-1b-v2 (#17670)
* fix get_nvidia_nim_rerank_config

* add NvidiaNimRankingConfig

* add get_nvidia_nim_rerank_config

* add test_nvidia_nim_rerank_ranking_endpoint

* add /ranking model provider support

* feat: add nvidia/llama-3.2-nv-rerankqa-1b-v2
2025-12-08 15:25:23 -08:00
Cesar Garcia dcf5217d17 docs: improve Getting Started page and SDK documentation structure (#17614)
* docs: update Getting Started page with accurate endpoints and fix exception handling

- Update endpoints list to include /responses, /audio, /batches
- Change "Consistent output" to be endpoint-agnostic
- Clarify Response Format title as "OpenAI Chat Completions Format"
- Fix exception handling example: use litellm exceptions instead of deprecated openai.error
- Add model prefix (anthropic/) to example

* docs: reorganize sidebar and improve SDK documentation structure

Sidebar changes:
- Reorder: Python SDK first, then AI Gateway (Proxy)
- Rename "LiteLLM - Getting Started" to "Getting Started"
- Restructure SDK section with Core Functions, Configuration subsections
- Move budget_manager to Guides
- Move sdk_custom_pricing and migration to Extras
- Remove duplicate embedding/async_embedding and embedding/moderation

Content changes:
- Add Response Format section to response_api.md
- Add async aembedding() section to supported_embedding.md

* docs: add deprecation notice for OpenAI Assistants API

OpenAI has deprecated the Assistants API, shutting down on August 26, 2026.
Added warning banner directing users to the Responses API.

* docs: expand Core Functions in SDK sidebar

Add more SDK functions to Core Functions category:
- text_completion()
- image_generation()
- transcription()
- speech()
- Link to "All Supported Endpoints" for complete list

* Rename Sidebar Item

* docs: revert Getting Started label to original

* Rename sidebar label from 'LiteLLM - Getting Started' to 'Getting Started'
2025-12-08 13:05:50 -08:00
Ishaan Jaff 7b47c0f583 docs: Explain default behavior of drop_params (#17658)
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: ishaan <ishaan@berri.ai>
2025-12-08 12:58:21 -08:00
Ishaan Jaff 3a43042fad docs - add sap gen ai provider on LiteLLM (#17667) 2025-12-08 12:43:42 -08:00
_juliettech ee0812a297 Add Helicone as a provider and update observability documentation (#17663)
* Add Helicone as a provider to liteLLM

* Add Helicone provider integration
2025-12-08 12:34:11 -08:00