* feat(deepseek): add native support for thinking and reasoning_effort params
Add proper parameter mapping for DeepSeek thinking mode, allowing users
to use the unified LiteLLM interface instead of extra_body workarounds.
Supported formats:
- thinking={"type": "enabled"}
- thinking={"type": "enabled", "budget_tokens": X} (budget_tokens ignored)
- reasoning_effort="low|medium|high" (maps to thinking enabled)
DeepSeek only supports {"type": "enabled"} without budget_tokens,
so any budget_tokens are stripped and all reasoning_effort values
(except "none") map to enabled.
Reference: https://api-docs.deepseek.com/guides/thinking_mode
* docs(deepseek): add thinking and reasoning_effort parameter documentation
* fix: resolve mypy type errors in hiddenlayer guardrail and transformation
- Fix return type of apply_guardrail from str to GenericGuardrailAPIInputs
- Add None checks for logging_obj before accessing attributes
- Convert AllMessageValues to dict format for HiddenLayer API compatibility
- Fix payload type annotation in _call_hiddenlayer
- Ensure transformed_output always returns list[dict[str, Any]] in transformation.py
* fix: use litellm_call_id as trace_id fallback in langfuse logging
- Only use standard_logging_object.trace_id if explicitly set via litellm_session_id or litellm_trace_id params
- Fallback to litellm_call_id when no explicit trace_id is provided (matches test expectation)
- Return the trace_id we set instead of generation_client.trace_id for consistency
- Add warning if langfuse modifies the trace_id to help debug potential issues
Fixes test_logging_trace_id test failure where auto-generated UUID was used instead of litellm_call_id
* fix: document envs
* fix: handle None response in /spend/logs endpoint when no records found
- Return empty list [] instead of [None] when spend_log is None
- Prevents 500 errors when querying by request_id, api_key, or user_id with no matching records
- Fixes test_chat_completion_bad_model_with_spend_logs test failure
* fix: use standard_logging_object trace_id when available in langfuse logger
- Fix trace_id selection logic to use standard_logging_object.trace_id when available
- Previously only used standard_logging_object.trace_id if explicitly set via params
- Now uses standard_logging_object.trace_id whenever it's present, matching test expectations
- Falls back to litellm_call_id if no trace_id is found
- Fixes test_log_langfuse_v2_uses_standard_trace_id_when_available test failure
Add support for the Bedrock Converse API serviceTier parameter to allow
specifying processing tier (priority, default, or flex).
Changes:
- Add ServiceTierBlock type in litellm/types/llms/bedrock.py
- Add serviceTier to CommonRequestObject
- Add serviceTier to get_config_blocks() in AmazonConverseConfig
- Add comprehensive tests for serviceTier functionality
- Add documentation for serviceTier usage
This allows users to configure service tier via:
- litellm_params in proxy config
- optional_params in SDK calls
* docs(prompt_management.md): document how to onboard prompts to litellm
* feat(arize_phoenix_prompt_manager.py): support new prompt management integration
allows users to connect arize phoenix prompt manager to litellm
* fix(proxy/utils.py): remove prompt variables to avoid re-processing prompt
* docs(arize_phoenix_prompts.md): document new prompt management integration
The relative link was causing Docusaurus to incorrectly associate the
/supported_endpoints page with SDK Functions category instead of the
actual Supported Endpoints generated-index.
* Add community contribution guide for integration partners
Co-authored-by: krrishdholakia <krrishdholakia@gmail.com>
* Update community docs to direct users to #integration-partners
Co-authored-by: krrishdholakia <krrishdholakia@gmail.com>
---------
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
* feat(voyage): add rerank API support
Add support for Voyage AI rerank models (rerank-2.5, rerank-2.5-lite,
rerank-2, rerank-2-lite) to the LiteLLM rerank API.
Changes:
- Add VoyageRerankConfig transformation class
- Register voyage provider in rerank_api/main.py
- Add voyage case in utils.py get_provider_rerank_config
- Add rerank-2.5 and rerank-2.5-lite models to pricing JSON
- Add unit tests for transformation logic
- Update documentation for voyage.md and rerank.md
Usage:
```python
from litellm import rerank
response = rerank(
model="voyage/rerank-2.5",
query="What is the capital of France?",
documents=["Paris is...", "London is..."],
top_n=3,
)
```
* refactor(voyage): simplify rerank transformation code
Remove verbose docstrings to align with other providers (jina_ai pattern).
No functional changes - 168 lines vs 169 for jina_ai.
* fix(voyage): remove incorrect input_cost_per_query from rerank models
Voyage AI charges per token, not per query. The input_cost_per_query
field was incorrectly set to the same value as input_cost_per_token
in the existing rerank-2 and rerank-2-lite models.
Removes input_cost_per_query from all Voyage rerank models:
- voyage/rerank-2
- voyage/rerank-2-lite
- voyage/rerank-2.5
- voyage/rerank-2.5-lite
Pricing source: https://docs.voyageai.com/docs/pricing
* attempt to implement the passthrough feature
* Formatting and small change
* Fix formatting
* feat: grayswan guardrail overwrite ModelResponse in passthrough mode
* fix missing exception error catching on certain
endpoints
* fix wrong call site
* fix: patch anthropic endpoint internal error on streaming obj
* fix grayswan testcase
* feat: update the violation response to more natural
* Formatting
* move passthrough exception definition to custom_guardrail.
* Enhancement: show whether the blocked at input or output
* update exception name
* fix a typo in testing unit.
---------
Co-authored-by: Xiaohan Fu <xiaohan@grayswan.ai>
* docs: update Getting Started page with accurate endpoints and fix exception handling
- Update endpoints list to include /responses, /audio, /batches
- Change "Consistent output" to be endpoint-agnostic
- Clarify Response Format title as "OpenAI Chat Completions Format"
- Fix exception handling example: use litellm exceptions instead of deprecated openai.error
- Add model prefix (anthropic/) to example
* docs: reorganize sidebar and improve SDK documentation structure
Sidebar changes:
- Reorder: Python SDK first, then AI Gateway (Proxy)
- Rename "LiteLLM - Getting Started" to "Getting Started"
- Restructure SDK section with Core Functions, Configuration subsections
- Move budget_manager to Guides
- Move sdk_custom_pricing and migration to Extras
- Remove duplicate embedding/async_embedding and embedding/moderation
Content changes:
- Add Response Format section to response_api.md
- Add async aembedding() section to supported_embedding.md
* docs: add deprecation notice for OpenAI Assistants API
OpenAI has deprecated the Assistants API, shutting down on August 26, 2026.
Added warning banner directing users to the Responses API.
* docs: expand Core Functions in SDK sidebar
Add more SDK functions to Core Functions category:
- text_completion()
- image_generation()
- transcription()
- speech()
- Link to "All Supported Endpoints" for complete list
* Rename Sidebar Item
* docs: revert Getting Started label to original
* Rename sidebar label from 'LiteLLM - Getting Started' to 'Getting Started'