Commit Graph

4936 Commits

Author SHA1 Message Date
yuneng-jiang 39bf7a9f7c Merge remote-tracking branch 'origin' into litellm_allow_custom_mount_paths 2025-12-09 11:58:05 -08:00
Shivam Rawat 43a7bbeeaf added note for using Azure Active Directory Tokens with all the other endpoints (#17733) 2025-12-09 11:51:28 -08:00
yuneng-jiang aa450e7ebe Merge pull request #17738 from BerriAI/litellm_doc_update_1805
[Docs] Adding known issues to 1.80.5-stable docs
2025-12-09 11:46:08 -08:00
yuneng-jiang 431884f591 Adding known issues to 1.80.5-stable docs 2025-12-09 11:45:16 -08:00
Derek Duenas 3322523e07 Passthrough in response (#17102)
* attempt to implement the passthrough feature

* Formatting and small change

* Fix formatting

* feat: grayswan guardrail overwrite ModelResponse in passthrough mode

* fix missing exception error catching on certain
endpoints

* fix wrong call site

* fix: patch anthropic endpoint internal error on streaming obj

* fix grayswan testcase

* feat: update the violation response to more natural

* Formatting

* move passthrough exception definition to custom_guardrail.

* Enhancement: show whether the blocked at input or output

* update exception name

* fix a typo in testing unit.

---------

Co-authored-by: Xiaohan Fu <xiaohan@grayswan.ai>
2025-12-09 10:45:45 -08:00
Krish Dholakia 81f0bbad73 Add Azure AI Search to supported vector stores (#17726)
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
2025-12-09 09:04:04 -08:00
Chetan Choudhary 38eda3409a docs: Add SumoLogic integration documentation (#17647)
* docs: Add SumoLogic integration documentation

* minor update
2025-12-08 18:54:07 -08:00
Yi Ding e0a8f7435d docs(json): make it clearer how to get Pydantic model output (#17671) 2025-12-08 18:38:57 -08:00
Ishaan Jaff a904067d38 [Feat] New model - add bedrock writer models (#17685)
* add new bedrock models

* test bedrock writer models

* docs bedrock writer palmyra

* add palymra models

* add bedrock writer models

* docs fix
2025-12-08 17:49:06 -08:00
Ishaan Jaff 074445edb1 [Fix] AI Gateway Auth - allow using wildcard patterns for public routes (#17686)
* edit auth utils to allow wildcard patterns

* docs fix private / public routes

* test_route_in_additional_public_routes_wildcard_match
2025-12-08 17:39:53 -08:00
Ishaan Jaff 2f335ac5a6 [Feat] Dynamic Rate Limiter - allow specifying ttl for in memory cache (#17679)
* fix _get_saturation_value_from_cache

* fix _get_saturation_check_cache_ttl

* fix test_saturation_check_cache_ttl_configuration

* docs saturation_check_cache_ttl
2025-12-08 17:20:52 -08:00
Krish Dholakia fbe18a21c9 Docs: Add integration documentation instructions (#17644)
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
2025-12-08 16:29:15 -08:00
Ishaan Jaff 601da4a3d1 [Feat] New model - add nvidia nim llama-3.2-nv-rerankqa-1b-v2 (#17670)
* fix get_nvidia_nim_rerank_config

* add NvidiaNimRankingConfig

* add get_nvidia_nim_rerank_config

* add test_nvidia_nim_rerank_ranking_endpoint

* add /ranking model provider support

* feat: add nvidia/llama-3.2-nv-rerankqa-1b-v2
2025-12-08 15:25:23 -08:00
Cesar Garcia dcf5217d17 docs: improve Getting Started page and SDK documentation structure (#17614)
* docs: update Getting Started page with accurate endpoints and fix exception handling

- Update endpoints list to include /responses, /audio, /batches
- Change "Consistent output" to be endpoint-agnostic
- Clarify Response Format title as "OpenAI Chat Completions Format"
- Fix exception handling example: use litellm exceptions instead of deprecated openai.error
- Add model prefix (anthropic/) to example

* docs: reorganize sidebar and improve SDK documentation structure

Sidebar changes:
- Reorder: Python SDK first, then AI Gateway (Proxy)
- Rename "LiteLLM - Getting Started" to "Getting Started"
- Restructure SDK section with Core Functions, Configuration subsections
- Move budget_manager to Guides
- Move sdk_custom_pricing and migration to Extras
- Remove duplicate embedding/async_embedding and embedding/moderation

Content changes:
- Add Response Format section to response_api.md
- Add async aembedding() section to supported_embedding.md

* docs: add deprecation notice for OpenAI Assistants API

OpenAI has deprecated the Assistants API, shutting down on August 26, 2026.
Added warning banner directing users to the Responses API.

* docs: expand Core Functions in SDK sidebar

Add more SDK functions to Core Functions category:
- text_completion()
- image_generation()
- transcription()
- speech()
- Link to "All Supported Endpoints" for complete list

* Rename Sidebar Item

* docs: revert Getting Started label to original

* Rename sidebar label from 'LiteLLM - Getting Started' to 'Getting Started'
2025-12-08 13:05:50 -08:00
Ishaan Jaff 7b47c0f583 docs: Explain default behavior of drop_params (#17658)
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: ishaan <ishaan@berri.ai>
2025-12-08 12:58:21 -08:00
Ishaan Jaff 3a43042fad docs - add sap gen ai provider on LiteLLM (#17667) 2025-12-08 12:43:42 -08:00
_juliettech ee0812a297 Add Helicone as a provider and update observability documentation (#17663)
* Add Helicone as a provider to liteLLM

* Add Helicone provider integration
2025-12-08 12:34:11 -08:00
Sameer Kankute 05f800fe7d Merge pull request #17653 from BerriAI/litellm_fireworks_rerank_model
(Feat) Add fireworks rerank support
2025-12-08 21:33:08 +05:30
Sameer Kankute 87cf6f3ffe Add fireworks rerank support 2025-12-08 20:29:50 +05:30
Alexsander Hamir 60a325e403 Document missing environment variables and fix incorrect types (#17649)
* fix: correct type annotations for anthropic streaming handlers

- Fix return type of _handle_accumulated_json_chunk from Optional[GenericStreamingChunk] to Optional[ModelResponseStream]
- Fix return type of _parse_sse_data from Optional[GenericStreamingChunk] to Optional[ModelResponseStream]
- Add type annotation for output_items in background_streaming.py

These changes align type annotations with actual return values from chunk_parser() which returns ModelResponseStream.

* docs: add missing ONYX_API_KEY and ONYX_API_BASE to environment variables reference

- Add ONYX_API_BASE documentation entry
- Add ONYX_API_KEY documentation entry
- Fixes test_env_keys.py test failure
2025-12-08 05:38:21 -08:00
Tamir Kiviti 0f5694c8eb add onyx guardrail hooks integration (#16591)
* add onyx guardrail hooks integration

* fix lint issue

* fix lint issue

* update PR to use the new custom guardrail interface

* lint fix
2025-12-07 23:33:28 -08:00
yuneng-jiang 6777a23a53 Merge remote-tracking branch 'origin' into litellm_allow_custom_mount_paths 2025-12-06 22:22:59 -08:00
Ishaan Jaffer 74b48c9716 docs fix 2025-12-06 16:09:27 -08:00
Ishaan Jaffer b4970f6033 amazon nova api fix 2025-12-06 16:09:27 -08:00
yuneng-jiang 69f65e20e0 Merge pull request #17618 from BerriAI/litellm_customer_usage_docs_path
[Docs] Fixing path to image
2025-12-06 14:47:18 -08:00
yuneng-jiang 7385801fba Fixing path to image 2025-12-06 14:45:50 -08:00
Anil Kodali 1a50a89cd3 [New Model] Add Amazon Nova as first party provider for chat completions (#17351)
* Add Amazon Nova as a first party provider

* Added new provider folder under llms/ to outline the openai supported params

* Updated supported endpoints on the documnetation
2025-12-06 14:43:55 -08:00
yuneng-jiang bff3590dd0 Update sidebar for customer usage 2025-12-06 14:21:18 -08:00
Cesar Garcia 8ccfaa21de docs: add Microsoft GraphRAG to projects using LiteLLM (#17616)
* docs: add Microsoft GraphRAG to projects using LiteLLM

* docs: add arXiv paper link for GraphRAG

* docs: add GraphRAG to sidebar

* Update projects in sidebars.js

Reordered items in the projects list to include 'GraphRAG'.
2025-12-06 13:47:46 -08:00
yuneng-jiang 3f7d51d53e Merge remote-tracking branch 'origin' into litellm_customer_usage_docs 2025-12-06 13:43:28 -08:00
yuneng-jiang da4b36fe9b Changed image 2025-12-06 13:23:49 -08:00
yuneng-jiang 40ad0e2e96 Customer Usage Docs 2025-12-06 13:17:54 -08:00
Ishaan Jaffer df6cb4244d docs a2a gateway 2025-12-06 12:32:30 -08:00
Ishaan Jaffer fdf28331a5 docs fix 2025-12-06 11:48:00 -08:00
Ishaan Jaffer b4c6b29149 docs fix 2025-12-06 11:41:51 -08:00
Ishaan Jaffer 2bf0b951f7 docs fix 2025-12-06 11:37:08 -08:00
Ishaan Jaffer 86a0c14aca docs guardrails 2025-12-06 11:34:55 -08:00
Ishaan Jaffer 6bb4087b22 docs fix 2025-12-06 11:09:46 -08:00
Ishaan Jaff a9b654224e 1.80.8 RC docs (#17605)
* stash docs

* docs fix

* doc fix

* docs fix
2025-12-06 10:40:00 -08:00
Sungjun.Kim ca7241188a feat: Add xhigh reasoning effort for gpt-5.1-codex-max (#17585)
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
2025-12-06 09:48:18 -08:00
Cesar Garcia 0f1d6c37d2 docs: add gpt-5.1-codex-max to OpenAI provider documentation (#17602)
Add gpt-5.1-codex-max model to:
- Model list table
- Reasoning effort table
- Verbosity note
2025-12-06 09:46:48 -08:00
Krrish Dholakia 497856e1e3 docs: document multi tenant architecture 2025-12-06 09:27:30 -08:00
Alexsander Hamir 8172f6cdd6 Fix security vulnerability: update mdast-util-to-hast to 13.2.1 (CVE-2025-66400) (#17601) 2025-12-06 09:26:26 -08:00
Cesar Garcia 87f94172a9 fix(responses): Add image generation support for Responses API (#16586)
* fix(responses): Add image generation support for Responses API

Fixes #16227

## Problem
When using Gemini 2.5 Flash Image with /responses endpoint, image generation
outputs were not being returned correctly. The response contained only text
with empty content instead of the generated images.

## Solution
1. Created new `OutputImageGenerationCall` type for image generation outputs
2. Modified `_extract_message_output_items()` to detect images in completion responses
3. Added `_extract_image_generation_output_items()` to transform images from
   completion format (data URL) to responses format (pure base64)
4. Added `_extract_base64_from_data_url()` helper to extract base64 from data URLs
5. Updated `ResponsesAPIResponse.output` type to include `OutputImageGenerationCall`

## Changes
- litellm/types/responses/main.py: Added OutputImageGenerationCall type
- litellm/types/llms/openai.py: Updated ResponsesAPIResponse.output type
- litellm/responses/litellm_completion_transformation/transformation.py:
  Added image detection and extraction logic
- tests/test_litellm/responses/litellm_completion_transformation/test_image_generation_output.py:
  Added comprehensive unit tests (16 tests, all passing)

## Result
/responses endpoint now correctly returns:
```json
{
  "output": [{
    "type": "image_generation_call",
    "id": "..._img_0",
    "status": "completed",
    "result": "iVBORw0KGgo..."  // Pure base64, no data: prefix
  }]
}
```

This matches OpenAI Responses API specification where image generation
outputs have type "image_generation_call" with base64 data in "result" field.

* docs(responses): Add image generation documentation and tests

- Add comprehensive image generation documentation to response_api.md
  - Include examples for Gemini (no tools param) and OpenAI (with tools param)
  - Document response format and base64 handling
  - Add supported models table with provider-specific requirements

- Add unit tests for image generation output transformation
  - Test base64 extraction from data URLs
  - Test image generation output item creation
  - Test status mapping and integration scenarios
  - Verify proper transformation from completions to responses format

Related to #16227

* fix(responses): Correct status type for image generation output

- Add _map_finish_reason_to_image_generation_status() helper function
- Fix MyPy type error: OutputImageGenerationCall.status only accepts
  ['in_progress', 'completed', 'incomplete', 'failed'], not the full
  ResponsesAPIStatus union which includes 'cancelled' and 'queued'

Fixes MyPy error in transformation.py:838
2025-12-05 15:56:26 -08:00
Cesar Garcia 829b06f53f Fix: Gemini image_tokens incorrectly treated as text tokens in cost calculation (#17554)
When Gemini image generation models return `text_tokens=0` with `image_tokens > 0`,
the cost calculator was assuming no token breakdown existed and treating all
completion tokens as text tokens, resulting in ~10x underestimation of costs.

Changes:
- Fix cost calculation logic to respect token breakdown when image/audio/reasoning
  tokens are present, even if text_tokens=0
- Add `output_cost_per_image_token` pricing for gemini-3-pro-image-preview models
- Add test case reproducing the issue
- Add documentation explaining image token pricing

Fixes #17410
2025-12-05 15:55:38 -08:00
Yuichiro Utsumi d18e489872 fix(docs): remove source .env (#17466)
Remove `source .env` since `docker compose` automatically loads
the `.env` file.

Signed-off-by: utsumi.yuichiro <utsumi.yuichiro@fujitsu.com>
2025-12-05 15:53:05 -08:00
Ishaan Jaff f02df3035a [Feat] Allow using dynamic rate limit/priority reservation on teams (#17061)
* use helper to get key/team priority

* test_team_metadata_priority

* docs team priority
2025-12-05 15:42:27 -08:00
Sameer Kankute 43914796d6 fix failing vertex tests 2025-12-06 00:04:04 +05:30
Krrish Dholakia c272741d7f docs: fix strings 2025-12-05 09:37:22 -08:00
Krrish Dholakia c1cbe6ed56 docs: document tool calls spec 2025-12-05 09:37:22 -08:00