litellm

mirror of https://github.com/tiennm99/litellm.git synced 2026-06-24 05:36:04 +00:00

Author	SHA1	Message	Date
yuneng-jiang	39bf7a9f7c	Merge remote-tracking branch 'origin' into litellm_allow_custom_mount_paths	2025-12-09 11:58:05 -08:00
Shivam Rawat	43a7bbeeaf	added note for using Azure Active Directory Tokens with all the other endpoints (#17733 )	2025-12-09 11:51:28 -08:00
yuneng-jiang	aa450e7ebe	Merge pull request #17738 from BerriAI/litellm_doc_update_1805 [Docs] Adding known issues to 1.80.5-stable docs	2025-12-09 11:46:08 -08:00
yuneng-jiang	431884f591	Adding known issues to 1.80.5-stable docs	2025-12-09 11:45:16 -08:00
Derek Duenas	3322523e07	Passthrough in response (#17102 ) * attempt to implement the passthrough feature * Formatting and small change * Fix formatting * feat: grayswan guardrail overwrite ModelResponse in passthrough mode * fix missing exception error catching on certain endpoints * fix wrong call site * fix: patch anthropic endpoint internal error on streaming obj * fix grayswan testcase * feat: update the violation response to more natural * Formatting * move passthrough exception definition to custom_guardrail. * Enhancement: show whether the blocked at input or output * update exception name * fix a typo in testing unit. --------- Co-authored-by: Xiaohan Fu <xiaohan@grayswan.ai>	2025-12-09 10:45:45 -08:00
Krish Dholakia	81f0bbad73	Add Azure AI Search to supported vector stores (#17726 ) Co-authored-by: Cursor Agent <cursoragent@cursor.com>	2025-12-09 09:04:04 -08:00
Chetan Choudhary	38eda3409a	docs: Add SumoLogic integration documentation (#17647 ) * docs: Add SumoLogic integration documentation * minor update	2025-12-08 18:54:07 -08:00
Yi Ding	e0a8f7435d	docs(json): make it clearer how to get Pydantic model output (#17671 )	2025-12-08 18:38:57 -08:00
Ishaan Jaff	a904067d38	[Feat] New model - add bedrock writer models (#17685 ) * add new bedrock models * test bedrock writer models * docs bedrock writer palmyra * add palymra models * add bedrock writer models * docs fix	2025-12-08 17:49:06 -08:00
Ishaan Jaff	074445edb1	[Fix] AI Gateway Auth - allow using wildcard patterns for public routes (#17686 ) * edit auth utils to allow wildcard patterns * docs fix private / public routes * test_route_in_additional_public_routes_wildcard_match	2025-12-08 17:39:53 -08:00
Ishaan Jaff	2f335ac5a6	[Feat] Dynamic Rate Limiter - allow specifying ttl for in memory cache (#17679 ) * fix _get_saturation_value_from_cache * fix _get_saturation_check_cache_ttl * fix test_saturation_check_cache_ttl_configuration * docs saturation_check_cache_ttl	2025-12-08 17:20:52 -08:00
Krish Dholakia	fbe18a21c9	Docs: Add integration documentation instructions (#17644 ) Co-authored-by: Cursor Agent <cursoragent@cursor.com>	2025-12-08 16:29:15 -08:00
Ishaan Jaff	601da4a3d1	[Feat] New model - add nvidia nim `llama-3.2-nv-rerankqa-1b-v2` (#17670 ) * fix get_nvidia_nim_rerank_config * add NvidiaNimRankingConfig * add get_nvidia_nim_rerank_config * add test_nvidia_nim_rerank_ranking_endpoint * add /ranking model provider support * feat: add nvidia/llama-3.2-nv-rerankqa-1b-v2	2025-12-08 15:25:23 -08:00
Cesar Garcia	dcf5217d17	docs: improve Getting Started page and SDK documentation structure (#17614 ) * docs: update Getting Started page with accurate endpoints and fix exception handling - Update endpoints list to include /responses, /audio, /batches - Change "Consistent output" to be endpoint-agnostic - Clarify Response Format title as "OpenAI Chat Completions Format" - Fix exception handling example: use litellm exceptions instead of deprecated openai.error - Add model prefix (anthropic/) to example * docs: reorganize sidebar and improve SDK documentation structure Sidebar changes: - Reorder: Python SDK first, then AI Gateway (Proxy) - Rename "LiteLLM - Getting Started" to "Getting Started" - Restructure SDK section with Core Functions, Configuration subsections - Move budget_manager to Guides - Move sdk_custom_pricing and migration to Extras - Remove duplicate embedding/async_embedding and embedding/moderation Content changes: - Add Response Format section to response_api.md - Add async aembedding() section to supported_embedding.md * docs: add deprecation notice for OpenAI Assistants API OpenAI has deprecated the Assistants API, shutting down on August 26, 2026. Added warning banner directing users to the Responses API. * docs: expand Core Functions in SDK sidebar Add more SDK functions to Core Functions category: - text_completion() - image_generation() - transcription() - speech() - Link to "All Supported Endpoints" for complete list * Rename Sidebar Item * docs: revert Getting Started label to original * Rename sidebar label from 'LiteLLM - Getting Started' to 'Getting Started'	2025-12-08 13:05:50 -08:00
Ishaan Jaff	7b47c0f583	docs: Explain default behavior of drop_params (#17658 ) Co-authored-by: Cursor Agent <cursoragent@cursor.com> Co-authored-by: ishaan <ishaan@berri.ai>	2025-12-08 12:58:21 -08:00
Ishaan Jaff	3a43042fad	docs - add sap gen ai provider on LiteLLM (#17667 )	2025-12-08 12:43:42 -08:00
_juliettech	ee0812a297	Add Helicone as a provider and update observability documentation (#17663 ) * Add Helicone as a provider to liteLLM * Add Helicone provider integration	2025-12-08 12:34:11 -08:00
Sameer Kankute	05f800fe7d	Merge pull request #17653 from BerriAI/litellm_fireworks_rerank_model (Feat) Add fireworks rerank support	2025-12-08 21:33:08 +05:30
Sameer Kankute	87cf6f3ffe	Add fireworks rerank support	2025-12-08 20:29:50 +05:30
Alexsander Hamir	60a325e403	Document missing environment variables and fix incorrect types (#17649 ) * fix: correct type annotations for anthropic streaming handlers - Fix return type of _handle_accumulated_json_chunk from Optional[GenericStreamingChunk] to Optional[ModelResponseStream] - Fix return type of _parse_sse_data from Optional[GenericStreamingChunk] to Optional[ModelResponseStream] - Add type annotation for output_items in background_streaming.py These changes align type annotations with actual return values from chunk_parser() which returns ModelResponseStream. * docs: add missing ONYX_API_KEY and ONYX_API_BASE to environment variables reference - Add ONYX_API_BASE documentation entry - Add ONYX_API_KEY documentation entry - Fixes test_env_keys.py test failure	2025-12-08 05:38:21 -08:00
Tamir Kiviti	0f5694c8eb	add onyx guardrail hooks integration (#16591 ) * add onyx guardrail hooks integration * fix lint issue * fix lint issue * update PR to use the new custom guardrail interface * lint fix	2025-12-07 23:33:28 -08:00
yuneng-jiang	6777a23a53	Merge remote-tracking branch 'origin' into litellm_allow_custom_mount_paths	2025-12-06 22:22:59 -08:00
Ishaan Jaffer	74b48c9716	docs fix	2025-12-06 16:09:27 -08:00
Ishaan Jaffer	b4970f6033	amazon nova api fix	2025-12-06 16:09:27 -08:00
yuneng-jiang	69f65e20e0	Merge pull request #17618 from BerriAI/litellm_customer_usage_docs_path [Docs] Fixing path to image	2025-12-06 14:47:18 -08:00
yuneng-jiang	7385801fba	Fixing path to image	2025-12-06 14:45:50 -08:00
Anil Kodali	1a50a89cd3	[New Model] Add Amazon Nova as first party provider for chat completions (#17351 ) * Add Amazon Nova as a first party provider * Added new provider folder under llms/ to outline the openai supported params * Updated supported endpoints on the documnetation	2025-12-06 14:43:55 -08:00
yuneng-jiang	bff3590dd0	Update sidebar for customer usage	2025-12-06 14:21:18 -08:00
Cesar Garcia	8ccfaa21de	docs: add Microsoft GraphRAG to projects using LiteLLM (#17616 ) * docs: add Microsoft GraphRAG to projects using LiteLLM * docs: add arXiv paper link for GraphRAG * docs: add GraphRAG to sidebar * Update projects in sidebars.js Reordered items in the projects list to include 'GraphRAG'.	2025-12-06 13:47:46 -08:00
yuneng-jiang	3f7d51d53e	Merge remote-tracking branch 'origin' into litellm_customer_usage_docs	2025-12-06 13:43:28 -08:00
yuneng-jiang	da4b36fe9b	Changed image	2025-12-06 13:23:49 -08:00
yuneng-jiang	40ad0e2e96	Customer Usage Docs	2025-12-06 13:17:54 -08:00
Ishaan Jaffer	df6cb4244d	docs a2a gateway	2025-12-06 12:32:30 -08:00
Ishaan Jaffer	fdf28331a5	docs fix	2025-12-06 11:48:00 -08:00
Ishaan Jaffer	b4c6b29149	docs fix	2025-12-06 11:41:51 -08:00
Ishaan Jaffer	2bf0b951f7	docs fix	2025-12-06 11:37:08 -08:00
Ishaan Jaffer	86a0c14aca	docs guardrails	2025-12-06 11:34:55 -08:00
Ishaan Jaffer	6bb4087b22	docs fix	2025-12-06 11:09:46 -08:00
Ishaan Jaff	a9b654224e	1.80.8 RC docs (#17605 ) * stash docs * docs fix * doc fix * docs fix	2025-12-06 10:40:00 -08:00
Sungjun.Kim	ca7241188a	feat: Add xhigh reasoning effort for gpt-5.1-codex-max (#17585 ) Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>	2025-12-06 09:48:18 -08:00
Cesar Garcia	0f1d6c37d2	docs: add gpt-5.1-codex-max to OpenAI provider documentation (#17602 ) Add gpt-5.1-codex-max model to: - Model list table - Reasoning effort table - Verbosity note	2025-12-06 09:46:48 -08:00
Krrish Dholakia	497856e1e3	docs: document multi tenant architecture	2025-12-06 09:27:30 -08:00
Alexsander Hamir	8172f6cdd6	Fix security vulnerability: update mdast-util-to-hast to 13.2.1 (CVE-2025-66400) (#17601 )	2025-12-06 09:26:26 -08:00
Cesar Garcia	87f94172a9	fix(responses): Add image generation support for Responses API (#16586 ) * fix(responses): Add image generation support for Responses API Fixes #16227 ## Problem When using Gemini 2.5 Flash Image with /responses endpoint, image generation outputs were not being returned correctly. The response contained only text with empty content instead of the generated images. ## Solution 1. Created new `OutputImageGenerationCall` type for image generation outputs 2. Modified `_extract_message_output_items()` to detect images in completion responses 3. Added `_extract_image_generation_output_items()` to transform images from completion format (data URL) to responses format (pure base64) 4. Added `_extract_base64_from_data_url()` helper to extract base64 from data URLs 5. Updated `ResponsesAPIResponse.output` type to include `OutputImageGenerationCall` ## Changes - litellm/types/responses/main.py: Added OutputImageGenerationCall type - litellm/types/llms/openai.py: Updated ResponsesAPIResponse.output type - litellm/responses/litellm_completion_transformation/transformation.py: Added image detection and extraction logic - tests/test_litellm/responses/litellm_completion_transformation/test_image_generation_output.py: Added comprehensive unit tests (16 tests, all passing) ## Result /responses endpoint now correctly returns: ```json { "output": [{ "type": "image_generation_call", "id": "..._img_0", "status": "completed", "result": "iVBORw0KGgo..." // Pure base64, no data: prefix }] } ``` This matches OpenAI Responses API specification where image generation outputs have type "image_generation_call" with base64 data in "result" field. * docs(responses): Add image generation documentation and tests - Add comprehensive image generation documentation to response_api.md - Include examples for Gemini (no tools param) and OpenAI (with tools param) - Document response format and base64 handling - Add supported models table with provider-specific requirements - Add unit tests for image generation output transformation - Test base64 extraction from data URLs - Test image generation output item creation - Test status mapping and integration scenarios - Verify proper transformation from completions to responses format Related to #16227 * fix(responses): Correct status type for image generation output - Add _map_finish_reason_to_image_generation_status() helper function - Fix MyPy type error: OutputImageGenerationCall.status only accepts ['in_progress', 'completed', 'incomplete', 'failed'], not the full ResponsesAPIStatus union which includes 'cancelled' and 'queued' Fixes MyPy error in transformation.py:838	2025-12-05 15:56:26 -08:00
Cesar Garcia	829b06f53f	Fix: Gemini image_tokens incorrectly treated as text tokens in cost calculation (#17554 ) When Gemini image generation models return `text_tokens=0` with `image_tokens > 0`, the cost calculator was assuming no token breakdown existed and treating all completion tokens as text tokens, resulting in ~10x underestimation of costs. Changes: - Fix cost calculation logic to respect token breakdown when image/audio/reasoning tokens are present, even if text_tokens=0 - Add `output_cost_per_image_token` pricing for gemini-3-pro-image-preview models - Add test case reproducing the issue - Add documentation explaining image token pricing Fixes #17410	2025-12-05 15:55:38 -08:00
Yuichiro Utsumi	d18e489872	fix(docs): remove `source .env` (#17466 ) Remove `source .env` since `docker compose` automatically loads the `.env` file. Signed-off-by: utsumi.yuichiro <utsumi.yuichiro@fujitsu.com>	2025-12-05 15:53:05 -08:00
Ishaan Jaff	f02df3035a	[Feat] Allow using dynamic rate limit/priority reservation on teams (#17061 ) * use helper to get key/team priority * test_team_metadata_priority * docs team priority	2025-12-05 15:42:27 -08:00
Sameer Kankute	43914796d6	fix failing vertex tests	2025-12-06 00:04:04 +05:30
Krrish Dholakia	c272741d7f	docs: fix strings	2025-12-05 09:37:22 -08:00
Krrish Dholakia	c1cbe6ed56	docs: document tool calls spec	2025-12-05 09:37:22 -08:00

1 2 3 4 5 ...

4936 Commits