Commit Graph

4896 Commits

Author SHA1 Message Date
Ishaan Jaffer 6bb4087b22 docs fix 2025-12-06 11:09:46 -08:00
Ishaan Jaff a9b654224e 1.80.8 RC docs (#17605)
* stash docs

* docs fix

* doc fix

* docs fix
2025-12-06 10:40:00 -08:00
Sungjun.Kim ca7241188a feat: Add xhigh reasoning effort for gpt-5.1-codex-max (#17585)
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
2025-12-06 09:48:18 -08:00
Cesar Garcia 0f1d6c37d2 docs: add gpt-5.1-codex-max to OpenAI provider documentation (#17602)
Add gpt-5.1-codex-max model to:
- Model list table
- Reasoning effort table
- Verbosity note
2025-12-06 09:46:48 -08:00
Krrish Dholakia 497856e1e3 docs: document multi tenant architecture 2025-12-06 09:27:30 -08:00
Alexsander Hamir 8172f6cdd6 Fix security vulnerability: update mdast-util-to-hast to 13.2.1 (CVE-2025-66400) (#17601) 2025-12-06 09:26:26 -08:00
Cesar Garcia 87f94172a9 fix(responses): Add image generation support for Responses API (#16586)
* fix(responses): Add image generation support for Responses API

Fixes #16227

## Problem
When using Gemini 2.5 Flash Image with /responses endpoint, image generation
outputs were not being returned correctly. The response contained only text
with empty content instead of the generated images.

## Solution
1. Created new `OutputImageGenerationCall` type for image generation outputs
2. Modified `_extract_message_output_items()` to detect images in completion responses
3. Added `_extract_image_generation_output_items()` to transform images from
   completion format (data URL) to responses format (pure base64)
4. Added `_extract_base64_from_data_url()` helper to extract base64 from data URLs
5. Updated `ResponsesAPIResponse.output` type to include `OutputImageGenerationCall`

## Changes
- litellm/types/responses/main.py: Added OutputImageGenerationCall type
- litellm/types/llms/openai.py: Updated ResponsesAPIResponse.output type
- litellm/responses/litellm_completion_transformation/transformation.py:
  Added image detection and extraction logic
- tests/test_litellm/responses/litellm_completion_transformation/test_image_generation_output.py:
  Added comprehensive unit tests (16 tests, all passing)

## Result
/responses endpoint now correctly returns:
```json
{
  "output": [{
    "type": "image_generation_call",
    "id": "..._img_0",
    "status": "completed",
    "result": "iVBORw0KGgo..."  // Pure base64, no data: prefix
  }]
}
```

This matches OpenAI Responses API specification where image generation
outputs have type "image_generation_call" with base64 data in "result" field.

* docs(responses): Add image generation documentation and tests

- Add comprehensive image generation documentation to response_api.md
  - Include examples for Gemini (no tools param) and OpenAI (with tools param)
  - Document response format and base64 handling
  - Add supported models table with provider-specific requirements

- Add unit tests for image generation output transformation
  - Test base64 extraction from data URLs
  - Test image generation output item creation
  - Test status mapping and integration scenarios
  - Verify proper transformation from completions to responses format

Related to #16227

* fix(responses): Correct status type for image generation output

- Add _map_finish_reason_to_image_generation_status() helper function
- Fix MyPy type error: OutputImageGenerationCall.status only accepts
  ['in_progress', 'completed', 'incomplete', 'failed'], not the full
  ResponsesAPIStatus union which includes 'cancelled' and 'queued'

Fixes MyPy error in transformation.py:838
2025-12-05 15:56:26 -08:00
Cesar Garcia 829b06f53f Fix: Gemini image_tokens incorrectly treated as text tokens in cost calculation (#17554)
When Gemini image generation models return `text_tokens=0` with `image_tokens > 0`,
the cost calculator was assuming no token breakdown existed and treating all
completion tokens as text tokens, resulting in ~10x underestimation of costs.

Changes:
- Fix cost calculation logic to respect token breakdown when image/audio/reasoning
  tokens are present, even if text_tokens=0
- Add `output_cost_per_image_token` pricing for gemini-3-pro-image-preview models
- Add test case reproducing the issue
- Add documentation explaining image token pricing

Fixes #17410
2025-12-05 15:55:38 -08:00
Yuichiro Utsumi d18e489872 fix(docs): remove source .env (#17466)
Remove `source .env` since `docker compose` automatically loads
the `.env` file.

Signed-off-by: utsumi.yuichiro <utsumi.yuichiro@fujitsu.com>
2025-12-05 15:53:05 -08:00
Ishaan Jaff f02df3035a [Feat] Allow using dynamic rate limit/priority reservation on teams (#17061)
* use helper to get key/team priority

* test_team_metadata_priority

* docs team priority
2025-12-05 15:42:27 -08:00
Sameer Kankute 43914796d6 fix failing vertex tests 2025-12-06 00:04:04 +05:30
Krrish Dholakia c272741d7f docs: fix strings 2025-12-05 09:37:22 -08:00
Krrish Dholakia c1cbe6ed56 docs: document tool calls spec 2025-12-05 09:37:22 -08:00
Sameer Kankute 558c8f92d1 Merge pull request #17519 from BerriAI/litellm_cursor_integration
Add support for cursor BYOK with its own configuration
2025-12-05 22:23:45 +05:30
Alexsander Hamir 0c017f376c fix: code quality issues from ruff linter (#17536)
* fix: resolve code quality issues from ruff linter

- Fix duplicate imports in anthropic guardrail handler
  - Remove duplicate AllAnthropicToolsValues import
  - Remove duplicate ChatCompletionToolParam import

- Remove unused variable 'tools' in guardrail handler

- Replace print statement with proper logging in json_loader
  - Use verbose_logger.warning() instead of print()

- Remove unused imports
  - Remove _update_metadata_field from team_endpoints
  - Remove unused ChatCompletionToolCallChunk imports from transformation

- Refactor update_team function to reduce complexity (PLR0915)
  - Extract budget_duration handling into _set_budget_reset_at() helper
  - Minimal refactoring to reduce function from 51 to 50 statements

All ruff linter errors resolved. Fixes F811, F841, T201, F401, and PLR0915 errors.

* docs: add missing environment variables to documentation

Add 8 missing environment variables to the environment variables reference section:
- AIOHTTP_CONNECTOR_LIMIT_PER_HOST: Connection limit per host for aiohttp connector
- AUDIO_SPEECH_CHUNK_SIZE: Chunk size for audio speech processing
- CYBERARK_SSL_VERIFY: Flag to enable/disable SSL certificate verification for CyberArk
- LITELLM_DD_AGENT_HOST: Hostname or IP of DataDog agent for LiteLLM-specific logging
- LITELLM_DD_AGENT_PORT: Port of DataDog agent for LiteLLM-specific log intake
- WANDB_API_KEY: API key for Weights & Biases (W&B) logging integration
- WANDB_HOST: Host URL for Weights & Biases (W&B) service
- WANDB_PROJECT_ID: Project ID for Weights & Biases (W&B) logging integration

Fixes test_env_keys.py test that was failing due to undocumented environment variables.
2025-12-05 08:40:49 -08:00
Sameer Kankute c8fbcc7f1c add tutorial as well 2025-12-05 12:32:23 +05:30
Sameer Kankute acc0b5fe27 Merge pull request #17362 from BerriAI/litellm_vertex-bge-cherrypick
[Feat] VertexAI - Add BGE Embeddings support
2025-12-05 11:53:42 +05:30
Krish Dholakia b3a3081e8e Guardrails API - new structured_messages param (#17518)
* fix(generic_guardrail_api.py): add 'structured_messages' support

allows guardrail provider to know if text is from system or user

* fix(generic_guardrail_api.md): document 'structured_messages' parameter

give api provider a way to distinguish between user and system messages

* feat(anthropic/): return openai chat completion format structured messages when calls made via `/v1/messages` on Anthropic

* feat(responses/guardrail_translation): support 'structured_messages' param for guardrails

structured openai chat completion spec messages, for guardrail checks when using /v1/responses api

allows guardrail checks to work consistently across APIs
2025-12-04 22:08:00 -08:00
Krish Dholakia 8776336c3c Enable detailed debugging for reference (#17508)
* Deprecate set_verbose in favor of LITELLM_LOG

Co-authored-by: krrishdholakia <krrishdholakia@gmail.com>

* Update debugging documentation links

Co-authored-by: krrishdholakia <krrishdholakia@gmail.com>

---------

Co-authored-by: Cursor Agent <cursoragent@cursor.com>
2025-12-04 21:51:56 -08:00
Sameer Kankute 392e5059b0 Add steps to add litellm proxy in cursor 2025-12-05 10:02:42 +05:30
Sameer Kankute 01ee46b493 Add steps to add litellm proxy in cursor 2025-12-05 10:01:48 +05:30
Sameer Kankute 4d83a48b59 Add steps to add litellm proxy in cursor 2025-12-05 09:39:58 +05:30
Sameer Kankute a6006e698c Add support for cursor BYOK with its own configuration 2025-12-05 09:34:49 +05:30
Ishaan Jaffer 4f3b843efe docs openai 2025-12-04 18:32:23 -08:00
Ishaan Jaff b2e8d3fd42 [Feat] Allow adding OpenAI compatible chat providers using .json + add public ai provider (#17448)
* feat: Add JSON config for OpenAI-compatible providers

Co-authored-by: ishaan <ishaan@berri.ai>

* feat: Add simple JSON config for OpenAI-compatible providers

Co-authored-by: ishaan <ishaan@berri.ai>

* feat: Implement JSON-based provider config and migrate PublicAI

Co-authored-by: ishaan <ishaan@berri.ai>

* Checkpoint before follow-up message

Co-authored-by: ishaan <ishaan@berri.ai>

* Checkpoint before follow-up message

Co-authored-by: ishaan <ishaan@berri.ai>

* Checkpoint before follow-up message

Co-authored-by: ishaan <ishaan@berri.ai>

* Checkpoint before follow-up message

Co-authored-by: ishaan <ishaan@berri.ai>

* Checkpoint before follow-up message

Co-authored-by: ishaan <ishaan@berri.ai>

* Checkpoint before follow-up message

Co-authored-by: ishaan <ishaan@berri.ai>

* Checkpoint before follow-up message

Co-authored-by: ishaan <ishaan@berri.ai>

* Checkpoint before follow-up message

Co-authored-by: ishaan <ishaan@berri.ai>

* Checkpoint before follow-up message

Co-authored-by: ishaan <ishaan@berri.ai>

* Checkpoint before follow-up message

Co-authored-by: ishaan <ishaan@berri.ai>

* Checkpoint before follow-up message

Co-authored-by: ishaan <ishaan@berri.ai>

* Checkpoint before follow-up message

Co-authored-by: ishaan <ishaan@berri.ai>

* Checkpoint before follow-up message

Co-authored-by: ishaan <ishaan@berri.ai>

* Checkpoint before follow-up message

Co-authored-by: ishaan <ishaan@berri.ai>

* Checkpoint before follow-up message

Co-authored-by: ishaan <ishaan@berri.ai>

* Checkpoint before follow-up message

Co-authored-by: ishaan <ishaan@berri.ai>

* Checkpoint before follow-up message

Co-authored-by: ishaan <ishaan@berri.ai>

* Checkpoint before follow-up message

Co-authored-by: ishaan <ishaan@berri.ai>

* Checkpoint before follow-up message

Co-authored-by: ishaan <ishaan@berri.ai>

* docs fix

* undo change

---------

Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: ishaan <ishaan@berri.ai>
2025-12-04 17:59:25 -08:00
Ishaan Jaff fadfbb13d3 [Docs] A2a - Permission management (#17515)
* docs add a2a gateway + mcp gateway

* docs a2a permissions

* docs a2a permission

* docs

* docs a2a

* docs a2a

* add new img

* docs agent permissions
2025-12-04 17:29:47 -08:00
Ishaan Jaff 575e769bff [Feat] UI - Agent Gateway - set allowed agents by key, team (#17511)
* init schema.prisma

* init LiteLLM_ObjectPermissionTable with agents and agent_access_groups

* TestAgentRequestHandler

* refatctor agent list

* add AgentRequestHandler

* fix agent access controls by key/team

* feat - new migration for LiteLLM_AgentsTable

* fix add LiteLLM_ObjectPermissionBase with agent and agent groups

* add agent routes to llm api routes

* add agent routes as llm route

* add AgentPermissionsProps

* add agents on team/key create

* add agent selector on team/key

* add agent selector on key edit /info

* add AgentPermissions

* docs list + invoke agents
2025-12-04 16:31:17 -08:00
Raghav Jhavar 72eb4c3a1c 🆕 feat: support routing to only websearch supported deployments (#17500)
* support routing to only websearch supported deployments

* add docs
2025-12-04 14:18:20 -08:00
Krrish Dholakia 5aeba81538 docs(multi_tenant_architecture.md): add new architecture doc 2025-12-04 11:13:50 -08:00
Sameer Kankute f2c0029939 Merge pull request #17470 from BerriAI/litellm_batches_bedrock_content
Add support for file content download for bedrock batches
2025-12-04 21:57:04 +05:30
Sameer Kankute 5b4542304d Merge pull request #17461 from BerriAI/litellm_qwen2_imported_model_support
Add support for bedrock qwen 2 imported model
2025-12-04 21:56:22 +05:30
Sameer Kankute edd392b50d Add support for file content download for bedrock batches 2025-12-04 13:27:53 +05:30
Krish Dholakia dc7c2b9b05 Update docs to link agent hub (#17462)
* Docs: Add AI Hub agent registry documentation

Co-authored-by: krrishdholakia <krrishdholakia@gmail.com>

* Fix: Update AI Hub link in A2A documentation

Co-authored-by: krrishdholakia <krrishdholakia@gmail.com>

---------

Co-authored-by: Cursor Agent <cursoragent@cursor.com>
2025-12-03 21:59:45 -08:00
Sameer Kankute 4710e772be Add support for bedrock qwen 2 imported model 2025-12-04 11:08:57 +05:30
codgician adfbb1c308 docs: document responses and embedding api for github copilot (#17456) 2025-12-03 21:22:08 -08:00
Krish Dholakia 32013f63a0 Guardrail API - support tool call checks on OpenAI /chat/completions, OpenAI /responses, Anthropic /v1/messages (#17459)
* fix(unified_guardrail.py): correctly map a v1/messages call to the anthropic unified guardrail

* fix: add more rigorous call type checks

* fix(anthropic_endpoints/endpoints.py): initialize logging object at the beginning of endpoint

ensures call id + trace id are emitted to guardrail api

* feat(anthropic/chat/guardrail_translation): support streaming guardrails

sample on every 5 chunks

* fix(openai/chat/guardrail_translation): support openai streaming guardrails

* fix: initial commit fixing output guardrails for responses api

* feat(openai/responses/guardrail_translation): handler.py - fix output checks on responses api

* fix(openai/responses/guardrail_translation/handler.py): ensure responses api guardrails work on streaming

* test: update tests

* test: update tests

* fix: support multiple kinds of input to the guardrail api

* feat(guardrail_translation/handler.py): support extracting tool calls from openai chat completions for guardrail api's

* feat(generic_guardrail_api.py): support extracting + returning modified tool calls on generic_guardrails_api

allows guardrail api to analyze tool call being sent to provider - to run any analysis on it

* fix(guardrails.py): support anthropic /v1/messages tool calls

* feat(responses_api/): extract tool calls for guardrail processing

* docs(generic_guardrail_api.md): document tools param support

* docs: generic_guardrail_api.md

improve documentation
2025-12-03 21:20:39 -08:00
Ishaan Jaff e4f954b354 [Docs] Agent Gateway (#17454)
* init litellm A2a client

* simpler a2a client interface

* test a2a

* move a2a invoking tests

* test fix

* ensure a2a send message is tracked n logs

* rename tags

* add streaming handlng

* add a2a invocation

* add a2a invocation i cost calc

* test_a2a_logging_payload

* update invoke_agent_a2a

* test_invoke_agent_a2a_adds_litellm_data

* add A2a agent

* fix endpoints on A2a

* UI allow testing a2a endpoints

* add agent imgs

* add a2a as an endpoint

* add a2a

* docs a2a invoke

* docs a2a

* docs A2a invoke
2025-12-03 18:57:41 -08:00
Ishaan Jaff f035984dd7 fix: cyberark allow setting ssl verfiy to false (#17433) 2025-12-03 18:54:31 -08:00
yuneng-jiang 37c598441f Change is_sso_configured to auto_redirect_to_sso 2025-12-03 15:48:50 -08:00
Ishaan Jaffer 9b3d8302cf docs fix stable 2025-12-03 14:12:50 -08:00
Cesar Garcia 5e791464af docs: add Microsoft Agent Lightning to projects (#17422)
Add Agent Lightning, Microsoft's open-source framework for training
AI agents with RL, APO, and SFT. Uses LiteLLM Proxy for LLM routing
and trace collection.
2025-12-03 09:07:02 -08:00
Krrish Dholakia be5dd234bf docs: fix list 2025-12-03 08:01:26 -08:00
Sameer Kankute 8eaabb4ad7 Add vector store support for ragflow 2025-12-03 15:29:47 +05:30
Sameer Kankute 52090c3f3e Merge pull request #17350 from BerriAI/litellm_rag_chat_completion_api
Add ragflow support for chat completions API
2025-12-03 13:29:32 +05:30
Cesar Garcia 86350fe6d7 docs: add Google ADK and Harbor to projects (#17352)
Both frameworks integrate with LiteLLM:
- Google ADK uses LiteLLM for model-agnostic agent building
- Harbor uses LiteLLM for agent evaluation across providers
2025-12-02 22:27:04 -08:00
Cesar Garcia 4c6604b0da Cleanup: Remove orphan docs pages and Docusaurus template files (#17356)
* docs: update getting started page

- Add Core Functions table with link to full list
- Add Responses API section
- Add Async section with acompletion() example
- Add "Switch Providers with One Line" example
- Clarify Basic Usage supports multiple endpoints
- Update models to current versions (openai/gpt-4o, anthropic/claude-sonnet-4)
- Use provider/model format throughout
- Fix deprecated import: from openai.error -> from openai
- Keep original structure: community key, More details links, observability env vars

* Cleanup: Remove orphan docs pages and Docusaurus template files

- Remove orphan getting_started.md (not linked in sidebar)
- Remove Docusaurus template intro.md
- Remove tutorial-basics/ directory (Docusaurus template)
- Remove tutorial-extras/ directory (Docusaurus template)
2025-12-02 22:25:26 -08:00
Ali Saleh 6b5ad5d5a6 docs: Update Instructions For Phoenix Integration (#17373) 2025-12-02 22:03:54 -08:00
Sameer Kankute a0819d6df0 Merge branch 'main' into litellm_vertex-bge-cherrypick 2025-12-03 08:37:04 +05:30
Ishaan Jaff 427074ac6e Fix: Datadog callback regression when ddtrace is installed (#17393)
* fix DD agent host logging

* docs fix

* test_datadog_agent_configuration

* test_datadog_ignores_ddtrace_agent_host
2025-12-02 17:27:50 -08:00
Ishaan Jaff 6c188c5ae2 [Feat] New model/provider - Adds support for Google Cloud Chirp3 HD on /speech (#17391)
* docs vertex tts

* place vertex ai types in file

* use VertexAITextToSpeechConfig

* use vertex_voice_dict

* refactor docs

* docs vertex ai chirp

* TestVertexAITextToSpeechConfig

* new provider vertex ai chirp3

* test_litellm_speech_vertex_ai_chirp

* add vertex_ai/chirp cost trackign
2025-12-02 15:36:23 -08:00