Commit Graph

4770 Commits

Author SHA1 Message Date
Ishaan Jaffer 73d3d151ab fix 1.80.5 2025-11-22 19:24:46 -08:00
YutaSaito 06f2ecef42 feat: tool permission argument check (#16982) 2025-11-22 19:21:25 -08:00
Ishaan Jaff f3cd0b0bc4 docs - ai gateway prompt management (#16990)
* docs prompt management

* docs

* add prompt mgmt imgs

* docs fix

* docs prompt management

* docs fix

* docs prompt versions

* docs fix
2025-11-22 18:49:29 -08:00
Ishaan Jaffer 1c864dc0de docs 1.80.5 2025-11-22 17:54:50 -08:00
Ishaan Jaffer 7e338f1beb docs sso roles 2025-11-22 17:36:44 -08:00
Ishaan Jaffer 2ebfc92dca docs fix 2025-11-22 17:34:45 -08:00
Ishaan Jaffer 036b2848e3 docs 2025-11-22 17:33:57 -08:00
Ishaan Jaffer f1e4242cbf docs fix 2025-11-22 17:33:57 -08:00
Alexsander Hamir 815136fbef perf release notes (#16978) 2025-11-22 17:23:31 -08:00
Ishaan Jaffer 023eefb6d5 docs fix 2025-11-22 16:54:30 -08:00
Ishaan Jaffer 93c2103097 fix docs 2025-11-22 16:16:56 -08:00
Ishaan Jaffer 7cbb159997 v1.80.0-stable 2025-11-22 16:09:52 -08:00
Krrish Dholakia d0cb2db0c6 docs(ai_hub.md): document mcp servers on ai hub 2025-11-22 16:07:38 -08:00
yuneng-jiang c6e0a0209f Docs for Model Compare UI (#16979) 2025-11-22 15:48:42 -08:00
Krish Dholakia c966c122ad feat: Add Presidio PII masking tutorial (#16969)
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
2025-11-22 15:45:43 -08:00
Krish Dholakia b9f2cc1c98 Model Armor - Logging guardrail response on llm responses (#16977)
* Litellm dev 11 22 2025 p1 (#16975)

* fix(model_armor.py): return response after applying changes

* fix: initial commit adding guardrail span logging to otel on post-call runs

sends it as a separate span right now, need to include in the same llm request/response span

* fix(opentelemetry.py): include guardrail in received request log + set input/ouput fields on parent otel span instead of nesting it

allows request/response to be seen easily on observability tools

* fix(model_armor.py): working model armor logging on post call events

* fix: fix exception message

* fix(opentelemetry.py): add backwards compatibility for litellm_request

allow users building on the spec change to use previous spec
2025-11-22 15:44:28 -08:00
Ishaan Jaffer 4fb9e33a95 fixes 2025-11-22 14:11:13 -08:00
Ishaan Jaffer b43b68a072 docs fix 2025-11-22 14:02:14 -08:00
Alexsander Hamir b02baf53a9 Fix: prevent memory blowout in LoggingWorker (#16559)
* fix: prevent memory blowout in LoggingWorker

Tasks were being executed sequentially with each task awaited before
processing the next one. When the queue had 10k+ tasks, only one could
execute at a time. Since the request rate exceeded execution speed,
objects accumulated in memory (50k+), holding references to heavy
objects and causing memory blowout.

The new implementation uses a semaphore to allow up to 1000 concurrent
tasks while properly tracking and cleaning up each task, significantly
improving throughput and preventing queue buildup.

* fix: require semaphor before removing task from queue

* fix: make worker concurrency configurable

* fix: clean comments

* fix: clarify new env purpose

* fix: add missing lib

* make constants configurable instead of hardcoded

* add more aggressive cleaning when queue is full

* add helpers function for the aggressive cleaning functionality

* use envs instead of static constants

* import and document constants

* add unit test for new functionality

* fix default value on config_settings

* fix: remove unused variables and imports to resolve linter errors

- Remove unused time_since_last_clear variable in logging_worker.py
  The variable was calculated but never used in _handle_queue_full()
  method, causing F841 linter error.

- Remove unused TYPE_CHECKING import in mcp_server/server.py
  The import was not used anywhere in the file, causing F401 linter error.

These changes improve code cleanliness and ensure the codebase passes
all linter checks without affecting functionality.

* add missing log expected by test_queue_full_handling

* fix: clean config_setting.md file

* fix: handle logging errors gracefully during shutdown in _flush_on_exit

During process shutdown, logging handlers may be closed while _flush_on_exit
tries to flush queued logging coroutines. This causes 'ValueError: I/O
operation on closed file' errors when coroutines attempt to log.

Changes:
- Add _safe_log helper method that wraps logging calls and suppresses
  errors when logging handlers are closed (ValueError, OSError, AttributeError)
- Replace all verbose_logger calls in _flush_on_exit with _safe_log
- Remove logging from exception handler in coroutine execution loop
  to prevent cascading errors during shutdown

This ensures graceful shutdown even when logging handlers are closed,
which is common during process termination.
2025-11-22 13:58:29 -08:00
Ishaan Jaffer a06e7edd85 docs 1.80.5 2025-11-22 13:52:26 -08:00
Krish Dholakia ac3aa74c22 (feat) Anthropic - support Structured Outputs output_format for Claude 4.5 sonnet and Opus 4.1 + Arize Phoenix - root span logging (#16949)
* feat(anthropic/chat/transformations): for claude-4-5-sonnet and opus-4-1 support passing structured output to anthropic api

* docs: document new feature

* fix: fix output format

* fix: cleanup

* fix(transformation.py): conditionally pass in json tool call

* fix: support ARIZE_SPACE_ID instead of ARIZE_SPACE_KEY

* docs(arize_integration.md): cleanup arize docs

* feat(callback_info_helpers.tsx): allow setting arize space id via ui

* fix: fix linting error

* fix(opentelemetry.py): working arize phoenix root span tracing
2025-11-22 12:08:26 -08:00
Ishaan Jaffer 3ba3faefb8 fix sec scan 2025-11-22 11:50:32 -08:00
Ishaan Jaffer ff99f93dfc fix req.txt 2025-11-22 11:46:17 -08:00
Cesar Garcia 6810e0699b docs: Add mini-swe-agent to Projects built on LiteLLM (#16971)
* docs: Add mini-swe-agent to projects page

Add mini-swe-agent to the documentation projects page.
mini-swe-agent is a minimal AI coding agent that resolves >70% of
GitHub issues in SWE-bench, built on LiteLLM for model flexibility.

- Added projects/mini-swe-agent.md documentation
- Updated sidebars.js to include mini-swe-agent in projects list

* docs: Update Singularity to Apptainer in mini-swe-agent.md
2025-11-22 10:45:29 -08:00
Ishaan Jaffer badbadba0d fix img URL for tests 2025-11-22 09:41:15 -08:00
Sameer Kankute 82dc0354ce Litellm sameer nov 3 stable branch (#16963)
* Add openai metadata filed in the request

* Add docs related to openai metadata

* Add utils

* test_completion_openai_metadata[True]

* Added support for though signature for gemini 3 in responses api (#16872)

* Added support for though signature for gemini 3

* Update docs with all supported endpoints and cost tracking

* Added config based routing support for batches and files

* fix lint errors

* Litellm anthropic image url support (#16868)

* Add image as url support to anthropic

* fix mypy errors

* fix tests

* Fix: Populate spend_logs_metadata in batch and files endpoints (#16921)

* Add spend-logs-metadata to the metadata

* Add tests for spend logs metadata in batches

* use better names

* Remove support for penalty param for gemini 3 (#16907)

* Remove support for penalty param

* remove halucinated model names

* fix mypy/test errors

* fix tests

* fix too many lines error

* fix too many lines error

* Add config for cicd test case

* Fix final tests

* fix batch tests

* fix batch tests
2025-11-22 09:35:05 -08:00
Ishaan Jaff 661117678c Revert "remove deprecated embedding model (#16724)" (#16970)
This reverts commit b9bc903536.
2025-11-22 09:34:53 -08:00
Derek Duenas bbaf0af907 Grayswan guardrail passthrough on flagged (#16891)
* attempt to implement the passthrough feature

* Formatting and small change

* Fix formatting

* Format test file

---------

Co-authored-by: Xiaohan Fu <xiaohan@grayswan.ai>
2025-11-21 20:01:35 -08:00
Mubashir Osmani db58f6aeb1 fix: arize phoenix logging (#16301)
* arize phx

* fix arize integration

* traces to specific project name

* fix

* look for http endpoint
2025-11-21 18:46:18 -08:00
yuneng-jiang 1ebe1fea37 Docs for Model Compare UI and Org Usage (#16928)
* Docs for Model Compare UI and Org Usage

* Fix typo in img path and add Model Compare to sidebars.js

* Updated to remove from 1.80 writeup
2025-11-21 16:45:55 -08:00
Ishaan Jaff 8e318dd06c [Feat] New LLM Provider - Docker Model Runner (#16948)
* add DOCKER_MODEL_RUNNER

* add DockerModelRunnerChatConfig Transorm

* add docker_model_runner

* add docker_model_runner

* docs docker model runner

* add DockerModelRunnerChatConfig

* add docker_model_runner to providers

* test_completion_hits_correct_url_and_body

* fix sidebar

* TestDockerModelRunnerIntegration

* test_completion_with_custom_engine_and_host

* docs docker model runner

* docs fix
2025-11-21 16:09:32 -08:00
Suresh Kumar 5b4a848391 fix anthropic pass-through endpoint (#16883) 2025-11-21 16:00:05 -08:00
Cesar Garcia 1c65800f4a Feat: add support for Grok 4.1 Fast models (#16936)
* feat: Add support for Grok 4.1 Fast models

Add new xAI Grok 4.1 Fast models optimized for high-performance agentic tool calling:

- xai/grok-4-1-fast (alias for grok-4-1-fast-reasoning)
- xai/grok-4-1-fast-reasoning (with reasoning capabilities)
- xai/grok-4-1-fast-reasoning-latest
- xai/grok-4-1-fast-non-reasoning (without reasoning for faster responses)
- xai/grok-4-1-fast-non-reasoning-latest

Features:
- Context window: 2,000,000 tokens
- Pricing: $0.20/1M input, $0.50/1M output tokens
- Cached tokens: $0.05/1M tokens
- Supports: Function calling, Structured outputs, Vision, Audio input, Web search, Reasoning

Fixes #16927

* docs: Add comprehensive Grok models documentation

- Add 'Supported Models' section highlighting new Grok 4.1 Fast models
- Include comparison guide for reasoning vs non-reasoning models
- Add complete model family table (Grok 4.1, 4, 3, Code, 2)
- Add features legend explaining capabilities
- Remove pricing details (link to xAI docs instead for current rates)
- Improve documentation clarity and consistency

Related to #16927

* docs: Minor corrections to xai.md
2025-11-21 15:57:55 -08:00
YutaSaito 7c4ef090c1 docs: fix mcp url format (#16940)
* docs: fix mcp url format

* fix: update Cursor MCP example to use url instead of server_url
2025-11-21 15:43:26 -08:00
colinlin-stripe f9d8eeaf8e [stripe] gemini 3 thought signatures in tool call id (#16895)
* though signature tool call id

* [stripe] refactor and tests

* [stripe] remove md and move to factory

* [stripe] remove redudant test

* [stripe] ran black formatting

* [stripe] add thought signature docs

* [stripe] remove unused import
2025-11-21 13:44:53 -08:00
YutaSaito 041ac054b6 feat: allow custom violation message for tool-permission guardrail (#16916) 2025-11-21 08:52:01 -08:00
Krrish Dholakia e7751f0c12 docs: fix docs bug 2025-11-20 16:05:39 -08:00
Ishaan Jaff 57544f1662 [Feat] Adds IAM role assumption support for AWS Secret Manager (#16887)
* add AWS fields for KeyManagementSettings

* docs IAM roles

* use aws iam auth on secret manager v2

* fix: load_aws_secret_manager

* test_secret_manager_with_iam_role_settings
2025-11-20 12:38:48 -08:00
yuneng-jiang 9120a02474 Change favicon (#16837) 2025-11-19 20:38:22 -08:00
Krrish Dholakia 28cadaa123 docs: fix tags 2025-11-19 20:26:48 -08:00
Krrish Dholakia 0389f2d064 docs: cleanup 2025-11-19 20:26:48 -08:00
Krrish Dholakia 87be419559 docs(index.md): cleanup 2025-11-19 20:23:22 -08:00
Krrish Dholakia 778425f02f docs: add initial blog post for Gemini 3 on LiteLLM 2025-11-19 20:22:24 -08:00
Krrish Dholakia 208027dc71 docs(ui.md): reorder ui page 2025-11-19 19:17:54 -08:00
Sameer Kankute 6fc7397dde Add Vertex AI Image Edit Support (#16828)
* Add vertex ai image edit support

* Fix lint errors
2025-11-19 18:39:28 -08:00
Krrish Dholakia 08246bf908 fix: fix broken doc link 2025-11-19 08:35:38 -08:00
Rob Geada afc9a763cb Fix IBM Guardrails optional params, add extra_headers field (#16771)
Signed-off-by: Rob Geada <rob@geada.net>
2025-11-18 19:55:40 -08:00
Ishaan Jaff 35ab0e109c [Docs] SSO - Manage User Roles via Azure App Roles (#16796)
* add img 2

* add app roles

* docs
2025-11-18 16:00:36 -08:00
yuneng-jiang f9ec353b80 [Feature] UI - Allow setting base_url in API reference docs (#16674)
* Allow setting base_url in API reference docs

* Add logic to change base url for test key page
2025-11-18 11:27:28 -08:00
Sameer Kankute acf206bec6 Add Day 0 gemini-3-pro-preview support (#16719)
* Add thinking signature support for gemini

* Add docs related to thinking signature

* remove double base64 import

* fix mypy errors

* fix litellm/llms/vertex_ai/gemini/vertex_and_google_ai_studio_gemini.py mypy

* Add new gemini 3 model and features

* Add docs related to gemini 3

* Update gemini 3 pricing

* fix llm translation tests

* fix mapped tests
2025-11-18 09:44:45 -08:00