* Litellm dev 11 22 2025 p1 (#16975)
* fix(model_armor.py): return response after applying changes
* fix: initial commit adding guardrail span logging to otel on post-call runs
sends it as a separate span right now, need to include in the same llm request/response span
* fix(opentelemetry.py): include guardrail in received request log + set input/ouput fields on parent otel span instead of nesting it
allows request/response to be seen easily on observability tools
* fix(model_armor.py): working model armor logging on post call events
* fix: fix exception message
* fix(opentelemetry.py): add backwards compatibility for litellm_request
allow users building on the spec change to use previous spec
* fix: prevent memory blowout in LoggingWorker
Tasks were being executed sequentially with each task awaited before
processing the next one. When the queue had 10k+ tasks, only one could
execute at a time. Since the request rate exceeded execution speed,
objects accumulated in memory (50k+), holding references to heavy
objects and causing memory blowout.
The new implementation uses a semaphore to allow up to 1000 concurrent
tasks while properly tracking and cleaning up each task, significantly
improving throughput and preventing queue buildup.
* fix: require semaphor before removing task from queue
* fix: make worker concurrency configurable
* fix: clean comments
* fix: clarify new env purpose
* fix: add missing lib
* make constants configurable instead of hardcoded
* add more aggressive cleaning when queue is full
* add helpers function for the aggressive cleaning functionality
* use envs instead of static constants
* import and document constants
* add unit test for new functionality
* fix default value on config_settings
* fix: remove unused variables and imports to resolve linter errors
- Remove unused time_since_last_clear variable in logging_worker.py
The variable was calculated but never used in _handle_queue_full()
method, causing F841 linter error.
- Remove unused TYPE_CHECKING import in mcp_server/server.py
The import was not used anywhere in the file, causing F401 linter error.
These changes improve code cleanliness and ensure the codebase passes
all linter checks without affecting functionality.
* add missing log expected by test_queue_full_handling
* fix: clean config_setting.md file
* fix: handle logging errors gracefully during shutdown in _flush_on_exit
During process shutdown, logging handlers may be closed while _flush_on_exit
tries to flush queued logging coroutines. This causes 'ValueError: I/O
operation on closed file' errors when coroutines attempt to log.
Changes:
- Add _safe_log helper method that wraps logging calls and suppresses
errors when logging handlers are closed (ValueError, OSError, AttributeError)
- Replace all verbose_logger calls in _flush_on_exit with _safe_log
- Remove logging from exception handler in coroutine execution loop
to prevent cascading errors during shutdown
This ensures graceful shutdown even when logging handlers are closed,
which is common during process termination.
* feat(anthropic/chat/transformations): for claude-4-5-sonnet and opus-4-1 support passing structured output to anthropic api
* docs: document new feature
* fix: fix output format
* fix: cleanup
* fix(transformation.py): conditionally pass in json tool call
* fix: support ARIZE_SPACE_ID instead of ARIZE_SPACE_KEY
* docs(arize_integration.md): cleanup arize docs
* feat(callback_info_helpers.tsx): allow setting arize space id via ui
* fix: fix linting error
* fix(opentelemetry.py): working arize phoenix root span tracing
* docs: Add mini-swe-agent to projects page
Add mini-swe-agent to the documentation projects page.
mini-swe-agent is a minimal AI coding agent that resolves >70% of
GitHub issues in SWE-bench, built on LiteLLM for model flexibility.
- Added projects/mini-swe-agent.md documentation
- Updated sidebars.js to include mini-swe-agent in projects list
* docs: Update Singularity to Apptainer in mini-swe-agent.md
* Add openai metadata filed in the request
* Add docs related to openai metadata
* Add utils
* test_completion_openai_metadata[True]
* Added support for though signature for gemini 3 in responses api (#16872)
* Added support for though signature for gemini 3
* Update docs with all supported endpoints and cost tracking
* Added config based routing support for batches and files
* fix lint errors
* Litellm anthropic image url support (#16868)
* Add image as url support to anthropic
* fix mypy errors
* fix tests
* Fix: Populate spend_logs_metadata in batch and files endpoints (#16921)
* Add spend-logs-metadata to the metadata
* Add tests for spend logs metadata in batches
* use better names
* Remove support for penalty param for gemini 3 (#16907)
* Remove support for penalty param
* remove halucinated model names
* fix mypy/test errors
* fix tests
* fix too many lines error
* fix too many lines error
* Add config for cicd test case
* Fix final tests
* fix batch tests
* fix batch tests
* attempt to implement the passthrough feature
* Formatting and small change
* Fix formatting
* Format test file
---------
Co-authored-by: Xiaohan Fu <xiaohan@grayswan.ai>
* though signature tool call id
* [stripe] refactor and tests
* [stripe] remove md and move to factory
* [stripe] remove redudant test
* [stripe] ran black formatting
* [stripe] add thought signature docs
* [stripe] remove unused import
* add AWS fields for KeyManagementSettings
* docs IAM roles
* use aws iam auth on secret manager v2
* fix: load_aws_secret_manager
* test_secret_manager_with_iam_role_settings