Commit Graph

4297 Commits

Author SHA1 Message Date
ripepersimmon be970735de feat: Add gemini-3-pro-image-preview model support for imageSize parameters (#17019)
- Add model identifier to FLASH_IMAGE_PREVIEW_MODEL_IDENTIFIERS
- Add imageSize parameter support (1K, 2K, 4K) with GeminiImageSize type
- Add tests for imageSize parameter transformation
- Update documentation with new model
2025-11-25 19:38:29 -08:00
Igal Boxerman e6e1e8fca4 feat(pillar): add automatic LiteLLM context headers (#17076)
- Automatically pass LiteLLM virtual key context as X-LiteLLM-* headers
- Includes key_alias, user_id, team_id, org_id, and user_email
- No configuration required - always enabled for application/user tracking
- Excludes sensitive data (metadata, API tokens) for security
- Add comprehensive tests (30 tests, all passing)
- Update documentation with header details
2025-11-25 19:35:39 -08:00
Carlo Alberto Ferraris b50fcc4b56 vertex ai: use the correct domain for the global location when counting tokens (#17116) 2025-11-25 19:22:20 -08:00
Sameer Kankute cd65a84abd Merge pull request #16844 from Chesars/fix/response-format-to-text-format-bridge-conversion
fix: Support response_format parameter in completion -> responses bridge
2025-11-26 08:51:09 +05:30
Ishaan Jaff 5c192a23c3 [Feat] Add new RAG API on LiteLLM AI Gateway (#17109)
* init RAG api types

* add RAG endpoints

* init main.py for RAG ingest API

* init RecursiveCharacterTextSplitter

* add BaseRAGIngestion

* fix OpenAIRAGIngestion

* fix img handler

* init OpenAIRAGIngestion

* init BedrockRAGIngestion

* init BedrockRAGIngestion

* init rag tests

* init BedrockVectorStoreOptions

* implement BedrockRAGIngestion

* add BaseRAGAPI

* add endpoint for RAG ingest

* add ingest RAG endpoints

* add test doc

* add parse_rag_ingest_request

* update endpoints

* docs add docs for new RAG API

* fix qa check

* fix linting

* docs ficx

* docs

* add max depth checks

* docs anthropic
2025-11-25 17:54:29 -08:00
Kerem Turgutlu 8637d74e17 include server_tool_use in streaming usage (#16826)
* include server_tool_use in streaming usage

* add test
2025-11-25 14:50:17 -08:00
Ishaan Jaff be712908a3 [Feat] Add OpenAI compatible bedrock imported models. - qwen etc (#17097)
* test_bedrock_openai_imported_model

* AmazonBedrockOpenAIConfig

* add openai route for bedrock

* docs fix

* fix code qa check
2025-11-25 12:20:39 -08:00
Sameer Kankute 67622fb040 Add day 0 support for anthropic new feat (#17091)
* Added tool search support for anthropic

* Add programtic tool calling support

* Add tool use input examples support

* Add anthropic effort param support

* Add anthropic effort param support

* Add blog for new features

* fix mypy and lint errors

* fix mypy and lint errors

* fix mypy and lint errors

* fix mypy and lint errors

* Add better handling

* Add better handling
2025-11-25 11:28:47 -08:00
Sameer Kankute 3249f6dd2d Merge pull request #17070 from BerriAI/litellm_add_vertex_ai_image_support
Add vertex ai image gen support for both gemini and imagen models
2025-11-26 00:04:03 +05:30
Sameer Kankute 83a9dcd2d2 Merge pull request #16886 from BerriAI/litellm_anthopic_azure_support
Added support for azure anthopic models via chat completion
2025-11-26 00:03:52 +05:30
Sameer Kankute 59bcf079fb Merge pull request #17078 from BerriAI/litellm_add_search_logging
Add search API logging and cost tracking in LiteLLM Proxy
2025-11-25 23:59:41 +05:30
Sameer Kankute 2e50db81a5 Merge pull request #17071 from BerriAI/litellm_azure_gpt_5_reasoning
Fix `reasoning_effort="none"` not working on Azure for GPT-5.1
2025-11-25 23:59:25 +05:30
Krish Dholakia 00e17c81a1 Add enforce user param functionality (#17088)
* feat: Add reject_metadata_tags to proxy config

Co-authored-by: krrishdholakia <krrishdholakia@gmail.com>

* Refactor: Rename reject_metadata_tags to reject_clientside_metadata_tags

Co-authored-by: krrishdholakia <krrishdholakia@gmail.com>

---------

Co-authored-by: Cursor Agent <cursoragent@cursor.com>
2025-11-25 09:36:24 -08:00
Sameer Kankute e0396e5fa7 Merge pull request #17082 from BerriAI/main
merge main
2025-11-25 18:49:52 +05:30
Sameer Kankute e2f2ccd913 Add tests related messages api 2025-11-25 18:45:51 +05:30
Sameer Kankute 67d69d12b0 Add cost tracking and logging support 2025-11-25 17:14:59 +05:30
Sameer Kankute c149ade6a8 Add tests related to reasoning param none 2025-11-25 13:57:15 +05:30
Sameer Kankute a50083a87b Remove none support from reasoning param 2025-11-25 13:56:30 +05:30
Sameer Kankute 883cfaeeaf Add tests 2025-11-25 13:32:13 +05:30
wcyat 6dcb5425a5 fix(vertex): fix CreateCachedContentRequest enum error (#16965)
* feat: add _fix_enum_types function to remove enums from non-string fields in schema

* test: add test for _fix_enum_types function to validate enum removal from non-string fields
2025-11-24 21:24:29 -08:00
Dmitrii Komarov 046b7efbbe Make Bedrock image generation more consistent (#17021) 2025-11-24 20:58:01 -08:00
Saar wintrov cfd35d3b14 Metadata: fix 401 when audio/transcriptions (#17023)
* Metadata: fix 401 when audio/transcriptions

* check if str, CR fixes
2025-11-24 20:56:27 -08:00
Cesar Garcia 650b18974f fix(gemini): skip thinking config for image models (#17027)
* fix(gemini): exclude image models from automatic thinking_level parameter (#17013)

- gemini-3-pro-image-preview does not support thinking_level parameter
- Added check to skip adding thinkingConfig for models containing "image"
- Fixes BadRequestError: "Thinking level is not supported for this model"
- Only affects automatic default behavior, user can still pass reasoning_effort explicitly

Fixes #17013

* test: add tests for gemini-3 image models thinking_level exclusion

* update docs
2025-11-24 20:54:12 -08:00
yuneng-jiang d2b3ef0667 Add aws_bedrock_runtime_endpoint into Credential Types (#17053) 2025-11-24 20:48:51 -08:00
yuneng-jiang 3f5a34d72c Deleting a user from team deletes key user created for team (#17057) 2025-11-24 20:47:43 -08:00
yuya_matsuba 262fb742d2 Fix: Distinguish permission errors from idempotent errors in Prisma migrations (#17064)
* fix: distinguish permission errors from idempotent errors in Prisma migrations

* style: apply Black formatting and fix line length issues
2025-11-24 20:41:44 -08:00
Raghav Jhavar bd8196f982 (fix) propagate x-litellm-model-id in responses (#16986)
* propagate model id on errors too

* make it work for messages and streaming

* fix

* cleanup

* cleanup

* final

* cleanup

* clean up method name and fix responses api streaming

* remove comment
2025-11-24 20:40:43 -08:00
Sameer Kankute 282ac87617 Add temperature support for 5.1 models (#17011) 2025-11-24 18:54:22 -08:00
Sameer Kankute fc219c7db8 Integrate eleven labs text-to-speech (#16573)
* Add elevenlaps tts support

* fix mypy error

* add simple usage in docs
2025-11-24 18:49:30 -08:00
Sameer Kankute 35bfcac3bc Add header forwarding in embedding (#16869) 2025-11-24 18:48:10 -08:00
Sameer Kankute c6fbdc7dc5 fix bedrock passthrough auth issue (#16879) 2025-11-24 18:44:59 -08:00
Sameer Kankute 3b6c170739 Fix the azure auth format for videos (#17009)
* fix the azure auth in correct format

* Add litellm param in validate_environment method

* fix lint errors
2025-11-24 17:40:55 -08:00
Sameer Kankute 629404a100 Add cost tracking for cohere embed passthrough endpoint (#17029)
* Add cost tracking for cohere embed passthrough endpoint

* update passthrough code

* update passthrough code

* fixed lint and mypy errors
2025-11-24 17:39:26 -08:00
Ishaan Jaff 4e195d639e [Feat] New API - Claude Skills API (Anthropic) (#17042)
* init readme

* init BaseSkillsAPIConfig

* init types for Skills APIs

* add feat: add create, list, retrieve skills

* add base skills config

* add BaseSkillsAPIConfig

* add get_provider_skills_api_config

* init skills

* add ANTHROPIC_SKILLS_API_BETA_VERSION

* init skills APIs

* working list, get skills

* working e2e skills API anthropic API

* add _prepare_skill_multipart_request

* add skills routes to llm api routes

* router _initialize_skills_endpoints

* add fix skills endpoints

* add convert_upload_files_to_file_data

* fix routing skills endpoints

* fix route llm request

* Potential fix for code scanning alert no. 3806: Clear-text logging of sensitive information

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

* Potential fix for code scanning alert no. 3809: Clear-text logging of sensitive information

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

* fix ruff checks

* test_initialize_skills_endpoints

* fix claude skills mypy linting errors

---------

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
2025-11-24 15:01:40 -08:00
Lior Drihem 62b84d6aad Prompt security litellm (#16365)
* add prompt security guardrails provider

* cosmetic

* small

* add file sanitization and update context window

* add pdf and OOXML files support

* add system prompt support

* add tests and documentation

* remove print

* fix PLR0915 Too many statements (96 > 50)

* cosmetic

* fix mypy error

* Fix failed tests due to naming conflict of responses directory with same-named pip package

* Fix mypy error: use 'aembedding' instead of 'embeddings' for async embedding call type

* Fix: Install enterprise package into Poetry virtualenv for tests

The GitHub Actions workflow was installing litellm-enterprise to system Python
using 'python -m pip install -e .', but tests run in Poetry's virtualenv using
'poetry run pytest'. This caused ImportError for enterprise package types.

Changed to 'poetry run pip install -e .' so the package is available in the
same virtualenv where pytest executes.

Fixes enterprise test collection errors in GitHub Actions CI.

* Move Prompt Security guardrail tests to tests/test_litellm/

Per reviewer feedback, move test_prompt_security_guardrails.py from
tests/guardrails_tests/ to tests/test_litellm/proxy/guardrails/ so
it will be executed by GitHub Actions workflow test-litellm.yml.

This ensures the Prompt Security integration tests run in CI.

---------

Co-authored-by: Ori Tabac <oritabac@prompt.security>
Co-authored-by: Vitaly Neyman <vitaly@prompt.security>
2025-11-24 11:44:20 -08:00
John Lathouwers 61fed95f8c OCI Provider: Fix pydantic validation errors during tool call with streaming. (#16899)
* logic to handle missing required fields in OCI streaming tool calls

* Fix test mocks
2025-11-23 22:03:44 -08:00
yuneng-jiang adfdcf1d61 [Fix] UI - Hide Default Team Settings From Proxy Admin Viewers (#16900)
* Add fallback in sort to prevent NoneType and str comparison

* Hide Default Team Settings from Proxy Admin Viewers

---------

Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>
2025-11-23 22:01:38 -08:00
yuneng-jiang 013dcd837f Change provider create fields to JSON (#16985) 2025-11-23 21:57:22 -08:00
soo-jin.kim a2a45ce8c9 fix: prevent duplicate spend logs in Responses API for non-OpenAI providers (#16992)
* fix: prevent duplicate spend logs in Responses API for non-OpenAI providers

Fixes #15740

This fixes a logging duplication bug where using kwargs.pop() removed
the litellm_logging_obj before passing kwargs to internal acompletion()
calls, causing duplicate spend log entries for providers without native
Responses API support (Anthropic, Gemini, etc).

By changing from pop() to get(), the logging object is preserved and
reused across the internal completion call, preventing duplicate entries
and maintaining correct cost tracking.

* test: add test for logging object preservation in responses API

Verify that litellm_logging_obj is preserved in kwargs when calling
responses(), ensuring no duplicate spend log entries are created.
2025-11-23 21:57:01 -08:00
prawaan 7cc92d1ced fix(vertex_ai): handle global location in context caching (#16997)
- Add conditional check for 'global' vertex_location
- Use aiplatform.googleapis.com (no prefix) for global endpoint
- Apply fix to both v1 and v1beta1 APIs
- Matches existing behavior in regular completion calls

Fixes context caching 404 errors when using global location.
Regular completion already handles global correctly, this brings
context caching in line with that behavior.

Related: #11190, #9234

Co-authored-by: prawaan-singh <prawaan.singh@thoughtspot.com>
2025-11-23 21:55:08 -08:00
YutaSaito b72b49757e feat: add backend support for OAuth2 auth_type registration via UI (#17006) 2025-11-23 21:52:18 -08:00
YutaSaito f0b10b854b chore: remove unused MCP_PROTOCOL_VERSION_HEADER_NAME constant (#17008) 2025-11-23 21:51:11 -08:00
YutaSaito 06f2ecef42 feat: tool permission argument check (#16982) 2025-11-22 19:21:25 -08:00
Krrish Dholakia cc5ecfd479 test: fix tests 2025-11-22 16:50:50 -08:00
Krish Dholakia 270d23939e (fix) litellm_logging.py: fix mcp tool call response logging + (fix) responses_bridge: remove unmapped param error mid-stream - allows gpt-5 web search to work via responses api in .completion() (#16946)
* fix: fix getting mcp servers

* fix(litellm_logging.py): handle list objects for final response in standard logging payload

Fixes issue where mcp tool call response wouldn't show up

* fix(litellm_responses_transformation/): remove invalid item error for unmapped objects - breaks stream and there's no real value to this as outside of a few of them, not all can be mapped to chat completions

resolves error for web search calls via chat completions to responses api
2025-11-22 15:48:32 -08:00
Krish Dholakia b9f2cc1c98 Model Armor - Logging guardrail response on llm responses (#16977)
* Litellm dev 11 22 2025 p1 (#16975)

* fix(model_armor.py): return response after applying changes

* fix: initial commit adding guardrail span logging to otel on post-call runs

sends it as a separate span right now, need to include in the same llm request/response span

* fix(opentelemetry.py): include guardrail in received request log + set input/ouput fields on parent otel span instead of nesting it

allows request/response to be seen easily on observability tools

* fix(model_armor.py): working model armor logging on post call events

* fix: fix exception message

* fix(opentelemetry.py): add backwards compatibility for litellm_request

allow users building on the spec change to use previous spec
2025-11-22 15:44:28 -08:00
Krish Dholakia e11d34eb69 Permission Management - disable global guardrails by key/team (#16983)
* feat(teams.py): param for disabling guardrails by team

allows use-case where you don't run global guardrails for team - only run team-specific guardrails

* feat(custom_guardrail.py): add support for disabling global guardrails

only run guardrails requested for in the request/key/team

* feat: support adding disable_global_guardrails to metadata if present in key/team metadata

* feat(create_key_button.tsx): new disable global guardrails field

* feat(key_edit_view.tsx): support disabling global guardrails on key edit

* feat(teams.tsx): add disable global guardrails on create team on UI

* feat(team_info.tsx): allow disabling global guardrails on team update
2025-11-22 15:43:50 -08:00
Ishaan Jaffer 1fc3baf864 e2e ui testing fixes 2025-11-22 14:30:00 -08:00
yuneng-jiang 825f61b452 Remove expired proxy admin keys from cache (#16894) 2025-11-22 14:23:28 -08:00
yuneng-jiang 22fd323d6b Calling team/permissions_list and team/permissions_update now returns 404 with non-existent team (#16835) 2025-11-22 14:21:58 -08:00