* feat: add GMI Cloud provider support
Add GMI Cloud as an OpenAI-compatible provider with:
- Provider configuration in providers.json
- Documentation page with usage examples
- Model pricing for 16 models (Claude, GPT, DeepSeek, Gemini, etc.)
- Sidebar entry for docs navigation
* Add gmi_cloud to provider_endpoints_support.json
Add provider entry to pass CI validation check that ensures all
providers in openai_like/providers.json are documented.
* Fix provider key: gmi_cloud -> gmi
Match the provider key with providers.json
---------
Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>
* add search provider for brave search api
Introduces a minimal implementation of the Brave Search API as a search provider. Additionally, this PR introduces a test file to ensure the provider works properly, and numerous other smaller changes (e.g., changes to docs to mention the new option).
* Update transformation.py
Add documentation explaining the difference between model formats:
- `gemini/model` → Gemini API (simple API key)
- `vertex_ai/model` → Vertex AI (GCP credentials)
- `model` (no prefix) → defaults to Vertex AI
This addresses user confusion when models without prefix require
GCP authentication instead of simple API key auth.
Ref #8424
Update Pillar Security integration to use the generic_guardrail_api
instead of the dedicated pillar guardrail type. This aligns with
the Generic Guardrail API specification introduced in previous PRs.
Changes:
- Rewrite pillar_security.md with new generic_guardrail_api config
- Add Pillar Security example to generic_guardrail_api.md
- Add Pillar Security to quick_start.md guardrails examples
Related PRs: #17175, #18647, #18932, #19023
The documentation incorrectly used `vertex_region` as the parameter name,
but the actual parameter expected by LiteLLM is `vertex_location` as defined
in VertexPassThroughCredentials and other type definitions.
Co-authored-by: Claude <noreply@anthropic.com>
* fix: Avoid attaching tool calls when a call_id already exists
* fix: Prevent MCP responses from reviving past tool calls via previous_response_id
* test: Parametrize MCP streaming test to cover OpenAI and Anthropic models
* test: Fail MCP streaming test when LiteLLM logs errors during follow-up calls
* test: Let MCP tool-execution mock accept new kwargs for streaming tests
* chore: fix lint error
* docs: Add Google Workload Identity Federation (WIF) documentation to Vertex AI (#19320)
- Added new section documenting WIF support for Vertex AI authentication
- Included SDK and Proxy configuration examples
- Added sample WIF credentials file format for AWS federation
- Mentioned LLM Credentials UI as an alternative for credential management
- Added link to Google Cloud WIF documentation
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
* fix(bedrock): deduplicate tool calls in assistant history (#15178)
* fix(types): add missing Set import to factory.py
---------
Co-authored-by: Yuta Saito <uc4w6c@bma.biglobe.ne.jp>
Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: YutaSaito <36355491+uc4w6c@users.noreply.github.com>
* feat(gemini): add opt-in support for responseJsonSchema
Add support for Gemini's native responseJsonSchema parameter which uses
standard JSON Schema format instead of OpenAPI-style responseSchema.
Benefits of responseJsonSchema (Gemini 2.0+ only):
- Standard JSON Schema format (lowercase types)
- Supports additionalProperties for stricter validation
- Better compatibility with Pydantic's model_json_schema()
- No propertyOrdering required
Usage:
```python
response_format={
"type": "json_schema",
"json_schema": {"schema": {...}},
"use_json_schema": True # opt-in
}
```
This is backwards compatible - existing code continues to use
responseSchema by default.
Closes#16340
* docs: add documentation for use_json_schema parameter
Document the new use_json_schema option for Gemini 2.0+ models
in the JSON Mode documentation.
* refactor(gemini): use responseJsonSchema by default for Gemini 2.0+
Remove opt-in flag `use_json_schema` and automatically detect model version:
- Gemini 2.0+: uses responseJsonSchema (standard JSON Schema, supports additionalProperties)
- Gemini 1.5: uses responseSchema (OpenAPI format, legacy)
This follows LiteLLM's philosophy of abstracting provider differences -
users write the same code regardless of model version.
* test(vertex): update json_schema tests to accept both responseSchema formats
Gemini 2.x+ uses responseJsonSchema while Gemini 1.x uses responseSchema.
Update tests to accept both formats since litellm now auto-selects based
on model version.
* docs: update UI contributing guide with correct commands
- Replace outdated proxy_cli.py command with poetry run litellm
- Add config.yaml example with required settings
- Clarify that UI comes pre-built in the repo
- Add two development options: Build Mode and Dev Mode (hot reload)
- Note about redirect issues in Dev Mode
* docs: add hot reload login flow and PR submission section
- Document the 3000 -> 4000 -> 3000 login flow for hot reload
- Reorder: Hot Reload as Option A, Build Mode as Option B
- Add section 4 on submitting PRs
- Add note that UI changes don't require tests
* Update login flow navigation URL in contributing.md
- Added new section documenting WIF support for Vertex AI authentication
- Included SDK and Proxy configuration examples
- Added sample WIF credentials file format for AWS federation
- Mentioned LLM Credentials UI as an alternative for credential management
- Added link to Google Cloud WIF documentation
Co-authored-by: Cursor Agent <cursoragent@cursor.com>