litellm

mirror of https://github.com/tiennm99/litellm.git synced 2026-06-18 07:33:58 +00:00

Author	SHA1	Message	Date
Shivam Rawat	74a9dbd8ac	Merge pull request #23413 from BerriAI/docs_policyy_builder v1.82.0 promote to stable	2026-03-11 19:19:15 -07:00
shivam	0c7c0a93ed	v1.82.0 promote to stable	2026-03-11 19:15:21 -07:00
Shivam Rawat	50e68dc387	Merge pull request #23410 from BerriAI/docs_policyy_builder Docs policyy builder	2026-03-11 19:03:55 -07:00
Mr. Ånand	05fba27b0c	Add Retool Assist tutorial with LiteLLM Proxy to docs (#21952 ) * docs: add Retool Assist integration guide - Add tutorials/retool_assist.md with setup instructions - Add screenshots: Resources screen, Custom Provider config, resource query test - Add retool_assist to AI Tools sidebar Co-authored-by: Cursor <cursoragent@cursor.com> * docs: refine Retool Assist guide layout Co-authored-by: Cursor <cursoragent@cursor.com> * docs: update Retool Assist guide with new screenshots - Add Resources screen after step 1 - Update AI category and LiteLLM config modal images Co-authored-by: Cursor <cursoragent@cursor.com> * Apply suggestion from @greptile-apps[bot] Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * updated blank line * Added video & gifs Made-with: Cursor --------- Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>	2026-03-11 13:17:08 -07:00
Chesars	f9a538b583	fix(docs): close unclosed code block before Examples heading	2026-03-11 14:15:54 -03:00
Cesar Garcia	274bf42493	Update docs/my-website/docs/providers/openai.md Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>	2026-03-11 14:13:56 -03:00
Chesars	9e7a6a73ed	docs: remove duplicate gpt-5.4 tip block	2026-03-11 13:48:47 -03:00
Chesars	01a6c707a3	docs: restore gpt-5.4 reasoning_effort tip lost during rebase	2026-03-11 13:48:07 -03:00
Chesars	d0d09e037e	docs: clarify when to use openai/responses/ prefix for built-in tools The existing documentation for the Responses API bridge only showed examples with models that have `mode: responses` (like o3-deep-research), which work automatically. This update clarifies that models with `mode: chat` (like gpt-4o, gpt-5) require the `openai/responses/` prefix to use built-in tools like web_search_preview. Changes: - Explain the `mode` property from model_prices_and_context_window.json - List models with mode: responses vs mode: chat - Add example showing the common error and how to fix it - Add SDK example using the prefix with gpt-4o - Update proxy example with both automatic and prefix-based configs - Fix invalid trailing comma in original JSON example	2026-03-11 13:47:57 -03:00
michelligabriele	24ad510617	feat(mcp): add AWS SigV4 auth support in UI and fix credential merge on edit (#23282 )	2026-03-11 09:43:28 -07:00
Sameer Kankute	20980f6c26	Merge pull request #23322 from BerriAI/litellm_gemini_embedding_2_support [Feat]: Add support for gemini embedding 2 preview	2026-03-11 19:30:09 +05:30
michelligabriele	db4cd87979	docs(web_fetch): add newer Claude models to supported models list (#23251 ) Add Claude Opus 4.6, Sonnet 4.6, Opus 4.5, Sonnet 4.5, and Haiku 4.5 to the web fetch supported models documentation. These models were missing from the list despite supporting the web_fetch tool.	2026-03-11 19:09:28 +05:30
Sameer Kankute	f243e5615f	Merge branch 'main' into litellm_oss_staging_03_10_2026	2026-03-11 18:50:03 +05:30
Sameer Kankute	43217c8a4b	Merge branch 'main' into litellm_oss_staging_03_10_2026	2026-03-11 18:32:17 +05:30
Sameer Kankute	3dab62023c	Merge branch 'main' into litellm_oss_staging_03_04_2026	2026-03-11 18:31:20 +05:30
Sameer Kankute	1c144fc896	Add embedding model documentation	2026-03-11 11:02:49 +05:30
Cesar Garcia	260c37d670	Merge pull request #21398 from Chesars/feat/openai-like-responses-api feat(openai_like): add Responses API support to JSON providers	2026-03-11 00:15:06 -03:00
Chesars	a5f0e1a741	docs: expand aliases section in add_model_pricing guide Add usage example with concrete model entry, explanation of load-time expansion, and cross-reference to model_alias_map to clarify the difference between the two features.	2026-03-10 22:42:18 -03:00
Chesars	c0dbff21a6	feat: add model cost aliases expansion support	2026-03-10 22:42:18 -03:00
Cesar Garcia	3d2df7e8b5	Revert "feat: add model_cost aliases expansion support"	2026-03-10 22:39:19 -03:00
shivam	864bcd7c57	policy builder docs	2026-03-10 18:16:25 -07:00
shivam	0bf9945969	docs: fix REDIS_CLUSTER_NODES example formatting Made-with: Cursor	2026-03-10 18:11:19 -07:00
shivam	86d02d107a	docs update	2026-03-10 18:08:30 -07:00
Krish Dholakia	8bcc8fe1e2	Rename 'Team-Based Guardrails' to 'Team Bring-Your-Own Guardrails' (#23307 ) Co-authored-by: Cursor Agent <cursoragent@cursor.com>	2026-03-10 17:49:09 -07:00
Shivam Rawat	a71ba39b78	Revert "policy builder"	2026-03-10 15:38:59 -07:00
Cesar Garcia	5f5e47fc24	Merge pull request #22138 from Chesars/fix/unify-finish-reason-mapping fix(completion): unify finish_reason mapping to OpenAI-compatible values	2026-03-10 19:29:04 -03:00
Cesar Garcia	3bf91ed9fe	Merge pull request #23258 from Chesars/docs/openai-tool-search docs(responses): add tool_search & namespaces docs for gpt-5.4	2026-03-10 18:51:16 -03:00
Chesars	d501c33a9d	feat(types): expose native_finish_reason in provider_specific_fields When a provider's finish_reason is mapped to a different OpenAI-compatible value (e.g. "MALFORMED_FUNCTION_CALL" → "stop"), the original value is now preserved in choices[].provider_specific_fields["native_finish_reason"]. This allows agent loops to distinguish between different stop conditions without breaking the unified OpenAI-compatible finish_reason mapping. Also returns a defensive copy from get_finish_reason_mapping() to prevent accidental mutation of the global _FINISH_REASON_MAP.	2026-03-10 18:43:51 -03:00
Chesars	e7a9c1e156	docs(responses): remove unused json import from tool search example	2026-03-10 18:41:54 -03:00
Cesar Garcia	6bca746d23	Merge pull request #21601 from Chesars/feat/model-cost-aliases feat: add model_cost aliases expansion support	2026-03-10 18:07:23 -03:00
Cesar Garcia	6a3b029066	Merge pull request #23271 from Chesars/docs/gpt54-reasoning-tools-limitation docs(openai): document gpt-5.4 reasoning_effort + tools limitation	2026-03-10 17:57:31 -03:00
milan-berri	9100e16776	docs: pip venv upgrade workflow (#23290 ) * docs: add pip/venv upgrade workflow guide - Add comprehensive guide for upgrading LiteLLM proxy via pip - Covers Prisma client regeneration and DB migration steps - Includes verification commands and troubleshooting tips - Links to existing Prisma migration troubleshooting doc * docs: clarify Python version in prisma generate command - Update example to show multiple Python versions (3.11, 3.12, 3.13) - Make it clear LiteLLM supports multiple Python versions, not just 3.11 * docs: emphasize venv activation before running commands - Add info box at top reminding users to activate venv - Include venv activation step before starting proxy (both options) - Add Windows activation command for cross-platform clarity - Make it clear all commands assume activated venv * docs: add pip_venv_upgrade to sidebar navigation - Add new page to Troubleshooting section in sidebars.js - Positioned after Performance/Latency category and before rollback - Makes the upgrade guide discoverable through docs navigation * docs: show explicit --schema flag in prisma migrate deploy - Add explicit --schema path to Option B migration command - Remove ambiguous instruction about running from litellm_proxy_extras - Include path variable guidance for clarity - Makes the command immediately runnable without directory navigation * Update docs/my-website/docs/troubleshoot/pip_venv_upgrade.md Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Update docs/my-website/docs/troubleshoot/pip_venv_upgrade.md Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * fix: close code block and add missing section in pip_venv_upgrade.md * docs: define schema-path placeholder in verification section --------- Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>	2026-03-10 13:53:54 -07:00
Shivam Rawat	97c92cc84e	Merge pull request #23287 from BerriAI/docs_flow_builder policy builder	2026-03-10 13:44:18 -07:00
Shivam Rawat	592232e835	Update docs/my-website/docs/proxy/guardrails/guardrail_pipeline_flow_builder.md Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>	2026-03-10 13:42:46 -07:00
Shivam Rawat	f3844d8356	Update docs/my-website/docs/proxy/guardrails/guardrail_pipeline_flow_builder.md Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>	2026-03-10 13:42:36 -07:00
Chesars	926a0df9b0	Merge main into feat/openai-like-responses-api Resolve conflict in perplexity/responses/transformation.py by keeping the simplified ~50 line version (PR's goal) instead of main's ~410 line version. Added supports_native_websocket() -> False from main.	2026-03-10 17:36:32 -03:00
Chesars	95ef97bd34	docs: expand aliases section in add_model_pricing guide Add usage example with concrete model entry, explanation of load-time expansion, and cross-reference to model_alias_map to clarify the difference between the two features.	2026-03-10 16:54:55 -03:00
Jason Roberts	70fca22f68	feat(panw-prisma-airs): PANW Prisma AIRS guardrail with apply_guardrail support (#22999 ) * feat(panw-prisma-airs): PANW Prisma AIRS guardrail with apply_guardrail support * fix(panw): honor masking and fallback behavior * fix(panw): clean up apply_guardrail MCP metadata handling * fix(panw): clean up apply_guardrail MCP metadata handling * fix(panw): harden apply_guardrail edge cases * fix(panw): apply MCP masked data on allow responses * fix(panw): scan latest developer message in anthropic mode * fix(panw): restore legacy user-only pre-call scanning * fix(panw): record apply_guardrail in applied guardrails header * fix(panw): scan developer role in legacy pre-call path * fix(panw): harden SSE parsing and narrow MCP name fallback * fix(panw): harden streaming attr lookup and document dual scans * fix(panw): fail closed on permanent 4xx and cover streaming observability	2026-03-10 12:31:31 -07:00
shivam	fa330ed96b	policy builder	2026-03-10 12:09:00 -07:00
Chesars	7ccb14cab4	feat: add model cost aliases expansion support	2026-03-10 15:57:23 -03:00
michelligabriele	ffc89e4ef6	fix(mcp): add AWS SigV4 auth for Bedrock AgentCore MCP servers (#22782 ) * fix(mcp): add AWS SigV4 auth for Bedrock AgentCore MCP servers Add aws_sigv4 auth type to MCP client via httpx.Auth subclass that signs each request with SigV4 using botocore. Enables mcp_servers config to connect to AgentCore-hosted MCP servers. * docs(mcp): add AWS SigV4 auth documentation for Bedrock AgentCore Add dedicated docs page for configuring MCP servers with AWS SigV4 authentication, update MCP overview with aws_sigv4 auth type and config example, and link from Bedrock AgentCore provider docs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(mcp): address Greptile review — requires_request_body, full header signing, health check - Add requires_request_body = True to MCPSigV4Auth so httpx buffers the request body before calling auth_flow (prevents empty body hash for streaming requests) - Pass all request headers to AWSRequest for canonical SigV4 signing instead of only Content-Type - Exclude aws_sigv4 from health check skip logic since it has its own credential fields (not authentication_token) - Fix docs: mark aws_access_key_id/aws_secret_access_key as optional (falls back to boto3 credential chain) - Add test for requires_request_body flag Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>	2026-03-10 11:11:20 -07:00
Chesars	d232d0de6c	docs(openai): document gpt-5.4 reasoning_effort + tools limitation Add tip boxes explaining that gpt-5.4 does not support reasoning_effort with function tools in /v1/chat/completions, and that the responses bridge (openai/responses/gpt-5.4) should be used instead.	2026-03-10 12:04:55 -03:00
Chesars	a6cb510703	merge: resolve conflicts between main and litellm_oss_staging_03_04_2026 Resolved 14 file conflicts: - image_edits.md: combined OpenRouter + Black Forest Labs providers - utils.py: kept staging's message-level cache_control check - networking.tsx: kept export on 4 tool interfaces - tool_management_endpoints.py: kept ToolOutputPolicy import - Accepted main's version for: schema.prisma, a2a_protocol, mcp_server, _types.py, auth_checks.py, db_spend_update_writer, endpoints.py, spend_tracking_utils, a2a_endpoints, model_prices backup	2026-03-10 10:45:04 -03:00
Chesars	8fac04208d	docs(responses): add tool_search bridge examples for chat completions Add examples showing tool_search with namespaces via the chat completions bridge (openai/responses/ prefix) for both SDK and proxy.	2026-03-10 10:27:40 -03:00
Chesars	bec12db635	docs(responses): add tool_search & namespaces section for gpt-5.4 Add documentation for OpenAI's tool_search feature (Responses API) with SDK and Proxy examples showing namespace-based deferred tool loading. Closes #23206.	2026-03-10 09:50:08 -03:00
shivam	5534f77314	doc improvement	2026-03-09 15:39:27 -07:00
yuneng-jiang	8ecac84789	Revert "feat(proxy): add Prisma DB pool and engine health metrics to Promethe…" This reverts commit `0bb26c3f1b`.	2026-03-09 14:55:11 -07:00
yuneng-jiang	b4e78ac7b4	Merge branch 'main' into litellm_doc_max_budget_per_session_ttl	2026-03-09 14:41:41 -07:00
yuneng-jiang	ea4e2bda8f	Document LITELLM_MAX_BUDGET_PER_SESSION_TTL env var Add missing env var to config_settings.md to fix test_env_keys CI check. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-09 14:40:05 -07:00
ohadgur	0bb26c3f1b	feat(proxy): add Prisma DB pool and engine health metrics to Prometheus (#22655 ) * feat(proxy): add Prisma DB pool and engine health metrics to Prometheus Add a PrismaMetricsCollector that periodically queries pg_stat_activity and the Prisma engine process to expose connection pool and engine health as Prometheus gauges/counters. Auto-enabled when prometheus_system is in service_callback. New metrics: - litellm_db_pool_active_connections (Gauge) - litellm_db_pool_idle_connections (Gauge) - litellm_db_pool_total_connections (Gauge) - litellm_db_pool_waiting_connections (Gauge) - litellm_db_engine_up (Gauge) - litellm_db_engine_restarts_total (Counter) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address Greptile review feedback - Only increment engine_restarts counter on heavy reconnects (engine actually dead), not lightweight network-blip reconnects - Fix potential KeyError in _get_or_create_gauge/counter fallback path when REGISTRY._names_to_collectors is absent - Rename litellm_db_pool_waiting_connections to litellm_db_pool_lock_waiting_connections to clarify it measures lock contention, not pool slot queuing Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: warn when prometheus_system enabled but watchdog disabled Log a warning when users have prometheus_system in service_callback but PRISMA_HEALTH_WATCHDOG_ENABLED=false, since DB pool and engine metrics won't be collected in that configuration. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * ci: retrigger CI checks Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor: use labeled gauge for DB pool connection metrics Replace 3 separate pool gauges (active, idle, total) with a single `litellm_db_pool_connections` gauge using a `state` label. This is more Prometheus-idiomatic and exposes all pg_stat_activity states (active, idle, idle in transaction, etc.) without ambiguity about what "total" includes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address Greptile review — stale labels and fallback re-registration - Zero out known pg_stat_activity states that are absent from the current query result, preventing stale gauge values from persisting. - Simplify _get_or_create_gauge/counter by removing the fallback loop that could re-register an already-registered metric (ValueError). - Add test for stale label clearing across collection cycles. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: include "unknown" in _PG_STATES for stale label clearing Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: collect immediately on start and consolidate into single query - Move sleep to end of loop so metrics appear on /metrics immediately after startup instead of after a 30s delay. - Combine pool state and lock waiting queries into a single SQL query using conditional aggregation, halving per-cycle DB overhead. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: prevent tight spin loop on collection error Move asyncio.sleep outside the try/except so it always executes even when _collect_engine_health() or _collect_pool_metrics() raises. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: add multiprocess_mode to _get_or_create_gauge initialization - Include `multiprocess_mode` parameter to properly support multiprocessing in Gauge creation. - Ensure consistent behavior for labeled and unlabeled Gauges. * fix: handle invalid env var and document watchdog prerequisite - Add try/except ValueError for PRISMA_METRICS_COLLECTION_INTERVAL_SECONDS to prevent proxy startup crash on non-numeric values (e.g. "30s") - Document that DB metrics require both prometheus_system callback and PRISMA_HEALTH_WATCHDOG_ENABLED=true Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: use defensive null coalescing for query_raw row values Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test: add invalid env var fallback test and fix mock signature - Add test for non-numeric PRISMA_METRICS_COLLECTION_INTERVAL_SECONDS - Add **kwargs to mock _patched_get_or_create_gauge for forward compat Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-09 08:49:46 -07:00

1 2 3 4 5 ...

5844 Commits