litellm

mirror of https://github.com/tiennm99/litellm.git synced 2026-06-17 18:48:36 +00:00

Author	SHA1	Message	Date
Sameer Kankute	fd7ff0f269	fix(hosted_vllm): normalize custom tools for chat completions (#25763 ) * fix(hosted_vllm): normalize custom tools for chat completions Convert custom tool definitions into OpenAI function tools before forwarding hosted_vllm chat requests to avoid provider-side validation failures. Add a regression test and include a local curl verification screenshot. Made-with: Cursor * Fix black issue * Fix hosted vllm custom tool schema fallback * fix black --------- Co-authored-by: Cursor Agent <cursoragent@cursor.com>	2026-05-05 17:27:02 -07:00
Emmanuel Acheampong	f8ba2d750b	fix(crusoe): fix streaming doc model typo and add supports_vision for Gemma 3 - Streaming example referenced Llama-3.1 instead of Llama-3.3 - Add supports_vision: true for gemma-3-12b-it in both JSON files, matching other providers (bedrock, novita)	2026-05-01 17:27:52 +05:30
Emmanuel Acheampong	e08b8ef7b6	fix(crusoe): split Custom API Base docs into two independent examples The previous example set CRUSOE_API_BASE via env var and also passed api_base= in the same call, making it look like both were required. They are independent alternatives.	2026-05-01 17:27:52 +05:30
Emmanuel Acheampong	6e1e6244cf	fix(crusoe): remove trailing slashes from API base URLs and fix list indentation Trailing slashes on custom API base examples cause double-slash in get_complete_url. Also fixes inconsistent list indentation in test_crusoe_models_configuration.	2026-05-01 17:27:52 +05:30
Emmanuel Acheampong	9039eb1898	fix(crusoe): fix docs trailing slash, test state pollution, missing __init__.py - Remove trailing slash from docs Base URL to match providers.json - Wrap model_cost mutations in try/finally to prevent test state leakage - Add missing __init__.py to crusoe test package	2026-05-01 17:27:52 +05:30
Emmanuel Acheampong	caa0db3843	adding crusoe to litellm	2026-05-01 17:27:34 +05:30
clyang	3f5e28fcdc	Adding Cycraft XecGuard integration (#26011 )	2026-04-27 08:58:38 +05:30
Yuneng Jiang	c35f3a50ae	docs: remove docs/my-website, point contributors to litellm-docs The documentation source has moved to a separate repository, BerriAI/litellm-docs, served at docs.litellm.ai. This PR removes docs/my-website/ from this repo and updates README.md, AGENTS.md, and CLAUDE.md to direct doc contributions to the new repo. Also fixes a broken relative link in litellm/integrations/levo/README.md. The existing CI symlink in .github/workflows/test-code-quality.yml (which clones litellm-docs and symlinks docs/my-website to it for tests/documentation_tests/*) continues to work without change.	2026-04-24 14:17:46 -07:00
shin-berri	ca443a957c	Merge pull request #24374 from BerriAI/litellm_staging_03_22_2026 Litellm staging 03 22 2026	2026-04-24 12:38:47 -07:00
yuneng-jiang	9dd7e37530	Merge pull request #25359 from BerriAI/litellm_Sameerlite/openai-chat-to-responses feat(openai): add route_all_chat_openai_to_responses global flag	2026-04-24 12:06:19 -07:00
Sameer Kankute	a0c52cda6e	docs(proxy): clarify x-litellm-model-group vs provider model id (#25497 ) Made-with: Cursor	2026-04-24 16:59:03 +00:00
yuneng-jiang	8dda834cf9	Merge pull request #25842 from BerriAI/litellm_docs-gemini3-thinking-defaults docs(gemini): Gemini 3 thinking_level defaults and release note	2026-04-24 09:45:24 -07:00
Yuneng Jiang	4d5c3476a4	Merge remote-tracking branch 'origin/litellm_internal_staging' into litellm_docs-gemini3-thinking-defaults	2026-04-24 09:40:04 -07:00
Yuneng Jiang	b2afc70080	Merge remote-tracking branch 'origin/litellm_internal_staging' into litellm_docs-code-block-padding-parity	2026-04-24 09:39:06 -07:00
Sameer Kankute	e1466be825	feat(pricing): gemini-embedding-2 GA cost map, blog, and test (#26391 ) * feat(pricing): gemini-embedding-2 GA cost map, blog, and test - Add model_prices entries for gemini-embedding-2 (Gemini + Vertex paths) - Add docs blog gemini_embedding_2_ga with LiteLLM proxy curl examples - Add test_gemini_embedding_2_ga_in_cost_map in test_utils Made-with: Cursor * Fix greptile reviews	2026-04-24 09:28:18 -07:00
Cesar Garcia	8bd58fb82d	Merge branch 'litellm_internal_staging' into litellm_staging_03_22_2026	2026-04-24 13:12:19 -03:00
Sameer Kankute	1720903bda	Merge pull request #25346 from BerriAI/litellm_Sameerlite/responses-bridge-optin feat(responses): add use_chat_completions_api flag for openai/ models with custom api_base	2026-04-24 20:55:22 +05:30
Sameer Kankute	d5449f5b1a	Merge pull request #26300 from BerriAI/litellm_oss_staging_04_22_2026 Litellm oss staging 04 22 2026	2026-04-23 18:53:58 +05:30
Sameer Kankute	e3440baa0c	Merge pull request #25767 from vinhphamhuu-ct/main feat: Expand VideoMetadata support to all Gemini Models.	2026-04-23 17:20:01 +05:30
Sameer Kankute	94288d76a9	Merge pull request #26303 from BerriAI/litellm_internal_staging merge main	2026-04-23 08:30:54 +05:30
Sameer Kankute	f3b80726a7	Merge pull request #26301 from BerriAI/litellm_internal_staging merge main	2026-04-23 08:30:10 +05:30
Cesar Garcia	25c0aa8bfd	Merge pull request #26283 from BerriAI/litellm_internal_staging Sync litellm_staging_03_22_2026 with litellm_internal_staging	2026-04-22 19:55:27 -03:00
Krrish Dholakia	ecd9a83e61	fix(adaptive_router): P2 review items — @updatedAt + snapshot samples - Mark last_updated_at (AdaptiveRouterState) and last_activity_at (AdaptiveRouterSession) with @updatedAt so Prisma refreshes the timestamps on every write. Without this the fields stayed frozen at INSERT time and the last_activity_at index was misleading for any future TTL/eviction logic. Applied to all three schema.prisma copies; no migration SQL change needed (Prisma @updatedAt is a client-side annotation that doesn't touch DDL). - get_state_snapshot: report cell.total_samples instead of alpha+beta for the 'samples' field. The previous value inflated every cell by the COLD_START_MASS prior (e.g. showed 10.0 before any real traffic arrived), which confused operators reading /adaptive_router/.../state. Updated docs + the snapshot test to match. Also fixes two pre-existing merge-break syntax errors in router.py (missing ')' on the AdaptiveRouter TYPE_CHECKING import; truncated async_pre_routing_hook dispatch call for the adaptive router branch) that were masking the rest of the file from the interpreter. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 16:27:01 -07:00
Krrish Dholakia	b6fc75b3ce	Merge branch 'litellm_internal_staging' into litellm_adaptive_routing	2026-04-20 15:28:08 -07:00
Michael-RZ-Berri	4f823cedac	Add supported providers to prompt caching doc (#26124 ) * Add supported providers to prompt caching doc * Move Z.ai / GLM to cache_control marker list * Mark xAI models as supporting prompt caching * Narrow xAI prompt caching flag to models with documented cache pricing * Add prompt caching flag to grok-4, grok-4-0709, grok-4-latest --------- Co-authored-by: Michael Riad Zaky <michaelr@Michaels-MacBook-Air.local>	2026-04-20 15:25:21 -07:00
Krrish Dholakia	fba736ca3c	fix(adaptive_router): 3 P1 review defects - Use 'auto_router/adaptive_router' prefix in example yaml, docs, and README — the old 'adaptive_router/...' and 'openai/gpt-4o-mini' values silently skipped adaptive-router init because detection requires the 'auto_router/adaptive_router' prefix. - Read x-litellm-min-quality-tier from request headers (and the 'min_quality_tier' metadata key as fallback) in async_pre_routing_hook. Previously the documented header was defined but never extracted, so the quality-floor feature was inert. - Evict expired entries from _session_states. The cache grew without bound — added a parallel expiry map (same TTL as _owner_cache) and an opportunistic bulk sweep when the cache crosses a size threshold. - Align adaptive-router migration SQL with Prisma schema: all count columns and the 'clean_credit_awarded' / 'last_processed_turn' fields are NOT NULL in the data model, so the migration now declares them NOT NULL. Fixes test_aaaasschema_migration_check. Tests: 8 new covering header/metadata/precedence/invalid-value paths for min_quality_tier and TTL-based eviction of _session_states. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-20 15:22:18 -07:00
Krrish Dholakia	386f334fee	Prompt Compression - add it to the proxy (#25729 ) * refactor: new agentic loop event hook simplifies how to create logic for tool based multi llm calls * fix: compress - make it work on anthropic input as well * fix(compress.py): working prompt compression for claude code ensures claude code messages can run through proxy easily * docs: add agentic loop hook guide * docs: add agentic_loop_hook to sidebar * fix: fix multiple arguments error * fix: fix tool call loop for compression on streaming /v1/messages * fix: fix linting errors * fix: fix ci/cd errors * feat(litellm_pre_call_utils.py): use claude code session for litellm session id allows claude code logs to be stitched together, making it easy to know they were all part of the same conversation * fix: suppress incorrect mypy warning rE: module * revert: drop PR's changes to litellm/proxy/_experimental/out/ Restores the 34 HTML files under _experimental/out/ to their pre-PR paths (X/index.html -> X.html). All renames are R100 (content unchanged); no other files are touched. * fix: address greptile review comments on PR #25729 - Skip ``kwargs["tools"] = []`` injection when compression is a no-op — Anthropic Messages rejects empty tool arrays on requests that did not originally declare tools. - Move agentic-loop safety guards (fingerprint cycle / max depth) out of the per-callback try/except so they propagate instead of being swallowed by the generic exception handler. Extracted _check_agentic_loop_safety. - Gate generic ``x-<vendor>-session-id`` capture behind the LITELLM_CAPTURE_VENDOR_SESSION_HEADERS env var (off by default) to preserve backwards compatibility; explicit x-litellm-* headers are unaffected. - Fix monkeypatch target in pre-call-hook test to patch the actual module-level binding (litellm.integrations.compression_interception.handler.compress). - Add regression tests for empty-tools skip and opt-in session capture. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * revert: drop LITELLM_CAPTURE_VENDOR_SESSION_HEADERS flag Generic x-<vendor>-session-id header capture is a new feature and only runs after the explicit x-litellm-trace-id / x-litellm-session-id checks, so it does not change behavior for any existing caller that was already using the LiteLLM headers — no backwards-incompatibility to gate. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor(compress): replace input_type with CallTypes call_type Drop the bespoke ``CompressionInputType`` literal and use the existing ``litellm.types.utils.CallTypes`` enum instead. ``litellm.compress()`` now takes ``call_type: Union[CallTypes, str]`` (default ``CallTypes.completion``) — no new concept to learn, and the enum is already the way the rest of the codebase talks about request shapes. Supported values: ``completion`` / ``acompletion`` (OpenAI chat-completions shape) and ``anthropic_messages`` (Anthropic structured content blocks). Updated: compress(), the compression_interception handler, tests, docs, and the two eval scripts. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-20 15:08:00 -07:00
nhyy244	a19bff4ca6	Feature/add audio support for scaleway (#26110 ) * feat(scaleway): add SCALEWAY to LlmProviders enum * feat(scaleway): add audio transcription config and dispatch wiring Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * test(scaleway): add behavior tests for audio transcription config Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * chore(scaleway): advertise audio_transcriptions in endpoint-support JSON * docs(scaleway): document audio transcription support * fix(scaleway): address PR review — plain-text response_format + missing-key fail-fast Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * test(scaleway): cover new response paths, drop gettysburg.wav coupling Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-20 14:49:41 -07:00
Sameer Kankute	57eae8d01c	Merge branch 'litellm_internal_staging' into litellm_staging_03_22_2026	2026-04-20 19:56:00 +05:30
Krrish Dholakia	70caf5aec0	docs: update docs	2026-04-18 21:31:53 -07:00
Krrish Dholakia	924fa6a3bc	feat: commit new adaptive routing	2026-04-18 21:29:39 -07:00
ishaan-berri	d03c301c79	Merge pull request #25936 from BerriAI/litellm_health-check-reasoning-tokens fix(proxy): prioritize reasoning health-check max token precedence	2026-04-18 11:35:04 -07:00
Yuneng Jiang	e004876950	Merge remote-tracking branch 'origin/litellm_internal_staging' into litellm_/wonderful-bouman # Conflicts: # tests/test_litellm/proxy/ui_crud_endpoints/test_proxy_setting_endpoints.py	2026-04-17 21:32:09 -07:00
ishaan-berri	1c128a86b8	Merge pull request #25256 from BerriAI/litellm_ishaan_april6 Litellm ishaan april6	2026-04-17 16:26:45 -07:00
Yuneng Jiang	1e25a00e5d	[Docs] BYOK tutorial: document the UI-only configuration path	2026-04-17 13:32:17 -07:00
Krrish Dholakia	dd76cc5d9d	docs: add "Copy Page as Markdown" + llms.txt to docs site (#25975 ) * docs: add copy-page-as-markdown button + llms.txt generation Adds the signalwire llms-txt Docusaurus plugin + theme so every docs page gets: - A "Copy Page" dropdown in the breadcrumbs (Copy, View Markdown, Ask ChatGPT, Ask Claude) — defaults from the theme hook, no extra config required. - A raw `.md` companion at `<page>.md` for LLM consumption. - Site-wide `/llms.txt` index and `/llms-full.txt` corpus. The signalwire plugin README documents a `copyPageButton` option that the v1.2.2 Joi schema actually rejects; the theme's defaults cover the same feature set, so only `content.enableMarkdownFiles` and `enableLlmsFullTxt` are set. Theme is pinned to `1.0.0-alpha.9` because the floating version resolves to a broken canary whose `main` points at a missing file. Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com> * docs: pin exact versions for signalwire llms-txt deps Drop the caret ranges on the two packages added in the prior commit so the docs site pulls byte-identical npm tarballs on every install. Matches the existing convention in this package.json (everything else is already exact) and protects against supply-chain substitution if a malicious patch version is published under the same minor. Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com> * docs: upgrade signalwire llms-txt plugin to v2 alpha + enable copy button The stable v1.2.2 plugin we first pinned does not call setGlobalData during contentLoaded, so the theme's CopyPageContent component always returned null (its `!siteConfig` bailout). The theme v1.0.0-alpha.9 is built against the v2-alpha plugin API, which is the version that actually wires the copy-content JSON and plugin config into the theme via setGlobalData. Pins plugin to 2.0.0-alpha.7 (exact, no caret) and switches the config to the v2 schema: - top-level `markdown` + `llmsTxt` replace the v1 `content` block - new `ui.copyPageContent` (off by default in v2) enables the button with view-markdown + ChatGPT + Claude actions. Verified end-to-end: production build serves the dropdown with "Copy Raw Markdown", "View Markdown", "Reference in ChatGPT", and "Reference in Claude" on /docs/routing (button mounts at ~x=960 in the breadcrumbs row). Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com> --------- Co-authored-by: yuneng-jiang <yuneng@berri.ai> Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>	2026-04-17 13:03:12 -07:00
Ishaan Jaffer	f31d4faa87	Merge origin/main into litellm_ishaan_april6	2026-04-17 12:36:51 -07:00
Sameer Kankute	27877b4b06	Merge pull request #25945 from BerriAI/litellm_internal_staging merge litellm_internal_staging	2026-04-17 18:48:03 +05:30
Sameer Kankute	96882e04e7	Merge pull request #25942 from BerriAI/litellm_internal_staging merge litellm_internal_staging	2026-04-17 18:18:12 +05:30
Sameer Kankute	d86c6a5b2f	fix(proxy): prioritize reasoning health check token defaults Apply reasoning-first precedence for background health-check max tokens, parse reasoning env as optional, and raise non-wildcard fallback max_tokens from 1 to 5 for better reliability. Made-with: Cursor	2026-04-17 12:36:58 +05:30
Sameer Kankute	52fde57df7	feat(docs): align fenced code padding on blog and doc pages - Set --ifm-pre-padding to 1.25rem for consistent code block inset - Restore horizontal padding for line-numbered Docusaurus blocks - Scope pre/code resets via article .markdown so blog chip styles no longer strip CodeBlock inner padding on Prism fences Made-with: Cursor	2026-04-17 10:04:03 +05:30
Stefano Romanò	f69b9d6564	Add capability to override default GitHub Copilot authentication endp… (#25915 ) * Add capability to override default GitHub Copilot authentication endpoints This feature adds support for GitHub Enterprise subsriptions with custom domain/data ownership (which use a different URL compared to standard accounts) * Update documentation with new parameters * Move access token URL and Client ID retrieval outside for loop Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Fix spurious comment from Greptile review * Align api_base retrieval behavior across chat and embedding transformations * Add missing GitHub Copilot client ID parameter in docs * Update website documentation with newer options for GitHub Enterprise Copilot * Fix default value for Copilot client ID in docs Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> --------- Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>	2026-04-16 21:04:38 -07:00
Krrish Dholakia	13108f39cb	Add docs announcement bar for Trivy compromise resolution (#25870 ) * Add announcement bar for Trivy compromise resolution notice Add a Docusaurus announcement bar to the top of the docs site informing users that the Trivy supply-chain compromise has been mitigated and resolved. The banner: - States all affected packages have been deleted and releases are safe - Links to the Security Townhall blog post for details - Links to the CI/CD v2 blog post for improvements made - Uses a green background with closeable dismiss button Co-authored-by: Krrish Dholakia <krrish-berri-2@users.noreply.github.com> * Use :::note admonition instead of announcement bar Replace the Docusaurus announcementBar with a :::note admonition on the docs index page. The note appears below the hero image with the title 'Security Update' and links to the Security Townhall and CI/CD v2 blog posts. Co-authored-by: Krrish Dholakia <krrish-berri-2@users.noreply.github.com> * Update security notice wording to 'contained' Co-authored-by: Krrish Dholakia <krrish-berri-2@users.noreply.github.com> * Move note above hero image and add to root page - Move the security notice above the product screenshot on /docs - Add the same notice to the root page (src/pages/index.md) Co-authored-by: Krrish Dholakia <krrish-berri-2@users.noreply.github.com> * Update security notice wording Co-authored-by: Krrish Dholakia <krrish-berri-2@users.noreply.github.com> --------- Co-authored-by: Cursor Agent <cursoragent@cursor.com> Co-authored-by: Krrish Dholakia <krrish-berri-2@users.noreply.github.com>	2026-04-16 15:15:52 -07:00
Sameer Kankute	13522ff33a	Fix version in docs	2026-04-16 22:41:32 +05:30
ishaan-berri	44c992416c	Merge pull request #25867 from BerriAI/litellm_day_0_opus_4.7_support Litellm day 0 opus 4.7 support	2026-04-16 09:42:11 -07:00
Sameer Kankute	07d863b8e7	Remove max support for opus 4.7	2026-04-16 21:58:03 +05:30
Sameer Kankute	f94c8dda82	Fix model names	2026-04-16 21:47:58 +05:30
Sameer Kankute	b3d5ff5774	Fix tests + add docs	2026-04-16 21:45:31 +05:30
Sameer Kankute	4b5c86b8a1	Fix code qa	2026-04-16 19:29:08 +05:30
Sameer Kankute	c98002ce74	docs(gemini): document Gemini 3 thinking_level API defaults - Release v1.82.3: note removal of injected default when reasoning_effort omitted - Blog gemini_3: correct defaults and reasoning_effort mapping guidance - Provider gemini.md: align tip and mapping table with implementation Made-with: Cursor	2026-04-16 11:24:35 +05:30

1 2 3 4 5 ...

6193 Commits