litellm

mirror of https://github.com/tiennm99/litellm.git synced 2026-06-27 13:05:45 +00:00

Author	SHA1	Message	Date
Steve G	9806e21871	Add Lakera v2 post-call hook and tests (fixed PII masking) (#21783 ) * Add post-call hook for Lakera guardrail and mask PII in responses * Add post-call hook for Lakera and mask PII in responses * Fix post-call hook: pass event_type to call_v2_guard * Address Greptile review: return ModelResponse, fix mutation, add header, test location, mask order - PII masking path: return ModelResponse instead of dict so deployment hook accepts it - Avoid mutating request data: deep copy original_messages and messages in _mask_pii_in_messages - Add guardrail header in PII-only return path - Add test in tests/test_litellm/ (test_lakera_ai_v2.py) per PR checklist - Sort PII payload spans by (start,end) descending so multiple spans in one message mask correctly Co-authored-by: Cursor <cursoragent@cursor.com> * Updated ponteital for index mismatch when choices have null content and inconsistent on_flagged access pattern * Update litellm/proxy/guardrails/guardrail_hooks/lakera_ai_v2.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Update to explicitly state supported endpoints - chat completions * Fix minor lint error on masked_entity_count --------- Co-authored-by: Steve <steve.giguere@lakera.ai> Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>	2026-02-25 17:20:38 -08:00
Ishaan Jaff	cc85fe5921	Proxy request tags docs (#22129 ) * docs: document x-litellm-tags header and request body tags parameter - Add documentation for x-litellm-tags header (comma-separated or array) - Add documentation for tags in request body - Clarify that dynamic tags override config tags Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * docs: consolidate tag documentation and improve cross-references - Make request_tags.md the single source of truth for all tag options - Add cross-reference from cost_tracking.md to request_tags.md - Document both direct tags and metadata.tags formats - Add key/team tag setup and custom header tracking to request_tags.md - Reduce duplication and make navigation clearer Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * docs: use generic examples instead of specific company names Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * docs: clarify x-litellm-tags header format is comma-separated string HTTP headers are always strings, not arrays. Remove misleading array format documentation for the header parameter. Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * Update docs/my-website/docs/proxy/request_tags.md Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> --------- Co-authored-by: Cursor Agent <cursoragent@cursor.com> Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>	2026-02-25 14:29:00 -08:00
yuneng-jiang	7daeaf8106	[Docs] Add Credential Usage Tracking documentation Add new document explaining automatic credential usage tracking and tagging. When models use reusable credentials, LiteLLM automatically injects a Credential: <name> tag on requests, enabling credential-level spend tracking on the Usage page with no additional configuration. Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>	2026-02-25 10:52:24 -08:00
Sameer Kankute	ec3ae25a3a	Merge pull request #22070 from BerriAI/litellm_forward_auth_headers [Feat]Add forward auth headers of provider	2026-02-25 18:45:38 +05:30
Harshit Jain	d2aeb3e513	Merge pull request #22084 from Harshit28j/litellm_presidio-non-json-response-handling fix(guardrails): prevent presidio crash on non-json responses	2026-02-25 16:37:39 +05:30
Harshit28j	6a8052295b	fix(guardrails): prevent presidio crash on non-json responses	2026-02-25 16:11:24 +05:30
Harshit Jain	23e84eb789	fix: Metadata / Trace ID Missing in S3 Streaming Callbacks	2026-02-25 14:16:42 +05:30
Sameer Kankute	0e806c83c1	Fix docs	2026-02-25 12:13:56 +05:30
Sameer Kankute	a43d6139c7	Fix docs	2026-02-25 12:13:07 +05:30
Sameer Kankute	1f8f66de69	add docs for Authentication Headers forwarding	2026-02-25 12:10:12 +05:30
Ryan Crabbe	adafac1117	fix: add prompt_cache_key and prompt_cache_retention support for OpenAI These params were silently dropped for Chat Completions because they were missing from the supported params whitelist. Also adds prompt_cache_retention to the Responses API TypedDict and fixes misleading cache_control comments in OpenAI prompt caching docs.	2026-02-24 16:42:54 -08:00
ryan-crabbe	75113440ab	Merge pull request #20509 from ryan-crabbe/docs/mcp-trailing-slash docs: add trailing slash to /mcp endpoint URLs	2026-02-24 16:38:29 -08:00
Ishaan Jaff	33719e6b38	docs: update v1.81.12-stable release notes to point to v1.81.12-stable.1 (#22036 )	2026-02-24 12:30:18 -08:00
Sameer Kankute	5219b1d0c3	Merge pull request #22035 from BerriAI/litellm_openai_codex_day_0_codex_5.3 [Feat] OpenAI codex 5.3 day 0 support	2026-02-25 01:29:27 +05:30
Ishaan Jaff	e44b9b6b35	feat(prometheus): add opt-in stream label to litellm_proxy_total_requests_metric (#22023 ) Set prometheus_emit_stream_label: true in litellm_settings to emit a stream label (True/False/None) on litellm_proxy_total_requests_metric. Opt-in to avoid breaking cardinality on existing deployments.	2026-02-24 11:51:42 -08:00
Sameer Kankute	5d291c739f	Fix phase docs link	2026-02-25 01:21:38 +05:30
Sameer Kankute	74abf0c8e6	Fix phase docs link	2026-02-25 01:19:10 +05:30
Sameer Kankute	aded14a55a	Fix release version for gpt-5.3-codex	2026-02-25 01:04:12 +05:30
Harshit28j	132e2ed671	Merge branch 'main' of https://github.com/BerriAI/litellm into litellm_fix_CVE # Please enter a commit message to explain why this merge is necessary, # especially if it merges an updated upstream into a topic branch. # # Lines starting with '#' will be ignored, and an empty message aborts # the commit.	2026-02-24 21:09:16 +05:30
Harshit28j	3e6c10a071	security: fix critical/high CVEs in OS-level libs and NPM transitive	2026-02-24 19:40:09 +05:30
Sameer Kankute	b38059b014	Merge branch 'main' into litellm_oss_staging_02_23_2026	2026-02-24 19:32:48 +05:30
Sameer Kankute	ac720defc3	Add documentation related to phase	2026-02-24 17:50:38 +05:30
Shivam Rawat	7622f26918	Merge pull request #21997 from BerriAI/doc_fix_remove_harcoded_api_key [Doc] replaced azure openai key with mock key	2026-02-24 03:32:37 -08:00
shivam	c86b174642	replaced with mock key	2026-02-24 03:28:28 -08:00
Ishaan Jaff	c79d94fd16	feat(realtime): guardrail hook for voice transcription (#21976 ) * feat(realtime): add guardrail hook for voice transcription in Realtime API Adds a new `realtime_input_transcription` guardrail event hook that fires after Whisper transcription completes, before the LLM generates a response. When a guardrail blocks, a synthetic warning is sent to the client and `response.create` is never forwarded — the LLM never responds. Also rewrites `create_response: true` → `false` in client `session.update` so the proxy controls when responses are triggered. * feat(realtime): speak guardrail block message as audio via TTS Instead of sending synthetic text events when a guardrail blocks, send response.create with forced instructions so OpenAI's TTS speaks the warning message — user hears the block instead of just seeing text. * fix(realtime): speak exact content filter error message via TTS Extract the human-readable error string from HTTPException.detail so the spoken warning says e.g. "Content blocked: keyword 'system update' detected" instead of the raw str(e) repr. * fix(realtime): reliably enforce create_response=false for guardrails - Proxy now injects session.update with create_response=false immediately on session.created (when guardrails are active), instead of rewriting the client's session.update — works regardless of what the client sends - Add response.cancel before the warning response.create to kill any in-flight LLM response that snuck through before the guardrail fired * refactor(realtime): call apply_guardrail directly, remove dedicated hook method The async_realtime_input_transcription_hook in CustomGuardrail and ContentFilterGuardrail was just a thin wrapper that called apply_guardrail — the same interface used by /chat and /messages. Remove the wrapper and call apply_guardrail directly from run_realtime_guardrails, keeping the pattern consistent across all endpoints. * docs: add Realtime API guardrails tutorial and flow diagram * fix: address Greptile review comments - Forward user_api_key_dict through realtime_api/main.py (_arealtime) so it actually reaches RealTimeStreaming instead of always being None - Run guardrail interception in provider_config path too (e.g. Gemini), not only the OpenAI direct path - Narrow exception catch to HTTPException/ValueError only; re-raise unexpected errors so programming bugs surface in logs rather than silently appearing as guardrail blocks - Update tests: mock apply_guardrail directly (hook method was removed), replace session.update client-rewrite test with session.created injection test matching the new server-side approach * fix: address latest Greptile review comments - Remove fastapi import from SDK-layer file; check for status_code/detail attrs instead to identify guardrail-block exceptions vs programming errors - Add store_message() before continue in transcription interception so transcription events are logged in the non-provider_config path - Inject create_response=false on session.created in provider_config path (Gemini etc.) to match the OpenAI path — prevents LLM auto-responding before guardrail runs on VAD-detected turns	2026-02-23 21:04:40 -08:00
Nicolò Pignatelli	b8dddab311	feat: add groq/openai/gpt-oss-safeguard-20b model pricing (#21951 ) * feat: add groq/openai/gpt-oss-safeguard-20b model pricing Add pricing and context window data for OpenAI's GPT-OSS-Safeguard-20B model on Groq, a reasoning model trained for safety classification tasks. - Input: $0.075/1M tokens - Cached input: $0.037/1M tokens - Output: $0.30/1M tokens - Context window: 131,072 tokens - Max output: 65,536 tokens Reference: https://console.groq.com/docs/model/openai/gpt-oss-safeguard-20b * docs: add gpt-oss-safeguard-20b to Groq provider docs	2026-02-23 21:03:18 -08:00
Cesar Garcia	9495f4e941	fix(ollama): thread api_base to get_model_info + graceful fallback (#21970 ) * auth_with_role_name add region_name arg for cross-account sts * update tests to include case with aws_region_name for _auth_with_aws_role * Only pass region_name to STS client when aws_region_name is set * Add optional aws_sts_endpoint to _auth_with_aws_role * Parametrize ambient-credentials test for no opts, region_name, and aws_sts_endpoint * consistently passing region and endpoint args into explicit credentials irsa * fix env var leakage * fix: bedrock openai-compatible imported-model should also have model arn encoded * feat: show proxy url in ModelHub (#21660) * fix(bedrock): correct modelInput format for Converse API batch models (#21656) * fix(proxy): add model_ids param to access group endpoints for precise deployment tagging (#21655) POST /access_group/new and PUT /access_group/{name}/update now accept an optional model_ids list that targets specific deployments by their unique model_id, instead of tagging every deployment that shares a model_name. When model_ids is provided it takes priority over model_names, giving API callers the same single-deployment precision that the UI already has via PATCH /model/{model_id}/update. Backward compatible: model_names continues to work as before. Closes #21544 * feat(proxy): add custom favicon support\n\nAdd ability to configure a custom favicon for the litellm proxy UI.\n\n- Add favicon_url field to UIThemeConfig model\n- Add LITELLM_FAVICON_URL env var support\n- Add /get_favicon endpoint to serve custom favicons\n- Update ThemeContext to dynamically set favicon\n- Add favicon URL input to UI theme settings page\n- Add comprehensive tests\n\nCloses #8323 (#21653) * fix(bedrock): prevent double UUID in create_file S3 key (#21650) In create_file for Bedrock, get_complete_file_url is called twice: once in the sync handler (generating UUID-1 for api_base) and once inside transform_create_file_request (generating UUID-2 for the actual S3 upload). The Bedrock provider correctly writes UUID-2 into litellm_params["upload_url"], but the sync handler unconditionally overwrites it with api_base (UUID-1). This causes the returned file_id to point to a non-existent S3 key. Fix: only set upload_url to api_base when transform_create_file_request has not already set it, preserving the Bedrock provider's value. Closes #21546 * feat(semantic-cache): support configurable vector dimensions for Qdrant (#21649) Add vector_size parameter to QdrantSemanticCache and expose it through the Cache facade as qdrant_semantic_cache_vector_size. This allows users to use embedding models with dimensions other than the default 1536, enabling cheaper/stronger models like Stella (1024d), bge-en-icl (4096d), voyage, cohere, etc. The parameter defaults to QDRANT_VECTOR_SIZE (env var or 1536) for backward compatibility. When creating new collections, the configured vector_size is used instead of the hardcoded constant. Closes #9377 * fix(utils): normalize camelCase thinking param keys to snake_case (#21762) Clients like OpenCode's @ai-sdk/openai-compatible send budgetTokens (camelCase) instead of budget_tokens in the thinking parameter, causing validation errors. Add early normalization in completion(). * feat: add optional digest mode for Slack alert types (#21683) Adds per-alert-type digest mode that aggregates duplicate alerts within a configurable time window and emits a single summary message with count, start/end timestamps. Configuration via general_settings.alert_type_config: alert_type_config: llm_requests_hanging: digest: true digest_interval: 86400 Digest key: (alert_type, request_model, api_base) Default interval: 24 hours Window type: fixed interval Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add blog_posts.json and local backup * feat: add GetBlogPosts utility with GitHub fetch and local fallback Adds GetBlogPosts class that fetches blog posts from GitHub with a 1-hour in-process TTL cache, validates the response, and falls back to the bundled blog_posts_backup.json on any network or validation failure. * test: add cache reset fixture and LITELLM_LOCAL_BLOG_POSTS test Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat: add GET /public/litellm_blog_posts endpoint Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: log fallback warning in blog posts endpoint and tighten test * feat: add disable_show_blog to UISettings Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat: add useUISettings and useDisableShowBlog hooks * fix: rename useUISettings to useUISettingsFlags to avoid naming collision * fix: use existing useUISettings hook in useDisableShowBlog to avoid cache duplication Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat: add BlogDropdown component with react-query and error/retry state Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: enforce 5-post limit in BlogDropdown and add cap test * fix: add retry, stable post key, enabled guard in BlogDropdown Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat: add BlogDropdown to navbar after Docs link * feat: add network_mock transport for benchmarking proxy overhead without real API calls Intercepts at httpx transport layer so the full proxy path (auth, routing, OpenAI SDK, response transformation) is exercised with zero-latency responses. Activated via `litellm_settings: { network_mock: true }` in proxy config. * Litellm dev 02 19 2026 p2 (#21871) * feat(ui/): new guardrails monitor 'demo mock representation of what guardrails monitor looks like * fix: ui updates * style(ui/): fix styling * feat: enable running ai monitor on individual guardrails * feat: add backend logic for guardrail monitoring * fix(guardrails/usage_endpoints.py): fix usage dashboard * fix(budget): fix timezone config lookup and replace hardcoded timezone map with ZoneInfo (#21754) * fix(budget): fix timezone config lookup and replace hardcoded timezone map with ZoneInfo * fix(budget): update stale docstring on get_budget_reset_time * fix: add missing return type annotations to iterator protocol methods in streaming_handler (#21750) * fix: add return type annotations to iterator protocol methods in streaming_handler Add missing return type annotations to __iter__, __aiter__, __next__, and __anext__ methods in CustomStreamWrapper and related classes. - __iter__(self) -> Iterator["ModelResponseStream"] - __aiter__(self) -> AsyncIterator["ModelResponseStream"] - __next__(self) -> "ModelResponseStream" - __anext__(self) -> "ModelResponseStream" Also adds AsyncIterator and Iterator to typing imports. Fixes issue with PLR0915 noqa comments and ensures proper type checking support. Related to: BerriAI/litellm#8304 * fix: add ruff PLR0915 noqa for files with too many statements * Add gollem Go agent framework cookbook example (#21747) Show how to use gollem, a production Go agent framework, with LiteLLM proxy for multi-provider LLM access including tool use and streaming. * fix: avoid mutating caller-owned dicts in SpendUpdateQueue aggregation (#21742) * fix(vertex_ai): enable context-1m-2025-08-07 beta header (#21870) * server root path regression doc * fixing syntax * fix: replace Zapier webhook with Google Form for survey submission (#21621) * Replace Zapier webhook with Google Form for survey submission * Add back error logging for survey submission debugging --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> * Revert "Merge pull request #21140 from BerriAI/litellm_perf_user_api_key_auth" This reverts commit `0e1db3f7e4`, reversing changes made to `7e2d6f2355`. * test_vertex_ai_gemini_2_5_pro_streaming * UI new build * fix rendering * ui new build * docs fix * docs fix * docs fix * docs fix * docs fix * docs fix * docs fix * docs fix * release note docs * docs * adding image * fix(vertex_ai): enable context-1m-2025-08-07 beta header The `context-1m-2025-08-07` Anthropic beta header was set to `null` for vertex_ai, causing it to be filtered out when users set `extra_headers: {anthropic-beta: context-1m-2025-08-07}`. This prevented using Claude's 1M context window feature via Vertex AI, resulting in `prompt is too long: 460500 tokens > 200000 maximum` errors. Fixes #21861 --------- Co-authored-by: yuneng-jiang <yuneng.jiang@gmail.com> Co-authored-by: milan-berri <milan@berri.ai> Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> * Revert "fix(vertex_ai): enable context-1m-2025-08-07 beta header (#21870)" (#21876) This reverts commit `bce078a796`. * docs(ui): add pre-PR checklist to UI contributing guide Add testing and build verification steps per maintainer feedback from @yjiang-litellm. Contributors should run their related tests per-file and ensure npm run build passes before opening PRs. * Fix entries with fast and us/ * Add tests for fast and us * Add support for Priority PayGo for vertex ai and gemini * Add model pricing * fix: ensure arrival_time is set before calculating queue time * Fix: Anthropic model wildcard access issue * Add incident report * Add ability to see which model cost map is getting used * Fix name of title * Readd tpm limit * State management fixes for CheckBatchCost * Fix PR review comments * State management fixes for CheckBatchCost - Address greptile comments * fix mypy issues: * Add Noma guardrails v2 based on custom guardrails (#21400) * Fix code qa issues * Fix mypy issues * Fix mypy issues * Fix test_aaamodel_prices_and_context_window_json_is_valid * fix: update calendly on repo * fix(tests): use counter-based mock for time.time in prisma self-heal test The test used a fixed side_effect list for time.time(), but the number of calls varies by Python version, causing StopIteration on 3.12 and AssertionError on 3.14. Replace with an infinite counter-based callable and assert the timestamp was updated rather than checking for an exact value. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(tests): use absolute path for model_prices JSON in validation test The test used a relative path 'litellm/model_prices_and_context_window.json' which only works when pytest runs from a specific working directory. Use os.path based on __file__ to resolve the path reliably. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Update tests/test_litellm/test_utils.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * fix(tests): use os.path instead of Path to avoid NameError Path is not imported at module level. Use os.path.join which is already available. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * clean up mock transport: remove streaming, add defensive parsing * docs: add Google GenAI SDK tutorial (JS & Python) (#21885) * docs: add Google GenAI SDK tutorial for JS and Python Add tutorial for using Google's official GenAI SDK (@google/genai for JS, google-genai for Python) with LiteLLM proxy. Covers pass-through and native router endpoints, streaming, multi-turn chat, and multi-provider routing via model_group_alias. Also updates pass-through docs to use the new SDK replacing the deprecated @google/generative-ai. * fix(docs): correct Python SDK env var name in GenAI tutorial GOOGLE_GENAI_API_KEY does not exist in the google-genai SDK. The correct env var is GEMINI_API_KEY (or GOOGLE_API_KEY). Also note that the Python SDK has no base URL env var. * fix(docs): replace non-existent GOOGLE_GENAI_BASE_URL env var in interactions.md The Python google-genai SDK does not read GOOGLE_GENAI_BASE_URL. Use http_options={"base_url": "..."} in code instead. * docs: add network mock benchmarking section * docs: tweak benchmarks wording * fix: add auth headers and empty latencies guard to benchmark script * refactor: use method-level import for MockOpenAITransport * fix: guard print_aggregate against empty latencies * fix: add INCOMPLETE status to Interactions API enum and test Google added INCOMPLETE to the Interactions API OpenAPI spec status enum. Update both the Status3 enum in the SDK types and the test's expected values to match. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Guardrail Monitor - measure guardrail reliability in prod (#21944) * fix: fix log viewer for guardrail monitoring * feat(ui/): fix rendering logs per guardrail * fix: fix viewing logs on overview tab of guardrail * fix: log viewer * fix: fix naming to align with metric * docs: add performance & reliability section to v1.81.14 release notes * fix(tests): make RPM limit test sequential to avoid race condition Concurrent requests via run_in_executor + asyncio.gather caused a race condition where more requests slipped through the rate limiter than expected, leading to flaky test failures (e.g. 3 successes instead of 2 with rpm_limit=2). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: Singapore guardrail policies (PDPA + MAS AI Risk Management) (#21948) * feat: Singapore PDPA PII protection guardrail policy template Add Singapore Personal Data Protection Act (PDPA) guardrail support: Regex patterns (patterns.json): - sg_nric: NRIC/FIN detection ([STFGM] + 7 digits + checksum letter) - sg_phone: Singapore phone numbers (+65/0065/65 prefix) - sg_postal_code: 6-digit postal codes (contextual) - passport_singapore: Passport numbers (E/K + 7 digits, contextual) - sg_uen: Unique Entity Numbers (3 formats) - sg_bank_account: Bank account numbers (dash format, contextual) YAML policy templates (5 sub-guardrails): - sg_pdpa_personal_identifiers: s.13 Consent - sg_pdpa_sensitive_data: Advisory Guidelines - sg_pdpa_do_not_call: Part IX DNC Registry - sg_pdpa_data_transfer: s.26 overseas transfers - sg_pdpa_profiling_automated_decisions: Model AI Governance Framework Policy template entry in policy_templates.json with 9 guardrail definitions (4 regex-based + 5 YAML conditional keyword matching). Tests: - test_sg_patterns.py: regex pattern unit tests - test_sg_pdpa_guardrails.py: conditional keyword matching tests (100+ cases) * feat: MAS AI Risk Management Guidelines guardrail policy template Add Monetary Authority of Singapore (MAS) AI Risk Management Guidelines guardrail support for financial institutions: YAML policy templates (5 sub-guardrails): - sg_mas_fairness_bias: Blocks discriminatory financial AI (credit/loans/insurance by protected attributes) - sg_mas_transparency_explainability: Blocks opaque/unexplainable AI for consequential financial decisions - sg_mas_human_oversight: Blocks fully automated financial decisions without human-in-the-loop - sg_mas_data_governance: Blocks unauthorized sharing/mishandling of financial customer data - sg_mas_model_security: Blocks adversarial attacks, model poisoning, inversion on financial AI Policy template entry in policy_templates.json with 5 guardrail definitions. Aligned with MAS FEAT Principles, Project MindForge, and NIST AI RMF. Tests: - test_sg_mas_ai_guardrails.py: conditional keyword matching tests (100+ cases) * fix: address SG pattern review feedback - Update NRIC lowercase test for IGNORECASE runtime behavior - Add keyword context guard to sg_uen pattern to reduce false positives * docs: clarify MAS AIRM timeline references - Explicitly mark MAS AIRM as Nov 2025 consultation draft - Add 2018 qualifier for FEAT principles in MAS policy descriptions - Update MAS guardrail wording to avoid release-year ambiguity * chore: commit resolved MAS policy conflicts * test: * chore: * Add OpenAI Agents SDK tutorial with LiteLLM Proxy to docs (#21221) * Add OpenAI Agents SDK tutorial to docs * Update OpenAI Agents SDK tutorial to use LiteLLM environment variables * Enhance OpenAI Agents SDK tutorial with built-in LiteLLM extension details and updated configuration steps. Adjust section headings for clarity and improve the flow of information regarding model setup and usage. * adjust blog posts to fetch from github first * feat(videos): add variant parameter to video content download (#21955) openai videos models support the features to download variants. See more details here: https://developers.openai.com/api/docs/guides/video-generation#use-image-references. Plumb variant (e.g. "thumbnail", "spritesheet") through the full video content download chain: avideo_content → video_content → video_content_handler → transform_video_content_request. OpenAI appends ?variant=<value> to the GET URL; other providers accept the parameter in their signature but ignore it. * fixing path * adjust blog post path * Revert duplicate issue checker to text-based matching, remove duplicate PR workflow Remove the Claude Code-powered duplicate PR detection workflow and revert the duplicate issue checker back to wow-actions/potential-duplicates with text similarity matching. * ui changes * adding tests * adjust default aggregation threshold * fix(videos): pass api_key from litellm_params to video remix handlers (#21965) video_remix_handler and async_video_remix_handler were not falling back to litellm_params.api_key when the api_key parameter was None, causing Authorization: Bearer None to be sent to the provider. This matches the pattern already used by async_video_generation_handler. * adding testing coverage + fixing flaky tests * fix(ollama): thread api_base through get_model_info and add graceful fallback When users pass api_base to litellm.completion() for Ollama, the model info fetch (context window, function_calling support) was ignoring the user's api_base and only reading OLLAMA_API_BASE env var or defaulting to localhost:11434. This caused confusing errors in logs when Ollama runs on a remote server. Thread api_base from litellm_params through the get_model_info call chain so OllamaConfig.get_model_info() uses the correct server. Also return safe defaults instead of raising when the server is unreachable. Fixes #21967 --------- Co-authored-by: An Tang <ta@stripe.com> Co-authored-by: janfrederickk <75388864+janfrederickk@users.noreply.github.com> Co-authored-by: Zhenting Huang <3061613175@qq.com> Co-authored-by: Darien Kindlund <darien@kindlund.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: yuneng-jiang <yuneng.jiang@gmail.com> Co-authored-by: Ryan Crabbe <rcrabbe@berkeley.edu> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: LeeJuOh <56071126+LeeJuOh@users.noreply.github.com> Co-authored-by: Monesh Ram <31161039+WhoisMonesh@users.noreply.github.com> Co-authored-by: Trevor Prater <trevor.prater@gmail.com> Co-authored-by: The Mavik <179817126+themavik@users.noreply.github.com> Co-authored-by: Edwin Isac <33712823+edwiniac@users.noreply.github.com> Co-authored-by: milan-berri <milan@berri.ai> Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: Sameer Kankute <sameer@berri.ai> Co-authored-by: Harshit Jain <harshitjain0562@gmail.com> Co-authored-by: Harshit Jain <48647625+Harshit28j@users.noreply.github.com> Co-authored-by: Ephrim Stanley <ephrim.stanley@point72.com> Co-authored-by: TomAlon <tom@noma.security> Co-authored-by: Julio Quinteros Pro <jquinter@gmail.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Co-authored-by: ryan-crabbe <128659760+ryan-crabbe@users.noreply.github.com> Co-authored-by: Ron Zhong <ron-zhong@hotmail.com> Co-authored-by: Arindam Majumder <109217591+Arindam200@users.noreply.github.com> Co-authored-by: Lei Nie <lenie@quora.com>	2026-02-23 21:00:37 -08:00
Harshit Jain	a15c4db499	Merge pull request #21949 from BerriAI/fix/presidio-streaming-false-positives fix: presidio streaming, false positives	2026-02-24 10:09:47 +05:30
Sameer Kankute	3b2ff5b06a	Fix cicd code quality	2026-02-24 09:22:40 +05:30
ryan-crabbe	0ca9869b99	Merge pull request #21950 from ryan-crabbe/docs/v1-81-14-perf-section docs: add performance & reliability section to v1.81.14 release notes	2026-02-23 13:13:21 -08:00
Arindam Majumder	71b4bd12a7	Add OpenAI Agents SDK tutorial with LiteLLM Proxy to docs (#21221 ) * Add OpenAI Agents SDK tutorial to docs * Update OpenAI Agents SDK tutorial to use LiteLLM environment variables * Enhance OpenAI Agents SDK tutorial with built-in LiteLLM extension details and updated configuration steps. Adjust section headings for clarity and improve the flow of information regarding model setup and usage.	2026-02-23 12:10:01 -08:00
Ryan Crabbe	67ceade162	docs: add performance & reliability section to v1.81.14 release notes	2026-02-23 11:23:29 -08:00
Harshit28j	af9ad68a43	fix: presidio streaming, false positives	2026-02-24 00:42:29 +05:30
ryan-crabbe	c4c48fe977	Merge pull request #21942 from BerriAI/litellm_network_mock feat: Litellm network mock	2026-02-23 10:07:11 -08:00
Kesku	5899e909fd	feat(perplexity): update Responses API integration to match Agent API - Rename "Agentic Research API" to "Agent API" Expand - supported Responses API parameters - Fix function tool handling to pass custom function tools through unchanged instead of heuristically mapping them. -Update model registry with current Perplexity models and presets - Add Function Calling and Structured Outputs documentation sections. - Unit tests for transformation logic.	2026-02-23 18:05:03 +00:00
yuneng-jiang	bba98c2f15	Merge pull request #21886 from Chesars/docs/ui-contributing-pre-pr-checklist docs(ui): add pre-PR checklist to UI contributing guide	2026-02-23 09:50:26 -08:00
Ryan Crabbe	8244ad1f0e	docs: tweak benchmarks wording	2026-02-23 09:50:02 -08:00
Ryan Crabbe	5b41b009f6	docs: add network mock benchmarking section	2026-02-23 09:44:02 -08:00
Cesar Garcia	64d1de0552	docs: add Google GenAI SDK tutorial (JS & Python) (#21885 ) * docs: add Google GenAI SDK tutorial for JS and Python Add tutorial for using Google's official GenAI SDK (@google/genai for JS, google-genai for Python) with LiteLLM proxy. Covers pass-through and native router endpoints, streaming, multi-turn chat, and multi-provider routing via model_group_alias. Also updates pass-through docs to use the new SDK replacing the deprecated @google/generative-ai. * fix(docs): correct Python SDK env var name in GenAI tutorial GOOGLE_GENAI_API_KEY does not exist in the google-genai SDK. The correct env var is GEMINI_API_KEY (or GOOGLE_API_KEY). Also note that the Python SDK has no base URL env var. * fix(docs): replace non-existent GOOGLE_GENAI_BASE_URL env var in interactions.md The Python google-genai SDK does not read GOOGLE_GENAI_BASE_URL. Use http_options={"base_url": "..."} in code instead.	2026-02-23 09:20:46 -08:00
Krrish Dholakia	a26f83fd3c	fix: update calendly on repo	2026-02-23 06:13:59 -08:00
Sameer Kankute	9b5bbee906	Merge pull request #21786 from BerriAI/litellm_oss_staging_02_21_2026 Litellm oss staging 02 21 2026	2026-02-23 18:51:55 +05:30
Sameer Kankute	8decf04d8a	Merge pull request #21877 from BerriAI/litellm_oss_staging_02_22_2026 Litellm oss staging 02 22 2026	2026-02-23 18:50:47 +05:30
Sameer Kankute	37d45139f2	Merge pull request #21917 from BerriAI/litellm_fix_model_cost_map_wildcard Fix: Anthropic model wildcard access issue	2026-02-23 18:45:49 +05:30
TomAlon	99184c48d9	Add Noma guardrails v2 based on custom guardrails (#21400 )	2026-02-23 05:05:27 -08:00
Sameer Kankute	c7aafdf794	Merge pull request #21926 from BerriAI/main merge main in oss 21 02	2026-02-23 18:17:30 +05:30
Sameer Kankute	57af8e6a93	Merge pull request #21924 from BerriAI/main merge main in oss 22 02	2026-02-23 18:11:36 +05:30
Sameer Kankute	eaf3900200	Fix name of title	2026-02-23 17:18:31 +05:30
Sameer Kankute	9b27cd8c0e	Add incident report	2026-02-23 17:13:44 +05:30
Chesars	0e0abeb123	docs(ui): add pre-PR checklist to UI contributing guide Add testing and build verification steps per maintainer feedback from @yjiang-litellm. Contributors should run their related tests per-file and ensure npm run build passes before opening PRs.	2026-02-22 09:50:26 -03:00
Cesar Garcia	b8cef1a4e5	docs: add OpenClaw integration tutorial (#21605 ) * docs: add OpenClaw integration tutorial * docs: simplify OpenClaw proxy start command * docs: rewrite OpenClaw integration guide for clarity - Use gpt-5 as default model - Replace poetry run with standard litellm CLI - Add prerequisites section and verification step - Simplify onboarding instructions (table format) - Move manual config and troubleshooting to bottom - Add multi-model config (claude-sonnet, gemini-flash) * docs: fix model name in OpenClaw manual config example * docs: rewrite OpenClaw integration guide from scratch Rewrote the guide based on hands-on testing of every command. Key changes: - Replace non-existent `openclaw chat` with verified commands (dashboard, tui, agent --agent main) - Add 3 onboarding options: QuickStart, Manual, and non-interactive - Fix health check (requires Bearer token) - Remove misleading "Starting from scratch" section - Use gpt-4o instead of gpt-5 as the example model - Clarify that API keys can come from export, .env, or any method - Add config reference section showing openclaw.json structure - Add real troubleshooting based on issues found during testing	2026-02-21 20:16:27 -08:00

1 2 3 4 5 ...

5672 Commits