mirror of
https://github.com/tiennm99/litellm.git
synced 2026-06-27 05:07:36 +00:00
dd5b85697a5eddad00a84b221a15a0608bbd5c31
13 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
9495f4e941 |
fix(ollama): thread api_base to get_model_info + graceful fallback (#21970)
* auth_with_role_name add region_name arg for cross-account sts * update tests to include case with aws_region_name for _auth_with_aws_role * Only pass region_name to STS client when aws_region_name is set * Add optional aws_sts_endpoint to _auth_with_aws_role * Parametrize ambient-credentials test for no opts, region_name, and aws_sts_endpoint * consistently passing region and endpoint args into explicit credentials irsa * fix env var leakage * fix: bedrock openai-compatible imported-model should also have model arn encoded * feat: show proxy url in ModelHub (#21660) * fix(bedrock): correct modelInput format for Converse API batch models (#21656) * fix(proxy): add model_ids param to access group endpoints for precise deployment tagging (#21655) POST /access_group/new and PUT /access_group/{name}/update now accept an optional model_ids list that targets specific deployments by their unique model_id, instead of tagging every deployment that shares a model_name. When model_ids is provided it takes priority over model_names, giving API callers the same single-deployment precision that the UI already has via PATCH /model/{model_id}/update. Backward compatible: model_names continues to work as before. Closes #21544 * feat(proxy): add custom favicon support\n\nAdd ability to configure a custom favicon for the litellm proxy UI.\n\n- Add favicon_url field to UIThemeConfig model\n- Add LITELLM_FAVICON_URL env var support\n- Add /get_favicon endpoint to serve custom favicons\n- Update ThemeContext to dynamically set favicon\n- Add favicon URL input to UI theme settings page\n- Add comprehensive tests\n\nCloses #8323 (#21653) * fix(bedrock): prevent double UUID in create_file S3 key (#21650) In create_file for Bedrock, get_complete_file_url is called twice: once in the sync handler (generating UUID-1 for api_base) and once inside transform_create_file_request (generating UUID-2 for the actual S3 upload). The Bedrock provider correctly writes UUID-2 into litellm_params["upload_url"], but the sync handler unconditionally overwrites it with api_base (UUID-1). This causes the returned file_id to point to a non-existent S3 key. Fix: only set upload_url to api_base when transform_create_file_request has not already set it, preserving the Bedrock provider's value. Closes #21546 * feat(semantic-cache): support configurable vector dimensions for Qdrant (#21649) Add vector_size parameter to QdrantSemanticCache and expose it through the Cache facade as qdrant_semantic_cache_vector_size. This allows users to use embedding models with dimensions other than the default 1536, enabling cheaper/stronger models like Stella (1024d), bge-en-icl (4096d), voyage, cohere, etc. The parameter defaults to QDRANT_VECTOR_SIZE (env var or 1536) for backward compatibility. When creating new collections, the configured vector_size is used instead of the hardcoded constant. Closes #9377 * fix(utils): normalize camelCase thinking param keys to snake_case (#21762) Clients like OpenCode's @ai-sdk/openai-compatible send budgetTokens (camelCase) instead of budget_tokens in the thinking parameter, causing validation errors. Add early normalization in completion(). * feat: add optional digest mode for Slack alert types (#21683) Adds per-alert-type digest mode that aggregates duplicate alerts within a configurable time window and emits a single summary message with count, start/end timestamps. Configuration via general_settings.alert_type_config: alert_type_config: llm_requests_hanging: digest: true digest_interval: 86400 Digest key: (alert_type, request_model, api_base) Default interval: 24 hours Window type: fixed interval Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add blog_posts.json and local backup * feat: add GetBlogPosts utility with GitHub fetch and local fallback Adds GetBlogPosts class that fetches blog posts from GitHub with a 1-hour in-process TTL cache, validates the response, and falls back to the bundled blog_posts_backup.json on any network or validation failure. * test: add cache reset fixture and LITELLM_LOCAL_BLOG_POSTS test Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat: add GET /public/litellm_blog_posts endpoint Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: log fallback warning in blog posts endpoint and tighten test * feat: add disable_show_blog to UISettings Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat: add useUISettings and useDisableShowBlog hooks * fix: rename useUISettings to useUISettingsFlags to avoid naming collision * fix: use existing useUISettings hook in useDisableShowBlog to avoid cache duplication Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat: add BlogDropdown component with react-query and error/retry state Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: enforce 5-post limit in BlogDropdown and add cap test * fix: add retry, stable post key, enabled guard in BlogDropdown Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat: add BlogDropdown to navbar after Docs link * feat: add network_mock transport for benchmarking proxy overhead without real API calls Intercepts at httpx transport layer so the full proxy path (auth, routing, OpenAI SDK, response transformation) is exercised with zero-latency responses. Activated via `litellm_settings: { network_mock: true }` in proxy config. * Litellm dev 02 19 2026 p2 (#21871) * feat(ui/): new guardrails monitor 'demo mock representation of what guardrails monitor looks like * fix: ui updates * style(ui/): fix styling * feat: enable running ai monitor on individual guardrails * feat: add backend logic for guardrail monitoring * fix(guardrails/usage_endpoints.py): fix usage dashboard * fix(budget): fix timezone config lookup and replace hardcoded timezone map with ZoneInfo (#21754) * fix(budget): fix timezone config lookup and replace hardcoded timezone map with ZoneInfo * fix(budget): update stale docstring on get_budget_reset_time * fix: add missing return type annotations to iterator protocol methods in streaming_handler (#21750) * fix: add return type annotations to iterator protocol methods in streaming_handler Add missing return type annotations to __iter__, __aiter__, __next__, and __anext__ methods in CustomStreamWrapper and related classes. - __iter__(self) -> Iterator["ModelResponseStream"] - __aiter__(self) -> AsyncIterator["ModelResponseStream"] - __next__(self) -> "ModelResponseStream" - __anext__(self) -> "ModelResponseStream" Also adds AsyncIterator and Iterator to typing imports. Fixes issue with PLR0915 noqa comments and ensures proper type checking support. Related to: BerriAI/litellm#8304 * fix: add ruff PLR0915 noqa for files with too many statements * Add gollem Go agent framework cookbook example (#21747) Show how to use gollem, a production Go agent framework, with LiteLLM proxy for multi-provider LLM access including tool use and streaming. * fix: avoid mutating caller-owned dicts in SpendUpdateQueue aggregation (#21742) * fix(vertex_ai): enable context-1m-2025-08-07 beta header (#21870) * server root path regression doc * fixing syntax * fix: replace Zapier webhook with Google Form for survey submission (#21621) * Replace Zapier webhook with Google Form for survey submission * Add back error logging for survey submission debugging --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> * Revert "Merge pull request #21140 from BerriAI/litellm_perf_user_api_key_auth" This reverts commit |
||
|
|
7f81dea8b3 | Add custom auth header support and increase default prompt size to 100k chars (#19436) | ||
|
|
270b41b0f4 | Simplify file comments (#19382) | ||
|
|
0cd7763d5f |
Add health check scripts and parallel execution support (#19295)
- Add health_check_client.py for monitoring model availability - Add health_check_client_README.md with usage documentation - Add health_check_requirements.txt for dependencies - Add run_parallel_health_checks.ps1 (PowerShell version) - Add run_parallel_health_checks.sh (Bash version) - Organize all scripts under scripts/health_check/ directory |
||
|
|
07fe9e8604 |
implement failopen option default to True on grayswan guardrail (#18266)
* implement failopen option default to True * introduce a config to set the timeout limit (default to 30) |
||
|
|
b635f92d90 | Add benchmark_proxy_vs_provider.py script to scripts directory with usage examples (#17889) | ||
|
|
762b429d6c | enhance: create_litellm_branch tool to be more robust (#17874) | ||
|
|
a7ad8a36a4 |
chore: cleanup unused scripts and fix misplaced test file (#17611)
Remove scripts/ directory containing unused development/debug scripts: - mock_ibm_guardrails_server.py - test_groq_streaming_issue.py (debug for #12660) - test_mock_ibm_guardrails.py - update_readme_providers_table.py Move misplaced test file to correct location: - test_litellm/ -> tests/test_litellm/ (from PR #17221) |
||
|
|
c44e075b2d |
feat: add script to create branches with litellm_ prefix (#17606)
Add utility scripts to create branches with litellm_ prefix from contributor branches. This helps maintain consistent branch naming conventions for CI/CD. - scripts/create_litellm_branch.sh (Bash for macOS/Linux) - scripts/create_litellm_branch.ps1 (PowerShell for Windows) Usage: ./scripts/create_litellm_branch.sh [source_branch] [new_branch_name] ./scripts/create_litellm_branch.ps1 [source_branch] [new_branch_name] Features: - Auto-prefixes branch names with litellm_ - Handles existing branches gracefully - Validates branch names - Supports local and remote source branches |
||
|
|
d35d9008c9 | Ensure detector-id is passed as header to IBM detector server (#16649) | ||
|
|
0428229032 |
[Docs] readme fixes add supported providers (#16109)
* add provider test * docs readme.md * docs providers * order providers * test_providers_alphabetically_ordered * docs endpoint * fix config * add ENDPOINT_COLUMNS * add provider endpoints * docs fix |
||
|
|
ddacaf6c32 |
(feat) Organizations: allow org admins to create teams on UI + (feat) IBM Guardrails (#15924)
* fix(oldteams.tsx): allow org admin to create team on ui * fix(oldteams.tsx): show org admin a dropdown of allowed orgs for team creation * docs(access_control.md): cleanup doc * feat(ibm_guardrails/): initial commit adding support for ibm guardrails on litellm allows user to use self-hosted ibm guardrails * feat(ibm_detector.py): working detector * docs(ibm_guardrails.md): document new ibm guardrails * fix: fix linting errors |
||
|
|
000ecad4e2 |
Fix Groq streaming ASCII encoding issue
Replace iter_lines()/aiter_lines() with iter_text()/aiter_text() using explicit UTF-8 encoding to handle non-ASCII characters like µ in streaming responses. - Added utf8_iter_lines() and utf8_aiter_lines() helper functions - Ensures proper UTF-8 decoding of streaming response content - Added comprehensive tests for Unicode character handling Fixes #12660 |