* Pyroscope: require PYROSCOPE_APP_NAME and PYROSCOPE_SERVER_ADDRESS, add UTF-8 locale hint
- No defaults for PYROSCOPE_APP_NAME or PYROSCOPE_SERVER_ADDRESS; fail at startup if unset when Pyroscope is enabled
- Set LANG/LC_ALL to C.UTF-8 when unset to reduce malformed_profile (invalid UTF-8) rejections
- Startup message suggests PYTHONUTF8=1 if server rejects profiles
- Simplify LITELLM_ENABLE_PYROSCOPE in config_settings; document Pyroscope env vars as required with no default
- Add pyroscope_profiling to sidebar (Alerting & Monitoring)
- pyproject.toml: pyroscope-io as required dep on non-Windows (marker), in proxy extra
* proxy: add PYROSCOPE_SAMPLE_RATE env, use verbose logging, fix int type
- Add optional PYROSCOPE_SAMPLE_RATE env (integer, no default)
- Pass sample_rate to pyroscope.configure() as int for pyroscope-io
- Replace print with verbose_proxy_logger (info/warning)
- Document PYROSCOPE_SAMPLE_RATE in config_settings.md
* Address Greptile PR feedback: Pyroscope optional, docs, tests, docstring
- pyproject.toml: mark pyroscope-io as optional=true (proxy extra only)
- Add docs/my-website/docs/proxy/pyroscope_profiling.md (fix broken sidebar link)
- Add tests/test_litellm/proxy/test_pyroscope.py for _init_pyroscope()
- proxy_server: fix _init_pyroscope docstring (required server/app name, sample rate as int)
* Update litellm/proxy/proxy_server.py
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
---------
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
- Fix Atlassian config: use http transport and correct URL (/v1/mcp not /v1/sse)
- Fix URL pattern: use standard /mcp/<server_name> not legacy /<server_name>/mcp
- Add parameter breakdown table explaining each claude mcp add argument
- Add warning that server name in proxy config must match URL path
- Add ngrok step for OAuth callback accessibility
- Add ~/.claude.json config option alongside claude mcp add
- Fix auth header guidance: use x-litellm-api-key for OAuth servers
* docs: add native thinking param examples for Claude Opus 4.6
Add documentation for using the native `thinking` parameter directly
with adaptive thinking and explicit budgets for Claude Opus 4.6.
* docs: add note about reasoning_effort mapping to adaptive for Opus 4.6
- Document two OpenAI web search approaches: search models (/chat/completions) vs web_search_preview tool (/responses)
- Add gpt-5-search-api examples across all sections in web_search.md
- Update /responses examples to use gpt-5 with web_search_preview tool
- Add OpenAI Web Search Models section to providers/openai.md
- Add web search example to providers/openai/responses_api.md
- Updated benchmarks.md with a section on setting up fake OpenAI endpoints
- Updated load_test.md to mention the self-hosted option
- Updated load_test_advanced.md with a tip box about the example repo
Reference: https://github.com/BerriAI/example_openai_endpoint
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
* fix: _should_use_api_key_header
* test_azure_ai_validate_environment_with_api_key
* fix: remove unused top-level RouteChecks import
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* docs: add missing env keys to config_settings reference
Add MODEL_COST_MAP_MIN_MODEL_COUNT, MODEL_COST_MAP_MAX_SHRINK_RATIO,
and MAX_POLICY_ESTIMATE_IMPACT_ROWS to the environment variables
reference table.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* feat: add opus 4.5 and 4.6 to use outout_format param
* generate poetry lock with 2.3.2 poetry
* restore poetry lock
* e2e tests, key delete, update tpm rpm, and regenerate
* Split e2e ui testing for browser
* new login with sso button in login page
* option to hide usage indicator
* fix(cloudzero): update CBF field mappings per LIT-1907 (#20906)
* fix(cloudzero): update CBF field mappings per LIT-1907
Phase 1 field updates for CloudZero integration:
ADD/UPDATE:
- resource/account: Send concat(api_key_alias, '|', api_key_prefix)
- resource/service: Send model_group instead of service_type
- resource/usage_family: Send provider instead of hardcoded 'llm-usage'
- action/operation: NEW - Send team_id
- resource/id: Send model name instead of CZRN
- resource/tag:organization_alias: Add if exists
- resource/tag:project_alias: Add if exists
- resource/tag:user_alias: Add if exists
REMOVE:
- resource/tag:total_tokens: Removed
- resource/tag:team_id: Removed (team_id now in action/operation)
Fixes LIT-1907
* Update litellm/integrations/cloudzero/transform.py
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
* fix: define api_key_alias variable, update CBFRecord docstring
- Fix F821 lint error: api_key_alias was used but not defined
- Update CBFRecord docstring to reflect LIT-1907 field mappings
- Remove unused Optional import
---------
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
* Add banner notifying of breaking change
* Add semgrep & Fix OOMs (#20912)
* [Feat] Policies - Allow connecting Policies to Tags, Simulating Policies, Viewing how many keys, teams it applies on (#20904)
* init schema with TAGS
* ui: add policy test
* resolvePoliciesCall
* add_policy_sources_to_metadata + headers
* types Policy
* preview Impact
* def _describe_match_reason(
* match based on TAGs
* TestTagBasedAttachments
* test fixes
* add policy_resolve_router
* add_guardrails_from_policy_engine
* TestMatchAttribution
* refactor
* fix
* fix: address Greptile review feedback on policy resolve endpoints
- Track unnamed keys/teams as separate counts instead of inflating
affected_keys_count with duplicate "(unnamed key)" placeholders.
Added unnamed_keys_count and unnamed_teams_count to response.
- Push alias pattern matching to DB via _build_alias_where() which
converts exact patterns to Prisma "in" and suffix wildcards to
"startsWith" filters.
- Gate sync_policies_from_db/sync_attachments_from_db behind
force_sync query param (default false) to avoid 2 DB round-trips
on every /policies/resolve request.
- Remove worktree-only conftest.py that cleared sys.modules at import
time — no longer needed since code moved to main repo.
- Rename MAX_ESTIMATE_IMPACT_ROWS → MAX_POLICY_ESTIMATE_IMPACT_ROWS.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: eliminate duplicate DB queries and fix header delimiter ambiguity
- Fetch teams table once in estimate_attachment_impact and reuse for
both tag-based and alias-based lookups (was querying teams twice when
both tag_patterns and team_patterns were provided).
- Convert tag/team filter functions from async DB queries to sync
filters that operate on pre-fetched data (_filter_keys_by_tags,
_filter_teams_by_tags).
- Fix comma ambiguity in x-litellm-policy-sources header: use '; '
as entry delimiter since matched_via values can contain commas.
- Use '+' as the within-value separator in matched_via reason strings
(e.g. "tag:healthcare+team:health-team") to avoid conflict with
header delimiters.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Update litellm/proxy/policy_engine/policy_resolve_endpoints.py
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
* fix: type error & better error handling (#20689)
* [Docs] Add docs guide for using policies (#20914)
* init schema with TAGS
* ui: add policy test
* resolvePoliciesCall
* add_policy_sources_to_metadata + headers
* types Policy
* preview Impact
* def _describe_match_reason(
* match based on TAGs
* TestTagBasedAttachments
* test fixes
* add policy_resolve_router
* add_guardrails_from_policy_engine
* TestMatchAttribution
* refactor
* fix
* fix: address Greptile review feedback on policy resolve endpoints
- Track unnamed keys/teams as separate counts instead of inflating
affected_keys_count with duplicate "(unnamed key)" placeholders.
Added unnamed_keys_count and unnamed_teams_count to response.
- Push alias pattern matching to DB via _build_alias_where() which
converts exact patterns to Prisma "in" and suffix wildcards to
"startsWith" filters.
- Gate sync_policies_from_db/sync_attachments_from_db behind
force_sync query param (default false) to avoid 2 DB round-trips
on every /policies/resolve request.
- Remove worktree-only conftest.py that cleared sys.modules at import
time — no longer needed since code moved to main repo.
- Rename MAX_ESTIMATE_IMPACT_ROWS → MAX_POLICY_ESTIMATE_IMPACT_ROWS.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: eliminate duplicate DB queries and fix header delimiter ambiguity
- Fetch teams table once in estimate_attachment_impact and reuse for
both tag-based and alias-based lookups (was querying teams twice when
both tag_patterns and team_patterns were provided).
- Convert tag/team filter functions from async DB queries to sync
filters that operate on pre-fetched data (_filter_keys_by_tags,
_filter_teams_by_tags).
- Fix comma ambiguity in x-litellm-policy-sources header: use '; '
as entry delimiter since matched_via values can contain commas.
- Use '+' as the within-value separator in matched_via reason strings
(e.g. "tag:healthcare+team:health-team") to avoid conflict with
header delimiters.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* docs v1 guide with UI imgs
* docs fix
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* feat: add dashscope/qwen3-max model with tiered pricing (#20919)
Add support for Alibaba Cloud's Qwen3-Max model with:
- 258K input tokens, 65K output tokens
- Tiered pricing based on context window usage (0-32K, 32K-128K, 128K-252K)
- Function calling and tool choice support
- Reasoning capabilities enabled
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
* fix linting
* docs: add Greptile review requirement to PR template (#20762)
* fix(azure): preserve content_policy_violation error details from Azure OpenAI
Closes#20811
Azure OpenAI returns rich error payloads for content policy violations
(inner_error with ResponsibleAIPolicyViolation, content_filter_results,
revised_prompt). Previously these details were lost when:
1. The top-level error code was not "content_policy_violation" but the
inner_error.code was "ResponsibleAIPolicyViolation" -- the structured
check only examined the top-level code.
2. The DALL-E image generation polling path stringified the error JSON
into the message field instead of setting the structured body, making
it impossible for exception_type() to extract error details.
3. The string-based fallback detector used "invalid_request_error" as a
content-policy indicator, which is too broad and could misclassify
regular bad-request errors.
Changes:
- exception_mapping_utils.py: Check inner_error.code for
ResponsibleAIPolicyViolation when top-level code is not
content_policy_violation. Replace overly broad "invalid_request_error"
string match with specific Azure safety-system messages.
- azure.py: Set structured body on AzureOpenAIError in both async and
sync DALL-E polling paths so exception_type() can inspect error details.
- test_azure_exception_mapping.py: Add regression tests covering the
exact error payloads from issue #20811.
- Fix pre-existing lint: duplicate PerplexityResponsesConfig dict key,
unused RouteChecks top-level import.
---------
Co-authored-by: Kelvin Tran <kelvin-tran@users.noreply.github.com>
Co-authored-by: yuneng-jiang <yuneng.jiang@gmail.com>
Co-authored-by: shin-bot-litellm <shin-bot-litellm@berri.ai>
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: Alexsander Hamir <alexsanderhamirgomesbaptista@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Harshit Jain <48647625+Harshit28j@users.noreply.github.com>
Co-authored-by: ken <122603020@qq.com>
Co-authored-by: Sameer Kankute <sameer@berri.ai>
* Generic Guardrails: Forward request headers + litellm_version to generic guardrail API
* Generic Guardrail: Change the request headers addition to be with allowlist instead denylist
* init schema with TAGS
* ui: add policy test
* resolvePoliciesCall
* add_policy_sources_to_metadata + headers
* types Policy
* preview Impact
* def _describe_match_reason(
* match based on TAGs
* TestTagBasedAttachments
* test fixes
* add policy_resolve_router
* add_guardrails_from_policy_engine
* TestMatchAttribution
* refactor
* fix
* fix: address Greptile review feedback on policy resolve endpoints
- Track unnamed keys/teams as separate counts instead of inflating
affected_keys_count with duplicate "(unnamed key)" placeholders.
Added unnamed_keys_count and unnamed_teams_count to response.
- Push alias pattern matching to DB via _build_alias_where() which
converts exact patterns to Prisma "in" and suffix wildcards to
"startsWith" filters.
- Gate sync_policies_from_db/sync_attachments_from_db behind
force_sync query param (default false) to avoid 2 DB round-trips
on every /policies/resolve request.
- Remove worktree-only conftest.py that cleared sys.modules at import
time — no longer needed since code moved to main repo.
- Rename MAX_ESTIMATE_IMPACT_ROWS → MAX_POLICY_ESTIMATE_IMPACT_ROWS.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: eliminate duplicate DB queries and fix header delimiter ambiguity
- Fetch teams table once in estimate_attachment_impact and reuse for
both tag-based and alias-based lookups (was querying teams twice when
both tag_patterns and team_patterns were provided).
- Convert tag/team filter functions from async DB queries to sync
filters that operate on pre-fetched data (_filter_keys_by_tags,
_filter_teams_by_tags).
- Fix comma ambiguity in x-litellm-policy-sources header: use '; '
as entry delimiter since matched_via values can contain commas.
- Use '+' as the within-value separator in matched_via reason strings
(e.g. "tag:healthcare+team:health-team") to avoid conflict with
header delimiters.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* docs v1 guide with UI imgs
* docs fix
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>