* docs: add Claude Code skills page for litellm-skills
* docs: move skills page to new 'Manage with AI Agents' section
* docs: simplify install to one-liner, rename to LiteLLM Skills
The documentation test checks that all env vars used in code are
documented. The Vantage integration added 5 new env vars without
updating the reference table.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(bedrock): respect s3_region_name for batch file uploads (#23569)
* fix(bedrock): respect s3_region_name for batch file uploads (GovCloud fix)
* fix: s3_region_name always wins over aws_region_name for S3 signing (Greptile feedback)
* fix: _filter_headers_for_aws_signature - Bedrock KB (#23571)
* fix: _filter_headers_for_aws_signature
* fix: filter None header values in all post-signing re-merge paths
Addresses Greptile feedback: None-valued headers were being filtered
during SigV4 signing but re-merged back into the final headers dict
afterward, which would cause downstream HTTP client failures.
Made-with: Cursor
* feat(router): tag_regex routing — route by User-Agent regex without per-developer tag config (#23594)
* feat(router): add tag_regex support for header-based routing
Adds a new `tag_regex` field to litellm_params that lets operators route
requests based on regex patterns matched against request headers — primarily
User-Agent — without requiring per-developer tag configuration.
Use case: route all Claude Code traffic (User-Agent: claude-code/x.y.z) to
a dedicated deployment by setting:
tag_regex:
- "^User-Agent: claude-code\\/"
in the deployment's litellm_params. Works alongside existing `tags` routing;
exact tag match takes precedence over regex match. Unmatched requests fall
through to deployments tagged `default`.
The matched deployment, pattern, and user_agent are recorded in
`metadata["tag_routing"]` so they flow through to SpendLogs automatically.
* fix(tag_regex): address backwards-compat, metadata overwrite, and warning noise
Three issues from code review:
1. Backwards-compat: `has_tag_filter` was widened to activate on any non-empty
User-Agent, which would raise ValueError for existing deployments using plain
tags without a `default` fallback. Fix: only activate header-based regex
filtering when at least one candidate deployment has `tag_regex` configured.
2. Metadata overwrite: `metadata["tag_routing"]` was overwritten for every
matching deployment in the loop, leaving inaccurate provenance when multiple
deployments match. Fix: write only for the first match.
3. Warning noise: an invalid regex pattern logged one warning per header string
rather than once per pattern. Fix: compile first (catching re.error once),
then iterate over header strings.
Also adds two new tests covering these cases, and adds docs page for
tag_regex routing with a Claude Code walk-through.
* refactor(tag_regex): remove unnecessary _healthy_list copy
* docs: merge tag_regex section into tag_routing.md, remove standalone page
- Add ## Regex-based tag routing (tag_regex) section to existing
tag_routing.md instead of a separate page
- Remove tag_regex_routing.md standalone doc (odd UX to have a separate
page for a sub-feature)
- Remove proxy/tag_regex_routing from sidebars.js
- Add match_any=False debug warning in tag_based_routing.py when regex
routing fires under strict mode (regex always uses OR semantics)
* fix(tag_regex): address greptile review - security docs, strict-mode enforcement, validation order
- Strengthen security note in tag_routing.md: explicitly state User-Agent
is client-supplied and can be set to any value; frame tag_regex as a
traffic classification hint, not an access-control mechanism
- Move tag_regex startup validation before _add_deployment() so an invalid
pattern never leaves partial router state
- Enforce match_any=False strict-tag policy: when a deployment has both
tags and tag_regex and the strict tag check fails, skip the regex fallback
rather than silently bypassing the operator's intent
- Extract per-deployment match logic into _match_deployment() helper to
keep get_deployments_for_tag() readable
- Add two new tests: strict-mode blocks regex fallback, regex-only
deployment still matches under match_any=False
* fix(ci): apply Black formatting to 14 files and stabilize flaky caplog tests
- Run Black formatter on 14 files that were failing the lint check
- Replace caplog-based assertions in TestAliasConflicts with
unittest.mock.patch on verbose_logger.warning for xdist compatibility
- The caplog fixture can produce empty text in pytest-xdist workers
in certain CI environments, causing flaky test failures
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
---------
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
* fix(proxy): cap managed-object poll size + expire stale rows + kill-switch flag to prevent OOM/Prisma connection loss
* fix(constants): simplify PROXY_BATCH_POLLING_ENABLED readability
* docs+test: document new polling env vars, add pagination+stale-cleanup tests
* fix: exclude stale_expired from batch poll queries; fix update_many assertions in tests
* fix: scope stale cleanup to file_purpose, fix file_object mocks, add CheckBatchCost tests
* fix: avoid duplicate cost logging in fallback path; guard integer constants against zero/negative values
* fix: cache _has_batch_processed_column; guard cleanup from aborting poll; narrow fallback except
* fix: add complete/completed to primary query not_in; fix vacuous test assertion
- Primary find_many was missing "complete" and "completed" in its not_in
filter, creating asymmetry with the fallback query. A job whose status
was set to "complete" but whose batch_processed flag update failed would
be silently re-fetched and re-processed every cycle, emitting duplicate
cost logs.
- test_fallback_completion_update_omits_batch_processed patched
_is_base64_encoded_unified_file_id to return None, causing an immediate
continue — so update() was never called and the assertion looped over an
empty list (vacuously true). Rewrote the test to mock the full
completion pipeline, verify update() is called exactly once, and assert
batch_processed is absent from the update data.
- Added symmetric test (primary path) proving batch_processed IS included
when the column exists.
Made-with: Cursor
Document the PKCE_STRICT_CACHE_MISS environment variable in config_settings.md
to fix the CI env key documentation check.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
OpenAI rejects any reasoning_effort (even 'none') with tools in
/v1/chat/completions for gpt-5.4. Update the guard to drop reasoning_effort
regardless of value. Add docs explaining the auto-drop behavior.
The existing documentation for the Responses API bridge only showed
examples with models that have `mode: responses` (like o3-deep-research),
which work automatically. This update clarifies that models with
`mode: chat` (like gpt-4o, gpt-5) require the `openai/responses/` prefix
to use built-in tools like web_search_preview.
Changes:
- Explain the `mode` property from model_prices_and_context_window.json
- List models with mode: responses vs mode: chat
- Add example showing the common error and how to fix it
- Add SDK example using the prefix with gpt-4o
- Update proxy example with both automatic and prefix-based configs
- Fix invalid trailing comma in original JSON example
Implement Anthropic Files API (upload, retrieve, list, delete, content)
using the BaseFilesConfig provider pattern. Adds multipart form-data
support to BaseLLMHTTPHandler for file uploads.
Add Claude Opus 4.6, Sonnet 4.6, Opus 4.5, Sonnet 4.5, and Haiku 4.5
to the web fetch supported models documentation. These models were
missing from the list despite supporting the web_fetch tool.
Add usage example with concrete model entry, explanation of load-time
expansion, and cross-reference to model_alias_map to clarify the
difference between the two features.