Commit Graph

10 Commits

Author SHA1 Message Date
Shivam Rawat fbff60e9d9 fix(proxy): strip LiteLLM policy tracking from OpenAI batch metadata (#28425)
* fix(proxy): strip LiteLLM policy tracking from OpenAI batch metadata

Batch create was failing with `Invalid type for 'metadata.applied_policies':
expected a string, but got an array instead` whenever a policy attachment
matched the request. The policy engine helpers wrote `applied_policies`,
`applied_guardrails`, and `policy_sources` into `data["metadata"]`
unconditionally, and `/v1/batches` forwarded that dict straight to OpenAI,
which only accepts string values.

- Route proxy-internal tracking into `litellm_metadata` for batch/file
  routes via a shared `_get_or_create_proxy_metadata_bucket` helper.
- Sanitize `data["metadata"]` in `create_batch` to drop known internal
  keys and non-string values before building the OpenAI request.
- Cover both behaviors with unit + endpoint tests.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(proxy): merge metadata buckets for batch policy response headers

Ensure get_logging_caching_headers reads both metadata and litellm_metadata so policy/guardrail headers are emitted on batch routes with user metadata, and log dropped non-string OpenAI metadata at debug level.

Co-authored-by: Cursor <cursoragent@cursor.com>

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-26 11:35:42 -07:00
Michael-RZ-Berri 3b2ce201d8 encrypt callback_vars in key/team metadata at rest (#27141)
Co-authored-by: Michael Riad Zaky <michaelr@Michaels-MacBook-Air.local>
Co-authored-by: Yuneng Jiang <yuneng@berri.ai>
2026-05-23 12:15:44 -07:00
Krrish Dholakia 386f334fee Prompt Compression - add it to the proxy (#25729)
* refactor: new agentic loop event hook

simplifies how to create logic for tool based multi llm calls

* fix: compress - make it work on anthropic input as well

* fix(compress.py): working prompt compression for claude code

ensures claude code messages can run through proxy easily

* docs: add agentic loop hook guide

* docs: add agentic_loop_hook to sidebar

* fix: fix multiple arguments error

* fix: fix tool call loop for compression on streaming /v1/messages

* fix: fix linting errors

* fix: fix ci/cd errors

* feat(litellm_pre_call_utils.py): use claude code session for litellm session id

allows claude code logs to be stitched together, making it easy to know they were all part of the same conversation

* fix: suppress incorrect mypy warning rE: module

* revert: drop PR's changes to litellm/proxy/_experimental/out/

Restores the 34 HTML files under _experimental/out/ to their pre-PR
paths (X/index.html -> X.html). All renames are R100 (content
unchanged); no other files are touched.

* fix: address greptile review comments on PR #25729

- Skip ``kwargs["tools"] = []`` injection when compression is a no-op —
  Anthropic Messages rejects empty tool arrays on requests that did not
  originally declare tools.
- Move agentic-loop safety guards (fingerprint cycle / max depth) out of
  the per-callback try/except so they propagate instead of being swallowed
  by the generic exception handler. Extracted _check_agentic_loop_safety.
- Gate generic ``x-<vendor>-session-id`` capture behind the
  LITELLM_CAPTURE_VENDOR_SESSION_HEADERS env var (off by default) to
  preserve backwards compatibility; explicit x-litellm-* headers are
  unaffected.
- Fix monkeypatch target in pre-call-hook test to patch the actual
  module-level binding
  (litellm.integrations.compression_interception.handler.compress).
- Add regression tests for empty-tools skip and opt-in session capture.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* revert: drop LITELLM_CAPTURE_VENDOR_SESSION_HEADERS flag

Generic x-<vendor>-session-id header capture is a new feature and only
runs *after* the explicit x-litellm-trace-id / x-litellm-session-id
checks, so it does not change behavior for any existing caller that was
already using the LiteLLM headers — no backwards-incompatibility to gate.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* refactor(compress): replace input_type with CallTypes call_type

Drop the bespoke ``CompressionInputType`` literal and use the existing
``litellm.types.utils.CallTypes`` enum instead.  ``litellm.compress()``
now takes ``call_type: Union[CallTypes, str]`` (default
``CallTypes.completion``) — no new concept to learn, and the enum is
already the way the rest of the codebase talks about request shapes.

Supported values: ``completion`` / ``acompletion`` (OpenAI chat-completions
shape) and ``anthropic_messages`` (Anthropic structured content blocks).

Updated: compress(), the compression_interception handler, tests, docs,
and the two eval scripts.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-20 15:08:00 -07:00
Ishaan Jaffer e8461b5b97 style: run black formatter on files from main merge 2026-04-17 13:02:59 -07:00
yuneng-jiang 0dd4db34bd Working setting generic callbacks on UI 2025-12-05 14:37:48 -08:00
yuneng-jiang cff8a3115a Merge with main 2025-11-14 20:02:35 -08:00
yuneng-jiang cb27d6c456 [Fix] UI - Delete Callbacks Failing (#16473)
* Temp commit for branch switching

* Created normalize callback name util function and tests
2025-11-12 18:43:37 -08:00
yuneng-jiang 7833b3fdb4 Addressing comments 2025-11-10 17:28:13 -08:00
kayoch1n 76555cad81 Format code 2025-09-03 11:20:52 +08:00
kayoch1n ffbe5cd899 Add a testcase 2025-09-03 11:08:23 +08:00