Commit Graph

352 Commits

Author SHA1 Message Date
user bb6d7c9715 fix(callbacks): preserve langfuse secret alias 2026-04-30 14:36:51 -07:00
user 258edac727 test(callbacks): cover upstream langfuse debug env 2026-04-30 14:34:39 -07:00
user 15d4d51453 chore(callbacks): guard dynamic integration hosts 2026-04-30 14:27:19 -07:00
user 7497674661 fix(proxy): sanitize redaction controls at ingress 2026-04-29 22:52:31 -07:00
user 842eea0131 chore(proxy): harden request control fields 2026-04-29 22:35:17 -07:00
Sameer Kankute b516120036 Merge pull request #26737 from BerriAI/litellm_internal_staging
merge internal staging
2026-04-29 08:50:12 +05:30
Sameer Kankute cf74f55b79 Fix extra body error 2026-04-29 08:34:31 +05:30
milan-berri 10aed9e981 feat(logging): add retry settings for generic API logger (#26645)
* Add retry settings for generic API logger

Made-with: Cursor

* Refine generic API retry behavior

Made-with: Cursor
2026-04-28 08:38:17 -07:00
ishaan-berri 8a9faa81b2 feat(guardrails): LLM-as-a-Judge guardrail (#26360)
* feat(guardrails): add LLM_AS_A_JUDGE to SupportedGuardrailIntegrations

* feat(types): add EvalVerdict, StandardLoggingEvalInformation; wire eval_information into SpendLogsMetadata

* feat(guardrails): add self-contained llm_as_a_judge guardrail hook

* fix(a2a): filter agent-only litellm_params from acompletion kwargs; pass agent_id into body

* feat(ui): add LLMJudgeFields criteria builder component

* feat(ui): wire LLM-as-a-Judge into add guardrail form

* feat(ui): update EvalViewer — title 'LLM Judge Results', weighted score column, summary row

* fix(ui): wire EvalViewer into LogDetailContent to show LLM judge results on logs page

* fix(guardrails-ui): route llm_as_a_judge to criteria builder step; rename to LiteLLM LLM as a Judge; add litellm logo

* fix(guardrail-viewer): stack lifecycle + eval details vertically to avoid badge overflow in narrow drawer

* fix(guardrail-create): surface config validation errors on create instead of silently orphaning guardrail in DB

* fix(guardrail-registry): hardcode llm_as_a_judge in initializer registry so it loads regardless of package install path

* fix(llm-as-a-judge): fix P1 code quality issues - validate weights/on_failure, guard pre_call, handle multimodal, move imports to module level, fix spurious finally logging

* fix(guardrail_endpoints): use correct PK field in rollback delete and log rollback failure

* fix(llm_as_a_judge): support Pydantic object in _get_litellm_param fallback chain

* fix(LLMJudgeFields): replace @tremor/react Button with antd Button

* fix(llm_as_a_judge): remove dead registry dicts, fix KeyError in prompt builder, set correct status on judge failure

* test(llm_as_a_judge): add unit tests for guardrail hook

* fix(llm_as_a_judge): remove @log_guardrail_information decorator to fix duplicate guardrail_information entries

The decorator and the manual finally block both called add_standard_logging_guardrail_information_to_request_data, producing two entries per request. The decorator also misclassified HTTPException(422) blocks as guardrail_failed_to_respond (it checks for 400). The finally block correctly tracks status throughout, so removing the decorator is sufficient.

* fix(test_gcs_pub_sub): ignore metadata.eval_information in comparison

* fix(test_spend_management): ignore metadata.eval_information in payload comparison

* fix(types/guardrails): add input_type and messages to ApplyGuardrailRequest

* fix(guardrail_endpoints): pass input_type and messages through apply_guardrail endpoint

* fix(guardrail_endpoints): auto-detect post_call guardrails and use input_type=response

* fix(a2a_endpoints): merge agent litellm_params guardrails into data before post_call hooks

* fix(llm_as_a_judge): use float sum with tolerance for weight validation

* fix(guardrail_registry): split long import line for black formatting

* fix(llm_as_a_judge): guard guardrail_name Optional for mypy

* fix(llm_as_a_judge): set guardrail_status=guardrail_intervened when score fails, regardless of on_failure mode

* fix(a2a_endpoints): use try/finally so deferred spend log fires even when guardrail blocks with 422

* fix(litellm_logging): declare _defer_async_logging and _enqueue_deferred_logging on Logging class for mypy

* fix(logging_worker): restore queue.join() in flush() to wait for in-flight callbacks
2026-04-24 17:15:32 -07:00
ishaan-berri 8a4a775b1b fix(logging): add litellm_call_id to StandardLoggingPayload and OTel span (#26133)
* add litellm_call_id field to StandardLoggingPayload

* populate litellm_call_id in get_standard_logging_object_payload

* emit litellm.call_id span attribute in OTel integration

* test: litellm_call_id is present in StandardLoggingPayload

* test: litellm.call_id emitted as OTel span attribute

* test: allow litellm. prefix attributes in redacted span validator
2026-04-21 15:24:32 -07:00
Yuneng Jiang 11c3270cdc Merge remote-tracking branch 'origin/litellm_internal_staging' into litellm_yj_apr17
# Conflicts:
#	litellm/__init__.py
2026-04-17 17:36:40 -07:00
Yuneng Jiang ee2cf0e6e8 fix: address three CI failures from recent security PR merges
- url_utils.py: narrow sockaddr[0] from str|int to str via a helper with a
  fail-closed isinstance check. Fixes the two mypy errors introduced by
  the SSRF hardening without masking unexpected stdlib behavior.

- key_management_endpoints.py: restore the documented team member_permissions
  path for /key/update. The cross-key admin check added to close the
  cross-org rewrite attack was over-broad: it rejected non-admin team
  members even when can_team_member_execute_key_management_endpoint had
  already validated their team membership and /key/update grant. Now skip
  the admin check when the key has a team_id and the change is non-budget
  (membership + permission already enforced above). Budget/spend changes
  still require team/org admin. The cross-org attack remains blocked:
  an outside org admin fails the earlier team membership check.

- test_logging_redaction_e2e_test.py: rename and rewrite two parametrized
  tests to assert that request-body turn_off_message_logging has no effect.
  Reflects the intentional removal of turn_off_message_logging from
  _supported_callback_params so the caller cannot override admin logging
  policy via the request body.

- test_key_management_endpoints.py: add two tests covering the restored
  team member permission path — one positive (non-budget update succeeds
  for a team member with /key/update grant), one negative (max_budget
  change still rejected without admin role).
2026-04-17 15:11:45 -07:00
Ishaan Jaffer e8461b5b97 style: run black formatter on files from main merge 2026-04-17 13:02:59 -07:00
Ishaan Jaffer 98c2d90f5c fix(logging): update test_get_additional_headers to reflect provider header passthrough 2026-04-15 12:23:33 -07:00
David Chen b7ccc5b691 [Test Fix] fix gov pricing tests (#25022)
* fix pricing tests

* fix mypy

* fix cost expectation since us based model is used now.

* fix test get model info
2026-04-02 15:55:55 -07:00
David Chen d1df4e838b Litellm fix update bedrock models (#24947)
* update bedrock models in tests

* updated more tests and model_prices_and_context_window

* fix model id and pricing

* replace more sonnet models

* update tests

* git push

* update pricing

* flaky total cost

* monkey patch

* relax the cost change

* fix and revert some changes

* revert the pricing

* chore: move cost/pricing changes to bedrock-cost-fixes branch

* chore: split Bedrock file-api beta stripping to separate branch

Removes strip_unsupported_file_api_betas_for_bedrock_invoke from this branch;
see litellm_bedrock_invoke_strip_file_api_betas for that fix.

Made-with: Cursor
2026-04-01 19:22:54 -07:00
ishaan-berri e4442a4d98 test fix us.anthropic.claude-haiku-4-5-20251001-v1:0 (#24931)
* test fix us.anthropic.claude-haiku-4-5-20251001-v1:0

* ignore mypy cache files

---------

Co-authored-by: Ishaan Jaffer <ishaanjaffer0324@gmail.com>
Co-authored-by: David Chen <clfhhc@gmail.com>
2026-04-01 11:01:03 -07:00
Ishaan Jaffer 0298c1f58d test_basic_s3_v2_logging 2026-03-30 18:17:52 -07:00
Ishaan Jaffer 443566d4f5 test fixes 2026-03-30 16:59:27 -07:00
Ishaan Jaffer 28afbc152f test_async_gcs_pub_sub_v1 2026-03-30 16:52:56 -07:00
Ishaan Jaffer 431782c3fe test azure blob storage 2026-03-30 15:54:07 -07:00
Krrish Dholakia 25f2baad71 test: cleanup dead tests 2026-03-28 20:49:02 -07:00
Krrish Dholakia 0fef88d67c test: remove dead tests 2026-03-28 20:23:44 -07:00
Krrish Dholakia bc829d51f2 test: test 2026-03-28 19:17:38 -07:00
Ishaan Jaff 81dadb698a Ishaan - March 18th changes (#24056)
* add DD Tracing (#24033)

* feat(models): add Azure GPT-5.4 mini and nano variants (#24045)

Add `azure/gpt-5.4-mini` and `azure/gpt-5.4-nano` to the model
database with official pricing from Azure OpenAI:

- GPT-5.4 mini: $0.75/M input, $0.075/M cached, $4.5/M output
- GPT-5.4 nano: $0.20/M input, $0.02/M cached, $1.25/M output

Both models support:
- 1.05M input / 128K output context window
- Chat, batch, and responses endpoints
- Function calling, tools, vision, reasoning
- Prompt caching with automatic tiered pricing

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* Add new model pricing details for volcengine Doubao-Seed-2.0 series (#23871)

Add entries for volcengine Doubao-Seed-2.0 series

* fix(mcp): support refresh_token grant type in OAuth token endpoint (#23701)

* fix(mcp): support refresh_token grant type in OAuth token endpoint (#23700)

The .well-known/oauth-authorization-server metadata advertises
refresh_token as a supported grant type, but the token endpoint
rejected it with HTTP 400. This adds refresh_token grant support
so MCP clients can refresh expired tokens without re-authenticating.

* test(mcp): add tests for refresh_token grant type in OAuth token endpoint

* fix(mcp): move code_verifier guard into authorization_code branch

code_verifier is only relevant for authorization_code grants (PKCE).
Move it inside the else branch so it doesn't apply to refresh_token.

* fix(mcp): guard None client_secret and forward scope in token exchange

- Conditionally include client_secret in form data to prevent httpx
  from sending the literal string "None" (applies to both
  authorization_code and refresh_token branches)
- Forward optional scope parameter per RFC 6749 §6, allowing clients
  to request a subset of originally-granted scopes on refresh

* fix(mcp): validate code param in authorization_code grant

Guard against None code being form-encoded as literal string "None"
by httpx, symmetric with the existing refresh_token guard.

* docs: add incident report for guardrail logging secret exposure (#24059)

Add blog post documenting the guardrail logging path exposing internal
request data (e.g. Authorization headers) in spend logs and OTEL traces.
Fix available in LiteLLM 1.82.3+.

Made-with: Cursor

* [Fix] Datadog LLM Observability tags format (env, service, version missing) (#23673)

* tag fix

* greptile comment

* fix(ci): stabilize 6 failing CI jobs

1. mypy: remove duplicate type annotation for token_data in discoverable_endpoints.py
2. integrations tests: add parameterized to CI test deps
3. doc quality: document OTEL_IGNORE_CONTEXT_PROPAGATION env key
4. security: allowlist CVE-2026-2673, CVE-2026-3644, CVE-2026-4224 (no fix available)
5. proxy_store_model_in_db: fix missing x-litellm-call-id header on error responses
6. google tests: add --retries 3 for transient Vertex AI rate limits

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix(streaming): handle RuntimeError during model_copy in streaming handler

The race condition occurs when model_copy(deep=True) tries to deepcopy
_hidden_params dict while it's being concurrently modified by logging
callbacks. Fall back to shallow copy if the deep copy fails.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix(cost): handle non-string traffic_type in cost calculator + add retries

1. Fix AttributeError in _map_traffic_type_to_service_tier when traffic_type
   is an integer (cast to str before calling .upper()). This was causing
   pass-through vertex spend logging to fail silently.
2. Add --retries to llm_translation_testing for flaky external API calls.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

---------

Co-authored-by: Emerson Gomes <emerson.gomes@thalesgroup.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: ExMatics HydrogenC <33123710+HydrogenC@users.noreply.github.com>
Co-authored-by: Jack Venberg <jack.venberg@rover.com>
Co-authored-by: milan-berri <milan@berri.ai>
Co-authored-by: Shivam Rawat <161387515+shivamrawat1@users.noreply.github.com>
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
2026-03-19 10:20:35 -07:00
Ishaan Jaff 8e61b32b8e [Staging] - Ishaan March 17th (#23903)
* feat(xai): add grok-4.20 beta 2 models with pricing (#23900)

Add three grok-4.20 beta 2 model variants from xAI:
- grok-4.20-multi-agent-beta-0309 (reasoning + multi-agent)
- grok-4.20-beta-0309-reasoning (reasoning)
- grok-4.20-beta-0309-non-reasoning

Pricing (from https://docs.x.ai/docs/models):
- Input: $2.00/1M tokens ($0.20/1M cached)
- Output: $6.00/1M tokens
- Context: 2M tokens

All variants support vision, function calling, tool choice, and web search.
Closes LIT-2171

* docs: add Quick Install section for litellm --setup wizard (#23905)

* docs: add Quick Install section for litellm --setup wizard

* docs: clarify setup wizard is for local/beginner use

* feat(setup): interactive setup wizard + install.sh (#23644)

* feat(setup): add interactive setup wizard + install.sh

Adds `litellm --setup` — a Claude Code-style TUI onboarding wizard that
guides users through provider selection, API key entry, and proxy config
generation, then optionally starts the proxy immediately.

- litellm/setup_wizard.py: wizard with ASCII art, numbered provider menu
  (OpenAI, Anthropic, Azure, Gemini, Bedrock, Ollama), API key prompts,
  port/master-key config, and litellm_config.yaml generation
- litellm/proxy/proxy_cli.py: adds --setup flag that invokes the wizard
- scripts/install.sh: curl-installable script (detect OS/Python, pip
  install litellm[proxy], launch wizard)

Usage:
  curl -fsSL https://raw.githubusercontent.com/BerriAI/litellm/main/scripts/install.sh | sh
  litellm --setup

* fix(install.sh): remove orange color, add LITELLM_BRANCH env var for branch installs

* fix(install.sh): install from git branch so --setup is available for QA

* fix(install.sh): remove stale LITELLM_BRANCH reference that caused unbound variable error

* fix(install.sh): force-reinstall from git to bypass cached PyPI version

* fix(install.sh): show pip progress bar during install

* fix(install.sh): always launch wizard via $PYTHON_BIN -m litellm, not PATH binary

* fix(install.sh): use litellm.proxy.proxy_cli module (no __main__.py exists)

* fix(install.sh): suppress RuntimeWarning from module invocation

* fix(install.sh): use Python bin-dir litellm binary to avoid CWD sys.path shadowing

* fix(install.sh): use sysconfig.get_path('scripts') to find pip-installed litellm binary

* fix(install.sh): redirect stdin from /dev/tty on exec so wizard gets terminal, not exhausted pipe

* fix(install.sh): warn about git clone duration, drop --no-cache-dir so re-runs are faster

* feat(setup_wizard): arrow-key selector, updated model names

* fix(setup_wizard): use sysconfig binary to start proxy, not python -m litellm

* feat(setup_wizard): credential validation after key entry + clear next-steps after proxy start

* style(install.sh): show git clone warning in blue

* refactor(setup_wizard): class with static methods, use check_valid_key from litellm.utils

* address greptile review: fix yaml escaping, port validation, display name collisions, tests

- setup_wizard.py: add _yaml_escape() for safe YAML embedding of API keys
- setup_wizard.py: add _styled_input() with readline ANSI ignore markers
- setup_wizard.py: change DIVIDER to _divider() fn to avoid import-time color capture
- setup_wizard.py: validate port range 1-65535, initialize before loop
- setup_wizard.py: qualify azure display names (azure-gpt-4o) to avoid collision with openai
- setup_wizard.py: work on env_copy in _build_config to avoid mutating caller's dict
- setup_wizard.py: skip model_list entries for providers with no credentials
- setup_wizard.py: prompt for azure deployment name
- setup_wizard.py: wrap os.execlp in try/except with friendly fallback
- setup_wizard.py: wrap config write in try/except OSError
- setup_wizard.py: fix _validate_and_report to use two print lines (no \r overwrite)
- setup_wizard.py: add .gitignore tip next to key storage notice
- setup_wizard.py: fix run_setup_wizard() return type annotation to None
- scripts/install.sh: drop pipefail (not supported by dash on Ubuntu when invoked as sh)
- scripts/install.sh: use litellm[proxy] from PyPI (not hardcoded dev branch)
- scripts/install.sh: guard /dev/tty read with -r check for Docker/CI compat
- scripts/install.sh: remove --force-reinstall to avoid downgrading dependencies
- tests/test_litellm/test_setup_wizard.py: 13 unit tests for _build_config and _yaml_escape

* style: black format setup_wizard.py

* fix: address remaining greptile issues - Windows compat, YAML quoting, credential flow

- guard termios/tty imports with try/except ImportError for Windows compat
- quote master_key as YAML double-quoted scalar (same as env vars)
- remove unused port param from _build_config signature
- _validate_and_report now returns the final key so re-entered creds are stored
- add test for master_key YAML quoting

* fix: add --port to suggested command, guard /dev/tty exec in install.sh

* fix: quote api_base in YAML, skip azure if no deployment, only redraw on state change

* fix: address greptile review comments

- _yaml_escape: add control character escaping (\n, \r, \t)
- test: fix tautological assertion in test_build_config_azure_no_deployment_skipped
- test: add tests for control character escaping in _yaml_escape

* feat(ui): remove Chat UI page link and banner from sidebar and playground (#23908)

* feat(guardrails): MCPJWTSigner - built-in guardrail for zero trust MCP auth (#23897)

* Allow pre_mcp_call guardrail hooks to mutate outbound MCP headers

* Enhance MCPServerManager to support hook-modified arguments and extra headers. Update tests to validate argument mutation and header injection behavior, including warnings for OpenAPI-backed servers when headers are present.

* Refactor MCPServerManager to raise HTTPException for extra headers in OpenAPI-backed servers. Update tests to reflect this change, ensuring proper exception handling instead of logging warnings.

* Allow pre_mcp_call guardrail hooks to mutate outbound MCP headers

* Enhance MCPServerManager to support hook-modified arguments and extra headers. Update tests to validate argument mutation and header injection behavior, including warnings for OpenAPI-backed servers when headers are present.

* Refactor MCPServerManager to raise HTTPException for extra headers in OpenAPI-backed servers. Update tests to reflect this change, ensuring proper exception handling instead of logging warnings.

* feat(guardrails): add MCPJWTSigner built-in guardrail for zero trust MCP auth

Signs outbound MCP tool calls with a LiteLLM-issued RS256 JWT so MCP servers
can trust a single signing authority instead of every upstream IdP.

Enable in config.yaml:
  guardrails:
    - guardrail_name: mcp-jwt-signer
      litellm_params:
        guardrail: mcp_jwt_signer
        mode: pre_mcp_call
        default_on: true

JWT carries sub (user_id), act.sub (team_id, RFC 8693), tool-level scope, iss,
aud, iat/exp/nbf. RSA-2048 keypair auto-generated at startup unless
MCP_JWT_SIGNING_KEY env var is set.

Adds /.well-known/jwks.json endpoint and jwks_uri to /.well-known/openid-configuration
so MCP servers can verify LiteLLM-issued tokens via OIDC discovery.

* Update MCPServerManager to raise HTTPException with status code 400 for extra headers in OpenAPI-backed servers. Adjust tests to verify the correct status code and exception message.

* fix: address P1 issues in MCPJWTSigner

- OpenAPI servers: warn + skip header injection instead of 500
- JWKS Cache-Control: 5min for auto-generated keys, 1h for persistent
- sub claim: fallback to apikey:{token_hash} for anonymous callers
- ttl_seconds: validate > 0 at init time

* docs: add MCP zero trust auth guide with architecture diagram

* docs: add FastMCP JWT verification guide to zero trust doc

* fix: address remaining Greptile review issues (round 2)

- mcp_server_manager: warn when hook Authorization overwrites existing header
- __init__: remove _mcp_jwt_signer_instance from __all__ (private internal)
- discoverable_endpoints: copy dict instead of mutating in-place on OIDC augmentation
- test docstring: reflect warn-and-continue behavior for OpenAPI servers
- test: update scope assertions for least-privilege (no mcp:tools/list on tool-call JWTs)

* fix: address Greptile round 3 feedback

- initialize_guardrail: validate mode='pre_mcp_call' at init time — misconfigured
  mode silently bypasses JWT injection, which is a zero-trust bypass
- _build_claims: remove duplicate inline 'import re' (module-level import already present)
- _types.py: add TODO comment explaining jwt_claims is forward-compat plumbing
  for a follow-up PR that will forward upstream IdP claims into outbound MCP JWTs

* feat(mcp_jwt_signer): add verify+re-sign, claim ops, two-token model, configurable scopes

Addresses all missing pieces from the scoping doc review:

FR-5 (Verify + re-sign): MCPJWTSigner now accepts access_token_discovery_uri
and token_introspection_endpoint.  When set, the incoming Bearer token is
extracted from raw_headers (threaded through pre_call_tool_check), verified
against the IdP's JWKS (JWT) or introspected (opaque), and only re-signed if
valid.  Falls back to user_api_key_dict.jwt_claims for LiteLLM JWT-auth mode.

FR-12 (Configurable end-user identity mapping): end_user_claim_sources
ordered list drives sub resolution — sources: token:<claim>, litellm:user_id,
litellm:email, litellm:end_user_id, litellm:team_id.

FR-13 (Claim operations): add_claims (insert-if-absent), set_claims (always
override), remove_claims (delete) applied in that order.

FR-14 (Two-token model): channel_token_audience + channel_token_ttl issue a
second JWT injected as x-mcp-channel-token: Bearer <token>.

FR-15 (Incoming claim validation): required_claims raises HTTP 403 when any
listed claim is absent; optional_claims passes listed claims from verified
token into the outbound JWT.

FR-9 (Debug headers): debug_headers: true emits x-litellm-mcp-debug with kid,
sub, iss, exp, scope.

FR-10 (Configurable scopes): allowed_scopes replaces auto-generation.  Also
fixed: tool-call JWTs no longer grant mcp:tools/list (overpermission).

P1 fixes:
- proxy/utils.py: _convert_mcp_hook_response_to_kwargs merges rather than
  replaces extra_headers, preserving headers from prior guardrails.
- mcp_server_manager.py: warns when hook injects Authorization alongside a
  server-configured authentication_token (previously silent).
- mcp_server_manager.py: pre_call_tool_check now accepts raw_headers and
  extracts incoming_bearer_token so FR-5 verification has the raw token.
- proxy/utils.py: remove stray inline import inspect inside loop (pre-existing
  lint error, now cleaned up).

Tests: 43 passing (28 new tests covering all FR flags + P1 fixes).

* feat(mcp_jwt_signer): add verify+re-sign, claim ops, two-token model, configurable scopes (core)

Remaining files from the FR implementation:

mcp_jwt_signer.py — full rewrite with all new params:
  FR-5:  access_token_discovery_uri, token_introspection_endpoint,
         verify_issuer, verify_audience + _verify_incoming_jwt(),
         _introspect_opaque_token()
  FR-12: end_user_claim_sources ordered resolution chain
  FR-13: add_claims, set_claims, remove_claims
  FR-14: channel_token_audience, channel_token_ttl → x-mcp-channel-token
  FR-15: required_claims (raises 403), optional_claims (passthrough)
  FR-9:  debug_headers → x-litellm-mcp-debug
  FR-10: allowed_scopes; tool-call JWTs no longer over-grant tools/list

mcp_server_manager.py:
  - pre_call_tool_check gains raw_headers param to extract incoming_bearer_token
  - Silent Authorization override warning fixed: now fires when server has
    authentication_token AND hook injects Authorization

tests/test_mcp_jwt_signer.py:
  28 new tests covering all FR flags + P1 fixes (43 total, all passing)

* fix(mcp_jwt_signer): address pre-landing review issues

- Remove stale TODO comment on UserAPIKeyAuth.jwt_claims — the field is
  already populated and consumed by MCPJWTSigner in the same PR
- Fix _get_oidc_discovery to only cache the OIDC discovery doc when
  jwks_uri is present; a malformed/empty doc now retries on the next
  request instead of being permanently cached until proxy restart
- Add FR-5 test coverage for _fetch_jwks (cache hit/miss),
  _get_oidc_discovery (cache/no-cache on bad doc), _verify_incoming_jwt
  (valid token, expired token), _introspect_opaque_token (active,
  inactive, no endpoint), and the end-to-end 401 hook path — 53 tests
  total, all passing

* docs(mcp_zero_trust): rewrite as use-case guide covering all new JWT signer features

Add scenario-driven sections for each new config area:
- Verify+re-sign with Okta/Azure AD (access_token_discovery_uri,
  end_user_claim_sources, token_introspection_endpoint)
- Enforcing caller attributes with required_claims / optional_claims
- Adding metadata via add_claims / set_claims / remove_claims
- Two-token model for AWS Bedrock AgentCore Gateway
  (channel_token_audience / channel_token_ttl)
- Controlling scopes with allowed_scopes
- Debugging JWT rejections with debug_headers

Update JWT claims table to reflect configurable sub (end_user_claim_sources)

* fix(mcp_jwt_signer): wire all config.yaml params through initialize_guardrail

The factory was only passing issuer/audience/ttl_seconds to MCPJWTSigner.
All FR-5/9/10/12/13/14/15 params (access_token_discovery_uri,
end_user_claim_sources, add/set/remove_claims, channel_token_audience,
required/optional_claims, debug_headers, allowed_scopes, etc.) were
silently dropped, making every advertised advanced feature non-functional
when loaded from config.yaml.

Add regression test that asserts every param is wired through correctly.

* docs(mcp_zero_trust): add hero image

* docs(mcp_zero_trust): apply Linear-style edits

- Lead with the problem (unsigned direct calls bypass access controls)
- Shorter statement section headers instead of question-form headers
- Move diagram/OIDC discovery block after the reader is bought in
- Add 'read further only if you need to' callout after basic setup
- Two-token section now opens from the user problem not product jargon
- Add concrete 403 error response example in required_claims section
- Debug section opens from the symptom (MCP server returning 401)
- Lowercase claims reference header for consistency

* fix(mcp_jwt_signer): fix algorithm confusion attack + add OIDC discovery 24h TTL

- Remove alg from unverified JWT header; use signing_jwk.algorithm_name from JWKS key instead.
  Reading alg from attacker-controlled headers enables alg:none / HS256 confusion attacks.
- Add _oidc_discovery_fetched_at timestamp and _OIDC_DISCOVERY_TTL = 86400 (24h).
  Without a TTL the cached discovery doc never refreshes, so IdP key rotation is invisible.

---------

Co-authored-by: Noah Nistler <60981020+noahnistler@users.noreply.github.com>

* fix(ci): stabilize CI - formatting, type errors, test polling, security CVEs, router bug, batch resolution

Fix 1: Run Black formatter on 35 files
Fix 2: Fix MyPy type errors:
  - setup_wizard.py: add type annotation for 'selected' set variable
  - user_api_key_auth.py: remove redundant type annotation on jwt_claims reassignment
Fix 3: Fix spend accuracy test burst 2 polling to wait for expected total
  spend instead of just 'any increase' from burst 2
Fix 4: Bump Next.js 16.1.6 -> 16.1.7 to fix CVE-2026-27978, CVE-2026-27979,
  CVE-2026-27980, CVE-2026-29057
Fix 5: Fix router _pre_call_checks model variable being overwritten inside
  loop, causing wrong model lookups on subsequent deployments. Use local
  _deployment_model variable instead.
Fix 6: Add missing resolve_output_file_ids_to_unified call in batch retrieve
  non-terminal-to-terminal path (matching the terminal path behavior)

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* chore: regenerate poetry.lock to sync with pyproject.toml

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: format merged files from main and regenerate poetry.lock

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix(mypy): annotate jwt_claims as Optional[dict] to fix type incompatibility

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix(ci): update router region test to use gpt-4.1-mini (fix flaky model lookup)

Replace deprecated gpt-3.5-turbo-1106 with gpt-4.1-mini + mock_response in
test_router_region_pre_call_check, following the same pattern used in commit
717d37cc5b for test_router_context_window_check_pre_call_check_out_group.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* ci: retry flaky logging_testing (async event loop race condition)

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix(ci): aggregate all mock calls in langfuse e2e test to fix race condition

The _verify_langfuse_call helper only inspected the last mock call
(mock_post.call_args), but the Langfuse SDK may split trace-create and
generation-create events across separate HTTP flush cycles. This caused
an IndexError when the last call's batch contained only one event type.

Fix: iterate over mock_post.call_args_list to collect batch items from
ALL calls. Also add a safety assertion after filtering by trace_id and
mark all langfuse e2e tests with @pytest.mark.flaky(retries=3) as an
extra safety net for any residual timing issues.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix(ci): black formatting + update OpenAPI compliance tests for spec changes

- Apply Black 26.x formatting to litellm_logging.py (parenthesized style)
- Update test_input_types_match_spec to follow $ref to InteractionsInput schema
  (Google updated their OpenAPI spec to use $ref instead of inline oneOf)
- Update test_content_schema_uses_discriminator to handle discriminator without
  explicit mapping (Google removed the mapping key from Content discriminator)

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* revert: undo incorrect Black 26.x formatting on litellm_logging.py

The file was correctly formatted for Black 23.12.1 (the version pinned
in pyproject.toml). The previous commit applied Black 26.x formatting
which was incompatible with the CI's Black version.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix(ci): deduplicate and sort langfuse batch events after aggregation

The Langfuse SDK may send the same event (e.g., trace-create) in
multiple flush cycles, causing duplicates when we aggregate from all
mock calls. After filtering by trace_id, deduplicate by keeping only
the first event of each type, then sort to ensure trace-create is at
index 0 and generation-create at index 1.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

---------

Co-authored-by: Noah Nistler <60981020+noahnistler@users.noreply.github.com>
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
2026-03-18 15:09:01 -07:00
yuneng-jiang 8f56ddb9c6 Merge remote main into litellm_ci_optimize
Resolved conflict in test_claude_agent_sdk.py by keeping main's additions.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-16 00:50:22 -07:00
yuneng-jiang cc027a2b90 Fix flaky test_langsmith_queue_logging: poll instead of fixed sleep
The test waited a fixed 3s for async callbacks to populate log_queue.
Under xdist -n 4, CPU contention can delay the GLOBAL_LOGGING_WORKER
background task beyond 3s. Replace fixed sleeps with polling loops
(up to 10s) that break as soon as the expected condition is met.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-15 20:25:11 -07:00
yuneng-jiang 9b77524354 Fix logging_testing: capture true defaults at conftest import time
Module-level mutations (litellm.num_retries=3 in test_langfuse_e2e_test.py
and test_amazing_s3_logs.py, litellm.success_callback=['langfuse']) run
at import time, BEFORE any function fixture. The save/restore pattern
captured these polluted values as 'originals' and kept restoring them.

Fix: capture litellm defaults when conftest.py is first imported (before
test modules), then reset to those true defaults before each test instead
of saving/restoring the current (potentially polluted) state.
2026-03-15 19:52:46 -07:00
yuneng-jiang 13a46598e7 Fix logging_testing: clear _in_memory_loggers and add missing globals
- Clear _in_memory_loggers before/after each test to prevent cached logger
  instances (LangsmithLogger, SlackAlerting, etc.) from leaking stale state
- Add pre_call_rules, post_call_rules to list attrs save/restore
- Add vector_store_registry to scalar attrs save/restore

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-15 18:48:32 -07:00
yuneng-jiang 92ad90de2a Fix logging_testing: expand save/restore to cover redaction and other globals
The logging tests mutate many more litellm globals than guardrails tests
(turn_off_message_logging, s3_callback_params, datadog_params, service_callback,
etc.). The initial save/restore list only covered callbacks and a few basics,
causing state leaks like redaction settings bleeding across tests.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-15 18:37:07 -07:00
yuneng-jiang 19e8a16cce Optimize logging_testing CI: suppress DEBUG logs, fix xdist isolation
- Add LITELLM_LOG=WARNING to suppress verbose DEBUG log output
- Remove -s flag to stop capturing all stdout
- Bump xdist workers from -n 2 to -n 4
- Add --timeout=120 for safety
- Rewrite conftest.py to use save/restore pattern (matching guardrails_tests)
  instead of per-function importlib.reload + event loop creation

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-15 18:24:57 -07:00
Harshit28j d7c9ec6276 add tests for fix 2026-03-15 00:58:08 +05:30
yuneng-jiang 89d8401d72 Merge pull request #23483 from BerriAI/litellm_update_deprecated_test_models
[Fix] Update Deprecated Model Names in CI Tests
2026-03-12 14:16:52 -07:00
yuneng-jiang cc81e3c226 Replace deprecated model names in tests that were removed from remote model cost map
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 14:12:07 -07:00
Cesar Garcia e01d722803 Merge branch 'main' into litellm_oss_staging_03_11_2026 2026-03-12 13:53:14 -03:00
Chesars feed274aa3 Reapply "feat: add model_cost aliases expansion support"
This reverts commit 3d2df7e8b5.
2026-03-12 13:36:57 -03:00
Chesars 1be6b31e2f merge: resolve conflicts between main and litellm_oss_staging_03_11_2026 2026-03-12 09:38:31 -03:00
Sameer Kankute 36ec80d90c Fix azure model router 2026-03-12 12:40:37 +05:30
Cesar Garcia 3d2df7e8b5 Revert "feat: add model_cost aliases expansion support" 2026-03-10 22:39:19 -03:00
Sameer Kankute b08445837b fix(logging): preserve ModelResponse choices format in redacted standard_logging_object + add Charity Engine provider endpoint
- Fix perform_redaction to handle dict representation of ModelResponse (from model_dump())
- Preserve full choices structure when redacting, redact content/audio in place
- Add _redact_standard_logging_object helper for standard_logging_object field
- Update test_logging_redaction_e2e_test assertions to expect choices format
- Add charity_engine to provider_endpoints_support.json

Fixes: test_standard_logging_payload, test_standard_logging_payload_audio
Made-with: Cursor
2026-03-10 10:22:57 +05:30
Ishaan Jaff 28c33f53a3 CircleCI test stability (#23055)
* fix: resolve ruff lint errors and mypy type error

- Remove unused import get_user_credential (F401)
- Add noqa: PLR0915 for 3 large functions exceeding 50 statements
- Cast result_data['q'] to str for _append_domain_filters (mypy arg-type)

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: add /vertex_ai/live to supported endpoints and azure gpt-5.1 reasoning flags

- Add /vertex_ai/live to JSON schema validation enum in test_utils.py
- Add supports_none_reasoning_effort=true to 10 azure/gpt-5.1 model entries
  (matching the OpenAI gpt-5.1 behavior)

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: handle non-string team_alias/key_alias in PolicyMatchContext

Prevent Pydantic validation errors when team_alias or key_alias are not
proper strings (e.g. MagicMock in tests). Only pass values that are
actually strings; default to None otherwise.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: initialize jwt_handler.litellm_jwtauth in JWT test

The test_jwt_non_admin_team_route_access test was failing because
user_api_key_auth now accesses jwt_handler.litellm_jwtauth.virtual_key_claim_field
before reaching the mocked JWTAuthManager.auth_builder. Initialize the
jwt_handler with a default LiteLLM_JWTAuth object.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: add missing mock attributes to MCP server test

The test_add_update_server_fallback_to_server_id test was failing because
MagicMock auto-creates attributes when accessed. build_mcp_server_from_table
accesses many fields via getattr(), which on a MagicMock returns another
MagicMock instead of None, causing Pydantic validation errors in MCPServer.

Explicitly set all required mock attributes.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: update UI tests for leftnav, navbar, and KeyLifecycleSettings

- leftnav: Add mock for useTeams hook, add isUserTeamAdminForAnyTeam to
  roles mock, update topLevelLabels to match current component menu items
- navbar: Add mocks for useDisableBouncingIcon, BlogDropdown, UserDropdown,
  and serverRootPath. Update test to work with the new component structure.
- KeyLifecycleSettings: Fix placeholder and tooltip assertions to match
  actual component behavior

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: update health check test assertion from 'connected' to 'healthy'

The /health/readiness endpoint now returns {"status": "healthy"} with the
DB status in a separate field, instead of the previous {"status": "connected"}.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: clear litellm.api_key in OpenRouter validate_environment test

The test_validate_environment_raises_without_key test was failing because
litellm.api_key may be set globally in the test environment. Clear it
along with OPENROUTER_API_KEY and OR_API_KEY env vars using monkeypatch.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: patch HTTPHandler class-level in VLLM embedding test

The test_encoding_format_not_sent_in_actual_request test was patching
client.post on an instance, but the handler uses the class method.
Patch HTTPHandler.post at class level, add caching=False to prevent
cache hits, and remove broad try/except that hid errors.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: make test_redaction_responses_api_stream resilient to async callback timing

Replace fixed 1s sleep with polling wait for async_log_success_event.
Streaming success handler runs via asyncio.create_task; 1s was insufficient
in CI. Add 0.5s initial sleep for event loop to schedule the task, then
poll up to 10s for the callback to fire.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: update dompurify and svgo to fix security CVEs

- CVE-2026-0540: dompurify XSS vulnerability - fix by upgrading to 3.3.2+
- CVE-2026-29074: svgo DoS via entity expansion - fix by upgrading to 3.3.3+

Added npm overrides in docs/my-website/package.json and regenerated
package-lock.json.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: remove unused json import in config_override_endpoints.py

Ruff F401: json is imported but unused (safe_json_loads/safe_dumps
are used instead)

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: add missing MCP mock attributes and provider documentation entries

- Add missing mock attributes to test_add_update_server_with_alias and
  test_add_update_server_without_alias (same fix as fallback test)
- Add bedrock_mantle and searchapi to provider_endpoints_support.json
- Remove unused json import from config_override_endpoints.py

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: override _supports_reasoning_effort_level for Azure gpt5_series prefix

The Azure GPT-5 config uses 'gpt5_series/' as a routing prefix, but
_supports_factory(model='gpt5_series/gpt-5.1') fails to resolve because
'gpt5_series' is not a recognized provider. Override the method to strip
the prefix and prepend 'azure/' for correct model info lookup.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: accept both 'healthy' and 'connected' in health check test

The test_health_and_chat_completion test runs against both source builds
(which return 'healthy') and pip-installed versions (which may return
'connected'). Accept both values.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: mock extract_mcp_auth_context in streamable HTTP MCP handler test

The handle_streamable_http_mcp function now calls extract_mcp_auth_context
before session_manager.handle_request, but the test didn't mock it. The
auth extraction fails with the minimal mock scope, preventing
handle_request from being called. Also relax assertion to not check
exact args since the send wrapper may be modified by debug injection.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: add test for _combine_fallback_usage to satisfy router code coverage

The router_code_coverage.py check requires all functions in router.py
to be called in test files. Add a basic test for _combine_fallback_usage.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: add @log_guardrail_information decorator to CrowdStrike AIDR guardrail

The check_guardrail_apply_decorator.py CI check requires all guardrail
apply_guardrail methods to have the @log_guardrail_information decorator.
The CrowdStrike AIDR handler was missing it.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: document PRISMA_RECONNECT_ESCALATION_THRESHOLD and REDIS_CLUSTER_NODES env keys

Add missing environment variable documentation to config_settings.md
to satisfy the test_env_keys.py CI check.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: document enforced_file_expires_after and enforced_batch_output_expires_after in new_team docstring

The test_api_docs.py CI check validates that all Pydantic model fields
are documented in the function docstring. Add missing parameter docs
for enforced_file_expires_after and enforced_batch_output_expires_after.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: regenerate poetry.lock to match pyproject.toml

The poetry.lock file was out of sync with pyproject.toml, causing
proxy_e2e_azure_batches_tests to fail during dependency installation.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: set master_key=None in test_create_file_with_deep_nested_litellm_metadata

The test was missing the master_key monkeypatch that other tests in the
same file set. In CI with parallel execution (-n 4), another test may
set master_key to a non-None value, causing auth failures (500) when
the test sends 'Bearer test-key'.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: document enforced_*_expires_after in update_team docstring too

Same missing params as new_team - also needed in update_team docstring
for the test_api_docs.py CI check to pass.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: use get_async_httpx_client in a2a_protocol and add master_key monkeypatch to files tests

- Replace httpx.AsyncClient() with get_async_httpx_client() in a2a_protocol/main.py
  to satisfy the ensure_async_clients_test CI check
- Add httpxSpecialProvider.A2AProvider enum value
- Add master_key=None monkeypatch to test_managed_files_with_loadbalancing

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: remove unused httpx import from a2a_protocol/main.py

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: use cache-key-only param for A2A extra_headers to avoid AsyncHTTPHandler init error

The 'extra_headers' key in params was being passed to AsyncHTTPHandler.__init__()
which doesn't accept it. Use 'disable_aiohttp_transport' as the cache-key-only
param since it's explicitly filtered out before reaching the constructor.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: add additionalProperties:false and resolve $defs/$ref in Anthropic output_format schemas

Anthropic API now requires additionalProperties=false for all object-type
schemas in output_format. Also resolve $defs/$ref references by inlining
them using unpack_defs before sending to Anthropic, since Anthropic
doesn't support external schema references.

Fixes: llm_translation_testing Anthropic JSON schema failures

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: allowlist CVE-2026-2297 and GHSA-qffp-2rhf-9h96 in security scans

- CVE-2026-2297: Python 3.13 SourcelessFileLoader audit hook bypass,
  no fix available in base image
- GHSA-qffp-2rhf-9h96: tar hardlink path traversal, from nodejs_wheel
  bundled npm, not used in application runtime code

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: isolate files endpoint tests from shared proxy state in CI parallel execution

Override user_api_key_auth dependency to return a fixed UserAPIKeyAuth
with PROXY_ADMIN role, avoiding auth lookups via prisma_client,
user_api_key_cache, or master_key. Set prisma_client=None to prevent
DB state contamination. Use try/finally to clean up dependency overrides.

Fixes persistent test_create_file_with_deep_nested_litellm_metadata and
test_managed_files_with_loadbalancing 500 errors in CI with -n 4.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: apply same auth override to test_managed_files_with_loadbalancing

Same CI parallel execution fix as test_create_file_with_deep_nested -
override user_api_key_auth dependency and set prisma_client=None.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

---------

Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
2026-03-07 15:19:39 -08:00
Sameer Kankute 16d6c279da Merge pull request #22180 from BerriAI/litellm_fix_vllm_test
Add JSON exact match test for vLLM embeddings
2026-02-26 18:43:52 +05:30
Sameer Kankute d9cd3ea185 Merge pull request #22181 from mubashir1osmani/fix/arize-phoenix-nested-traces-test-update
fix(test): update Phoenix OTEL test
2026-02-26 17:12:17 +05:30
mubashir1osmani 3c595f6fd2 fix(test): update Phoenix OTEL test for dedicated TracerProvider architecture
The old test assumed ArizePhoenixLogger reused the global TracerProvider.
With the nested traces fix, Phoenix now creates its own dedicated provider
and produces litellm_proxy_request + litellm_request + raw_gen_ai_request
spans independently.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2026-02-26 06:40:24 -05:00
Sameer Kankute 1a3d4c80db Add JSON exact match test for vLLM embeddings 2026-02-26 16:49:34 +05:30
Sameer Kankute 828ce40eac Fix test_async_gcs_pub_sub_v1 2026-02-26 10:46:01 +05:30
Ryan Crabbe 75bc8329e2 Merge origin/main into litellm_perf_skip_usage_roundtrip
Resolve conflict in litellm_logging.py: take main's version and
re-apply get_usage_as_dict optimization on top.
2026-02-21 12:55:55 -08:00
Ishaan Jaff a5e886de79 fix(tests): read CI_CD_DEFAULT_ANTHROPIC_MODEL env var instead of hardcoding model (#21781)
* fix(tests): read CI_CD_DEFAULT_ANTHROPIC_MODEL env var in bedrock KB tests

* fix(tests): read CI_CD_DEFAULT_ANTHROPIC_MODEL env var in test_router

* fix(tests): read CI_CD_DEFAULT_ANTHROPIC_MODEL env var in test_router_retries

* fix(tests): read CI_CD_DEFAULT_ANTHROPIC_MODEL env var in test_router_timeout
2026-02-21 10:46:49 -08:00
Ishaan Jaff 0726bdb67c fix(tests): update gcs pubsub v1 fixture with new SpendLogsMetadata fields (#21779)
SpendLogsMetadata added new fields (user_api_key, status, error_information,
etc.) that weren't in the expected spend_logs_payload.json fixture, causing
test_async_gcs_pub_sub_v1 to fail.
2026-02-21 10:40:26 -08:00