chore(tests): migrate Bedrock CI to AWS account 941277531214 (#28728)

* chore(tests): migrate Bedrock CI from AWS account 888602223428 to 941277531214

The original account (888602223428) was put under a security restriction by
AWS after a root access key leaked in a PR comment. While that account works
its way through the AWS Support unlock process, Bedrock-touching CI tests have
been migrated to a fresh account (941277531214).

Changes:
  - Replace 26 hardcoded references to 888602223428 with 941277531214 across
    8 files (provisioned-model ARNs, imported-model ARNs, AgentCore runtime
    ARNs, batch execution role ARN, and example proxy config).
  - The provisioned-model and imported-model ARNs are referenced only from
    mocked unit tests — no AWS resources to recreate.
  - The batch execution IAM role has been recreated in the new account with
    the same name and equivalent permissions.
  - The two AgentCore runtimes (hosted_agent_r9jvp-3ySZuRHjLC,
    hosted_agent_13sf6-cALnp38iZD) are being recreated in the new account
    under the same names — see tools/agentcore-deploy/ in a follow-up.

CircleCI env vars AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY / AWS_REGION_NAME
were updated separately via the CircleCI API to point at the new account.

Smoke-tested locally against the new account:
  aws bedrock-runtime converse --region us-west-2 \
    --model-id us.anthropic.claude-sonnet-4-5-20250929-v1:0 \
    --messages '[{"role":"user","content":[{"text":"ping"}]}]'
  → 200, model returned 'pong'

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* chore(tests): refresh AgentCore ARN suffixes to match newly-deployed runtimes

The first migration commit replaced just the account ID, but AgentCore
auto-assigns a random 10-char suffix to every runtime on creation — we
can't reuse the original suffixes (`3ySZuRHjLC`, `cALnp38iZD`) in the
new account. Updated the AgentCore-runtime ARNs in the three files that
reference real runtime IDs (not the mock-based unit-test ARNs).

Deployed runtimes:
  arn:aws:bedrock-agentcore:us-west-2:941277531214:runtime/hosted_agent_r9jvp-Rq79QFC2fp
  arn:aws:bedrock-agentcore:us-west-2:941277531214:runtime/hosted_agent_13sf6-4046UzHSwy

Both runtimes are status=READY and pass a smoke invoke:
  $ aws bedrock-agentcore invoke-agent-runtime --agent-runtime-arn ... --payload '{"prompt":"ping"}'
  → 200, {"result": "echo: ping"}

The agent is a minimal echo (see /tmp/agentcore_deploy/agent.py for the
deploy artifacts). Tests that only verify the SDK wiring will pass; if any
test asserts on agent output content, swap the echo for the real agent.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* chore(tests): point Bedrock batch tests at new-account S3 bucket

The account migration (888602223428 -> 941277531214) was a flat
account-ID swap, which only rewrites ARNs that embed the account
number. S3 bucket names carry no account ID, so the live Bedrock
batch tests still uploaded to `litellm-proxy` — a bucket that lives
in the old account. S3 names are globally unique, and the old account
still holds that name, so it can't be recreated in the new account.

Rename to `litellm-proxy-941277531214` (account-ID suffix guarantees
global uniqueness). The bucket must be created in 941277531214 and the
batch execution role granted s3:GetObject/PutObject/ListBucket on it
before this job is run in CI.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(tests): point live S3 logging test at new-account bucket

Same account-ID-free blind spot as the batch bucket: `load-testing-oct`
lives in the old account and its name can't be reused globally. The
`logging_testing` CI job is wired into the workflow and runs
test_basic_s3_logging, which uploads to this bucket with the CI env
creds, then lists and deletes objects — a live dependency.

Rename to `load-testing-oct-941277531214`. The bucket must exist in the
new account with the CI IAM principal granted
s3:PutObject/GetObject/ListBucket/DeleteObject before this job runs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(tests): repoint Bedrock guardrail IDs to new-account guardrails

The migration left guardrail IDs untouched (no account ID in them), so
all live guardrail tests failed with "guardrail identifier or version
does not exist" against 941277531214. Recreated both guardrails in the
new account and updated the hardcoded IDs:
  - wf0hkdb5x07f -> zgkmukebruil (PII mask: PHONE + CREDIT_DEBIT_CARD,
    with explicit inputAction=ANONYMIZE so masking applies to INPUT,
    which is the source litellm's moderation hook sends)
  - ff6ujrregl1q -> 4w3d1di3snt5 (blocks "coffee"; blocked message set
    to the exact string the tests assert on)

Updated test_bedrock_guardrails.py, otel_test_config.yaml, and the
guardrailConfig in test_bedrock_completion.py. Verified locally: the 5
previously-failing guardrail tests now pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(bedrock): migrate legacy models to current inference profiles

The new CI account (941277531214) cannot invoke legacy Bedrock models
(AWS gates them: "marked by provider as Legacy... not actively using in
the last 30 days"). Migrated the live-call tests:
  - anthropic.claude-3-sonnet-20240229    -> us.anthropic.claude-sonnet-4-5-20250929-v1:0
  - anthropic.claude-3-haiku-20240307     -> us.anthropic.claude-haiku-4-5-20251001-v1:0
Current Claude models on Bedrock require the us. inference-profile prefix
(bare on-demand ids are rejected).

cohere.command-r-plus has no working replacement (all Cohere is legacy-
gated in the new account): swapped to claude-haiku-4-5 in provider-
agnostic param lists. amazon.titan-image-generator skipped (no working
replacement). Mocked/transformation/cost tests that reference the legacy
strings are intentionally left unchanged. Verified live against the new
account.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(bedrock): repoint SageMaker + Knowledge Base to new-account resources

These referenced account-scoped resources by hardcoded id that only
existed in the old account, so the migration's account-ID swap missed
them. Recreated in 941277531214 and repointed:
  - SageMaker endpoint jumpstart-dft-hf-textgeneration1-mp-20240815-185614
    -> litellm-ci-textgen (gpt2 on a TGI container, ml.g5.xlarge)
  - Bedrock Knowledge Base T37J8R4WTM -> LCYXFBR2TU (OpenSearch Serverless
    vector store + titan-embed-text-v2, seeded with a LiteLLM doc)
Verified live: test_sagemaker.py (12 passed) and
test_bedrock_knowledgebase_hook.py (12 passed).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(reasoning_effort_grid): skip bedrock claude-opus-4-7 cells (not entitled on 941277531214)

claude-opus-4-7 is listed in the new Bedrock CI account's foundation
models but invoke is denied (AccessDeniedException: "not available for
this account"). Bedrock access to the flagship Opus requires an AWS
Sales request, not the self-serve model-access toggle, so it can't be
enabled inline with the rest of the account migration.

Add an optional `skip_reason` to ModelEntry and set it on the
bedrock-claude-opus-4-7 entry; the grid test honors it via pytest.skip.
Cell count (231) and route coverage are unchanged, so the structural
asserts still pass. Restore coverage by deleting the one skip_reason
line once access is granted.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(bedrock): swap/skip legacy-gated models unavailable on new CI account

The migrated AWS account (941277531214) cannot access several models that
the old account could, so the remaining red CI jobs were hitting real
Bedrock "Access denied / Legacy" and "account not authorized" errors:

- image_gen: skip both Nova Canvas test classes (amazon.nova-canvas-v1:0 is
  legacy-gated), matching the existing titan skip.
- batches: skip test_async_file_and_batch (Bedrock batch inference is not
  authorized on the new account; requires an AWS support case).
- litellm_overhead: swap legacy claude-3-5-haiku for the active
  us.anthropic.claude-haiku-4-5 inference profile.
- test_completion_claude_3_function_call: swap legacy claude-3-sonnet for the
  active us.anthropic.claude-sonnet-4-5 inference profile.

https://claude.ai/code/session_01Y7zgHYu9GX29YRwV4yiWAa

* test(bedrock): fix remaining e2e legacy-model + batch failures on new CI account

- e2e_openai_endpoints: skip test_bedrock_batches_api (Bedrock batch inference
  is not authorized on account 941277531214) and migrate the missed
  s3_bucket_name in oai_misc_config.yaml to litellm-proxy-941277531214.
- build_and_test: swap legacy bedrock claude-3-sonnet for the active
  us.anthropic.claude-sonnet-4-5 inference profile in the proxy structured
  output e2e test.

https://claude.ai/code/session_01Y7zgHYu9GX29YRwV4yiWAa

* test(bedrock): make opus-4-7 + batch cells fail loudly and mock image-gen (#28791)

Replace the silent skips added for the new CI account with noisier behavior:
- reasoning-effort grid: opus-4-7 cells now fail (when AWS creds are present)
  instead of skipping, so the missing entitlement stays visible in CI; they
  still skip when AWS creds are absent (local dev)
- Bedrock batch inference tests: drop the skip so they run and fail until
  batch access is granted
- Titan + Nova Canvas image-gen tests: mock the Bedrock HTTP call so the
  transform + cost-tracking path stays under test without live model access

https://claude.ai/code/session_01MT7SWDnXUjv6e6EPG7BDjT

Co-authored-by: Claude <noreply@anthropic.com>

* test(bedrock): use pytest.xfail for known-failing opus-4-7 cells

Replace pytest.fail with pytest.xfail when a model has a fail_reason,
so known-broken cells stay visible as XFAIL without keeping CI red.

Co-authored-by: Yassin Kortam <yassin@berri.ai>

---------

Co-authored-by: Mateo <mateo@Mateos-MacBook-Pro.local>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: Yassin Kortam <yassin@berri.ai>
This commit is contained in:
Mateo Wang
2026-05-25 12:03:17 -07:00
committed by GitHub
parent f45909cb81
commit f9407bc036
23 changed files with 203 additions and 118 deletions
@@ -157,8 +157,8 @@ class AmazonAgentCoreConfig(BaseConfig, BaseAWSLLM):
def _get_agent_runtime_arn(self, model: str) -> str:
"""
Extract ARN from model string
model = "agentcore/arn:aws:bedrock-agentcore:us-west-2:888602223428:runtime/hosted_agent_r9jvp-3ySZuRHjLC"
returns: "arn:aws:bedrock-agentcore:us-west-2:888602223428:runtime/hosted_agent_r9jvp-3ySZuRHjLC"
model = "agentcore/arn:aws:bedrock-agentcore:us-west-2:941277531214:runtime/hosted_agent_r9jvp-Rq79QFC2fp"
returns: "arn:aws:bedrock-agentcore:us-west-2:941277531214:runtime/hosted_agent_r9jvp-Rq79QFC2fp"
"""
parts = model.split("/", 1)
if len(parts) != 2 or parts[0] != "agentcore":
@@ -170,7 +170,7 @@ class AmazonAgentCoreConfig(BaseConfig, BaseAWSLLM):
def _extract_region_from_arn(self, arn: str) -> str:
"""
Extract region from ARN
arn:aws:bedrock-agentcore:us-west-2:888602223428:runtime/hosted_agent_r9jvp-3ySZuRHjLC
arn:aws:bedrock-agentcore:us-west-2:941277531214:runtime/hosted_agent_r9jvp-Rq79QFC2fp
returns: us-west-2
"""
parts = arn.split(":")
@@ -23,11 +23,11 @@ model_list:
model: bedrock/us.anthropic.claude-haiku-4-5-20251001-v1:0
#########################################################
########## batch specific params ########################
s3_bucket_name: litellm-proxy
s3_bucket_name: litellm-proxy-941277531214
s3_region_name: us-west-2
s3_access_key_id: os.environ/AWS_ACCESS_KEY_ID
s3_secret_access_key: os.environ/AWS_SECRET_ACCESS_KEY
aws_batch_role_arn: arn:aws:iam::888602223428:role/service-role/AmazonBedrockExecutionRoleForAgents_BB9HNW6V4CV
aws_batch_role_arn: arn:aws:iam::941277531214:role/service-role/AmazonBedrockExecutionRoleForAgents_BB9HNW6V4CV
model_info:
mode: batch
@@ -55,7 +55,7 @@ guardrails:
litellm_params:
guardrail: bedrock # supported values: "bedrock", "lakera"
mode: "during_call"
guardrailIdentifier: ff6ujrregl1q
guardrailIdentifier: 4w3d1di3snt5
guardrailVersion: "DRAFT"
- guardrail_name: "custom-pre-guard"
litellm_params:
@@ -168,7 +168,7 @@ async def test_a2a_completion_bridge_bedrock_agentcore():
litellm._turn_on_debug()
# Bedrock AgentCore ARN (streaming-capable runtime)
agentcore_arn = "arn:aws:bedrock-agentcore:us-west-2:888602223428:runtime/hosted_agent_r9jvp-3ySZuRHjLC"
agentcore_arn = "arn:aws:bedrock-agentcore:us-west-2:941277531214:runtime/hosted_agent_r9jvp-Rq79QFC2fp"
send_message_payload = {
"message": {
@@ -38,7 +38,7 @@ async def test_async_create_file():
file=open(file_path, "rb"),
purpose="batch",
custom_llm_provider="bedrock",
s3_bucket_name="litellm-proxy",
s3_bucket_name="litellm-proxy-941277531214",
)
@@ -55,7 +55,7 @@ async def test_async_file_and_batch():
file=open(file_path, "rb"),
purpose="batch",
custom_llm_provider="bedrock",
s3_bucket_name="litellm-proxy",
s3_bucket_name="litellm-proxy-941277531214",
)
print("CREATED FILE RESPONSE=", file_obj)
@@ -70,7 +70,7 @@ async def test_async_file_and_batch():
# bedrock specific params
#########################################################
model="us.anthropic.claude-haiku-4-5-20251001-v1:0",
aws_batch_role_arn="arn:aws:iam::888602223428:role/service-role/AmazonBedrockExecutionRoleForAgents_BB9HNW6V4CV",
aws_batch_role_arn="arn:aws:iam::941277531214:role/service-role/AmazonBedrockExecutionRoleForAgents_BB9HNW6V4CV",
)
print("CREATED BATCH RESPONSE=", create_batch_response)
@@ -129,7 +129,7 @@ async def test_mock_bedrock_file_url_mapping():
),
purpose="batch",
custom_llm_provider="bedrock",
s3_bucket_name="litellm-proxy",
s3_bucket_name="litellm-proxy-941277531214",
)
print(f"PUT URL: {captured_put_url}")
@@ -20,7 +20,7 @@ async def test_bedrock_guardrails_pii_masking():
mock_user_api_key_dict = UserAPIKeyAuth()
guardrail = BedrockGuardrail(
guardrailIdentifier="wf0hkdb5x07f",
guardrailIdentifier="zgkmukebruil",
guardrailVersion="DRAFT",
)
@@ -60,7 +60,7 @@ async def test_bedrock_guardrails_pii_masking_content_list():
mock_user_api_key_dict = UserAPIKeyAuth()
guardrail = BedrockGuardrail(
guardrailIdentifier="wf0hkdb5x07f",
guardrailIdentifier="zgkmukebruil",
guardrailVersion="DRAFT",
)
@@ -115,7 +115,7 @@ async def test_bedrock_guardrails_block_messages_api():
mock_user_api_key_dict = UserAPIKeyAuth()
guardrail = BedrockGuardrail(
guardrailIdentifier="ff6ujrregl1q",
guardrailIdentifier="4w3d1di3snt5",
guardrailVersion="DRAFT",
)
@@ -166,7 +166,7 @@ async def test_bedrock_guardrails_block_responses_api():
mock_user_api_key_dict = UserAPIKeyAuth()
guardrail = BedrockGuardrail(
guardrailIdentifier="ff6ujrregl1q",
guardrailIdentifier="4w3d1di3snt5",
guardrailVersion="DRAFT",
)
@@ -211,7 +211,7 @@ async def test_bedrock_guardrails_with_streaming():
)
guardrail = BedrockGuardrail(
guardrailIdentifier="ff6ujrregl1q",
guardrailIdentifier="4w3d1di3snt5",
guardrailVersion="DRAFT",
supported_event_hooks=[GuardrailEventHooks.post_call],
guardrail_name="bedrock-post-guard",
@@ -255,7 +255,7 @@ async def test_bedrock_guardrails_with_streaming_no_violation():
)
guardrail = BedrockGuardrail(
guardrailIdentifier="ff6ujrregl1q",
guardrailIdentifier="4w3d1di3snt5",
guardrailVersion="DRAFT",
supported_event_hooks=[GuardrailEventHooks.post_call],
guardrail_name="bedrock-post-guard",
@@ -299,7 +299,7 @@ async def test_bedrock_guardrails_streaming_request_body_mock():
# Create the guardrail
guardrail = BedrockGuardrail(
guardrailIdentifier="wf0hkdb5x07f",
guardrailIdentifier="zgkmukebruil",
guardrailVersion="DRAFT",
supported_event_hooks=[GuardrailEventHooks.post_call],
guardrail_name="bedrock-post-guard",
@@ -382,7 +382,7 @@ async def test_bedrock_guardrail_aws_param_persistence():
from litellm.types.guardrails import GuardrailEventHooks
guardrail = BedrockGuardrail(
guardrailIdentifier="wf0hkdb5x07f",
guardrailIdentifier="zgkmukebruil",
guardrailVersion="DRAFT",
aws_access_key_id="test-access-key",
aws_secret_access_key="test-secret-key",
@@ -1,3 +1,4 @@
import json
import logging
import os
import sys
@@ -44,6 +45,9 @@ from litellm.llms.bedrock.image_generation.image_handler import (
)
from litellm.llms.bedrock.common_utils import BedrockError
# Base64 placeholder used for mocked Bedrock image responses (a 1x1 PNG).
_MOCK_BEDROCK_IMAGE_B64 = "iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAQAAAC1HAwCAAAAC0lEQVR42mNk+M9QDwADhgGAWjR9awAAAABJRU5ErkJggg=="
@pytest.mark.parametrize(
"model,expected",
@@ -528,17 +532,34 @@ def test_backward_compatibility_regular_nova_model():
def test_amazon_titan_image_gen():
"""Test Amazon Titan image generation with cost tracking."""
from litellm import image_generation
"""Test Amazon Titan image generation with cost tracking.
The Bedrock CI account is not entitled to amazon.titan-image-generator, so
the network call is mocked and only the transform + cost-tracking path is
exercised.
"""
from litellm.llms.custom_httpx.http_handler import HTTPHandler
# Use v2 as v1 has reached end of life
model_id = "bedrock/amazon.titan-image-generator-v2:0"
response = litellm.image_generation(
model=model_id,
prompt="A serene mountain landscape at sunset with a lake reflection",
aws_region_name="us-east-1",
)
mock_payload = {"images": [_MOCK_BEDROCK_IMAGE_B64]}
mock_response = MagicMock()
mock_response.status_code = 200
mock_response.json.return_value = mock_payload
mock_response.text = json.dumps(mock_payload)
mock_response.headers = {}
client = HTTPHandler()
with patch.object(client, "post", return_value=mock_response):
response = litellm.image_generation(
model=model_id,
prompt="A serene mountain landscape at sunset with a lake reflection",
aws_region_name="us-east-1",
aws_access_key_id="fake-access-key-id",
aws_secret_access_key="fake-secret-access-key",
client=client,
)
print(f"response cost: {response._hidden_params['response_cost']}")
+57 -1
View File
@@ -7,7 +7,6 @@ import sys
import traceback
from unittest.mock import AsyncMock, MagicMock, patch
sys.path.insert(
0, os.path.abspath("../..")
) # Adds the parent directory to the system path
@@ -136,6 +135,51 @@ class TestVertexAIGeminiImageGeneration(BaseImageGenTest):
}
# Base64 placeholder used for mocked Bedrock image responses (a 1x1 PNG).
_MOCK_BEDROCK_IMAGE_B64 = "iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAQAAAC1HAwCAAAAC0lEQVR42mNk+M9QDwADhgGAWjR9awAAAABJRU5ErkJggg=="
async def _assert_mocked_bedrock_image_generation(call_args: dict) -> None:
"""Run ``aimage_generation`` with the Bedrock HTTP call mocked.
The CI account is not entitled to Nova Canvas, so the network call is
replaced with a canned Bedrock response. This keeps the request transform,
response transform, and cost-tracking path under test without live access.
"""
mock_payload = {"images": [_MOCK_BEDROCK_IMAGE_B64]}
mock_response = MagicMock()
mock_response.status_code = 200
mock_response.json.return_value = mock_payload
mock_response.text = json.dumps(mock_payload)
mock_response.headers = {}
custom_logger = TestCustomLogger()
litellm.logging_callback_manager._reset_all_callbacks()
litellm.callbacks = [custom_logger]
with patch(
"litellm.llms.custom_httpx.http_handler.AsyncHTTPHandler.post",
new_callable=AsyncMock,
return_value=mock_response,
):
response = await litellm.aimage_generation(
**call_args,
prompt="A image of a otter",
aws_access_key_id="fake-access-key-id",
aws_secret_access_key="fake-secret-access-key",
)
await asyncio.sleep(1)
assert custom_logger.standard_logging_payload is not None
assert custom_logger.standard_logging_payload["response_cost"] is not None
assert custom_logger.standard_logging_payload["response_cost"] > 0
assert response.data is not None
for d in response.data:
assert isinstance(d, Image)
assert d.b64_json is not None or d.url is not None
class TestBedrockNovaCanvasTextToImage(BaseImageGenTest):
def get_base_image_generation_call_args(self) -> dict:
litellm.in_memory_llm_clients_cache = InMemoryCache()
@@ -148,6 +192,12 @@ class TestBedrockNovaCanvasTextToImage(BaseImageGenTest):
"aws_region_name": "us-east-1",
}
@pytest.mark.asyncio(scope="module")
async def test_basic_image_generation(self):
await _assert_mocked_bedrock_image_generation(
self.get_base_image_generation_call_args()
)
class TestBedrockNovaCanvasColorGuidedGeneration(BaseImageGenTest):
def get_base_image_generation_call_args(self) -> dict:
@@ -162,6 +212,12 @@ class TestBedrockNovaCanvasColorGuidedGeneration(BaseImageGenTest):
"aws_region_name": "us-east-1",
}
@pytest.mark.asyncio(scope="module")
async def test_basic_image_generation(self):
await _assert_mocked_bedrock_image_generation(
self.get_base_image_generation_call_args()
)
class TestOpenAIGPTImage1(BaseImageGenTest):
def get_base_image_generation_call_args(self) -> dict:
@@ -82,7 +82,7 @@ async def _vertex_ai_mocks():
"bedrock/mistral.mistral-7b-instruct-v0:2",
"openai/gpt-4o",
"openai/self_hosted",
"bedrock/anthropic.claude-3-5-haiku-20241022-v1:0",
"bedrock/us.anthropic.claude-haiku-4-5-20251001-v1:0",
"vertex_ai/gemini-1.5-flash",
],
)
@@ -147,7 +147,7 @@ async def test_litellm_overhead_non_streaming(model):
[
"bedrock/mistral.mistral-7b-instruct-v0:2",
"openai/gpt-4o",
"bedrock/anthropic.claude-3-5-haiku-20241022-v1:0",
"bedrock/us.anthropic.claude-haiku-4-5-20251001-v1:0",
"openai/self_hosted",
],
)
@@ -1,7 +1,6 @@
from dataclasses import dataclass, field
from typing import Dict, FrozenSet, List, Optional, Tuple
OMIT = object()
@@ -22,6 +21,7 @@ class ModelEntry:
extra_params: Tuple[Tuple[str, str], ...] = field(default_factory=tuple)
required_env: FrozenSet[str] = field(default_factory=frozenset)
caps: FrozenSet[str] = field(default_factory=frozenset)
fail_reason: Optional[str] = None
def params(self) -> Dict[str, str]:
return dict(self.extra_params)
@@ -205,6 +205,12 @@ BEDROCK_CONVERSE_MODELS: Tuple[ModelEntry, ...] = (
extra_params=(("aws_region_name", "us-east-1"),),
required_env=_BEDROCK_REQ,
caps=_CAPS_OPUS_4_7,
fail_reason=(
"claude-opus-4-7 is not entitled on the Bedrock CI account "
"941277531214 (model access requires an AWS Sales request, not "
"self-serve); this cell fails on purpose so it stays loud in CI — "
"remove this fail_reason once access is granted"
),
),
ModelEntry(
alias="bedrock-claude-opus-4-6",
@@ -15,7 +15,6 @@ from .grid_spec import (
all_cells,
)
_PROMPT_MESSAGES: List[Dict[str, str]] = [
{"role": "user", "content": "Step by step, calculate 47 * 53. Show your work."}
]
@@ -168,6 +167,9 @@ async def test_reasoning_effort_grid(
if skip_reason:
pytest.skip(skip_reason)
if model.fail_reason:
pytest.xfail(model.fail_reason)
if route_name == "bedrock_invoke_messages":
status, exc = await _call_messages(model, effort)
else:
+12 -12
View File
@@ -19,8 +19,8 @@ import httpx
@pytest.mark.parametrize(
"model",
[
"bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:888602223428:runtime/hosted_agent_13sf6-cALnp38iZD", # non-streaming invocation
"bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:888602223428:runtime/hosted_agent_r9jvp-3ySZuRHjLC", # streaming invocation
"bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:941277531214:runtime/hosted_agent_13sf6-4046UzHSwy", # non-streaming invocation
"bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:941277531214:runtime/hosted_agent_r9jvp-Rq79QFC2fp", # streaming invocation
],
)
def test_bedrock_agentcore_basic(model):
@@ -44,7 +44,7 @@ def test_bedrock_agentcore_basic(model):
@pytest.mark.parametrize(
"model",
[
"bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:888602223428:runtime/hosted_agent_13sf6-cALnp38iZD", # streaming invocation
"bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:941277531214:runtime/hosted_agent_13sf6-4046UzHSwy", # streaming invocation
],
)
async def test_bedrock_agentcore_with_streaming(model):
@@ -54,7 +54,7 @@ async def test_bedrock_agentcore_with_streaming(model):
print("running streming test for model=", model)
# litellm._turn_on_debug()
response = await litellm.acompletion(
model="bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:888602223428:runtime/hosted_agent_r9jvp-3ySZuRHjLC",
model="bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:941277531214:runtime/hosted_agent_r9jvp-Rq79QFC2fp",
messages=[
{
"role": "user",
@@ -82,7 +82,7 @@ def test_bedrock_agentcore_with_custom_params():
with patch.object(client, "post", return_value=MagicMock()) as mock_post:
try:
response = litellm.completion(
model="bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:888602223428:runtime/hosted_agent_r9jvp-3ySZuRHjLC",
model="bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:941277531214:runtime/hosted_agent_r9jvp-Rq79QFC2fp",
messages=[
{
"role": "user",
@@ -105,7 +105,7 @@ def test_bedrock_agentcore_with_custom_params():
url = call_kwargs["url"]
print(f"URL: {url}")
assert (
"/runtimes/arn%3Aaws%3Abedrock-agentcore%3Aus-west-2%3A888602223428%3Aruntime%2Fhosted_agent_r9jvp-3ySZuRHjLC/invocations"
"/runtimes/arn%3Aaws%3Abedrock-agentcore%3Aus-west-2%3A941277531214%3Aruntime%2Fhosted_agent_r9jvp-Rq79QFC2fp/invocations"
in url
)
assert "qualifier=DEFAULT" in url
@@ -150,7 +150,7 @@ def test_bedrock_agentcore_with_runtime_user_id():
with patch.object(client, "post", return_value=MagicMock()) as mock_post:
try:
response = litellm.completion(
model="bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:888602223428:runtime/hosted_agent_r9jvp-3ySZuRHjLC",
model="bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:941277531214:runtime/hosted_agent_r9jvp-Rq79QFC2fp",
messages=[
{
"role": "user",
@@ -189,7 +189,7 @@ def test_bedrock_agentcore_with_session_and_user():
with patch.object(client, "post", return_value=MagicMock()) as mock_post:
try:
response = litellm.completion(
model="bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:888602223428:runtime/hosted_agent_r9jvp-3ySZuRHjLC",
model="bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:941277531214:runtime/hosted_agent_r9jvp-Rq79QFC2fp",
messages=[
{
"role": "user",
@@ -234,7 +234,7 @@ def test_bedrock_agentcore_with_api_key_bearer_token():
with patch.object(client, "post", return_value=MagicMock()) as mock_post:
try:
response = litellm.completion(
model="bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:888602223428:runtime/hosted_agent_r9jvp-3ySZuRHjLC",
model="bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:941277531214:runtime/hosted_agent_r9jvp-Rq79QFC2fp",
messages=[
{
"role": "user",
@@ -282,7 +282,7 @@ def test_bedrock_agentcore_with_all_parameters():
with patch.object(client, "post", return_value=MagicMock()) as mock_post:
try:
response = litellm.completion(
model="bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:888602223428:runtime/hosted_agent_r9jvp-3ySZuRHjLC",
model="bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:941277531214:runtime/hosted_agent_r9jvp-Rq79QFC2fp",
messages=[
{
"role": "user",
@@ -350,7 +350,7 @@ def test_bedrock_agentcore_without_api_key_uses_sigv4():
with patch.object(client, "post", return_value=MagicMock()) as mock_post:
try:
response = litellm.completion(
model="bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:888602223428:runtime/hosted_agent_r9jvp-3ySZuRHjLC",
model="bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:941277531214:runtime/hosted_agent_r9jvp-Rq79QFC2fp",
messages=[
{
"role": "user",
@@ -625,7 +625,7 @@ def test_agentcore_synchronous_non_streaming_response():
with patch.object(client, "post", return_value=mock_response) as mock_post:
# Make a synchronous (non-streaming) completion call
response = litellm.completion(
model="bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:888602223428:runtime/hosted_agent_r9jvp-3ySZuRHjLC",
model="bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:941277531214:runtime/hosted_agent_r9jvp-Rq79QFC2fp",
messages=[
{
"role": "user",
@@ -115,7 +115,7 @@ def test_completion_bedrock_guardrails(streaming):
],
max_tokens=10,
guardrailConfig={
"guardrailIdentifier": "ff6ujrregl1q",
"guardrailIdentifier": "4w3d1di3snt5",
"guardrailVersion": "DRAFT",
"trace": "enabled",
},
@@ -144,7 +144,7 @@ def test_completion_bedrock_guardrails(streaming):
stream=True,
max_tokens=10,
guardrailConfig={
"guardrailIdentifier": "ff6ujrregl1q",
"guardrailIdentifier": "4w3d1di3snt5",
"guardrailVersion": "DRAFT",
"trace": "enabled",
},
@@ -475,7 +475,7 @@ def test_bedrock_claude_3(image_url):
],
}
response: ModelResponse = completion(
model="bedrock/anthropic.claude-3-sonnet-20240229-v1:0",
model="bedrock/us.anthropic.claude-sonnet-4-5-20250929-v1:0",
num_retries=3,
**data,
) # type: ignore
@@ -498,7 +498,7 @@ def test_bedrock_claude_3(image_url):
@pytest.mark.parametrize(
"model",
[
"anthropic.claude-3-sonnet-20240229-v1:0",
"us.anthropic.claude-sonnet-4-5-20250929-v1:0",
# "meta.llama3-70b-instruct-v1:0",
# "anthropic.claude-v2",
# "mistral.mixtral-8x7b-instruct-v0:1",
@@ -537,7 +537,7 @@ def test_bedrock_stop_value(stop, model):
@pytest.mark.parametrize(
"model",
[
"anthropic.claude-3-sonnet-20240229-v1:0",
"us.anthropic.claude-sonnet-4-5-20250929-v1:0",
"mistral.mixtral-8x7b-instruct-v0:1",
],
)
@@ -602,7 +602,7 @@ def test_bedrock_claude_3_tool_calling():
}
]
response: ModelResponse = completion(
model="bedrock/anthropic.claude-3-sonnet-20240229-v1:0",
model="bedrock/us.anthropic.claude-sonnet-4-5-20250929-v1:0",
messages=messages,
tools=tools,
tool_choice="auto",
@@ -630,7 +630,7 @@ def test_bedrock_claude_3_tool_calling():
)
# In the second response, Claude should deduce answer from tool results
second_response = completion(
model="bedrock/anthropic.claude-3-sonnet-20240229-v1:0",
model="bedrock/us.anthropic.claude-sonnet-4-5-20250929-v1:0",
messages=messages,
tools=tools,
tool_choice="auto",
@@ -737,7 +737,7 @@ def test_bedrock_ptu():
from openai.types.chat import ChatCompletion
model_id = (
"arn:aws:bedrock:us-west-2:888602223428:provisioned-model/8fxff74qyhs3"
"arn:aws:bedrock:us-west-2:941277531214:provisioned-model/8fxff74qyhs3"
)
try:
response = litellm.completion(
@@ -752,7 +752,7 @@ def test_bedrock_ptu():
assert "url" in mock_client_post.call_args.kwargs
assert (
mock_client_post.call_args.kwargs["url"]
== "https://bedrock-runtime.us-west-2.amazonaws.com/model/arn%3Aaws%3Abedrock%3Aus-west-2%3A888602223428%3Aprovisioned-model%2F8fxff74qyhs3/converse"
== "https://bedrock-runtime.us-west-2.amazonaws.com/model/arn%3Aaws%3Abedrock%3Aus-west-2%3A941277531214%3Aprovisioned-model%2F8fxff74qyhs3/converse"
)
mock_client_post.assert_called_once()
@@ -2327,7 +2327,7 @@ def test_bedrock_cross_region_inference(monkeypatch):
def test_bedrock_empty_content_real_call():
completion(
model="bedrock/anthropic.claude-3-sonnet-20240229-v1:0",
model="bedrock/us.anthropic.claude-sonnet-4-5-20250929-v1:0",
messages=[
{
"role": "user",
+11 -10
View File
@@ -299,7 +299,10 @@ def test_completion_claude_3():
@pytest.mark.parametrize(
"model",
["anthropic/claude-sonnet-4-5-20250929", "anthropic.claude-3-sonnet-20240229-v1:0"],
[
"anthropic/claude-sonnet-4-5-20250929",
"us.anthropic.claude-sonnet-4-5-20250929-v1:0",
],
)
def test_completion_claude_3_function_call(model):
litellm.set_verbose = True
@@ -385,7 +388,7 @@ def test_completion_claude_3_function_call(model):
[
("gpt-3.5-turbo", None, None),
("claude-sonnet-4-5-20250929", None, None),
("anthropic.claude-3-sonnet-20240229-v1:0", None, None),
("us.anthropic.claude-sonnet-4-5-20250929-v1:0", None, None),
# (
# "azure_ai/command-r-plus",
# os.getenv("AZURE_COHERE_API_KEY"),
@@ -1578,7 +1581,7 @@ def test_completion_openai():
[
# ("gpt-4o-2024-08-06", None),
# ("azure/gpt-4.1-mini", None),
("bedrock/anthropic.claude-3-sonnet-20240229-v1:0", None),
("bedrock/us.anthropic.claude-sonnet-4-5-20250929-v1:0", None),
# ("azure/gpt-4o-new-test", "2024-08-01-preview"),
],
)
@@ -1666,15 +1669,13 @@ def custom_callback(
#################################################
print(
f"""
print(f"""
Model: {model},
Messages: {messages},
User: {user},
Seed: {kwargs["seed"]},
temperature: {kwargs["temperature"]},
"""
)
""")
assert kwargs["user"] == "ishaans app"
assert kwargs["model"] == "gpt-3.5-turbo-1106"
@@ -2699,7 +2700,7 @@ def test_bedrock_deepseek_custom_prompt_dict():
def test_bedrock_deepseek_known_tokenizer_config(monkeypatch):
model = (
"deepseek_r1/arn:aws:bedrock:us-west-2:888602223428:imported-model/bnnr6463ejgf"
"deepseek_r1/arn:aws:bedrock:us-west-2:941277531214:imported-model/bnnr6463ejgf"
)
from litellm.llms.custom_httpx.http_handler import HTTPHandler
from unittest.mock import Mock
@@ -2914,8 +2915,8 @@ def response_format_tests(response: litellm.ModelResponse):
"model",
[
"bedrock/mistral.mistral-large-2407-v1:0",
"bedrock/cohere.command-r-plus-v1:0",
"anthropic.claude-3-sonnet-20240229-v1:0",
"us.anthropic.claude-haiku-4-5-20251001-v1:0",
"us.anthropic.claude-sonnet-4-5-20250929-v1:0",
"mistral.mistral-7b-instruct-v0:2",
"meta.llama3-8b-instruct-v1:0",
],
@@ -142,7 +142,8 @@ def trade(model_name: str) -> List[Trade]: # type: ignore
@pytest.mark.parametrize(
"model", ["claude-haiku-4-5-20251001", "anthropic.claude-3-haiku-20240307-v1:0"]
"model",
["claude-haiku-4-5-20251001", "us.anthropic.claude-haiku-4-5-20251001-v1:0"],
)
@pytest.mark.flaky(retries=6, delay=10)
def test_function_call_parsing(model):
+4 -5
View File
@@ -49,7 +49,7 @@ def get_current_weather(location, unit="fahrenheit"):
"mistral/mistral-large-latest",
"claude-haiku-4-5-20251001",
"gemini/gemini-2.5-flash-lite",
"anthropic.claude-3-sonnet-20240229-v1:0",
"us.anthropic.claude-sonnet-4-5-20250929-v1:0",
],
)
@pytest.mark.flaky(retries=3, delay=1)
@@ -267,7 +267,6 @@ def test_aaparallel_function_call_with_anthropic_thinking(model):
from litellm.types.utils import ChatCompletionMessageToolCall, Function, Message
_PARALLEL_TOOL_HISTORY_MESSAGES = [
{
"role": "user",
@@ -303,7 +302,7 @@ _PARALLEL_TOOL_HISTORY_MESSAGES = [
[
# Bedrock Converse still requires modify_params to inject the dummy tool.
(
"anthropic.claude-3-sonnet-20240229-v1:0",
"us.anthropic.claude-sonnet-4-5-20250929-v1:0",
_PARALLEL_TOOL_HISTORY_MESSAGES,
True,
),
@@ -314,7 +313,7 @@ _PARALLEL_TOOL_HISTORY_MESSAGES = [
False,
),
(
"anthropic.claude-3-sonnet-20240229-v1:0",
"us.anthropic.claude-sonnet-4-5-20250929-v1:0",
[
{
"role": "user",
@@ -579,7 +578,7 @@ def test_groq_parallel_function_call():
@pytest.mark.parametrize(
"model",
[
"bedrock/anthropic.claude-3-sonnet-20240229-v1:0",
"bedrock/us.anthropic.claude-sonnet-4-5-20250929-v1:0",
],
)
def test_passing_tool_result_as_list(model):
+14 -14
View File
@@ -57,7 +57,7 @@ async def test_completion_sagemaker(sync_mode):
print("testing sagemaker")
if sync_mode is True:
response = litellm.completion(
model="sagemaker/jumpstart-dft-hf-textgeneration1-mp-20240815-185614",
model="sagemaker/litellm-ci-textgen",
messages=[
{"role": "user", "content": "hi"},
],
@@ -67,7 +67,7 @@ async def test_completion_sagemaker(sync_mode):
)
else:
response = await litellm.acompletion(
model="sagemaker/jumpstart-dft-hf-textgeneration1-mp-20240815-185614",
model="sagemaker/litellm-ci-textgen",
messages=[
{"role": "user", "content": "hi"},
],
@@ -158,7 +158,7 @@ async def test_completion_sagemaker_messages_api(sync_mode):
"model",
[
# "sagemaker_chat/huggingface-pytorch-tgi-inference-2024-08-23-15-48-59-245",
"sagemaker/jumpstart-dft-hf-textgeneration1-mp-20240815-185614",
"sagemaker/litellm-ci-textgen",
],
)
# @pytest.mark.flaky(retries=3, delay=1)
@@ -218,7 +218,7 @@ async def test_completion_sagemaker_stream(sync_mode, model):
"model",
[
# "sagemaker_chat/huggingface-pytorch-tgi-inference-2024-08-23-15-48-59-245",
"sagemaker/jumpstart-dft-hf-textgeneration1-mp-20240815-185614",
"sagemaker/litellm-ci-textgen",
],
)
async def test_completion_sagemaker_streaming_bad_request(sync_mode, model):
@@ -256,7 +256,7 @@ async def test_acompletion_sagemaker_non_stream():
"id": "cmpl-mockid",
"object": "text_completion",
"created": 1629800000,
"model": "sagemaker/jumpstart-dft-hf-textgeneration1-mp-20240815-185614",
"model": "sagemaker/litellm-ci-textgen",
"choices": [
{
"text": "This is a mock response from SageMaker.",
@@ -282,7 +282,7 @@ async def test_acompletion_sagemaker_non_stream():
) as mock_post:
# Act: Call the litellm.acompletion function
response = await litellm.acompletion(
model="sagemaker/jumpstart-dft-hf-textgeneration1-mp-20240815-185614",
model="sagemaker/litellm-ci-textgen",
messages=[
{"role": "user", "content": "hi"},
],
@@ -302,7 +302,7 @@ async def test_acompletion_sagemaker_non_stream():
assert args_to_sagemaker == expected_payload
assert (
kwargs["url"]
== "https://runtime.sagemaker.us-west-2.amazonaws.com/endpoints/jumpstart-dft-hf-textgeneration1-mp-20240815-185614/invocations"
== "https://runtime.sagemaker.us-west-2.amazonaws.com/endpoints/litellm-ci-textgen/invocations"
)
@@ -316,7 +316,7 @@ async def test_completion_sagemaker_non_stream():
"id": "cmpl-mockid",
"object": "text_completion",
"created": 1629800000,
"model": "sagemaker/jumpstart-dft-hf-textgeneration1-mp-20240815-185614",
"model": "sagemaker/litellm-ci-textgen",
"choices": [
{
"text": "This is a mock response from SageMaker.",
@@ -342,7 +342,7 @@ async def test_completion_sagemaker_non_stream():
) as mock_post:
# Act: Call the litellm.acompletion function
response = litellm.completion(
model="sagemaker/jumpstart-dft-hf-textgeneration1-mp-20240815-185614",
model="sagemaker/litellm-ci-textgen",
messages=[
{"role": "user", "content": "hi"},
],
@@ -362,7 +362,7 @@ async def test_completion_sagemaker_non_stream():
assert args_to_sagemaker == expected_payload
assert (
kwargs["url"]
== "https://runtime.sagemaker.us-west-2.amazonaws.com/endpoints/jumpstart-dft-hf-textgeneration1-mp-20240815-185614/invocations"
== "https://runtime.sagemaker.us-west-2.amazonaws.com/endpoints/litellm-ci-textgen/invocations"
)
@@ -377,7 +377,7 @@ async def test_completion_sagemaker_prompt_template_non_stream():
"id": "cmpl-mockid",
"object": "text_completion",
"created": 1629800000,
"model": "sagemaker/jumpstart-dft-hf-textgeneration1-mp-20240815-185614",
"model": "sagemaker/litellm-ci-textgen",
"choices": [
{
"text": "This is a mock response from SageMaker.",
@@ -433,7 +433,7 @@ async def test_completion_sagemaker_non_stream_with_aws_params():
"id": "cmpl-mockid",
"object": "text_completion",
"created": 1629800000,
"model": "sagemaker/jumpstart-dft-hf-textgeneration1-mp-20240815-185614",
"model": "sagemaker/litellm-ci-textgen",
"choices": [
{
"text": "This is a mock response from SageMaker.",
@@ -459,7 +459,7 @@ async def test_completion_sagemaker_non_stream_with_aws_params():
) as mock_post:
# Act: Call the litellm.acompletion function
response = litellm.completion(
model="sagemaker/jumpstart-dft-hf-textgeneration1-mp-20240815-185614",
model="sagemaker/litellm-ci-textgen",
messages=[
{"role": "user", "content": "hi"},
],
@@ -482,5 +482,5 @@ async def test_completion_sagemaker_non_stream_with_aws_params():
assert args_to_sagemaker == expected_payload
assert (
kwargs["url"]
== "https://runtime.sagemaker.us-west-5.amazonaws.com/endpoints/jumpstart-dft-hf-textgeneration1-mp-20240815-185614/invocations"
== "https://runtime.sagemaker.us-west-5.amazonaws.com/endpoints/litellm-ci-textgen/invocations"
)
+4 -4
View File
@@ -1174,7 +1174,7 @@ async def test_completion_replicate_llama3_streaming(sync_mode):
[
# ["bedrock/ai21.jamba-instruct-v1:0", "us-east-1"],
# ["bedrock/cohere.command-r-plus-v1:0", None],
["anthropic.claude-3-sonnet-20240229-v1:0", None],
["us.anthropic.claude-sonnet-4-5-20250929-v1:0", None],
# ["mistral.mistral-7b-instruct-v0:2", None],
# ["meta.llama3-8b-instruct-v1:0", None],
],
@@ -1246,7 +1246,7 @@ def test_bedrock_claude_3_streaming():
try:
litellm.set_verbose = True
response: ModelResponse = completion( # type: ignore
model="bedrock/anthropic.claude-3-sonnet-20240229-v1:0",
model="bedrock/us.anthropic.claude-sonnet-4-5-20250929-v1:0",
messages=messages,
max_tokens=10, # type: ignore
stream=True,
@@ -1276,7 +1276,7 @@ def test_bedrock_claude_3_streaming():
"model",
[
"claude-haiku-4-5-20251001",
"cohere.command-r-plus-v1:0", # bedrock
"us.anthropic.claude-haiku-4-5-20251001-v1:0", # bedrock
"gpt-3.5-turbo",
],
)
@@ -3500,7 +3500,7 @@ def test_unit_test_perplexity_citations_chunk():
[
"gpt-3.5-turbo",
"claude-sonnet-4-5-20250929",
"anthropic.claude-3-sonnet-20240229-v1:0",
"us.anthropic.claude-sonnet-4-5-20250929-v1:0",
# "vertex_ai/claude-3-5-sonnet@20240620",
],
)
@@ -27,7 +27,7 @@ async def test_basic_s3_logging(sync_mode, streaming):
verbose_logger.setLevel(level=logging.DEBUG)
litellm.success_callback = ["s3"]
litellm.s3_callback_params = {
"s3_bucket_name": "load-testing-oct",
"s3_bucket_name": "load-testing-oct-941277531214",
"s3_aws_secret_access_key": "os.environ/AWS_SECRET_ACCESS_KEY",
"s3_aws_access_key_id": "os.environ/AWS_ACCESS_KEY_ID",
"s3_region_name": "us-west-2",
@@ -64,14 +64,14 @@ async def test_basic_s3_logging(sync_mode, streaming):
await asyncio.sleep(2)
print(f"response: {response}")
total_objects, all_s3_keys = list_all_s3_objects("load-testing-oct")
total_objects, all_s3_keys = list_all_s3_objects("load-testing-oct-941277531214")
# assert that atlest one key has response.id in it
assert any(response_id in key for key in all_s3_keys)
s3 = boto3.client("s3")
# delete all objects
for key in all_s3_keys:
s3.delete_object(Bucket="load-testing-oct", Key=key)
s3.delete_object(Bucket="load-testing-oct-941277531214", Key=key)
@pytest.mark.asyncio
@@ -82,7 +82,7 @@ async def test_basic_s3_v2_logging(streaming):
from litellm.integrations.s3_v2 import S3Logger
litellm.s3_callback_params = {
"s3_bucket_name": "load-testing-oct",
"s3_bucket_name": "load-testing-oct-941277531214",
"s3_aws_secret_access_key": "test-secret",
"s3_aws_access_key_id": "test-key",
"s3_region_name": "us-west-2",
@@ -2,7 +2,6 @@ import io
import os
import sys
sys.path.insert(0, os.path.abspath("../.."))
import asyncio
@@ -67,7 +66,7 @@ def setup_vector_store_registry():
litellm.vector_store_registry = VectorStoreRegistry(
vector_stores=[
LiteLLM_ManagedVectorStore(
vector_store_id="T37J8R4WTM", custom_llm_provider="bedrock"
vector_store_id="LCYXFBR2TU", custom_llm_provider="bedrock"
)
]
)
@@ -111,7 +110,7 @@ async def test_e2e_bedrock_knowledgebase_retrieval_with_completion(
response = await litellm.acompletion(
model="anthropic/claude-3.5-sonnet",
messages=[{"role": "user", "content": "what is litellm?"}],
vector_store_ids=["T37J8R4WTM"],
vector_store_ids=["LCYXFBR2TU"],
client=client,
)
except Exception as e:
@@ -152,7 +151,7 @@ async def test_e2e_bedrock_knowledgebase_retrieval_with_llm_api_call(
response = await litellm.acompletion(
model="bedrock/us.anthropic.claude-haiku-4-5-20251001-v1:0",
messages=[{"role": "user", "content": "what is litellm?"}],
vector_store_ids=["T37J8R4WTM"],
vector_store_ids=["LCYXFBR2TU"],
client=async_client,
)
print("OPENAI RESPONSE:", json.dumps(dict(response), indent=4, default=str))
@@ -196,7 +195,7 @@ async def test_e2e_bedrock_knowledgebase_retrieval_with_llm_api_call_streaming(
response = await litellm.acompletion(
model=f"anthropic/{os.environ.get('CI_CD_DEFAULT_ANTHROPIC_MODEL', 'claude-haiku-4-5-20251001')}",
messages=[{"role": "user", "content": "what is litellm?"}],
vector_store_ids=["T37J8R4WTM"],
vector_store_ids=["LCYXFBR2TU"],
stream=True,
client=async_client,
)
@@ -255,7 +254,7 @@ async def test_e2e_bedrock_knowledgebase_retrieval_with_llm_api_call_with_tools(
model=f"anthropic/{os.environ.get('CI_CD_DEFAULT_ANTHROPIC_MODEL', 'claude-haiku-4-5-20251001')}",
messages=[{"role": "user", "content": "what is litellm?"}],
max_tokens=10,
tools=[{"type": "file_search", "vector_store_ids": ["T37J8R4WTM"]}],
tools=[{"type": "file_search", "vector_store_ids": ["LCYXFBR2TU"]}],
)
assert response is not None
@@ -279,7 +278,7 @@ async def test_e2e_bedrock_knowledgebase_retrieval_with_llm_api_call_with_tools_
tools=[
{
"type": "file_search",
"vector_store_ids": ["T37J8R4WTM"],
"vector_store_ids": ["LCYXFBR2TU"],
"filters": {
"key": "user_id",
"value": "fake-user-id",
@@ -387,7 +386,7 @@ async def test_bedrock_kb_request_body_has_transformed_filters(
tools=[
{
"type": "file_search",
"vector_store_ids": ["T37J8R4WTM"],
"vector_store_ids": ["LCYXFBR2TU"],
"filters": {
"key": "user_id",
"value": "fake-user-id",
@@ -461,7 +460,7 @@ async def test_openai_with_knowledge_base_mock_openai(setup_vector_store_registr
await litellm.acompletion(
model="gpt-5.5",
messages=[{"role": "user", "content": "what is litellm?"}],
vector_store_ids=["T37J8R4WTM"],
vector_store_ids=["LCYXFBR2TU"],
client=client,
)
except Exception as e:
@@ -537,7 +536,7 @@ async def test_openai_with_vector_store_ids_in_tool_call_mock_openai(
await litellm.acompletion(
model="gpt-5.5",
messages=[{"role": "user", "content": "what is litellm?"}],
tools=[{"type": "file_search", "vector_store_ids": ["T37J8R4WTM"]}],
tools=[{"type": "file_search", "vector_store_ids": ["LCYXFBR2TU"]}],
client=client,
)
except Exception as e:
@@ -611,7 +610,7 @@ async def test_openai_with_mixed_tool_call_mock_openai(setup_vector_store_regist
model="gpt-5.5",
messages=[{"role": "user", "content": "what is litellm?"}],
tools=[
{"type": "file_search", "vector_store_ids": ["T37J8R4WTM"]},
{"type": "file_search", "vector_store_ids": ["LCYXFBR2TU"]},
{"type": "file_search", "vector_store_ids": ["unknownVS"]},
],
client=client,
@@ -645,7 +644,7 @@ async def test_openai_with_mixed_tool_call_mock_openai(setup_vector_store_regist
# model="gpt-5.5",
# messages=[{"role": "user", "content": "what is litellm?"}],
# vector_store_ids = [
# "T37J8R4WTM"
# "LCYXFBR2TU"
# ],
# )
@@ -667,7 +666,7 @@ async def test_openai_with_mixed_tool_call_mock_openai(setup_vector_store_regist
# # expect the vector store request metadata object to have the correct values
# vector_store_request_metadata = standard_logging_vector_store_request_metadata[0]
# assert vector_store_request_metadata.get("vector_store_id") == "T37J8R4WTM"
# assert vector_store_request_metadata.get("vector_store_id") == "LCYXFBR2TU"
# assert vector_store_request_metadata.get("query") == "what is litellm?"
# assert vector_store_request_metadata.get("custom_llm_provider") == "bedrock"
@@ -723,7 +722,7 @@ async def test_e2e_bedrock_knowledgebase_retrieval_without_vector_store_registry
response = await litellm.acompletion(
model="anthropic/claude-3.5-sonnet",
messages=[{"role": "user", "content": "what is litellm?"}],
vector_store_ids=["T37J8R4WTM"],
vector_store_ids=["LCYXFBR2TU"],
client=client,
)
except Exception as e:
@@ -76,7 +76,7 @@ class TestAgentCoreAcceptHeader:
with patch.object(client, "post", return_value=MagicMock()) as mock_post:
try:
litellm.completion(
model="bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:888602223428:runtime/test_runtime",
model="bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:941277531214:runtime/test_runtime",
messages=[{"role": "user", "content": "test"}],
api_key="test-jwt-token",
client=client,
@@ -281,7 +281,7 @@ class TestAgentCoreStreamingJsonFallback:
with patch.object(client, "post", return_value=mock_response):
response = litellm.completion(
model="bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:888602223428:runtime/test_agent",
model="bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:941277531214:runtime/test_agent",
messages=[{"role": "user", "content": "test"}],
stream=True,
client=client,
@@ -318,7 +318,7 @@ class TestAgentCoreStreamingJsonFallback:
client, "post", new_callable=AsyncMock, return_value=mock_response
):
response = await litellm.acompletion(
model="bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:888602223428:runtime/test_agent",
model="bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:941277531214:runtime/test_agent",
messages=[{"role": "user", "content": "test"}],
stream=True,
client=client,
@@ -353,7 +353,7 @@ class TestAgentCoreStreamingJsonFallback:
Exception, match="Failed to read/parse JSON response body"
):
litellm.completion(
model="bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:888602223428:runtime/test_agent",
model="bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:941277531214:runtime/test_agent",
messages=[{"role": "user", "content": "test"}],
stream=True,
client=client,
@@ -383,7 +383,7 @@ class TestAgentCoreStreamingJsonFallback:
Exception, match="Failed to read/parse JSON response body"
):
await litellm.acompletion(
model="bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:888602223428:runtime/test_agent",
model="bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:941277531214:runtime/test_agent",
messages=[{"role": "user", "content": "test"}],
stream=True,
client=client,
+1 -1
View File
@@ -446,7 +446,7 @@ async def test_chat_completion_anthropic_structured_output():
client = AsyncOpenAI(api_key="sk-1234", base_url="http://0.0.0.0:4000")
res = await client.beta.chat.completions.parse(
model="bedrock/us.anthropic.claude-3-sonnet-20240229-v1:0",
model="bedrock/us.anthropic.claude-sonnet-4-5-20250929-v1:0",
messages=messages,
response_format=EventsList,
timeout=60,
@@ -22,7 +22,7 @@ class TestBedrockVectorStore(BaseVectorStoreTest):
def get_base_request_args(self):
return {
"vector_store_id": "T37J8R4WTM",
"vector_store_id": "LCYXFBR2TU",
"custom_llm_provider": "bedrock",
"query": "what happens after we add a model",
}
@@ -106,7 +106,7 @@ async def test_bedrock_search_with_router():
_router = Router(model_list=[])
search_response = await _router.avector_store_search(
query="what happens after we add a model",
vector_store_id="T37J8R4WTM",
vector_store_id="LCYXFBR2TU",
custom_llm_provider="bedrock",
)
print(search_response)
@@ -150,7 +150,7 @@ async def test_bedrock_search_with_credentials_managed_registry():
# Create vector store with credential reference
vector_store = LiteLLM_ManagedVectorStore(
vector_store_id="T37J8R4WTM",
vector_store_id="LCYXFBR2TU",
custom_llm_provider="bedrock",
created_at=datetime.now(timezone.utc),
updated_at=datetime.now(timezone.utc),
@@ -162,7 +162,7 @@ async def test_bedrock_search_with_credentials_managed_registry():
litellm.vector_store_registry = registry
# Verify credentials can be retrieved from registry
retrieved_credentials = registry.get_credentials_for_vector_store("T37J8R4WTM")
retrieved_credentials = registry.get_credentials_for_vector_store("LCYXFBR2TU")
assert retrieved_credentials, "Should retrieve credentials from registry"
assert retrieved_credentials.get("aws_access_key_id") == "test_access_key"
assert retrieved_credentials.get("aws_secret_access_key") == "test_secret_key"
@@ -194,7 +194,7 @@ async def test_bedrock_search_with_credentials_managed_registry():
search_response = await _router.avector_store_search(
query="what happens after we add a model",
vector_store_id="T37J8R4WTM",
vector_store_id="LCYXFBR2TU",
custom_llm_provider="bedrock",
)
@@ -203,7 +203,7 @@ async def test_bedrock_search_with_credentials_managed_registry():
call_kwargs = mock_handler.call_args[1]
# Verify that the credential accessor was called with the correct vector store ID
mock_get_creds.assert_called_with("T37J8R4WTM")
mock_get_creds.assert_called_with("LCYXFBR2TU")
# Verify the credentials were injected into the search call
litellm_params = call_kwargs.get("litellm_params", {})
@@ -224,7 +224,7 @@ async def test_bedrock_search_with_credentials_managed_registry():
assert search_response["data"][0]["id"] == "test_result"
print(
f"✅ Test passed: Credential accessor was called with vector store ID: T37J8R4WTM"
f"✅ Test passed: Credential accessor was called with vector store ID: LCYXFBR2TU"
)
print(f"✅ Retrieved credentials: {retrieved_credentials}")
print(f"✅ Credentials were injected into search call")