From f9407bc0366ef9075e1cd298f2d344d3e11267f2 Mon Sep 17 00:00:00 2001 From: Mateo Wang <277851410+mateo-berri@users.noreply.github.com> Date: Mon, 25 May 2026 12:03:17 -0700 Subject: [PATCH] chore(tests): migrate Bedrock CI to AWS account 941277531214 (#28728) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * chore(tests): migrate Bedrock CI from AWS account 888602223428 to 941277531214 The original account (888602223428) was put under a security restriction by AWS after a root access key leaked in a PR comment. While that account works its way through the AWS Support unlock process, Bedrock-touching CI tests have been migrated to a fresh account (941277531214). Changes: - Replace 26 hardcoded references to 888602223428 with 941277531214 across 8 files (provisioned-model ARNs, imported-model ARNs, AgentCore runtime ARNs, batch execution role ARN, and example proxy config). - The provisioned-model and imported-model ARNs are referenced only from mocked unit tests — no AWS resources to recreate. - The batch execution IAM role has been recreated in the new account with the same name and equivalent permissions. - The two AgentCore runtimes (hosted_agent_r9jvp-3ySZuRHjLC, hosted_agent_13sf6-cALnp38iZD) are being recreated in the new account under the same names — see tools/agentcore-deploy/ in a follow-up. CircleCI env vars AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY / AWS_REGION_NAME were updated separately via the CircleCI API to point at the new account. Smoke-tested locally against the new account: aws bedrock-runtime converse --region us-west-2 \ --model-id us.anthropic.claude-sonnet-4-5-20250929-v1:0 \ --messages '[{"role":"user","content":[{"text":"ping"}]}]' → 200, model returned 'pong' Co-Authored-By: Claude Opus 4.7 * chore(tests): refresh AgentCore ARN suffixes to match newly-deployed runtimes The first migration commit replaced just the account ID, but AgentCore auto-assigns a random 10-char suffix to every runtime on creation — we can't reuse the original suffixes (`3ySZuRHjLC`, `cALnp38iZD`) in the new account. Updated the AgentCore-runtime ARNs in the three files that reference real runtime IDs (not the mock-based unit-test ARNs). Deployed runtimes: arn:aws:bedrock-agentcore:us-west-2:941277531214:runtime/hosted_agent_r9jvp-Rq79QFC2fp arn:aws:bedrock-agentcore:us-west-2:941277531214:runtime/hosted_agent_13sf6-4046UzHSwy Both runtimes are status=READY and pass a smoke invoke: $ aws bedrock-agentcore invoke-agent-runtime --agent-runtime-arn ... --payload '{"prompt":"ping"}' → 200, {"result": "echo: ping"} The agent is a minimal echo (see /tmp/agentcore_deploy/agent.py for the deploy artifacts). Tests that only verify the SDK wiring will pass; if any test asserts on agent output content, swap the echo for the real agent. Co-Authored-By: Claude Opus 4.7 * chore(tests): point Bedrock batch tests at new-account S3 bucket The account migration (888602223428 -> 941277531214) was a flat account-ID swap, which only rewrites ARNs that embed the account number. S3 bucket names carry no account ID, so the live Bedrock batch tests still uploaded to `litellm-proxy` — a bucket that lives in the old account. S3 names are globally unique, and the old account still holds that name, so it can't be recreated in the new account. Rename to `litellm-proxy-941277531214` (account-ID suffix guarantees global uniqueness). The bucket must be created in 941277531214 and the batch execution role granted s3:GetObject/PutObject/ListBucket on it before this job is run in CI. Co-Authored-By: Claude Opus 4.7 (1M context) * chore(tests): point live S3 logging test at new-account bucket Same account-ID-free blind spot as the batch bucket: `load-testing-oct` lives in the old account and its name can't be reused globally. The `logging_testing` CI job is wired into the workflow and runs test_basic_s3_logging, which uploads to this bucket with the CI env creds, then lists and deletes objects — a live dependency. Rename to `load-testing-oct-941277531214`. The bucket must exist in the new account with the CI IAM principal granted s3:PutObject/GetObject/ListBucket/DeleteObject before this job runs. Co-Authored-By: Claude Opus 4.7 (1M context) * chore(tests): repoint Bedrock guardrail IDs to new-account guardrails The migration left guardrail IDs untouched (no account ID in them), so all live guardrail tests failed with "guardrail identifier or version does not exist" against 941277531214. Recreated both guardrails in the new account and updated the hardcoded IDs: - wf0hkdb5x07f -> zgkmukebruil (PII mask: PHONE + CREDIT_DEBIT_CARD, with explicit inputAction=ANONYMIZE so masking applies to INPUT, which is the source litellm's moderation hook sends) - ff6ujrregl1q -> 4w3d1di3snt5 (blocks "coffee"; blocked message set to the exact string the tests assert on) Updated test_bedrock_guardrails.py, otel_test_config.yaml, and the guardrailConfig in test_bedrock_completion.py. Verified locally: the 5 previously-failing guardrail tests now pass. Co-Authored-By: Claude Opus 4.7 (1M context) * test(bedrock): migrate legacy models to current inference profiles The new CI account (941277531214) cannot invoke legacy Bedrock models (AWS gates them: "marked by provider as Legacy... not actively using in the last 30 days"). Migrated the live-call tests: - anthropic.claude-3-sonnet-20240229 -> us.anthropic.claude-sonnet-4-5-20250929-v1:0 - anthropic.claude-3-haiku-20240307 -> us.anthropic.claude-haiku-4-5-20251001-v1:0 Current Claude models on Bedrock require the us. inference-profile prefix (bare on-demand ids are rejected). cohere.command-r-plus has no working replacement (all Cohere is legacy- gated in the new account): swapped to claude-haiku-4-5 in provider- agnostic param lists. amazon.titan-image-generator skipped (no working replacement). Mocked/transformation/cost tests that reference the legacy strings are intentionally left unchanged. Verified live against the new account. Co-Authored-By: Claude Opus 4.7 (1M context) * test(bedrock): repoint SageMaker + Knowledge Base to new-account resources These referenced account-scoped resources by hardcoded id that only existed in the old account, so the migration's account-ID swap missed them. Recreated in 941277531214 and repointed: - SageMaker endpoint jumpstart-dft-hf-textgeneration1-mp-20240815-185614 -> litellm-ci-textgen (gpt2 on a TGI container, ml.g5.xlarge) - Bedrock Knowledge Base T37J8R4WTM -> LCYXFBR2TU (OpenSearch Serverless vector store + titan-embed-text-v2, seeded with a LiteLLM doc) Verified live: test_sagemaker.py (12 passed) and test_bedrock_knowledgebase_hook.py (12 passed). Co-Authored-By: Claude Opus 4.7 (1M context) * test(reasoning_effort_grid): skip bedrock claude-opus-4-7 cells (not entitled on 941277531214) claude-opus-4-7 is listed in the new Bedrock CI account's foundation models but invoke is denied (AccessDeniedException: "not available for this account"). Bedrock access to the flagship Opus requires an AWS Sales request, not the self-serve model-access toggle, so it can't be enabled inline with the rest of the account migration. Add an optional `skip_reason` to ModelEntry and set it on the bedrock-claude-opus-4-7 entry; the grid test honors it via pytest.skip. Cell count (231) and route coverage are unchanged, so the structural asserts still pass. Restore coverage by deleting the one skip_reason line once access is granted. Co-Authored-By: Claude Opus 4.7 (1M context) * test(bedrock): swap/skip legacy-gated models unavailable on new CI account The migrated AWS account (941277531214) cannot access several models that the old account could, so the remaining red CI jobs were hitting real Bedrock "Access denied / Legacy" and "account not authorized" errors: - image_gen: skip both Nova Canvas test classes (amazon.nova-canvas-v1:0 is legacy-gated), matching the existing titan skip. - batches: skip test_async_file_and_batch (Bedrock batch inference is not authorized on the new account; requires an AWS support case). - litellm_overhead: swap legacy claude-3-5-haiku for the active us.anthropic.claude-haiku-4-5 inference profile. - test_completion_claude_3_function_call: swap legacy claude-3-sonnet for the active us.anthropic.claude-sonnet-4-5 inference profile. https://claude.ai/code/session_01Y7zgHYu9GX29YRwV4yiWAa * test(bedrock): fix remaining e2e legacy-model + batch failures on new CI account - e2e_openai_endpoints: skip test_bedrock_batches_api (Bedrock batch inference is not authorized on account 941277531214) and migrate the missed s3_bucket_name in oai_misc_config.yaml to litellm-proxy-941277531214. - build_and_test: swap legacy bedrock claude-3-sonnet for the active us.anthropic.claude-sonnet-4-5 inference profile in the proxy structured output e2e test. https://claude.ai/code/session_01Y7zgHYu9GX29YRwV4yiWAa * test(bedrock): make opus-4-7 + batch cells fail loudly and mock image-gen (#28791) Replace the silent skips added for the new CI account with noisier behavior: - reasoning-effort grid: opus-4-7 cells now fail (when AWS creds are present) instead of skipping, so the missing entitlement stays visible in CI; they still skip when AWS creds are absent (local dev) - Bedrock batch inference tests: drop the skip so they run and fail until batch access is granted - Titan + Nova Canvas image-gen tests: mock the Bedrock HTTP call so the transform + cost-tracking path stays under test without live model access https://claude.ai/code/session_01MT7SWDnXUjv6e6EPG7BDjT Co-authored-by: Claude * test(bedrock): use pytest.xfail for known-failing opus-4-7 cells Replace pytest.fail with pytest.xfail when a model has a fail_reason, so known-broken cells stay visible as XFAIL without keeping CI red. Co-authored-by: Yassin Kortam --------- Co-authored-by: Mateo Co-authored-by: Claude Opus 4.7 Co-authored-by: Cursor Agent Co-authored-by: Yassin Kortam --- .../bedrock/chat/agentcore/transformation.py | 6 +- .../example_config_yaml/oai_misc_config.yaml | 4 +- .../example_config_yaml/otel_test_config.yaml | 2 +- .../test_a2a_completion_bridge.py | 2 +- .../test_bedrock_files_and_batches.py | 8 +-- .../test_bedrock_guardrails.py | 16 ++--- .../test_bedrock_image_gen_unit_tests.py | 35 ++++++++--- .../image_gen_tests/test_image_generation.py | 58 ++++++++++++++++++- .../test_litellm_overhead.py | 4 +- .../reasoning_effort_grid/grid_spec.py | 8 ++- .../test_reasoning_effort_grid.py | 4 +- .../llm_translation/test_bedrock_agentcore.py | 24 ++++---- .../test_bedrock_completion.py | 20 +++---- tests/local_testing/test_completion.py | 21 +++---- .../test_function_call_parsing.py | 3 +- tests/local_testing/test_function_calling.py | 9 ++- tests/local_testing/test_sagemaker.py | 28 ++++----- tests/local_testing/test_streaming.py | 8 +-- .../test_amazing_s3_logs.py | 8 +-- .../test_bedrock_knowledgebase_hook.py | 27 +++++---- .../test_agentcore_transformation.py | 10 ++-- tests/test_openai_endpoints.py | 2 +- .../test_bedrock_vector_store.py | 14 ++--- 23 files changed, 203 insertions(+), 118 deletions(-) diff --git a/litellm/llms/bedrock/chat/agentcore/transformation.py b/litellm/llms/bedrock/chat/agentcore/transformation.py index 44ba1ce3c8..9b9b96aae0 100644 --- a/litellm/llms/bedrock/chat/agentcore/transformation.py +++ b/litellm/llms/bedrock/chat/agentcore/transformation.py @@ -157,8 +157,8 @@ class AmazonAgentCoreConfig(BaseConfig, BaseAWSLLM): def _get_agent_runtime_arn(self, model: str) -> str: """ Extract ARN from model string - model = "agentcore/arn:aws:bedrock-agentcore:us-west-2:888602223428:runtime/hosted_agent_r9jvp-3ySZuRHjLC" - returns: "arn:aws:bedrock-agentcore:us-west-2:888602223428:runtime/hosted_agent_r9jvp-3ySZuRHjLC" + model = "agentcore/arn:aws:bedrock-agentcore:us-west-2:941277531214:runtime/hosted_agent_r9jvp-Rq79QFC2fp" + returns: "arn:aws:bedrock-agentcore:us-west-2:941277531214:runtime/hosted_agent_r9jvp-Rq79QFC2fp" """ parts = model.split("/", 1) if len(parts) != 2 or parts[0] != "agentcore": @@ -170,7 +170,7 @@ class AmazonAgentCoreConfig(BaseConfig, BaseAWSLLM): def _extract_region_from_arn(self, arn: str) -> str: """ Extract region from ARN - arn:aws:bedrock-agentcore:us-west-2:888602223428:runtime/hosted_agent_r9jvp-3ySZuRHjLC + arn:aws:bedrock-agentcore:us-west-2:941277531214:runtime/hosted_agent_r9jvp-Rq79QFC2fp returns: us-west-2 """ parts = arn.split(":") diff --git a/litellm/proxy/example_config_yaml/oai_misc_config.yaml b/litellm/proxy/example_config_yaml/oai_misc_config.yaml index 16cc69c19a..0b647de8a0 100644 --- a/litellm/proxy/example_config_yaml/oai_misc_config.yaml +++ b/litellm/proxy/example_config_yaml/oai_misc_config.yaml @@ -23,11 +23,11 @@ model_list: model: bedrock/us.anthropic.claude-haiku-4-5-20251001-v1:0 ######################################################### ########## batch specific params ######################## - s3_bucket_name: litellm-proxy + s3_bucket_name: litellm-proxy-941277531214 s3_region_name: us-west-2 s3_access_key_id: os.environ/AWS_ACCESS_KEY_ID s3_secret_access_key: os.environ/AWS_SECRET_ACCESS_KEY - aws_batch_role_arn: arn:aws:iam::888602223428:role/service-role/AmazonBedrockExecutionRoleForAgents_BB9HNW6V4CV + aws_batch_role_arn: arn:aws:iam::941277531214:role/service-role/AmazonBedrockExecutionRoleForAgents_BB9HNW6V4CV model_info: mode: batch diff --git a/litellm/proxy/example_config_yaml/otel_test_config.yaml b/litellm/proxy/example_config_yaml/otel_test_config.yaml index c05e2b1b5d..9c7937efba 100644 --- a/litellm/proxy/example_config_yaml/otel_test_config.yaml +++ b/litellm/proxy/example_config_yaml/otel_test_config.yaml @@ -55,7 +55,7 @@ guardrails: litellm_params: guardrail: bedrock # supported values: "bedrock", "lakera" mode: "during_call" - guardrailIdentifier: ff6ujrregl1q + guardrailIdentifier: 4w3d1di3snt5 guardrailVersion: "DRAFT" - guardrail_name: "custom-pre-guard" litellm_params: diff --git a/tests/agent_tests/local_only_agent_tests/test_a2a_completion_bridge.py b/tests/agent_tests/local_only_agent_tests/test_a2a_completion_bridge.py index a9268da4c3..95d76ba580 100644 --- a/tests/agent_tests/local_only_agent_tests/test_a2a_completion_bridge.py +++ b/tests/agent_tests/local_only_agent_tests/test_a2a_completion_bridge.py @@ -168,7 +168,7 @@ async def test_a2a_completion_bridge_bedrock_agentcore(): litellm._turn_on_debug() # Bedrock AgentCore ARN (streaming-capable runtime) - agentcore_arn = "arn:aws:bedrock-agentcore:us-west-2:888602223428:runtime/hosted_agent_r9jvp-3ySZuRHjLC" + agentcore_arn = "arn:aws:bedrock-agentcore:us-west-2:941277531214:runtime/hosted_agent_r9jvp-Rq79QFC2fp" send_message_payload = { "message": { diff --git a/tests/batches_tests/test_bedrock_files_and_batches.py b/tests/batches_tests/test_bedrock_files_and_batches.py index 5148ea4db9..97c0802ec9 100644 --- a/tests/batches_tests/test_bedrock_files_and_batches.py +++ b/tests/batches_tests/test_bedrock_files_and_batches.py @@ -38,7 +38,7 @@ async def test_async_create_file(): file=open(file_path, "rb"), purpose="batch", custom_llm_provider="bedrock", - s3_bucket_name="litellm-proxy", + s3_bucket_name="litellm-proxy-941277531214", ) @@ -55,7 +55,7 @@ async def test_async_file_and_batch(): file=open(file_path, "rb"), purpose="batch", custom_llm_provider="bedrock", - s3_bucket_name="litellm-proxy", + s3_bucket_name="litellm-proxy-941277531214", ) print("CREATED FILE RESPONSE=", file_obj) @@ -70,7 +70,7 @@ async def test_async_file_and_batch(): # bedrock specific params ######################################################### model="us.anthropic.claude-haiku-4-5-20251001-v1:0", - aws_batch_role_arn="arn:aws:iam::888602223428:role/service-role/AmazonBedrockExecutionRoleForAgents_BB9HNW6V4CV", + aws_batch_role_arn="arn:aws:iam::941277531214:role/service-role/AmazonBedrockExecutionRoleForAgents_BB9HNW6V4CV", ) print("CREATED BATCH RESPONSE=", create_batch_response) @@ -129,7 +129,7 @@ async def test_mock_bedrock_file_url_mapping(): ), purpose="batch", custom_llm_provider="bedrock", - s3_bucket_name="litellm-proxy", + s3_bucket_name="litellm-proxy-941277531214", ) print(f"PUT URL: {captured_put_url}") diff --git a/tests/guardrails_tests/test_bedrock_guardrails.py b/tests/guardrails_tests/test_bedrock_guardrails.py index 6e78a8c428..ea50fe08ae 100644 --- a/tests/guardrails_tests/test_bedrock_guardrails.py +++ b/tests/guardrails_tests/test_bedrock_guardrails.py @@ -20,7 +20,7 @@ async def test_bedrock_guardrails_pii_masking(): mock_user_api_key_dict = UserAPIKeyAuth() guardrail = BedrockGuardrail( - guardrailIdentifier="wf0hkdb5x07f", + guardrailIdentifier="zgkmukebruil", guardrailVersion="DRAFT", ) @@ -60,7 +60,7 @@ async def test_bedrock_guardrails_pii_masking_content_list(): mock_user_api_key_dict = UserAPIKeyAuth() guardrail = BedrockGuardrail( - guardrailIdentifier="wf0hkdb5x07f", + guardrailIdentifier="zgkmukebruil", guardrailVersion="DRAFT", ) @@ -115,7 +115,7 @@ async def test_bedrock_guardrails_block_messages_api(): mock_user_api_key_dict = UserAPIKeyAuth() guardrail = BedrockGuardrail( - guardrailIdentifier="ff6ujrregl1q", + guardrailIdentifier="4w3d1di3snt5", guardrailVersion="DRAFT", ) @@ -166,7 +166,7 @@ async def test_bedrock_guardrails_block_responses_api(): mock_user_api_key_dict = UserAPIKeyAuth() guardrail = BedrockGuardrail( - guardrailIdentifier="ff6ujrregl1q", + guardrailIdentifier="4w3d1di3snt5", guardrailVersion="DRAFT", ) @@ -211,7 +211,7 @@ async def test_bedrock_guardrails_with_streaming(): ) guardrail = BedrockGuardrail( - guardrailIdentifier="ff6ujrregl1q", + guardrailIdentifier="4w3d1di3snt5", guardrailVersion="DRAFT", supported_event_hooks=[GuardrailEventHooks.post_call], guardrail_name="bedrock-post-guard", @@ -255,7 +255,7 @@ async def test_bedrock_guardrails_with_streaming_no_violation(): ) guardrail = BedrockGuardrail( - guardrailIdentifier="ff6ujrregl1q", + guardrailIdentifier="4w3d1di3snt5", guardrailVersion="DRAFT", supported_event_hooks=[GuardrailEventHooks.post_call], guardrail_name="bedrock-post-guard", @@ -299,7 +299,7 @@ async def test_bedrock_guardrails_streaming_request_body_mock(): # Create the guardrail guardrail = BedrockGuardrail( - guardrailIdentifier="wf0hkdb5x07f", + guardrailIdentifier="zgkmukebruil", guardrailVersion="DRAFT", supported_event_hooks=[GuardrailEventHooks.post_call], guardrail_name="bedrock-post-guard", @@ -382,7 +382,7 @@ async def test_bedrock_guardrail_aws_param_persistence(): from litellm.types.guardrails import GuardrailEventHooks guardrail = BedrockGuardrail( - guardrailIdentifier="wf0hkdb5x07f", + guardrailIdentifier="zgkmukebruil", guardrailVersion="DRAFT", aws_access_key_id="test-access-key", aws_secret_access_key="test-secret-key", diff --git a/tests/image_gen_tests/test_bedrock_image_gen_unit_tests.py b/tests/image_gen_tests/test_bedrock_image_gen_unit_tests.py index 36ae9e1df6..181691b730 100644 --- a/tests/image_gen_tests/test_bedrock_image_gen_unit_tests.py +++ b/tests/image_gen_tests/test_bedrock_image_gen_unit_tests.py @@ -1,3 +1,4 @@ +import json import logging import os import sys @@ -44,6 +45,9 @@ from litellm.llms.bedrock.image_generation.image_handler import ( ) from litellm.llms.bedrock.common_utils import BedrockError +# Base64 placeholder used for mocked Bedrock image responses (a 1x1 PNG). +_MOCK_BEDROCK_IMAGE_B64 = "iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAQAAAC1HAwCAAAAC0lEQVR42mNk+M9QDwADhgGAWjR9awAAAABJRU5ErkJggg==" + @pytest.mark.parametrize( "model,expected", @@ -528,17 +532,34 @@ def test_backward_compatibility_regular_nova_model(): def test_amazon_titan_image_gen(): - """Test Amazon Titan image generation with cost tracking.""" - from litellm import image_generation + """Test Amazon Titan image generation with cost tracking. + + The Bedrock CI account is not entitled to amazon.titan-image-generator, so + the network call is mocked and only the transform + cost-tracking path is + exercised. + """ + from litellm.llms.custom_httpx.http_handler import HTTPHandler # Use v2 as v1 has reached end of life model_id = "bedrock/amazon.titan-image-generator-v2:0" - response = litellm.image_generation( - model=model_id, - prompt="A serene mountain landscape at sunset with a lake reflection", - aws_region_name="us-east-1", - ) + mock_payload = {"images": [_MOCK_BEDROCK_IMAGE_B64]} + mock_response = MagicMock() + mock_response.status_code = 200 + mock_response.json.return_value = mock_payload + mock_response.text = json.dumps(mock_payload) + mock_response.headers = {} + + client = HTTPHandler() + with patch.object(client, "post", return_value=mock_response): + response = litellm.image_generation( + model=model_id, + prompt="A serene mountain landscape at sunset with a lake reflection", + aws_region_name="us-east-1", + aws_access_key_id="fake-access-key-id", + aws_secret_access_key="fake-secret-access-key", + client=client, + ) print(f"response cost: {response._hidden_params['response_cost']}") diff --git a/tests/image_gen_tests/test_image_generation.py b/tests/image_gen_tests/test_image_generation.py index 873777189c..23a94ef389 100644 --- a/tests/image_gen_tests/test_image_generation.py +++ b/tests/image_gen_tests/test_image_generation.py @@ -7,7 +7,6 @@ import sys import traceback from unittest.mock import AsyncMock, MagicMock, patch - sys.path.insert( 0, os.path.abspath("../..") ) # Adds the parent directory to the system path @@ -136,6 +135,51 @@ class TestVertexAIGeminiImageGeneration(BaseImageGenTest): } +# Base64 placeholder used for mocked Bedrock image responses (a 1x1 PNG). +_MOCK_BEDROCK_IMAGE_B64 = "iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAQAAAC1HAwCAAAAC0lEQVR42mNk+M9QDwADhgGAWjR9awAAAABJRU5ErkJggg==" + + +async def _assert_mocked_bedrock_image_generation(call_args: dict) -> None: + """Run ``aimage_generation`` with the Bedrock HTTP call mocked. + + The CI account is not entitled to Nova Canvas, so the network call is + replaced with a canned Bedrock response. This keeps the request transform, + response transform, and cost-tracking path under test without live access. + """ + mock_payload = {"images": [_MOCK_BEDROCK_IMAGE_B64]} + mock_response = MagicMock() + mock_response.status_code = 200 + mock_response.json.return_value = mock_payload + mock_response.text = json.dumps(mock_payload) + mock_response.headers = {} + + custom_logger = TestCustomLogger() + litellm.logging_callback_manager._reset_all_callbacks() + litellm.callbacks = [custom_logger] + + with patch( + "litellm.llms.custom_httpx.http_handler.AsyncHTTPHandler.post", + new_callable=AsyncMock, + return_value=mock_response, + ): + response = await litellm.aimage_generation( + **call_args, + prompt="A image of a otter", + aws_access_key_id="fake-access-key-id", + aws_secret_access_key="fake-secret-access-key", + ) + + await asyncio.sleep(1) + + assert custom_logger.standard_logging_payload is not None + assert custom_logger.standard_logging_payload["response_cost"] is not None + assert custom_logger.standard_logging_payload["response_cost"] > 0 + assert response.data is not None + for d in response.data: + assert isinstance(d, Image) + assert d.b64_json is not None or d.url is not None + + class TestBedrockNovaCanvasTextToImage(BaseImageGenTest): def get_base_image_generation_call_args(self) -> dict: litellm.in_memory_llm_clients_cache = InMemoryCache() @@ -148,6 +192,12 @@ class TestBedrockNovaCanvasTextToImage(BaseImageGenTest): "aws_region_name": "us-east-1", } + @pytest.mark.asyncio(scope="module") + async def test_basic_image_generation(self): + await _assert_mocked_bedrock_image_generation( + self.get_base_image_generation_call_args() + ) + class TestBedrockNovaCanvasColorGuidedGeneration(BaseImageGenTest): def get_base_image_generation_call_args(self) -> dict: @@ -162,6 +212,12 @@ class TestBedrockNovaCanvasColorGuidedGeneration(BaseImageGenTest): "aws_region_name": "us-east-1", } + @pytest.mark.asyncio(scope="module") + async def test_basic_image_generation(self): + await _assert_mocked_bedrock_image_generation( + self.get_base_image_generation_call_args() + ) + class TestOpenAIGPTImage1(BaseImageGenTest): def get_base_image_generation_call_args(self) -> dict: diff --git a/tests/litellm_utils_tests/test_litellm_overhead.py b/tests/litellm_utils_tests/test_litellm_overhead.py index 3a428e9d58..60ee849f8e 100644 --- a/tests/litellm_utils_tests/test_litellm_overhead.py +++ b/tests/litellm_utils_tests/test_litellm_overhead.py @@ -82,7 +82,7 @@ async def _vertex_ai_mocks(): "bedrock/mistral.mistral-7b-instruct-v0:2", "openai/gpt-4o", "openai/self_hosted", - "bedrock/anthropic.claude-3-5-haiku-20241022-v1:0", + "bedrock/us.anthropic.claude-haiku-4-5-20251001-v1:0", "vertex_ai/gemini-1.5-flash", ], ) @@ -147,7 +147,7 @@ async def test_litellm_overhead_non_streaming(model): [ "bedrock/mistral.mistral-7b-instruct-v0:2", "openai/gpt-4o", - "bedrock/anthropic.claude-3-5-haiku-20241022-v1:0", + "bedrock/us.anthropic.claude-haiku-4-5-20251001-v1:0", "openai/self_hosted", ], ) diff --git a/tests/llm_translation/reasoning_effort_grid/grid_spec.py b/tests/llm_translation/reasoning_effort_grid/grid_spec.py index ed5346dad7..993643e0fc 100644 --- a/tests/llm_translation/reasoning_effort_grid/grid_spec.py +++ b/tests/llm_translation/reasoning_effort_grid/grid_spec.py @@ -1,7 +1,6 @@ from dataclasses import dataclass, field from typing import Dict, FrozenSet, List, Optional, Tuple - OMIT = object() @@ -22,6 +21,7 @@ class ModelEntry: extra_params: Tuple[Tuple[str, str], ...] = field(default_factory=tuple) required_env: FrozenSet[str] = field(default_factory=frozenset) caps: FrozenSet[str] = field(default_factory=frozenset) + fail_reason: Optional[str] = None def params(self) -> Dict[str, str]: return dict(self.extra_params) @@ -205,6 +205,12 @@ BEDROCK_CONVERSE_MODELS: Tuple[ModelEntry, ...] = ( extra_params=(("aws_region_name", "us-east-1"),), required_env=_BEDROCK_REQ, caps=_CAPS_OPUS_4_7, + fail_reason=( + "claude-opus-4-7 is not entitled on the Bedrock CI account " + "941277531214 (model access requires an AWS Sales request, not " + "self-serve); this cell fails on purpose so it stays loud in CI — " + "remove this fail_reason once access is granted" + ), ), ModelEntry( alias="bedrock-claude-opus-4-6", diff --git a/tests/llm_translation/reasoning_effort_grid/test_reasoning_effort_grid.py b/tests/llm_translation/reasoning_effort_grid/test_reasoning_effort_grid.py index 28e2e402d6..e0b6290ad7 100644 --- a/tests/llm_translation/reasoning_effort_grid/test_reasoning_effort_grid.py +++ b/tests/llm_translation/reasoning_effort_grid/test_reasoning_effort_grid.py @@ -15,7 +15,6 @@ from .grid_spec import ( all_cells, ) - _PROMPT_MESSAGES: List[Dict[str, str]] = [ {"role": "user", "content": "Step by step, calculate 47 * 53. Show your work."} ] @@ -168,6 +167,9 @@ async def test_reasoning_effort_grid( if skip_reason: pytest.skip(skip_reason) + if model.fail_reason: + pytest.xfail(model.fail_reason) + if route_name == "bedrock_invoke_messages": status, exc = await _call_messages(model, effort) else: diff --git a/tests/llm_translation/test_bedrock_agentcore.py b/tests/llm_translation/test_bedrock_agentcore.py index 40774cf3d6..95a814e97e 100644 --- a/tests/llm_translation/test_bedrock_agentcore.py +++ b/tests/llm_translation/test_bedrock_agentcore.py @@ -19,8 +19,8 @@ import httpx @pytest.mark.parametrize( "model", [ - "bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:888602223428:runtime/hosted_agent_13sf6-cALnp38iZD", # non-streaming invocation - "bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:888602223428:runtime/hosted_agent_r9jvp-3ySZuRHjLC", # streaming invocation + "bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:941277531214:runtime/hosted_agent_13sf6-4046UzHSwy", # non-streaming invocation + "bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:941277531214:runtime/hosted_agent_r9jvp-Rq79QFC2fp", # streaming invocation ], ) def test_bedrock_agentcore_basic(model): @@ -44,7 +44,7 @@ def test_bedrock_agentcore_basic(model): @pytest.mark.parametrize( "model", [ - "bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:888602223428:runtime/hosted_agent_13sf6-cALnp38iZD", # streaming invocation + "bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:941277531214:runtime/hosted_agent_13sf6-4046UzHSwy", # streaming invocation ], ) async def test_bedrock_agentcore_with_streaming(model): @@ -54,7 +54,7 @@ async def test_bedrock_agentcore_with_streaming(model): print("running streming test for model=", model) # litellm._turn_on_debug() response = await litellm.acompletion( - model="bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:888602223428:runtime/hosted_agent_r9jvp-3ySZuRHjLC", + model="bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:941277531214:runtime/hosted_agent_r9jvp-Rq79QFC2fp", messages=[ { "role": "user", @@ -82,7 +82,7 @@ def test_bedrock_agentcore_with_custom_params(): with patch.object(client, "post", return_value=MagicMock()) as mock_post: try: response = litellm.completion( - model="bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:888602223428:runtime/hosted_agent_r9jvp-3ySZuRHjLC", + model="bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:941277531214:runtime/hosted_agent_r9jvp-Rq79QFC2fp", messages=[ { "role": "user", @@ -105,7 +105,7 @@ def test_bedrock_agentcore_with_custom_params(): url = call_kwargs["url"] print(f"URL: {url}") assert ( - "/runtimes/arn%3Aaws%3Abedrock-agentcore%3Aus-west-2%3A888602223428%3Aruntime%2Fhosted_agent_r9jvp-3ySZuRHjLC/invocations" + "/runtimes/arn%3Aaws%3Abedrock-agentcore%3Aus-west-2%3A941277531214%3Aruntime%2Fhosted_agent_r9jvp-Rq79QFC2fp/invocations" in url ) assert "qualifier=DEFAULT" in url @@ -150,7 +150,7 @@ def test_bedrock_agentcore_with_runtime_user_id(): with patch.object(client, "post", return_value=MagicMock()) as mock_post: try: response = litellm.completion( - model="bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:888602223428:runtime/hosted_agent_r9jvp-3ySZuRHjLC", + model="bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:941277531214:runtime/hosted_agent_r9jvp-Rq79QFC2fp", messages=[ { "role": "user", @@ -189,7 +189,7 @@ def test_bedrock_agentcore_with_session_and_user(): with patch.object(client, "post", return_value=MagicMock()) as mock_post: try: response = litellm.completion( - model="bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:888602223428:runtime/hosted_agent_r9jvp-3ySZuRHjLC", + model="bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:941277531214:runtime/hosted_agent_r9jvp-Rq79QFC2fp", messages=[ { "role": "user", @@ -234,7 +234,7 @@ def test_bedrock_agentcore_with_api_key_bearer_token(): with patch.object(client, "post", return_value=MagicMock()) as mock_post: try: response = litellm.completion( - model="bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:888602223428:runtime/hosted_agent_r9jvp-3ySZuRHjLC", + model="bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:941277531214:runtime/hosted_agent_r9jvp-Rq79QFC2fp", messages=[ { "role": "user", @@ -282,7 +282,7 @@ def test_bedrock_agentcore_with_all_parameters(): with patch.object(client, "post", return_value=MagicMock()) as mock_post: try: response = litellm.completion( - model="bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:888602223428:runtime/hosted_agent_r9jvp-3ySZuRHjLC", + model="bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:941277531214:runtime/hosted_agent_r9jvp-Rq79QFC2fp", messages=[ { "role": "user", @@ -350,7 +350,7 @@ def test_bedrock_agentcore_without_api_key_uses_sigv4(): with patch.object(client, "post", return_value=MagicMock()) as mock_post: try: response = litellm.completion( - model="bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:888602223428:runtime/hosted_agent_r9jvp-3ySZuRHjLC", + model="bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:941277531214:runtime/hosted_agent_r9jvp-Rq79QFC2fp", messages=[ { "role": "user", @@ -625,7 +625,7 @@ def test_agentcore_synchronous_non_streaming_response(): with patch.object(client, "post", return_value=mock_response) as mock_post: # Make a synchronous (non-streaming) completion call response = litellm.completion( - model="bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:888602223428:runtime/hosted_agent_r9jvp-3ySZuRHjLC", + model="bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:941277531214:runtime/hosted_agent_r9jvp-Rq79QFC2fp", messages=[ { "role": "user", diff --git a/tests/llm_translation/test_bedrock_completion.py b/tests/llm_translation/test_bedrock_completion.py index 15f950224d..69c87d1d23 100644 --- a/tests/llm_translation/test_bedrock_completion.py +++ b/tests/llm_translation/test_bedrock_completion.py @@ -115,7 +115,7 @@ def test_completion_bedrock_guardrails(streaming): ], max_tokens=10, guardrailConfig={ - "guardrailIdentifier": "ff6ujrregl1q", + "guardrailIdentifier": "4w3d1di3snt5", "guardrailVersion": "DRAFT", "trace": "enabled", }, @@ -144,7 +144,7 @@ def test_completion_bedrock_guardrails(streaming): stream=True, max_tokens=10, guardrailConfig={ - "guardrailIdentifier": "ff6ujrregl1q", + "guardrailIdentifier": "4w3d1di3snt5", "guardrailVersion": "DRAFT", "trace": "enabled", }, @@ -475,7 +475,7 @@ def test_bedrock_claude_3(image_url): ], } response: ModelResponse = completion( - model="bedrock/anthropic.claude-3-sonnet-20240229-v1:0", + model="bedrock/us.anthropic.claude-sonnet-4-5-20250929-v1:0", num_retries=3, **data, ) # type: ignore @@ -498,7 +498,7 @@ def test_bedrock_claude_3(image_url): @pytest.mark.parametrize( "model", [ - "anthropic.claude-3-sonnet-20240229-v1:0", + "us.anthropic.claude-sonnet-4-5-20250929-v1:0", # "meta.llama3-70b-instruct-v1:0", # "anthropic.claude-v2", # "mistral.mixtral-8x7b-instruct-v0:1", @@ -537,7 +537,7 @@ def test_bedrock_stop_value(stop, model): @pytest.mark.parametrize( "model", [ - "anthropic.claude-3-sonnet-20240229-v1:0", + "us.anthropic.claude-sonnet-4-5-20250929-v1:0", "mistral.mixtral-8x7b-instruct-v0:1", ], ) @@ -602,7 +602,7 @@ def test_bedrock_claude_3_tool_calling(): } ] response: ModelResponse = completion( - model="bedrock/anthropic.claude-3-sonnet-20240229-v1:0", + model="bedrock/us.anthropic.claude-sonnet-4-5-20250929-v1:0", messages=messages, tools=tools, tool_choice="auto", @@ -630,7 +630,7 @@ def test_bedrock_claude_3_tool_calling(): ) # In the second response, Claude should deduce answer from tool results second_response = completion( - model="bedrock/anthropic.claude-3-sonnet-20240229-v1:0", + model="bedrock/us.anthropic.claude-sonnet-4-5-20250929-v1:0", messages=messages, tools=tools, tool_choice="auto", @@ -737,7 +737,7 @@ def test_bedrock_ptu(): from openai.types.chat import ChatCompletion model_id = ( - "arn:aws:bedrock:us-west-2:888602223428:provisioned-model/8fxff74qyhs3" + "arn:aws:bedrock:us-west-2:941277531214:provisioned-model/8fxff74qyhs3" ) try: response = litellm.completion( @@ -752,7 +752,7 @@ def test_bedrock_ptu(): assert "url" in mock_client_post.call_args.kwargs assert ( mock_client_post.call_args.kwargs["url"] - == "https://bedrock-runtime.us-west-2.amazonaws.com/model/arn%3Aaws%3Abedrock%3Aus-west-2%3A888602223428%3Aprovisioned-model%2F8fxff74qyhs3/converse" + == "https://bedrock-runtime.us-west-2.amazonaws.com/model/arn%3Aaws%3Abedrock%3Aus-west-2%3A941277531214%3Aprovisioned-model%2F8fxff74qyhs3/converse" ) mock_client_post.assert_called_once() @@ -2327,7 +2327,7 @@ def test_bedrock_cross_region_inference(monkeypatch): def test_bedrock_empty_content_real_call(): completion( - model="bedrock/anthropic.claude-3-sonnet-20240229-v1:0", + model="bedrock/us.anthropic.claude-sonnet-4-5-20250929-v1:0", messages=[ { "role": "user", diff --git a/tests/local_testing/test_completion.py b/tests/local_testing/test_completion.py index cce6d33e79..c7abdb5f49 100644 --- a/tests/local_testing/test_completion.py +++ b/tests/local_testing/test_completion.py @@ -299,7 +299,10 @@ def test_completion_claude_3(): @pytest.mark.parametrize( "model", - ["anthropic/claude-sonnet-4-5-20250929", "anthropic.claude-3-sonnet-20240229-v1:0"], + [ + "anthropic/claude-sonnet-4-5-20250929", + "us.anthropic.claude-sonnet-4-5-20250929-v1:0", + ], ) def test_completion_claude_3_function_call(model): litellm.set_verbose = True @@ -385,7 +388,7 @@ def test_completion_claude_3_function_call(model): [ ("gpt-3.5-turbo", None, None), ("claude-sonnet-4-5-20250929", None, None), - ("anthropic.claude-3-sonnet-20240229-v1:0", None, None), + ("us.anthropic.claude-sonnet-4-5-20250929-v1:0", None, None), # ( # "azure_ai/command-r-plus", # os.getenv("AZURE_COHERE_API_KEY"), @@ -1578,7 +1581,7 @@ def test_completion_openai(): [ # ("gpt-4o-2024-08-06", None), # ("azure/gpt-4.1-mini", None), - ("bedrock/anthropic.claude-3-sonnet-20240229-v1:0", None), + ("bedrock/us.anthropic.claude-sonnet-4-5-20250929-v1:0", None), # ("azure/gpt-4o-new-test", "2024-08-01-preview"), ], ) @@ -1666,15 +1669,13 @@ def custom_callback( ################################################# - print( - f""" + print(f""" Model: {model}, Messages: {messages}, User: {user}, Seed: {kwargs["seed"]}, temperature: {kwargs["temperature"]}, - """ - ) + """) assert kwargs["user"] == "ishaans app" assert kwargs["model"] == "gpt-3.5-turbo-1106" @@ -2699,7 +2700,7 @@ def test_bedrock_deepseek_custom_prompt_dict(): def test_bedrock_deepseek_known_tokenizer_config(monkeypatch): model = ( - "deepseek_r1/arn:aws:bedrock:us-west-2:888602223428:imported-model/bnnr6463ejgf" + "deepseek_r1/arn:aws:bedrock:us-west-2:941277531214:imported-model/bnnr6463ejgf" ) from litellm.llms.custom_httpx.http_handler import HTTPHandler from unittest.mock import Mock @@ -2914,8 +2915,8 @@ def response_format_tests(response: litellm.ModelResponse): "model", [ "bedrock/mistral.mistral-large-2407-v1:0", - "bedrock/cohere.command-r-plus-v1:0", - "anthropic.claude-3-sonnet-20240229-v1:0", + "us.anthropic.claude-haiku-4-5-20251001-v1:0", + "us.anthropic.claude-sonnet-4-5-20250929-v1:0", "mistral.mistral-7b-instruct-v0:2", "meta.llama3-8b-instruct-v1:0", ], diff --git a/tests/local_testing/test_function_call_parsing.py b/tests/local_testing/test_function_call_parsing.py index f9582fcc57..2453571f1c 100644 --- a/tests/local_testing/test_function_call_parsing.py +++ b/tests/local_testing/test_function_call_parsing.py @@ -142,7 +142,8 @@ def trade(model_name: str) -> List[Trade]: # type: ignore @pytest.mark.parametrize( - "model", ["claude-haiku-4-5-20251001", "anthropic.claude-3-haiku-20240307-v1:0"] + "model", + ["claude-haiku-4-5-20251001", "us.anthropic.claude-haiku-4-5-20251001-v1:0"], ) @pytest.mark.flaky(retries=6, delay=10) def test_function_call_parsing(model): diff --git a/tests/local_testing/test_function_calling.py b/tests/local_testing/test_function_calling.py index 3c7e004b62..1cad7d1421 100644 --- a/tests/local_testing/test_function_calling.py +++ b/tests/local_testing/test_function_calling.py @@ -49,7 +49,7 @@ def get_current_weather(location, unit="fahrenheit"): "mistral/mistral-large-latest", "claude-haiku-4-5-20251001", "gemini/gemini-2.5-flash-lite", - "anthropic.claude-3-sonnet-20240229-v1:0", + "us.anthropic.claude-sonnet-4-5-20250929-v1:0", ], ) @pytest.mark.flaky(retries=3, delay=1) @@ -267,7 +267,6 @@ def test_aaparallel_function_call_with_anthropic_thinking(model): from litellm.types.utils import ChatCompletionMessageToolCall, Function, Message - _PARALLEL_TOOL_HISTORY_MESSAGES = [ { "role": "user", @@ -303,7 +302,7 @@ _PARALLEL_TOOL_HISTORY_MESSAGES = [ [ # Bedrock Converse still requires modify_params to inject the dummy tool. ( - "anthropic.claude-3-sonnet-20240229-v1:0", + "us.anthropic.claude-sonnet-4-5-20250929-v1:0", _PARALLEL_TOOL_HISTORY_MESSAGES, True, ), @@ -314,7 +313,7 @@ _PARALLEL_TOOL_HISTORY_MESSAGES = [ False, ), ( - "anthropic.claude-3-sonnet-20240229-v1:0", + "us.anthropic.claude-sonnet-4-5-20250929-v1:0", [ { "role": "user", @@ -579,7 +578,7 @@ def test_groq_parallel_function_call(): @pytest.mark.parametrize( "model", [ - "bedrock/anthropic.claude-3-sonnet-20240229-v1:0", + "bedrock/us.anthropic.claude-sonnet-4-5-20250929-v1:0", ], ) def test_passing_tool_result_as_list(model): diff --git a/tests/local_testing/test_sagemaker.py b/tests/local_testing/test_sagemaker.py index d4c5a5a857..fdc8347c36 100644 --- a/tests/local_testing/test_sagemaker.py +++ b/tests/local_testing/test_sagemaker.py @@ -57,7 +57,7 @@ async def test_completion_sagemaker(sync_mode): print("testing sagemaker") if sync_mode is True: response = litellm.completion( - model="sagemaker/jumpstart-dft-hf-textgeneration1-mp-20240815-185614", + model="sagemaker/litellm-ci-textgen", messages=[ {"role": "user", "content": "hi"}, ], @@ -67,7 +67,7 @@ async def test_completion_sagemaker(sync_mode): ) else: response = await litellm.acompletion( - model="sagemaker/jumpstart-dft-hf-textgeneration1-mp-20240815-185614", + model="sagemaker/litellm-ci-textgen", messages=[ {"role": "user", "content": "hi"}, ], @@ -158,7 +158,7 @@ async def test_completion_sagemaker_messages_api(sync_mode): "model", [ # "sagemaker_chat/huggingface-pytorch-tgi-inference-2024-08-23-15-48-59-245", - "sagemaker/jumpstart-dft-hf-textgeneration1-mp-20240815-185614", + "sagemaker/litellm-ci-textgen", ], ) # @pytest.mark.flaky(retries=3, delay=1) @@ -218,7 +218,7 @@ async def test_completion_sagemaker_stream(sync_mode, model): "model", [ # "sagemaker_chat/huggingface-pytorch-tgi-inference-2024-08-23-15-48-59-245", - "sagemaker/jumpstart-dft-hf-textgeneration1-mp-20240815-185614", + "sagemaker/litellm-ci-textgen", ], ) async def test_completion_sagemaker_streaming_bad_request(sync_mode, model): @@ -256,7 +256,7 @@ async def test_acompletion_sagemaker_non_stream(): "id": "cmpl-mockid", "object": "text_completion", "created": 1629800000, - "model": "sagemaker/jumpstart-dft-hf-textgeneration1-mp-20240815-185614", + "model": "sagemaker/litellm-ci-textgen", "choices": [ { "text": "This is a mock response from SageMaker.", @@ -282,7 +282,7 @@ async def test_acompletion_sagemaker_non_stream(): ) as mock_post: # Act: Call the litellm.acompletion function response = await litellm.acompletion( - model="sagemaker/jumpstart-dft-hf-textgeneration1-mp-20240815-185614", + model="sagemaker/litellm-ci-textgen", messages=[ {"role": "user", "content": "hi"}, ], @@ -302,7 +302,7 @@ async def test_acompletion_sagemaker_non_stream(): assert args_to_sagemaker == expected_payload assert ( kwargs["url"] - == "https://runtime.sagemaker.us-west-2.amazonaws.com/endpoints/jumpstart-dft-hf-textgeneration1-mp-20240815-185614/invocations" + == "https://runtime.sagemaker.us-west-2.amazonaws.com/endpoints/litellm-ci-textgen/invocations" ) @@ -316,7 +316,7 @@ async def test_completion_sagemaker_non_stream(): "id": "cmpl-mockid", "object": "text_completion", "created": 1629800000, - "model": "sagemaker/jumpstart-dft-hf-textgeneration1-mp-20240815-185614", + "model": "sagemaker/litellm-ci-textgen", "choices": [ { "text": "This is a mock response from SageMaker.", @@ -342,7 +342,7 @@ async def test_completion_sagemaker_non_stream(): ) as mock_post: # Act: Call the litellm.acompletion function response = litellm.completion( - model="sagemaker/jumpstart-dft-hf-textgeneration1-mp-20240815-185614", + model="sagemaker/litellm-ci-textgen", messages=[ {"role": "user", "content": "hi"}, ], @@ -362,7 +362,7 @@ async def test_completion_sagemaker_non_stream(): assert args_to_sagemaker == expected_payload assert ( kwargs["url"] - == "https://runtime.sagemaker.us-west-2.amazonaws.com/endpoints/jumpstart-dft-hf-textgeneration1-mp-20240815-185614/invocations" + == "https://runtime.sagemaker.us-west-2.amazonaws.com/endpoints/litellm-ci-textgen/invocations" ) @@ -377,7 +377,7 @@ async def test_completion_sagemaker_prompt_template_non_stream(): "id": "cmpl-mockid", "object": "text_completion", "created": 1629800000, - "model": "sagemaker/jumpstart-dft-hf-textgeneration1-mp-20240815-185614", + "model": "sagemaker/litellm-ci-textgen", "choices": [ { "text": "This is a mock response from SageMaker.", @@ -433,7 +433,7 @@ async def test_completion_sagemaker_non_stream_with_aws_params(): "id": "cmpl-mockid", "object": "text_completion", "created": 1629800000, - "model": "sagemaker/jumpstart-dft-hf-textgeneration1-mp-20240815-185614", + "model": "sagemaker/litellm-ci-textgen", "choices": [ { "text": "This is a mock response from SageMaker.", @@ -459,7 +459,7 @@ async def test_completion_sagemaker_non_stream_with_aws_params(): ) as mock_post: # Act: Call the litellm.acompletion function response = litellm.completion( - model="sagemaker/jumpstart-dft-hf-textgeneration1-mp-20240815-185614", + model="sagemaker/litellm-ci-textgen", messages=[ {"role": "user", "content": "hi"}, ], @@ -482,5 +482,5 @@ async def test_completion_sagemaker_non_stream_with_aws_params(): assert args_to_sagemaker == expected_payload assert ( kwargs["url"] - == "https://runtime.sagemaker.us-west-5.amazonaws.com/endpoints/jumpstart-dft-hf-textgeneration1-mp-20240815-185614/invocations" + == "https://runtime.sagemaker.us-west-5.amazonaws.com/endpoints/litellm-ci-textgen/invocations" ) diff --git a/tests/local_testing/test_streaming.py b/tests/local_testing/test_streaming.py index 10f351714e..eb153404a4 100644 --- a/tests/local_testing/test_streaming.py +++ b/tests/local_testing/test_streaming.py @@ -1174,7 +1174,7 @@ async def test_completion_replicate_llama3_streaming(sync_mode): [ # ["bedrock/ai21.jamba-instruct-v1:0", "us-east-1"], # ["bedrock/cohere.command-r-plus-v1:0", None], - ["anthropic.claude-3-sonnet-20240229-v1:0", None], + ["us.anthropic.claude-sonnet-4-5-20250929-v1:0", None], # ["mistral.mistral-7b-instruct-v0:2", None], # ["meta.llama3-8b-instruct-v1:0", None], ], @@ -1246,7 +1246,7 @@ def test_bedrock_claude_3_streaming(): try: litellm.set_verbose = True response: ModelResponse = completion( # type: ignore - model="bedrock/anthropic.claude-3-sonnet-20240229-v1:0", + model="bedrock/us.anthropic.claude-sonnet-4-5-20250929-v1:0", messages=messages, max_tokens=10, # type: ignore stream=True, @@ -1276,7 +1276,7 @@ def test_bedrock_claude_3_streaming(): "model", [ "claude-haiku-4-5-20251001", - "cohere.command-r-plus-v1:0", # bedrock + "us.anthropic.claude-haiku-4-5-20251001-v1:0", # bedrock "gpt-3.5-turbo", ], ) @@ -3500,7 +3500,7 @@ def test_unit_test_perplexity_citations_chunk(): [ "gpt-3.5-turbo", "claude-sonnet-4-5-20250929", - "anthropic.claude-3-sonnet-20240229-v1:0", + "us.anthropic.claude-sonnet-4-5-20250929-v1:0", # "vertex_ai/claude-3-5-sonnet@20240620", ], ) diff --git a/tests/logging_callback_tests/test_amazing_s3_logs.py b/tests/logging_callback_tests/test_amazing_s3_logs.py index dab2a0cc0b..e6291a9404 100644 --- a/tests/logging_callback_tests/test_amazing_s3_logs.py +++ b/tests/logging_callback_tests/test_amazing_s3_logs.py @@ -27,7 +27,7 @@ async def test_basic_s3_logging(sync_mode, streaming): verbose_logger.setLevel(level=logging.DEBUG) litellm.success_callback = ["s3"] litellm.s3_callback_params = { - "s3_bucket_name": "load-testing-oct", + "s3_bucket_name": "load-testing-oct-941277531214", "s3_aws_secret_access_key": "os.environ/AWS_SECRET_ACCESS_KEY", "s3_aws_access_key_id": "os.environ/AWS_ACCESS_KEY_ID", "s3_region_name": "us-west-2", @@ -64,14 +64,14 @@ async def test_basic_s3_logging(sync_mode, streaming): await asyncio.sleep(2) print(f"response: {response}") - total_objects, all_s3_keys = list_all_s3_objects("load-testing-oct") + total_objects, all_s3_keys = list_all_s3_objects("load-testing-oct-941277531214") # assert that atlest one key has response.id in it assert any(response_id in key for key in all_s3_keys) s3 = boto3.client("s3") # delete all objects for key in all_s3_keys: - s3.delete_object(Bucket="load-testing-oct", Key=key) + s3.delete_object(Bucket="load-testing-oct-941277531214", Key=key) @pytest.mark.asyncio @@ -82,7 +82,7 @@ async def test_basic_s3_v2_logging(streaming): from litellm.integrations.s3_v2 import S3Logger litellm.s3_callback_params = { - "s3_bucket_name": "load-testing-oct", + "s3_bucket_name": "load-testing-oct-941277531214", "s3_aws_secret_access_key": "test-secret", "s3_aws_access_key_id": "test-key", "s3_region_name": "us-west-2", diff --git a/tests/logging_callback_tests/test_bedrock_knowledgebase_hook.py b/tests/logging_callback_tests/test_bedrock_knowledgebase_hook.py index d6d0652ed7..0d4405094b 100644 --- a/tests/logging_callback_tests/test_bedrock_knowledgebase_hook.py +++ b/tests/logging_callback_tests/test_bedrock_knowledgebase_hook.py @@ -2,7 +2,6 @@ import io import os import sys - sys.path.insert(0, os.path.abspath("../..")) import asyncio @@ -67,7 +66,7 @@ def setup_vector_store_registry(): litellm.vector_store_registry = VectorStoreRegistry( vector_stores=[ LiteLLM_ManagedVectorStore( - vector_store_id="T37J8R4WTM", custom_llm_provider="bedrock" + vector_store_id="LCYXFBR2TU", custom_llm_provider="bedrock" ) ] ) @@ -111,7 +110,7 @@ async def test_e2e_bedrock_knowledgebase_retrieval_with_completion( response = await litellm.acompletion( model="anthropic/claude-3.5-sonnet", messages=[{"role": "user", "content": "what is litellm?"}], - vector_store_ids=["T37J8R4WTM"], + vector_store_ids=["LCYXFBR2TU"], client=client, ) except Exception as e: @@ -152,7 +151,7 @@ async def test_e2e_bedrock_knowledgebase_retrieval_with_llm_api_call( response = await litellm.acompletion( model="bedrock/us.anthropic.claude-haiku-4-5-20251001-v1:0", messages=[{"role": "user", "content": "what is litellm?"}], - vector_store_ids=["T37J8R4WTM"], + vector_store_ids=["LCYXFBR2TU"], client=async_client, ) print("OPENAI RESPONSE:", json.dumps(dict(response), indent=4, default=str)) @@ -196,7 +195,7 @@ async def test_e2e_bedrock_knowledgebase_retrieval_with_llm_api_call_streaming( response = await litellm.acompletion( model=f"anthropic/{os.environ.get('CI_CD_DEFAULT_ANTHROPIC_MODEL', 'claude-haiku-4-5-20251001')}", messages=[{"role": "user", "content": "what is litellm?"}], - vector_store_ids=["T37J8R4WTM"], + vector_store_ids=["LCYXFBR2TU"], stream=True, client=async_client, ) @@ -255,7 +254,7 @@ async def test_e2e_bedrock_knowledgebase_retrieval_with_llm_api_call_with_tools( model=f"anthropic/{os.environ.get('CI_CD_DEFAULT_ANTHROPIC_MODEL', 'claude-haiku-4-5-20251001')}", messages=[{"role": "user", "content": "what is litellm?"}], max_tokens=10, - tools=[{"type": "file_search", "vector_store_ids": ["T37J8R4WTM"]}], + tools=[{"type": "file_search", "vector_store_ids": ["LCYXFBR2TU"]}], ) assert response is not None @@ -279,7 +278,7 @@ async def test_e2e_bedrock_knowledgebase_retrieval_with_llm_api_call_with_tools_ tools=[ { "type": "file_search", - "vector_store_ids": ["T37J8R4WTM"], + "vector_store_ids": ["LCYXFBR2TU"], "filters": { "key": "user_id", "value": "fake-user-id", @@ -387,7 +386,7 @@ async def test_bedrock_kb_request_body_has_transformed_filters( tools=[ { "type": "file_search", - "vector_store_ids": ["T37J8R4WTM"], + "vector_store_ids": ["LCYXFBR2TU"], "filters": { "key": "user_id", "value": "fake-user-id", @@ -461,7 +460,7 @@ async def test_openai_with_knowledge_base_mock_openai(setup_vector_store_registr await litellm.acompletion( model="gpt-5.5", messages=[{"role": "user", "content": "what is litellm?"}], - vector_store_ids=["T37J8R4WTM"], + vector_store_ids=["LCYXFBR2TU"], client=client, ) except Exception as e: @@ -537,7 +536,7 @@ async def test_openai_with_vector_store_ids_in_tool_call_mock_openai( await litellm.acompletion( model="gpt-5.5", messages=[{"role": "user", "content": "what is litellm?"}], - tools=[{"type": "file_search", "vector_store_ids": ["T37J8R4WTM"]}], + tools=[{"type": "file_search", "vector_store_ids": ["LCYXFBR2TU"]}], client=client, ) except Exception as e: @@ -611,7 +610,7 @@ async def test_openai_with_mixed_tool_call_mock_openai(setup_vector_store_regist model="gpt-5.5", messages=[{"role": "user", "content": "what is litellm?"}], tools=[ - {"type": "file_search", "vector_store_ids": ["T37J8R4WTM"]}, + {"type": "file_search", "vector_store_ids": ["LCYXFBR2TU"]}, {"type": "file_search", "vector_store_ids": ["unknownVS"]}, ], client=client, @@ -645,7 +644,7 @@ async def test_openai_with_mixed_tool_call_mock_openai(setup_vector_store_regist # model="gpt-5.5", # messages=[{"role": "user", "content": "what is litellm?"}], # vector_store_ids = [ -# "T37J8R4WTM" +# "LCYXFBR2TU" # ], # ) @@ -667,7 +666,7 @@ async def test_openai_with_mixed_tool_call_mock_openai(setup_vector_store_regist # # expect the vector store request metadata object to have the correct values # vector_store_request_metadata = standard_logging_vector_store_request_metadata[0] -# assert vector_store_request_metadata.get("vector_store_id") == "T37J8R4WTM" +# assert vector_store_request_metadata.get("vector_store_id") == "LCYXFBR2TU" # assert vector_store_request_metadata.get("query") == "what is litellm?" # assert vector_store_request_metadata.get("custom_llm_provider") == "bedrock" @@ -723,7 +722,7 @@ async def test_e2e_bedrock_knowledgebase_retrieval_without_vector_store_registry response = await litellm.acompletion( model="anthropic/claude-3.5-sonnet", messages=[{"role": "user", "content": "what is litellm?"}], - vector_store_ids=["T37J8R4WTM"], + vector_store_ids=["LCYXFBR2TU"], client=client, ) except Exception as e: diff --git a/tests/test_litellm/llms/bedrock/chat/agentcore/test_agentcore_transformation.py b/tests/test_litellm/llms/bedrock/chat/agentcore/test_agentcore_transformation.py index 64b43b15dc..3287061d37 100644 --- a/tests/test_litellm/llms/bedrock/chat/agentcore/test_agentcore_transformation.py +++ b/tests/test_litellm/llms/bedrock/chat/agentcore/test_agentcore_transformation.py @@ -76,7 +76,7 @@ class TestAgentCoreAcceptHeader: with patch.object(client, "post", return_value=MagicMock()) as mock_post: try: litellm.completion( - model="bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:888602223428:runtime/test_runtime", + model="bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:941277531214:runtime/test_runtime", messages=[{"role": "user", "content": "test"}], api_key="test-jwt-token", client=client, @@ -281,7 +281,7 @@ class TestAgentCoreStreamingJsonFallback: with patch.object(client, "post", return_value=mock_response): response = litellm.completion( - model="bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:888602223428:runtime/test_agent", + model="bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:941277531214:runtime/test_agent", messages=[{"role": "user", "content": "test"}], stream=True, client=client, @@ -318,7 +318,7 @@ class TestAgentCoreStreamingJsonFallback: client, "post", new_callable=AsyncMock, return_value=mock_response ): response = await litellm.acompletion( - model="bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:888602223428:runtime/test_agent", + model="bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:941277531214:runtime/test_agent", messages=[{"role": "user", "content": "test"}], stream=True, client=client, @@ -353,7 +353,7 @@ class TestAgentCoreStreamingJsonFallback: Exception, match="Failed to read/parse JSON response body" ): litellm.completion( - model="bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:888602223428:runtime/test_agent", + model="bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:941277531214:runtime/test_agent", messages=[{"role": "user", "content": "test"}], stream=True, client=client, @@ -383,7 +383,7 @@ class TestAgentCoreStreamingJsonFallback: Exception, match="Failed to read/parse JSON response body" ): await litellm.acompletion( - model="bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:888602223428:runtime/test_agent", + model="bedrock/agentcore/arn:aws:bedrock-agentcore:us-west-2:941277531214:runtime/test_agent", messages=[{"role": "user", "content": "test"}], stream=True, client=client, diff --git a/tests/test_openai_endpoints.py b/tests/test_openai_endpoints.py index e898b88a55..29875a0441 100644 --- a/tests/test_openai_endpoints.py +++ b/tests/test_openai_endpoints.py @@ -446,7 +446,7 @@ async def test_chat_completion_anthropic_structured_output(): client = AsyncOpenAI(api_key="sk-1234", base_url="http://0.0.0.0:4000") res = await client.beta.chat.completions.parse( - model="bedrock/us.anthropic.claude-3-sonnet-20240229-v1:0", + model="bedrock/us.anthropic.claude-sonnet-4-5-20250929-v1:0", messages=messages, response_format=EventsList, timeout=60, diff --git a/tests/vector_store_tests/test_bedrock_vector_store.py b/tests/vector_store_tests/test_bedrock_vector_store.py index d8af1c7188..47e73e61c5 100644 --- a/tests/vector_store_tests/test_bedrock_vector_store.py +++ b/tests/vector_store_tests/test_bedrock_vector_store.py @@ -22,7 +22,7 @@ class TestBedrockVectorStore(BaseVectorStoreTest): def get_base_request_args(self): return { - "vector_store_id": "T37J8R4WTM", + "vector_store_id": "LCYXFBR2TU", "custom_llm_provider": "bedrock", "query": "what happens after we add a model", } @@ -106,7 +106,7 @@ async def test_bedrock_search_with_router(): _router = Router(model_list=[]) search_response = await _router.avector_store_search( query="what happens after we add a model", - vector_store_id="T37J8R4WTM", + vector_store_id="LCYXFBR2TU", custom_llm_provider="bedrock", ) print(search_response) @@ -150,7 +150,7 @@ async def test_bedrock_search_with_credentials_managed_registry(): # Create vector store with credential reference vector_store = LiteLLM_ManagedVectorStore( - vector_store_id="T37J8R4WTM", + vector_store_id="LCYXFBR2TU", custom_llm_provider="bedrock", created_at=datetime.now(timezone.utc), updated_at=datetime.now(timezone.utc), @@ -162,7 +162,7 @@ async def test_bedrock_search_with_credentials_managed_registry(): litellm.vector_store_registry = registry # Verify credentials can be retrieved from registry - retrieved_credentials = registry.get_credentials_for_vector_store("T37J8R4WTM") + retrieved_credentials = registry.get_credentials_for_vector_store("LCYXFBR2TU") assert retrieved_credentials, "Should retrieve credentials from registry" assert retrieved_credentials.get("aws_access_key_id") == "test_access_key" assert retrieved_credentials.get("aws_secret_access_key") == "test_secret_key" @@ -194,7 +194,7 @@ async def test_bedrock_search_with_credentials_managed_registry(): search_response = await _router.avector_store_search( query="what happens after we add a model", - vector_store_id="T37J8R4WTM", + vector_store_id="LCYXFBR2TU", custom_llm_provider="bedrock", ) @@ -203,7 +203,7 @@ async def test_bedrock_search_with_credentials_managed_registry(): call_kwargs = mock_handler.call_args[1] # Verify that the credential accessor was called with the correct vector store ID - mock_get_creds.assert_called_with("T37J8R4WTM") + mock_get_creds.assert_called_with("LCYXFBR2TU") # Verify the credentials were injected into the search call litellm_params = call_kwargs.get("litellm_params", {}) @@ -224,7 +224,7 @@ async def test_bedrock_search_with_credentials_managed_registry(): assert search_response["data"][0]["id"] == "test_result" print( - f"✅ Test passed: Credential accessor was called with vector store ID: T37J8R4WTM" + f"✅ Test passed: Credential accessor was called with vector store ID: LCYXFBR2TU" ) print(f"✅ Retrieved credentials: {retrieved_credentials}") print(f"✅ Credentials were injected into search call")