Files
litellm/tests/logging_callback_tests/test_spend_logs.py
T
Mateo Wang 2c733c00f5 chore(ci): modernize model references in tests and configs (#27856)
* test: modernize models used in CircleCI e2e test suites

Replaces obsolete models (gpt-4o, gpt-4o-mini, gpt-3.5-turbo,
claude-3-5-sonnet-20240620, claude-sonnet-4-20250514) with current
equivalents across the e2e_openai_endpoints and
proxy_e2e_anthropic_messages_tests CircleCI jobs.

- gpt-4o -> gpt-5.5 (responses API e2e tests)
- gpt-4o-mini -> gpt-5-mini (websocket responses, oai_misc_config)
- gpt-4o-mini-2024-07-18 -> gpt-4.1-mini-2025-04-14 (fine-tuning,
  still actively fine-tunable)
- gpt-4 / gpt-3.5-turbo target_model_names example -> gpt-5.5 /
  gpt-5-mini
- bedrock claude-3-5-sonnet-20240620 batch entry -> haiku-4-5-20251001
  (also aligning oai_misc_config model_name with what
  test_bedrock_batches_api.py actually requests)
- bedrock claude-sonnet-4-20250514 (deprecated, retires 2026-06-15)
  -> claude-sonnet-4-5-20250929

* test: point bedrock-claude-sonnet-4 alias at Sonnet 4.6, not 4.5

Greptile/Cursor flagged that after the previous commit, the
bedrock-claude-sonnet-4 alias collided with bedrock-claude-sonnet-4.5
(both pointed to claude-sonnet-4-5-20250929). Rename to
bedrock-claude-sonnet-4.6 and point it at the Sonnet 4.6 Bedrock ID
(us.anthropic.claude-sonnet-4-6, already in the litellm model
registry) so the alias name matches the underlying model version.

* test: modernize models across remaining CI-mounted configs & tests

Expands the modernization sweep to all CircleCI-mounted proxy configs
and to test directories where the model literal is a fixture/route key
(not the test's subject).

Config changes:
- proxy_server_config.yaml: bump gpt-3.5-turbo / gpt-3.5-turbo-1106 /
  gpt-4o / gemini-1.5-flash / dall-e-3 underlying models; rename
  gpt-3.5-turbo-end-user-test alias to gpt-5-mini-end-user-test; bump
  text-embedding-ada-002 underlying to text-embedding-3-small. User-
  facing aliases (gpt-3.5-turbo, gpt-4, text-embedding-ada-002, etc.)
  preserved for backward compatibility with tests.
- simple_config.yaml, otel_test_config.yaml, spend_tracking_config.yaml:
  bump gpt-3.5-turbo underlying to gpt-5-mini.
- pass_through_config.yaml: claude-3-5-sonnet / claude-3-7-sonnet /
  claude-3-haiku entries replaced with claude-sonnet-4-5 / claude-
  haiku-4-5 / claude-opus-4-7.
- oai_misc_config.yaml: align alias name with the gpt-5-mini rename.

Test changes (proactive: claude-sonnet-4-20250514 / claude-opus-4-
20250514 retire 2026-06-15):
- tests/llm_translation/test_anthropic_completion.py: bump 3 references
  + paired Vertex AI ID to claude-sonnet-4-5.
- tests/llm_translation/test_optional_params.py: bump 2 references.
- tests/pass_through_unit_tests/test_anthropic_messages_passthrough.py
  and test_bedrock_anthropic_messages_test.py: bump router fixtures
  using the deprecated model IDs.
- tests/pass_through_unit_tests/base_anthropic_messages_tool_search_test.py:
  modernize docstring examples.
- tests/test_end_users.py: update references to renamed alias.

* test: modernize placeholder model literals in router_unit_tests

Mass replace_all on fixture/placeholder model literals across the
router_unit_tests/ suite (model name is a routing key / label, not the
test subject). Sub-agent sweep so far — additional commits will follow
for logging_callback_tests/, enterprise/, top-level tests/test_*.py,
and other CI-mounted dirs.

Mappings applied:
- gpt-3.5-turbo -> gpt-5-mini
- gpt-4 (bare) -> gpt-5.5
- gpt-4o (bare) -> gpt-5
- text-embedding-ada-002 -> text-embedding-3-small
- claude-3-sonnet-20240229 / claude-3-opus-20240229 /
  claude-3-haiku-20240307 / claude-3-5-sonnet-20240620 ->
  claude-sonnet-4-5-20250929 / claude-opus-4-7 /
  claude-haiku-4-5-20251001 as appropriate

Explicitly preserved:
- gpt-4o-mini-* variants (transcribe, tts, etc.) where they're current
- gpt-4-turbo / gpt-4-vision-preview / gpt-4-0613 (subject literals)
- JSONL batch body literals
- Mock LLM response model fields (must match upstream)
- Fake/mock identifiers

* test: modernize placeholder model literals across remaining CI suites

Sub-agent sweep across logging_callback_tests/, guardrails_tests/,
enterprise/, pass_through_unit_tests/, otel_tests/,
llm_responses_api_testing/, batches_tests/, spend_tracking_tests/,
litellm_utils_tests/, unified_google_tests/, and a few top-level
tests/test_*.py files where the model literal is a fixture or
placeholder (router model_list, mock standard logging payload, mock
callback data) rather than the test's subject.

Mappings applied (see scope notes below):
- gpt-3.5-turbo -> gpt-5-mini
- gpt-4 (bare) -> gpt-5.5
- gpt-4o (bare) -> gpt-5.5 (corrected from initial gpt-5 — bare gpt-5
  is not a valid OpenAI alias; only gpt-5.5 / gpt-5.4 / gpt-5.2-codex
  / gpt-5-mini exist)
- gpt-4o-mini (bare) -> gpt-5-mini
- text-embedding-ada-002 -> text-embedding-3-small
- claude-3-sonnet-20240229 -> claude-sonnet-4-5-20250929
- claude-3-opus-20240229 -> claude-opus-4-7
- claude-3-haiku-20240307 -> claude-haiku-4-5-20251001
- claude-3-5-sonnet-20240620/20241022 -> claude-sonnet-4-5-20250929
- claude-3-7-sonnet-20250219 -> claude-sonnet-4-6
- gemini-1.5-flash -> gemini-2.5-flash
- gemini-1.5-pro -> gemini-2.5-pro

Explicitly preserved (not modernized):
- llm_translation/ tests where model is the SUBJECT (provider-specific
  translation/transformation logic). Only the deprecated 20250514
  references were already bumped in a prior commit.
- Cost-calc / tokenizer subject tests in test_utils.py (skip-ranges
  documented by the sub-agent).
- Bedrock model IDs in test_health_check.py path-stripping tests.
- JSONL batch request bodies and mock LLM response bodies (must match
  upstream literal).
- Langfuse expected-request-body JSON fixtures (cost values are exact-
  match-asserted; changing the model would shift response_cost).
- gpt-3.5-turbo-instruct (text-completion endpoint; no modern OpenAI
  equivalent).
- Top-level tests calling the proxy through user-facing aliases
  (gpt-3.5-turbo, gpt-4, text-embedding-ada-002, dall-e-3) — aliases
  in proxy_server_config.yaml stay; only the underlying model was
  bumped.
- tests/test_gpt5_azure_temperature_support.py (the test's whole point
  is model-name handling).
- Fake / mock / openai/fake identifiers.

Notable side fixes:
- test_spend_accuracy_tests.py: UPSTREAM_MODEL now matches what
  spend_tracking_config.yaml's proxy actually routes to (gpt-5-mini),
  resolving a latent inconsistency.
- proxy_server_config.yaml: bare `gpt-5` alias renamed to `gpt-5.5`
  (bare gpt-5 is not a valid OpenAI alias).
- test_batches_logging_unit_tests.py: explicit_models list entries
  kept distinct (gpt-5-mini + gpt-5.5) after bulk rename.

* test: fix CI failures from model modernization sweep

CI surfaced 4 categories of regression from the bulk modernization:

1. Azure deployment names are customer-specific. Reverted:
   - tests/litellm_utils_tests/test_health_check.py: azure/text-
     embedding-3-small -> azure/text-embedding-ada-002 (the CI Azure
     account does not have a text-embedding-3-small deployment).
   - tests/logging_callback_tests/test_custom_callback_router.py:
     same revert for two router fixtures driving aembedding.

2. gpt-5 family does not accept temperature != 1. Tests that pass a
   custom temperature swapped from gpt-5-mini to gpt-4.1-mini (modern
   non-reasoning OpenAI mini that still accepts temperature/logprobs):
   - tests/logging_callback_tests/test_datadog.py
   - tests/logging_callback_tests/test_langsmith_unit_test.py
   - tests/logging_callback_tests/test_otel_logging.py

3. proxy_server_config.yaml's gpt-3.5-turbo-large alias was routing to
   gpt-5.5 (a reasoning model that rejects logprobs). The proxy test
   tests/test_openai_endpoints.py::test_chat_completion_streaming
   exercises logprobs/top_logprobs through that alias. Bumped the
   underlying model to gpt-4.1 (non-reasoning, still modern).

4. tests/logging_callback_tests/test_gcs_pub_sub.py asserts against a
   pinned JSON fixture (gcs_pub_sub_body/spend_logs_payload.json) with
   hardcoded model="gpt-4o" and a model-specific spend value. Reverted
   the litellm.acompletion calls in the test to model="gpt-4o" so the
   fixture's exact-match assertions still hold.

5. tests/pass_through_unit_tests/test_anthropic_messages_passthrough.py:
   anthropic.messages.create routing to openai/gpt-5-mini returned an
   empty content[0] with max_tokens=100 (reasoning-token consumption).
   Swapped to openai/gpt-4.1-mini.

* test: fix Assistants API model + 2 cursor[bot] review nits

1. pass_through_unit_tests/test_custom_logger_passthrough.py: gpt-5.5
   isn't accepted by the /v1/assistants endpoint
   ("unsupported_model"). Switch to gpt-4.1-mini (modern, Assistants-
   API-supported, non-reasoning).

2. example_config_yaml/pass_through_config.yaml: the previous sweep
   bumped the claude-3-7-sonnet alias to claude-opus-4-7, which is a
   tier change (Sonnet -> Opus). Map to claude-sonnet-4-6 to keep the
   Sonnet tier intact. (Cursor bugbot review.)

3. example_config_yaml/simple_config.yaml: model_name was left as
   gpt-3.5-turbo while the underlying was bumped to gpt-5-mini, which
   muddles the "simple" example. Make both sides gpt-5-mini so the
   most basic example is a straight 1:1 mapping again. (Cursor bugbot
   review.)

* fix: revert gpt-4/gpt-3.5-turbo alias underlying to non-reasoning models

tests/test_openai_endpoints.py::test_completion calls the proxy alias
"gpt-4" with temperature=0, and other tests call gpt-3.5-turbo with
custom temperature / logprobs / the legacy /v1/completions endpoint.
The earlier modernization mapped both aliases to gpt-5.5 / gpt-5-mini,
which are reasoning models that reject temperature != 1 and don't
expose /v1/completions. Map the aliases to gpt-4.1 / gpt-4.1-mini
(modern non-reasoning OpenAI models) instead — keeps user-facing
aliases preserved while picking a current underlying that still
supports the parameters/endpoints the tests exercise.
2026-05-15 15:44:28 -07:00

561 lines
22 KiB
Python

import os
import sys
import traceback
from litellm._uuid import uuid
from dotenv import load_dotenv
from fastapi import Request
from fastapi.routing import APIRoute
load_dotenv()
import io
import os
import time
# this file is to test litellm/proxy
sys.path.insert(
0, os.path.abspath("../..")
) # Adds the parent directory to the system path
import asyncio
import datetime
import json
import logging
from typing import Optional
import pytest
import litellm
from litellm.proxy.spend_tracking.spend_tracking_utils import (
get_logging_payload,
_sanitize_request_body_for_spend_logs_payload,
)
from litellm.proxy._types import SpendLogsMetadata, SpendLogsPayload
@pytest.mark.parametrize(
"model_id",
["chatcmpl-9XZmkzS1uPhRCoVdGQvBqqIbSgECt", "", None],
)
def test_spend_logs_payload(model_id: Optional[str]):
"""
Ensure only expected values are logged in spend logs payload.
"""
input_args: dict = {
"kwargs": {
"model": "chatgpt-v-3",
"messages": [
{"role": "system", "content": "you are a helpful assistant.\n"},
{"role": "user", "content": "bom dia"},
],
"custom_llm_provider": "azure",
"optional_params": {
"stream": False,
"max_tokens": 10,
"user": "116544810872468347480",
"extra_body": {},
},
"litellm_params": {
"acompletion": True,
"api_key": "sk-test-mock-key-707",
"force_timeout": 600,
"logger_fn": None,
"verbose": False,
"custom_llm_provider": "azure",
"api_base": "https://openai-gpt-4-test-v-1.openai.azure.com//openai/",
"litellm_call_id": "b9929bf6-7b80-4c8c-b486-034e6ac0c8b7",
"model_alias_map": {},
"completion_call_id": None,
"metadata": {
"tags": ["model-anthropic-claude-v2.1", "app-ishaan-prod"],
"user_api_key": "sk-test-mock-api-key-123",
"user_api_key_alias": "custom-key-alias",
"user_api_end_user_max_budget": None,
"litellm_api_version": "0.0.0",
"global_max_parallel_requests": None,
"user_api_key_user_id": "116544810872468347480",
"user_api_key_org_id": "custom-org-id",
"user_api_key_team_id": "custom-team-id",
"user_api_key_team_alias": "custom-team-alias",
"user_api_key_metadata": {},
"requester_ip_address": "127.0.0.1",
"spend_logs_metadata": {"hello": "world"},
"headers": {
"content-type": "application/json",
"user-agent": "PostmanRuntime/7.32.3",
"accept": "*/*",
"postman-token": "92300061-eeaa-423b-a420-0b44896ecdc4",
"host": "localhost:4000",
"accept-encoding": "gzip, deflate, br",
"connection": "keep-alive",
"content-length": "163",
},
"endpoint": "http://localhost:4000/chat/completions",
"model_group": "gpt-5-mini",
"deployment": "azure/gpt-4.1-mini",
"model_info": {
"id": "4bad40a1eb6bebd1682800f16f44b9f06c52a6703444c99c7f9f32e9de3693b4",
"db_model": False,
},
"api_base": "https://openai-gpt-4-test-v-1.openai.azure.com/",
"caching_groups": None,
"error_information": None,
"status": "success",
"proxy_server_request": "{}",
"raw_request": "\n\nPOST Request Sent from LiteLLM:\ncurl -X POST \\\nhttps://openai-gpt-4-test-v-1.openai.azure.com//openai/ \\\n-H 'Authorization: *****' \\\n-d '{'model': 'chatgpt-v-3', 'messages': [{'role': 'system', 'content': 'you are a helpful assistant.\\n'}, {'role': 'user', 'content': 'bom dia'}], 'stream': False, 'max_tokens': 10, 'user': '116544810872468347480', 'extra_body': {}}'\n",
},
"model_info": {
"id": "4bad40a1eb6bebd1682800f16f44b9f06c52a6703444c99c7f9f32e9de3693b4",
"db_model": False,
},
"proxy_server_request": {
"url": "http://localhost:4000/chat/completions",
"method": "POST",
"headers": {
"content-type": "application/json",
"user-agent": "PostmanRuntime/7.32.3",
"accept": "*/*",
"postman-token": "92300061-eeaa-423b-a420-0b44896ecdc4",
"host": "localhost:4000",
"accept-encoding": "gzip, deflate, br",
"connection": "keep-alive",
"content-length": "163",
},
"body": {
"messages": [
{
"role": "system",
"content": "you are a helpful assistant.\n",
},
{"role": "user", "content": "bom dia"},
],
"model": "gpt-5-mini",
"max_tokens": 10,
},
},
"preset_cache_key": None,
"no-log": False,
"stream_response": {},
"input_cost_per_token": None,
"input_cost_per_second": None,
"output_cost_per_token": None,
"output_cost_per_second": None,
},
"start_time": datetime.datetime(2024, 6, 7, 12, 43, 30, 307665),
"stream": False,
"user": "116544810872468347480",
"call_type": "acompletion",
"litellm_call_id": "b9929bf6-7b80-4c8c-b486-034e6ac0c8b7",
"completion_start_time": datetime.datetime(2024, 6, 7, 12, 43, 30, 954146),
"max_tokens": 10,
"extra_body": {},
"custom_llm_provider": "azure",
"input": [
{"role": "system", "content": "you are a helpful assistant.\n"},
{"role": "user", "content": "bom dia"},
],
"api_key": "1234",
"original_response": "",
"additional_args": {
"headers": {"Authorization": "Bearer 1234"},
"api_base": "openai-gpt-4-test-v-1.openai.azure.com",
"acompletion": True,
"complete_input_dict": {
"model": "chatgpt-v-3",
"messages": [
{"role": "system", "content": "you are a helpful assistant.\n"},
{"role": "user", "content": "bom dia"},
],
"stream": False,
"max_tokens": 10,
"user": "116544810872468347480",
"extra_body": {},
},
},
"log_event_type": "post_api_call",
"end_time": datetime.datetime(2024, 6, 7, 12, 43, 30, 954146),
"cache_hit": None,
"response_cost": 2.4999999999999998e-05,
"standard_logging_object": {
"request_tags": ["model-anthropic-claude-v2.1", "app-ishaan-prod"],
"metadata": {
"user_api_key_end_user_id": "test-user",
},
"model_map_information": {
"tpm": 1000,
"rpm": 1000,
},
},
},
"response_obj": litellm.ModelResponse(
id=model_id,
choices=[
litellm.Choices(
finish_reason="length",
index=0,
message=litellm.Message(
content="Bom dia! Como posso ajudar você", role="assistant"
),
)
],
created=1717789410,
model="gpt-35-turbo",
object="chat.completion",
system_fingerprint=None,
usage=litellm.Usage(
completion_tokens=10, prompt_tokens=20, total_tokens=30
),
),
"start_time": datetime.datetime(2024, 6, 7, 12, 43, 30, 308604),
"end_time": datetime.datetime(2024, 6, 7, 12, 43, 30, 954146),
}
payload: SpendLogsPayload = get_logging_payload(**input_args)
assert len(payload["request_id"]) > 0
# Define the expected metadata keys
expected_metadata_keys = SpendLogsMetadata.__annotations__.keys()
# Validate only specified metadata keys are logged
assert "metadata" in payload
assert isinstance(payload["metadata"], str)
payload["metadata"] = json.loads(payload["metadata"])
assert set(payload["metadata"].keys()) == set(expected_metadata_keys)
# This is crucial - used in PROD, it should pass, related issue: https://github.com/BerriAI/litellm/issues/4334
assert (
payload["request_tags"] == '["model-anthropic-claude-v2.1", "app-ishaan-prod"]'
)
assert payload["metadata"]["user_api_key_org_id"] == "custom-org-id"
assert payload["metadata"]["user_api_key_team_id"] == "custom-team-id"
assert payload["metadata"]["user_api_key_team_alias"] == "custom-team-alias"
assert payload["metadata"]["user_api_key_alias"] == "custom-key-alias"
assert payload["custom_llm_provider"] == "azure"
def test_spend_logs_payload_whisper():
"""
Ensure we can write /transcription request/responses to spend logs
"""
kwargs: dict = {
"model": "whisper-1",
"messages": [{"role": "user", "content": "audio_file"}],
"optional_params": {},
"litellm_params": {
"api_base": "",
"metadata": {
"user_api_key": "sk-test-mock-api-key-123",
"user_api_key_alias": None,
"user_api_key_end_user_id": "test-user",
"user_api_end_user_max_budget": None,
"litellm_api_version": "1.40.19",
"global_max_parallel_requests": None,
"user_api_key_user_id": "default_user_id",
"user_api_key_org_id": None,
"user_api_key_team_id": None,
"user_api_key_team_alias": None,
"user_api_key_team_max_budget": None,
"user_api_key_team_spend": None,
"user_api_key_spend": 0.0,
"user_api_key_max_budget": None,
"user_api_key_metadata": {},
"headers": {
"host": "localhost:4000",
"user-agent": "curl/7.88.1",
"accept": "*/*",
"content-length": "775501",
"content-type": "multipart/form-data; boundary=------------------------21d518e191326d20",
},
"endpoint": "http://localhost:4000/v1/audio/transcriptions",
"litellm_parent_otel_span": None,
"model_group": "whisper-1",
"deployment": "whisper-1",
"model_info": {
"id": "d7761582311451c34d83d65bc8520ce5c1537ea9ef2bec13383cf77596d49eeb",
"db_model": False,
},
"caching_groups": None,
},
},
"start_time": datetime.datetime(2024, 6, 26, 14, 20, 11, 313291),
"stream": False,
"user": "",
"call_type": "atranscription",
"litellm_call_id": "05921cf7-33f9-421c-aad9-33310c1e2702",
"completion_start_time": datetime.datetime(2024, 6, 26, 14, 20, 13, 653149),
"stream_options": None,
"input": "tmp-requestc8640aee-7d85-49c3-b3ef-bdc9255d8e37.wav",
"original_response": '{"text": "Four score and seven years ago, our fathers brought forth on this continent a new nation, conceived in liberty and dedicated to the proposition that all men are created equal. Now we are engaged in a great civil war, testing whether that nation, or any nation so conceived and so dedicated, can long endure."}',
"additional_args": {
"complete_input_dict": {
"model": "whisper-1",
"file": "<_io.BufferedReader name='tmp-requestc8640aee-7d85-49c3-b3ef-bdc9255d8e37.wav'>",
"language": None,
"prompt": None,
"response_format": None,
"temperature": None,
}
},
"log_event_type": "post_api_call",
"end_time": datetime.datetime(2024, 6, 26, 14, 20, 13, 653149),
"cache_hit": None,
"response_cost": 0.00023398580000000003,
}
response = litellm.utils.TranscriptionResponse(
text="Four score and seven years ago, our fathers brought forth on this continent a new nation, conceived in liberty and dedicated to the proposition that all men are created equal. Now we are engaged in a great civil war, testing whether that nation, or any nation so conceived and so dedicated, can long endure."
)
payload: SpendLogsPayload = get_logging_payload(
kwargs=kwargs,
response_obj=response,
start_time=datetime.datetime.now(),
end_time=datetime.datetime.now(),
)
print("payload: ", payload)
assert payload["call_type"] == "atranscription"
assert payload["spend"] == 0.00023398580000000003
def test_spend_logs_payload_with_prompts_enabled(monkeypatch):
"""
Test that messages and responses are logged in spend logs when store_prompts_in_spend_logs is enabled
"""
# Mock general_settings
from litellm.proxy.proxy_server import general_settings
general_settings["store_prompts_in_spend_logs"] = True
input_args: dict = {
"kwargs": {
"model": "gpt-5-mini",
"messages": [{"role": "user", "content": "Hello!"}],
"litellm_params": {
"metadata": {
"user_api_key": "fake_key",
}
},
},
"response_obj": litellm.ModelResponse(
id="chatcmpl-123",
choices=[
litellm.Choices(
finish_reason="stop",
index=0,
message=litellm.Message(content="Hi there!", role="assistant"),
)
],
model="gpt-5-mini",
usage=litellm.Usage(completion_tokens=2, prompt_tokens=1, total_tokens=3),
),
"start_time": datetime.datetime.now(),
"end_time": datetime.datetime.now(),
}
# Create a standard logging payload
standard_logging_payload = {
"messages": [{"role": "user", "content": "Hello!"}],
"response": {"role": "assistant", "content": "Hi there!"},
"metadata": {
"user_api_key_end_user_id": "test-user",
},
"request_tags": ["model-anthropic-claude-v2.1", "app-ishaan-prod"],
"model_map_information": {
"tpm": 1000,
"rpm": 1000,
},
}
litellm_params = {
"proxy_server_request": {
"body": {
"model": "gpt-5.5",
"messages": [{"role": "user", "content": "Hello!"}],
}
}
}
input_args["kwargs"]["standard_logging_object"] = standard_logging_payload
input_args["kwargs"]["litellm_params"] = litellm_params
payload: SpendLogsPayload = get_logging_payload(**input_args)
print("json payload: ", json.dumps(payload, indent=4, default=str))
# Verify messages and response are included in payload
assert payload["response"] == json.dumps(
{"role": "assistant", "content": "Hi there!"}
)
proxy_server_request = json.loads(payload["proxy_server_request"] or "{}")
assert proxy_server_request["model"] == "gpt-5.5"
assert proxy_server_request["messages"] == [{"role": "user", "content": "Hello!"}]
# Clean up - reset general_settings
general_settings["store_prompts_in_spend_logs"] = False
# Verify messages and response are not included when disabled
payload_disabled: SpendLogsPayload = get_logging_payload(**input_args)
assert payload_disabled["messages"] == "{}"
assert payload_disabled["response"] == "{}"
def test_large_request_no_truncation_threshold():
"""
Test that MAX_STRING_LENGTH_PROMPT_IN_DB constant is used for request body sanitization
and that the new truncation logic keeps beginning (35%) and end (65%) of the string
"""
from litellm.constants import (
MAX_STRING_LENGTH_PROMPT_IN_DB,
LITELLM_TRUNCATED_PAYLOAD_FIELD,
)
# Create a large string that exceeds the threshold
# Use a pattern that allows us to verify beginning and end are preserved
start_pattern = "START" * 250 # 1250 chars
middle_pattern = "MIDDLE" * 200 # 1200 chars
end_pattern = "END" * 250 # 750 chars
large_content = start_pattern + middle_pattern + end_pattern
request_body = {
"messages": [{"role": "user", "content": large_content}],
"model": "gpt-5.5",
}
sanitized = _sanitize_request_body_for_spend_logs_payload(request_body)
# Verify the content was truncated
truncated_content = sanitized["messages"][0]["content"]
# Calculate expected character counts (35% start, 65% end)
expected_start_chars = int(MAX_STRING_LENGTH_PROMPT_IN_DB * 0.35)
expected_end_chars = int(MAX_STRING_LENGTH_PROMPT_IN_DB * 0.65)
# Should keep first 35% of MAX_STRING_LENGTH_PROMPT_IN_DB chars
assert truncated_content.startswith(large_content[:expected_start_chars])
# Should keep last 65% of MAX_STRING_LENGTH_PROMPT_IN_DB chars
assert truncated_content.endswith(large_content[-expected_end_chars:])
# Should have truncation marker
assert LITELLM_TRUNCATED_PAYLOAD_FIELD in truncated_content
assert "skipped" in truncated_content
def test_small_request_no_truncation():
"""
Test that small strings are not truncated by MAX_STRING_LENGTH_PROMPT_IN_DB
"""
from litellm.constants import MAX_STRING_LENGTH_PROMPT_IN_DB
# Create a small string that's under the threshold
small_content = "x" * (MAX_STRING_LENGTH_PROMPT_IN_DB - 100)
request_body = {
"messages": [{"role": "user", "content": small_content}],
"model": "gpt-5.5",
}
sanitized = _sanitize_request_body_for_spend_logs_payload(request_body)
# Verify the content was NOT truncated
assert sanitized["messages"][0]["content"] == small_content
assert (
len(sanitized["messages"][0]["content"]) == MAX_STRING_LENGTH_PROMPT_IN_DB - 100
)
def test_configurable_string_length_env_var(monkeypatch):
"""
Test that MAX_STRING_LENGTH_PROMPT_IN_DB can be configured via environment variable
"""
# Set environment variable to a custom value
monkeypatch.setenv("MAX_STRING_LENGTH_PROMPT_IN_DB", "1000")
# Import after setting env var to ensure it picks up the new value
import importlib
import litellm.constants
import litellm.proxy.spend_tracking.spend_tracking_utils
importlib.reload(litellm.constants)
importlib.reload(litellm.proxy.spend_tracking.spend_tracking_utils)
from litellm.constants import (
MAX_STRING_LENGTH_PROMPT_IN_DB,
LITELLM_TRUNCATED_PAYLOAD_FIELD,
)
from litellm.proxy.spend_tracking.spend_tracking_utils import (
_sanitize_request_body_for_spend_logs_payload,
)
# Verify the constant was set to the env var value
assert MAX_STRING_LENGTH_PROMPT_IN_DB == 1000
# Test truncation with the custom value
large_content = "A" * 500 + "B" * 800 + "C" * 500 # 1800 chars total
request_body = {
"messages": [{"role": "user", "content": large_content}],
"model": "gpt-5.5",
}
sanitized = _sanitize_request_body_for_spend_logs_payload(request_body)
# Verify truncation occurred with 35% beginning and 65% end preserved
truncated_content = sanitized["messages"][0]["content"]
expected_start = int(1000 * 0.35) # 350 chars from beginning
expected_end = int(1000 * 0.65) # 650 chars from end
assert truncated_content.startswith(large_content[:expected_start])
assert truncated_content.endswith(large_content[-expected_end:])
assert LITELLM_TRUNCATED_PAYLOAD_FIELD in truncated_content
assert "skipped" in truncated_content
assert "800" in truncated_content # Should mention skipped 800 chars
def test_truncation_preserves_beginning_and_end():
"""
Test that truncation preserves the beginning (35%) and end (65%) of content for better debugging
"""
from litellm.constants import (
MAX_STRING_LENGTH_PROMPT_IN_DB,
LITELLM_TRUNCATED_PAYLOAD_FIELD,
)
# Create content with distinct beginning, middle, and end
beginning = "BEGIN_" * 200 # 1200 chars
middle = "MIDDLE_" * 300 # 2100 chars
end = "_END" * 300 # 1200 chars
large_content = beginning + middle + end
request_body = {
"messages": [{"role": "user", "content": large_content}],
"model": "gpt-5.5",
}
sanitized = _sanitize_request_body_for_spend_logs_payload(request_body)
truncated_content = sanitized["messages"][0]["content"]
# Calculate expected splits (35% beginning, 65% end)
expected_start_chars = int(MAX_STRING_LENGTH_PROMPT_IN_DB * 0.35)
expected_end_chars = int(MAX_STRING_LENGTH_PROMPT_IN_DB * 0.65)
# Check that beginning is preserved
expected_beginning = large_content[:expected_start_chars]
assert truncated_content.startswith(expected_beginning)
# Check that end is preserved
expected_end = large_content[-expected_end_chars:]
assert truncated_content.endswith(expected_end)
# Check truncation marker is present
assert LITELLM_TRUNCATED_PAYLOAD_FIELD in truncated_content
assert "skipped" in truncated_content
# Calculate expected skipped chars
total_chars = len(large_content)
kept_chars = expected_start_chars + expected_end_chars
expected_skipped = total_chars - kept_chars
assert str(expected_skipped) in truncated_content