Files
litellm/tests/proxy_unit_tests
Darien Kindlund 17e145a083 fix(proxy): use model_group for model_max_budget spend tracking cache key (#25549)
The model_max_budget limiter tracks spend in one code path
(async_log_success_event) and enforces budget limits in another
(is_key_within_model_budget via user_api_key_auth). These two paths
used different model name formats to build cache keys:

- Tracking used standard_logging_payload["model"], which is the
  deployment-level model name (e.g. "vertex_ai/claude-opus-4-6@default")
- Enforcement used request_data["model"], which is the model group
  alias (e.g. "claude-opus-4-6")

Because the cache keys never matched, the enforcement path always read
None for current spend, silently allowing all requests through even
after the budget was exceeded. This affected any provider that decorates
model names with provider prefixes or version suffixes (Vertex AI,
Bedrock, etc.).

Fix: use model_group (the user-facing alias) from StandardLoggingPayload
for spend tracking, falling back to model when model_group is None.
This aligns the tracking cache key with the enforcement cache key.

Fixes the same root cause reported in #15223 and #10052.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 19:37:58 -07:00
..
2026-03-28 19:17:38 -07:00
2026-03-28 19:17:38 -07:00
2026-03-28 19:17:38 -07:00