litellm

mirror of https://github.com/tiennm99/litellm.git synced 2026-06-25 21:06:55 +00:00

Author	SHA1	Message	Date
Ephrim Stanley	ae0769b1df	fix: guard empty-dict team limits and malformed int in deployment default limits - Change `if team_limit:` to `if team_limit is not None:` in both get_key_model_rpm_limit and get_key_model_tpm_limit so that an explicitly-empty team rate-limit map ({}) is returned as-is instead of silently falling through to deployment defaults (P1 fix). - Replace the bare `int()` list comprehension in _get_deployment_default_limit with a loop that catches ValueError/TypeError so malformed config strings do not raise an unhandled exception during request handling (P2 fix). - Add corresponding unit tests for both edge cases. Co-Authored-By: Claude (claude-sonnet-4-6) <noreply@anthropic.com>	2026-03-19 07:40:47 -04:00
Ephrim Stanley	477c54184b	perf: avoid unconditional router lookups in success handler Replace bare _get_deployment_default_tpm/rpm_limit calls in the async_log_success_event condition with get_key_model_tpm/rpm_limit (model_name=model_group). The higher-level getters short-circuit on key/team metadata hits before ever reaching the router, so requests that don't use deployment defaults incur no extra router lookup. Remove the now-unused bare helper imports. Also fix invalid `int = None` type hints in test helper signatures to `Optional[int] = None`. Co-Authored-By: Claude (claude-sonnet-4-6) <noreply@anthropic.com>	2026-03-19 02:07:50 -04:00
Ephrim Stanley	36dc893770	fix: address review feedback on default tpm/rpm limits - Use min() across all matching deployments instead of first-wins when resolving default_api_key_tpm/rpm_limit for a model group, so load-balanced setups with different per-deployment limits always apply the most conservative value - Replace the global SensitiveDataMasker non_sensitive_overrides change with a targeted excluded_keys set at the remove_sensitive_info_from_deployment call site, avoiding unintended suppression of other fields - Update the v1 parallel request limiter to pass model_name to get_key_model_tpm/rpm_limit so deployment defaults apply there too - Add 4 tests covering multi-deployment min semantics Co-Authored-By: Claude (claude-sonnet-4-6) <noreply@anthropic.com>	2026-03-19 01:43:27 -04:00
Ephrim Stanley	cac685014f	feat: add proxy-wide default tpm/rpm limits per deployment Adds `default_api_key_tpm_limit` and `default_api_key_rpm_limit` to `GenericLiteLLMParams` so operators can set per-deployment rate limit defaults in config.yaml. When a key has no model-specific tpm/rpm limit configured, the proxy falls back to these deployment defaults (Case 2 in spec). Key-level limits always take priority (Case 1). - Extends `get_key_model_tpm_limit` / `get_key_model_rpm_limit` with a `model_name` param and a priority-4 deployment-default fallback - Passes `model_name=requested_model` in the parallel request limiter so the fallback is triggered at enforcement time - Adds `"limit"` to `SensitiveDataMasker` non-sensitive overrides so `*_limit` fields are not masked in `/model/info` responses - Adds 17 unit tests covering both spec cases and the `/model/info` path Co-Authored-By: Claude (claude-sonnet-4-6) <noreply@anthropic.com>	2026-03-19 01:30:18 -04:00
brtydse100	dd1ea3d39e	Support multiple headers mapped to the customer user role (#23664 ) * added the header mapping feature * added tests * final cleanup * final cleanup * added missing test and logic * fixed header sending bug * Update litellm/proxy/auth/auth_utils.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * added back init file in responses + fixed test_auth_utils.py int local_testing --------- Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>	2026-03-15 14:20:45 +05:30
Jay Prajapati	582d324a76	fix(proxy): support slashes in google generateContent model names (#19737 ) * fix(proxy): support slashes in google route params * fix(proxy): extract google model ids with slashes * test(proxy): cover google model ids with slashes	2026-01-25 22:59:50 -08:00
Ishaan Jaff	117c7dd158	[Feat] Claude Code - Add End-user tracking with Claude Code (#19171 ) * add claude code customer usage tracking * fix get end user trackign claude code * TestGetCustomerIdFromStandardHeaders	2026-01-15 17:57:10 -08:00
Harshit Jain	8a683d9a6a	Add fix for bedrock_cache, metadata and max_model_budget (#18872 )	2026-01-10 01:09:00 +05:30

8 Commits