litellm

mirror of https://github.com/tiennm99/litellm.git synced 2026-06-17 22:48:35 +00:00

Author	SHA1	Message	Date
Ishaan Jaff	57544f1662	[Feat] Adds IAM role assumption support for AWS Secret Manager (#16887 ) * add AWS fields for KeyManagementSettings * docs IAM roles * use aws iam auth on secret manager v2 * fix: load_aws_secret_manager * test_secret_manager_with_iam_role_settings	2025-11-20 12:38:48 -08:00
Cesar Garcia	5e70c78b94	fix(cost-tracking): support base_model lookup in litellm_metadata for Responses API (#16778 ) Cost tracking was failing for Responses API when using custom deployment names with base_model configuration. The issue occurred because: - Chat Completions API stores model_info in 'metadata' - Responses API stores model_info in 'litellm_metadata' - Cost calculator only checked 'metadata', missing Responses API costs Changes: - Updated _get_base_model_from_metadata() to check both metadata locations - Added comprehensive unit tests covering all scenarios - Maintains backward compatibility (metadata takes precedence) Fixes #16772	2025-11-18 19:53:18 -08:00
Ishaan Jaffer	95b1608970	test_get_valid_models_with_custom_llm_provider	2025-11-15 09:43:10 -08:00
Krish Dholakia	06906534b3	feat(audio_transcriptions/): calculate duration of audio file for cost calculation + feat (image_generations): cost tracking accuracy improved with output_format, quality, size values fixed per openai model * feat(audio_transcriptions/): calculate duration of audio file for cost calculation Fixes https://github.com/BerriAI/litellm/issues/11846 Closes https://github.com/BerriAI/litellm/issues/14605 * fix(cost_calculator.py): correctly use base model, when set Fixes issue where azure base model was being ignored * feat(cost_calculator.py): fix default cost tracking quality param for image generation * feat(image_generations/): return output_format, quality, size aligns response to openai spec and improves cost tracking accuracy * fix(cost_calculator.py): refactor cost calculation for image generation to use image response instead of hidden params * build: update build * fix: fix cost calculation * build: update poetry lock * fix: fix ruff checks * fix: fix aembedding * fix: fix ruff errors * fix: modify to catch errors * fix: test * fix: loosen test to handle openai lib out of sync * fix: fix base models * fix: fix usage object	2025-11-08 16:24:31 -08:00
Sameer Kankute	faae0ff0dc	Fix Azure DALL-E-3 health check content policy violation by using safe default prompt (#16329 ) * Add custom health check prompt support * Add constant for health check prompt * Add constant for health check prompt	2025-11-07 15:30:56 -08:00
Ishaan Jaff	a6b0993405	[Feat] Secret Manager - Hashicorp, add auth via approle (#16374 ) * add _verify_required_credentials_exist and _auth_via_approle * test_hashicorp_secret_manager_approle_auth * docs hcorp auth	2025-11-07 14:39:33 -08:00
Ishaan Jaffer	89a73b853a	fix cyber ark	2025-11-06 16:26:14 -08:00
Alexsander Hamir	8ee9b1bc93	feat: Add configurable mount name and path prefix for HashiCorp Vault (#16253 ) - Add HCP_VAULT_MOUNT_NAME env var to override default 'secret' mount - Add HCP_VAULT_PATH_PREFIX env var to add prefix to secret paths - Update get_url() method to construct URLs with configurable mount and prefix - Add test coverage for custom mount names and path prefixes - Maintain backward compatibility with existing configurations This allows users to configure Vault paths like: - Custom mount: {VAULT_ADDR}/v1/{MOUNT_NAME}/data/{SECRET} - With prefix: {VAULT_ADDR}/v1/secret/data/{PREFIX}/{SECRET} - Both: {VAULT_ADDR}/v1/{MOUNT_NAME}/data/{PREFIX}/{SECRET} Resolves issue where mount name was hardcoded and path prefixes weren't supported.	2025-11-05 16:06:07 -08:00
Ishaan Jaff	466e7d178c	[Feat] Cyber Ark - Add Key Rotations support (#16289 ) * KeyManagementSystem add cyberark * add CyberArkSecretManager * add CyberArkSecretManager * add CyberArkSecretManager * docs add CyberArkSecretManager * docs * refactor to use get_secret_from_manager * fix async roate for cyber ark, re-use base class * fixes * cyber ark * docs fix * docs fix * docs cyberark * fix linting * fix get_secret_from_manager	2025-11-05 14:03:43 -08:00
Ishaan Jaff	9a372bfad6	[Feat] Add CyberArk Secrets Manager Integration (#16278 ) * KeyManagementSystem add cyberark * add CyberArkSecretManager * add CyberArkSecretManager * add CyberArkSecretManager * docs add CyberArkSecretManager * docs * refactor to use get_secret_from_manager * Potential fix for code scanning alert no. 3645: Clear-text logging of sensitive information Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com> * Potential fix for code scanning alert no. 3650: Clear-text logging of sensitive information Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com> * Potential fix for code scanning alert no. 3649: Clear-text logging of sensitive information Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com> * Potential fix for code scanning alert no. 3646: Clear-text logging of sensitive information Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com> --------- Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>	2025-11-05 14:00:45 -08:00
Deepanshu Lulla	812ea03d28	Add tags and descriptions support to aws secrets manager (#16224 ) * Add tags and descriptions support to aws secrets manager * add tags --------- Co-authored-by: deepanshu <deepanshu.lulla@hq.bill.com>	2025-11-04 16:11:51 -08:00
Ishaan Jaffer	94c2c28f3d	claude-sonnet-4-5-20250929 fix	2025-10-31 18:20:52 -07:00
Chris Gibbons	2bef7c3662	fix: Preserve Bedrock inference profile IDs in health checks (#15947 ) * fix: Preserve Bedrock inference profile IDs in health checks - Fixes issue where health checks were stripping inference profile IDs - Preserves cross-region inference profile prefixes (us., eu., apac., jp., au., us-gov., global.) - Strips only AWS region routing while preserving routes and handlers - Resolves both issue #15807 and inference profile requirement errors - Adds comprehensive tests for all Bedrock model format combinations Issue #15807 attempted to fix regional Bedrock model health checks but was too aggressive, stripping cross-region inference profile prefixes that AWS requires. This caused errors: "Invocation of model ID X with on-demand throughput isn't supported. Retry your request with the ID or ARN of an inference profile." The fix now correctly: - Strips AWS regions (us-west-2, eu-central-1, etc.) from routing - Preserves CRIS prefixes (us., eu., etc.) required by AWS - Preserves routes (converse/, invoke/) - Preserves handlers (llama/, deepseek_r1/) - Only affects Bedrock models (checked via startswith) Test coverage includes 20+ scenarios for all Bedrock model format combinations. * Remove unused traceback import	2025-10-27 19:44:45 -07:00
Ishaan Jaffer	d8e5938f54	test_azure_img_gen_health_check	2025-10-25 10:35:47 -07:00
Ishaan Jaffer	778e10119c	test_azure_img_gen_health_check	2025-10-25 10:27:10 -07:00
Ishaan Jaffer	0bedf1c0a7	fix tests	2025-10-25 10:19:24 -07:00
Ishaan Jaffer	ae7b13550e	test_models_by_provider	2025-10-23 09:10:41 -07:00
nuernber	799a2b624a	use proper bedrock model name in health check (#15808 )	2025-10-22 15:24:57 -07:00
Ishaan Jaff	9135e748a0	[Feat ] /ocr - Add mode + Health check support for OCR models (#15767 ) * get_mode_handlers * use get_mode_handlers * test_ahealth_check_ocr * Add OCR mode to test models * docs OCR Health Checks * fix connection endpoint	2025-10-21 16:58:37 -07:00
Ishaan Jaffer	8cb66168bc	test fix	2025-10-10 19:57:17 -07:00
Georg Wölflein	dbfa8ec921	Fix end user cost tracking in the responses API (#15124 ) #13860	2025-10-02 15:13:57 -07:00
Ishaan Jaff	f6d7683261	[Feat] LiteLLM Overhead metric tracking - Add support for tracking litellm overhead on cache hits (#15045 ) * test_litellm_overhead * vertex track overhead * fix config.yaml used for testing * test_litellm_overhead_stream * add update_response_metadata for caching handler * add CachingDetails * fix update_response_metadata import * add CachingDetails metrics * add CachingDetails * test_litellm_overhead_cache_hit * test_litellm_overhead_cache_hit * test_litellm_overhead_cache_hit	2025-09-29 17:33:27 -07:00
Ishaan Jaffer	e0172b86e2	test_litellm_overhead_non_streaming	2025-09-29 15:48:32 -07:00
Ishaan Jaff	619577d4e8	[Feat] Add litellm overhead metric for VertexAI (#15040 ) * test_litellm_overhead * vertex track overhead * fix config.yaml used for testing * test_litellm_overhead_stream * add update_response_metadata for caching handler * Revert "add update_response_metadata for caching handler" This reverts commit f2a891f2b448b878a5dbf4b5b0a6166c807b3705.	2025-09-29 15:15:25 -07:00
Ishaan Jaffer	bbf5761b49	tets health check	2025-09-27 12:06:26 -07:00
Alexsander Hamir	eaa04cd8ce	fix: use fastuuid helper (#14903 ) * fix: use fastuuid helper across the codebase First batch of changes, simple drop in replacement. * second batch of changes * fixed: script mistake on helper file	2025-09-25 15:47:01 -07:00
Ishaan Jaffer	ba1cd3f0d2	Revert "feature: generic object pool (#14702 )" This reverts commit `60800698f2`.	2025-09-24 21:36:49 -07:00
Ishaan Jaff	d9bf6a8c53	Revert "Fix: make `pondpond` as optional dependency for `proxy` extras, disab…" (#14880 ) This reverts commit `e75d8b711e`.	2025-09-24 21:34:42 -07:00
Alex Shoop	e75d8b711e	Fix: make `pondpond` as optional dependency for `proxy` extras, disable object pooling gracefully (#14863 ) * pondpond optional dep proxy extra * lock	2025-09-24 17:09:30 -07:00
Krish Dholakia	d4540d31c1	Merge branch 'main' into fix/streaming-tool-call-indices	2025-09-21 21:24:22 -07:00
Alexsander Hamir	60800698f2	feature: generic object pool (#14702 ) * add: generic object pool & tests Introduced a reusable object pool that can be applied across the codebase. Note: memory growth is managed via eviction settings—using a hard cap could reduce performance, so eviction is the preferred safeguard. * fix: simpler tests	2025-09-18 18:32:45 -07:00
Ishaan Jaffer	c6afa904bb	fix: test_completion_with_no_model	2025-09-18 10:17:09 -07:00
Ishaan Jaffer	1e1d174733	fix: test_completion_with_no_model	2025-09-18 10:13:32 -07:00
Tim Elfrink	c5ca2afec3	Add test for tool call sequential index assignment - Test multiple tool calls without explicit indices receive sequential indices - Verify Delta class assigns indices 0, 1, 2... instead of defaulting all to 0 - Add comprehensive assertions for tool call details preservation - Cover provider-agnostic streaming response scenarios	2025-09-15 21:11:13 +02:00
Krrish Dholakia	d05f58721e	test: remove end of life model from tests	2025-09-09 21:01:45 -07:00
Ishaan Jaff	d37be48a80	test: llama-3.3-70b-versatile	2025-09-01 20:14:12 -07:00
Krish Dholakia	3e764ec268	Merge pull request #13808 from mainred/validate_api_version feat(utils.py): accept 'api_version' as param for validate_environment	2025-08-22 23:59:38 -07:00
Ishaan Jaff	e93e266f84	[Performance] Use O(1) Set lookups for model routing (#13879 ) * o(1) lookups * Revert "o(1) lookups" This reverts commit 620d14246980813366b4b1f1c0ce396b528dd9df. * o(1) lookups * Revert "o(1) lookups" This reverts commit 676a9f5bcc3c2b9fa31e0a9fdf00389739b3052f. * o(1) lookups * register_model fix * test_aget_valid_models * lambda ai models fix * test_utils.py * test fix vertex ai	2025-08-21 22:56:46 -07:00
Qingchuan Hao	f2a6be390b	feat(utils.py): accept 'api_version' as param for validate_environment	2025-08-20 14:29:58 +00:00
Krrish Dholakia	f544a4e238	test: update test	2025-07-29 21:08:36 -07:00
Robert Gambee	52b2984792	[Bug Fix] Always include tool calls in output of trim_messages (#11517 ) * Check content and order of trimmed messages * Assert tool calls are preserved if below max_tokens * Unreverse order of tool calls * Return tool calls alongside other messages * Write test for trimming untokenizable field * Return original messages in case of exception	2025-07-17 16:01:59 -07:00
Krish Dholakia	d202ce229b	Prevent writing default user setting updates to yaml (error in non-root env) + Use central team member budget when max_budget_in_team set on UI (#12533 ) * fix(proxy_setting_endpoints.py): require store model in db is enabled for setting user default settings * test(test_proxy_server.py): update test * fix(reset_budget_job.py): initial commit adding reset budget logic for team members * test: update unit testing * test(test_proxy_budget_reset.py): validate team member budget was reset * test(test_reset_budget_job.py): update unit tests * test: update tests	2025-07-12 10:13:07 -07:00
Jugal D. Bhatt	aa14d26da4	fix slack alerts (#12464 ) * fix slack alerts * remvoe print * add unit test	2025-07-10 08:58:47 -07:00
Ishaan Jaff	c31a7d3ab7	fix new utils tests	2025-07-04 18:30:50 -07:00
Cole McIntosh	a7594196cd	fix: support Cursor IDE tool_choice format {"type": "auto"} (#12168 ) * fix: support Cursor IDE tool_choice format {"type": "auto"} - Update validate_chat_completion_tool_choice to normalize {"type": "auto"} to "auto" - Handles Cursor IDE sending non-standard tool_choice format - Add comprehensive tests for tool choice validation Fixes #12098 * fix: return full tool_choice object for Cursor IDE format Based on PR feedback, updated validate_chat_completion_tool_choice to return the full tool_choice dictionary instead of just extracting the type string. This maintains consistency with downstream code that expects the full object structure. - Changed behavior: {"type": "auto"} now returns {"type": "auto"} instead of "auto" - Updated tests to reflect the new expected behavior - Ensures compatibility with code that passes tool_choice to optional_params Addresses feedback from PR #12168	2025-06-30 12:39:58 -07:00
Ishaan Jaff	0c19414b36	[⚡️ Python SDK import] - reduce python sdk import time by .3s (#12140 ) * use 1 file for KeyManagementSystem * move key management settings * fix import locs * test_proxy_types_not_imported * test the import loc * fix import item * fix imports * fix import loc * fix imports	2025-06-28 14:57:10 -07:00
Ishaan Jaff	6b623f9c98	test whitelisted models	2025-06-28 14:46:16 -07:00
Bougou Nisou	58dda44fda	feat: enhance redaction functionality for EmbeddingResponse (#12088 )	2025-06-27 21:30:26 -07:00
Laurien	0c50f8bcc9	Update enduser spend and budget reset date based on budget duration (#8460 )	2025-06-08 08:39:14 -07:00
Krrish Dholakia	e5f228abd5	fix(utils.py): handle litellm proxy case for checking model info	2025-06-06 09:24:41 -07:00

1 2

88 Commits