litellm

mirror of https://github.com/tiennm99/litellm.git synced 2026-06-18 00:48:01 +00:00

Author	SHA1	Message	Date
Cesar Garcia	5e70c78b94	fix(cost-tracking): support base_model lookup in litellm_metadata for Responses API (#16778 ) Cost tracking was failing for Responses API when using custom deployment names with base_model configuration. The issue occurred because: - Chat Completions API stores model_info in 'metadata' - Responses API stores model_info in 'litellm_metadata' - Cost calculator only checked 'metadata', missing Responses API costs Changes: - Updated _get_base_model_from_metadata() to check both metadata locations - Added comprehensive unit tests covering all scenarios - Maintains backward compatibility (metadata takes precedence) Fixes #16772	2025-11-18 19:53:18 -08:00
Ishaan Jaffer	95b1608970	test_get_valid_models_with_custom_llm_provider	2025-11-15 09:43:10 -08:00
Ishaan Jaffer	94c2c28f3d	claude-sonnet-4-5-20250929 fix	2025-10-31 18:20:52 -07:00
Ishaan Jaffer	ae7b13550e	test_models_by_provider	2025-10-23 09:10:41 -07:00
Ishaan Jaffer	8cb66168bc	test fix	2025-10-10 19:57:17 -07:00
Georg Wölflein	dbfa8ec921	Fix end user cost tracking in the responses API (#15124 ) #13860	2025-10-02 15:13:57 -07:00
Krish Dholakia	d4540d31c1	Merge branch 'main' into fix/streaming-tool-call-indices	2025-09-21 21:24:22 -07:00
Ishaan Jaffer	c6afa904bb	fix: test_completion_with_no_model	2025-09-18 10:17:09 -07:00
Ishaan Jaffer	1e1d174733	fix: test_completion_with_no_model	2025-09-18 10:13:32 -07:00
Tim Elfrink	c5ca2afec3	Add test for tool call sequential index assignment - Test multiple tool calls without explicit indices receive sequential indices - Verify Delta class assigns indices 0, 1, 2... instead of defaulting all to 0 - Add comprehensive assertions for tool call details preservation - Cover provider-agnostic streaming response scenarios	2025-09-15 21:11:13 +02:00
Krrish Dholakia	d05f58721e	test: remove end of life model from tests	2025-09-09 21:01:45 -07:00
Ishaan Jaff	d37be48a80	test: llama-3.3-70b-versatile	2025-09-01 20:14:12 -07:00
Krish Dholakia	3e764ec268	Merge pull request #13808 from mainred/validate_api_version feat(utils.py): accept 'api_version' as param for validate_environment	2025-08-22 23:59:38 -07:00
Ishaan Jaff	e93e266f84	[Performance] Use O(1) Set lookups for model routing (#13879 ) * o(1) lookups * Revert "o(1) lookups" This reverts commit 620d14246980813366b4b1f1c0ce396b528dd9df. * o(1) lookups * Revert "o(1) lookups" This reverts commit 676a9f5bcc3c2b9fa31e0a9fdf00389739b3052f. * o(1) lookups * register_model fix * test_aget_valid_models * lambda ai models fix * test_utils.py * test fix vertex ai	2025-08-21 22:56:46 -07:00
Qingchuan Hao	f2a6be390b	feat(utils.py): accept 'api_version' as param for validate_environment	2025-08-20 14:29:58 +00:00
Krrish Dholakia	f544a4e238	test: update test	2025-07-29 21:08:36 -07:00
Robert Gambee	52b2984792	[Bug Fix] Always include tool calls in output of trim_messages (#11517 ) * Check content and order of trimmed messages * Assert tool calls are preserved if below max_tokens * Unreverse order of tool calls * Return tool calls alongside other messages * Write test for trimming untokenizable field * Return original messages in case of exception	2025-07-17 16:01:59 -07:00
Ishaan Jaff	c31a7d3ab7	fix new utils tests	2025-07-04 18:30:50 -07:00
Ishaan Jaff	6b623f9c98	test whitelisted models	2025-06-28 14:46:16 -07:00
Bougou Nisou	58dda44fda	feat: enhance redaction functionality for EmbeddingResponse (#12088 )	2025-06-27 21:30:26 -07:00
Ishaan Jaff	7d47417906	test: fixes	2025-05-31 12:42:56 -07:00
Ishaan Jaff	0590b1eb3a	[Fix] Prometheus Metrics - Do not track end_user by default + expose flag to enable tracking end_user on prometheus (#11192 ) * fix: testing for disabling end user on metrics * fix: fixes for test_prometheus_factory * Delete litellm/model_prices_and_context_window_backup.json * fix: issues with merge conflicts * fix: test_get_end_user_id_for_cost_tracking_prometheus_only * Update tests/test_litellm/integrations/test_prometheus.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-05-27 17:06:58 -07:00
Ishaan Jaff	86cdb8382b	[Feat] Use aiohttp transport by default - 97% lower median latency (#11097 ) * fix: add flag for disabling use_aiohttp_transport * feat: add _create_async_transport * feat: fixes for transport * add httpx-aiohttp * feat: fixes for transport * refactor: fixes for transport * build: fix deps * fixes: test fixes * fix: ensure aiohttp does not auto set content type * test: test fixes * feat: add LiteLLMAiohttpTransport * fix: fixes for responses API handling * test: fixes for responses API handling * test: fixes for responses API handling * feat: fixes for transport * fix: base embedding handler * test: test_async_http_handler_force_ipv4 * test: fix failing deepeval test * fix: add YARL for bedrock urls * fix: issues with transport * fix: comment out linting issues * test fix * test: XAI is unstable * test: fixes for using respx * test: XAI fixes * test: XAI fixes * test: infinity testing fixes * docs(config_settings.md): document param * test: test_openai_image_edit_litellm_sdk * test: remove deprecated test * bump respx==0.22.0 * test: test_xai_message_name_filtering * test: fix anthropic test after bumping httpx * use n 4 for mapped tests (#11109) * fix: use 1 session per event loop * test: test_client_session_helper * fix: linting error * fix: resolving GET requests on httpx 0.28.1 * test fixes proxy unit tests * fix: add ssl verify settings * fix: proxy unit tests * fix: refactor * tests: basic unit tests for aiohttp transports * tests: fixes xai --------- Co-authored-by: Krrish Dholakia <krrishdholakia@gmail.com>	2025-05-23 22:55:35 -07:00
Krrish Dholakia	33814917fe	fix(token_counter.py): handle empty lists	2025-05-02 08:07:08 -07:00
Carlos Freund	cb177dbd7a	Fix and rewrite of token_counter (#10409 ) * added tests messages_with_counts: Made tolerance explicit for each test. But they match the new implementation(which beats the old) * new token counter impl * compare old and new implementation in test * delete old token counter * moved tests to /tests/litellm/litellm_core_utils * use existing types * docstrings * warn about using default params on unknown model. * created type for the token_counter_function * check key == "content" * throw error on invalid detail-type, ignore type-warning. * fix imports	2025-05-01 23:34:37 -07:00
Ruperto A. Martinez	298a3574f4	Add supports_pdf_input: true to Claude 3.7 bedrock models (#9917 ) * Add supports_pdf_input: true to Claude 3.7 bedrock models * update unit test --------- Co-authored-by: RupertoXTI <rmartinez@xtillion.com>	2025-05-01 14:56:54 -07:00
Krish Dholakia	33ead69c0a	Support checking provider `/models` endpoints on proxy `/v1/models` endpoint (#9958 ) * feat(utils.py): support global flag for 'check_provider_endpoints' enables setting this for `/models` on proxy * feat(utils.py): add caching to 'get_valid_models' Prevents checking endpoint repeatedly * fix(utils.py): ensure mutations don't impact cached results * test(test_utils.py): add unit test to confirm cache invalidation logic * feat(utils.py): get_valid_models - support passing litellm params dynamically Allows for checking endpoints based on received credentials * test: update test * feat(model_checks.py): pass router credentials to get_valid_models - ensures it checks correct credentials * refactor(utils.py): refactor for simpler functions * fix: fix linting errors * fix(utils.py): fix test * fix(utils.py): set valid providers to custom_llm_provider, if given * test: update test * fix: fix ruff check error	2025-04-14 23:23:20 -07:00
Ishaan Jaff	f9ce754817	[Feat] Add litellm.supports_reasoning() util to track if an llm supports reasoning (#9923 ) * add supports_reasoning for xai models * add "supports_reasoning": true for o1 series models * add supports_reasoning util * add litellm.supports_reasoning * add supports reasoning for claude 3-7 models * add deepseek as supports reasoning * test_supports_reasoning * add supports reasoning to model group info * add supports_reasoning * docs supports reasoning * fix supports_reasoning test * "supports_reasoning": false, * fix test * supports_reasoning	2025-04-11 17:56:04 -07:00
Krish Dholakia	ccbac691e5	Support discovering gemini, anthropic, xai models by calling their `/v1/model` endpoint (#9530 ) * fix: initial commit for adding provider model discovery to gemini * feat(gemini/): add model discovery for gemini/ route * docs(set_keys.md): update docs to show you can check available gemini models as well * feat(anthropic/): add model discovery for anthropic api key * feat(xai/): add model discovery for XAI enables checking what models an xai key can call * ci: bump ci config yml * fix(topaz/common_utils.py): fix linting error * fix: fix linting error for python38	2025-03-27 22:50:48 -07:00
Krish Dholakia	c0845fec1f	Add OpenAI gpt-4o-transcribe support (#9517 ) * refactor: introduce new transformation config for gpt-4o-transcribe models * refactor: expose new transformation configs for audio transcription * ci: fix config yml * feat(openai/transcriptions): support provider config transformation on openai audio transcriptions allows gpt-4o and whisper audio transformation to work as expected * refactor: migrate fireworks ai + deepgram to new transform request pattern * feat(openai/): working support for gpt-4o-audio-transcribe * build(model_prices_and_context_window.json): add gpt-4o-transcribe to model cost map * build(model_prices_and_context_window.json): specify what endpoints are supported for `/audio/transcriptions` * fix(get_supported_openai_params.py): fix return * refactor(deepgram/): migrate unit test to deepgram handler * refactor: cleanup unused imports * fix(get_supported_openai_params.py): fix linting error * test: update test	2025-03-26 23:10:25 -07:00
Ishaan Jaff	1d7accce9e	test_supports_web_search	2025-03-22 13:49:35 -07:00
Krrish Dholakia	8ef9129556	fix(types/utils.py): support openai 'file' message type Closes https://github.com/BerriAI/litellm/issues/9365	2025-03-19 23:13:51 -07:00
Krish Dholakia	ab7c4d1a0e	Litellm dev bedrock anthropic 3 7 v2 (#8843 ) * feat(bedrock/converse/transformation.py): support claude-3-7-sonnet reasoning_Content transformation Closes https://github.com/BerriAI/litellm/issues/8777 * fix(bedrock/): support returning `reasoning_content` on streaming for claude-3-7 Resolves https://github.com/BerriAI/litellm/issues/8777 * feat(bedrock/): unify converse reasoning content blocks for consistency across anthropic and bedrock * fix(anthropic/chat/transformation.py): handle deepseek-style 'reasoning_content' extraction within transformation.py simpler logic * feat(bedrock/): fix streaming to return blocks in consistent format * fix: fix linting error * test: fix test * feat(factory.py): fix bedrock thinking block translation on tool calling allows passing the thinking blocks back to bedrock for tool calling * fix(types/utils.py): don't exclude provider_specific_fields on model dump ensures consistent responses * fix: fix linting errors * fix(convert_dict_to_response.py): pass reasoning_content on root * fix: test * fix(streaming_handler.py): add helper util for setting model id * fix(streaming_handler.py): fix setting model id on model response stream chunk * fix(streaming_handler.py): fix linting error * fix(streaming_handler.py): fix linting error * fix(types/utils.py): add provider_specific_fields to model stream response * fix(streaming_handler.py): copy provider specific fields and add them to the root of the streaming response * fix(streaming_handler.py): fix check * fix: fix test * fix(types/utils.py): ensure messages content is always openai compatible * fix(types/utils.py): fix delta object to always be openai compatible only introduce new params if variable exists * test: fix bedrock nova tests * test: skip flaky test * test: skip flaky test in ci/cd	2025-02-26 16:05:33 -08:00
Krish Dholakia	09462ba80c	Add cohere v2/rerank support (#8421 ) (#8605 ) * Add cohere v2/rerank support (#8421) * Support v2 endpoint cohere rerank * Add tests and docs * Make v1 default if old params used * Update docs * Update docs pt 2 * Update tests * Add e2e test * Clean up code * Use inheritence for new config * Fix linting issues (#8608) * Fix cohere v2 failing test + linting (#8672) * Fix test and unused imports * Fix tests * fix: fix linting errors * test: handle tgai instability * fix: skip service unavailable err * test: print logs for unstable test * test: skip unreliable tests --------- Co-authored-by: vibhavbhat <vibhavb00@gmail.com>	2025-02-22 22:25:29 -08:00
Krish Dholakia	251467a525	add bedrock llama vision support + cohere / infinity rerank - 'return_documents' support (#8684 ) * build(model_prices_and_context_window.json): mark bedrock llama as supporting vision based on docs * Add price for Cerebras llama3.3-70b (#8676) * docs(readme.md): fix contributing docs point people to new mock directory testing structure s/o @vibhavbhat * build: update contributing readme * docs(readme.md): improve docs * docs(readme.md): cleanup readme on tests/ * docs(README.md): cleanup doc * feat(infinity/): support returning documents when return_documents=True * test(test_rerank.py): add e2e testing for cohere rerank * fix: fix linting errors * fix(together_ai/): fix together ai transformation * fix: fix linting error * fix: fix linting errors * fix: fix linting errors * test: mark cohere as flaky * build: fix model supports check * test: fix test * test: mark flaky test * fix: fix test * test: fix test --------- Co-authored-by: Yury Koleda <fut.wrk@gmail.com>	2025-02-20 21:23:54 -08:00
Krish Dholakia	2340f1b31f	Pass router tags in request headers - `x-litellm-tags` (#8609 ) * feat(litellm_pre_call_utils.py): support `x-litellm-tags` request header allow tag based routing + spend tracking via request headers * docs(request_headers.md): document new `x-litellm-tags` for tag based routing and spend tracking * docs(tag_routing.md): add to docs * fix(utils.py): only pass str values for openai metadata param * fix(utils.py): drop non-str values for metadata param to openai preview-feature, otel span was being sent in	2025-02-18 08:26:22 -08:00
Ishaan Jaff	2753de1458	(Bug Fix + Better Observability) - BudgetResetJob: (#8562 ) * use class ResetBudgetJob * refactor reset budget job * update reset_budget job * refactor reset budget job * fix LiteLLM_UserTable * refactor reset budget job * add telemetry for reset budget job * dd - log service success/failure on DD * add detailed reset budget reset info on DD * initialize_scheduled_background_jobs * refactor reset budget job * trigger service failure hook when fails to reset a budget for team, key, user * fix resetBudgetJob * unit testing for ResetBudgetJob * test_duration_in_seconds_basic * testing for triggering service logging * fix logs on test teams fail * remove unused imports * fix import duration in s * duration_in_seconds	2025-02-15 16:13:08 -08:00
Krish Dholakia	ce3ead6f91	Log applied guardrails on LLM API call (#8452 ) * fix(litellm_logging.py): support saving applied guardrails in logging object allows list of applied guardrails to be logged for proxy admin's knowledge * feat(spend_tracking_utils.py): log applied guardrails to spend logs makes it easy for admin to know what guardrails were applied on a request * ci(config.yml): uninstall posthog from ci/cd * test: fix tests * test: update test	2025-02-10 22:57:30 -08:00
Krish Dholakia	b5850b6b65	Handle azure deepseek reasoning response (#8288 ) (#8366 ) * Handle azure deepseek reasoning response (#8288) * Handle deepseek reasoning response * Add helper method + unit test * Fix: Follow infinity api url format (#8346) * Follow infinity api url format * Update test_infinity.py * fix(infinity/transformation.py): fix linting error --------- Co-authored-by: vibhavbhat <vibhavb00@gmail.com> Co-authored-by: Hao Shan <53949959+haoshan98@users.noreply.github.com>	2025-02-07 17:45:51 -08:00
Krish Dholakia	f031926b82	fix(utils.py): handle key error in msg validation (#8325 ) * fix(utils.py): handle key error in msg validation * Support running Aim Guard during LLM call (#7918) * support running Aim Guard during LLM call * Rename header * adjust docs and fix type annotations * fix(timeout.md): doc fix for openai example on dynamic timeouts --------- Co-authored-by: Tomer Bin <117278227+hxtomer@users.noreply.github.com>	2025-02-06 18:13:46 -08:00
Ishaan Jaff	b812286534	(fix) - proxy reliability, ensure duplicate callbacks are not added to proxy (#8067 ) * refactor _add_callbacks_from_db_config * fix check for _custom_logger_exists_in_litellm_callbacks * move loc of test utils * run ci/cd again * test_add_custom_logger_callback_to_specific_event_with_duplicates_callbacks * fix _custom_logger_class_exists_in_success_callbacks * unit testing for test_add_callbacks_from_db_config * test_custom_logger_exists_in_callbacks_individual_functions * fix config.yml * fix test test_stream_chunk_builder_openai_audio_output_usage - use direct dict comparison	2025-01-28 21:01:56 -08:00

41 Commits