litellm

mirror of https://github.com/tiennm99/litellm.git synced 2026-06-18 00:48:01 +00:00

Author	SHA1	Message	Date
Ishaan Jaffer	7f79abb552	test_aastreaming_tool_calls_valid_json_str	2025-10-31 19:05:31 -07:00
Ishaan Jaffer	33371d18f4	test fix claude-sonnet-4-5-20250929	2025-10-28 19:05:13 -07:00
Ishaan Jaffer	1b49dba1dd	fix claude-sonnet-4-5	2025-10-28 17:37:08 -07:00
Ishaan Jaffer	6350c20d9f	test_azure_streaming_and_function_calling	2025-10-25 12:19:26 -07:00
Ishaan Jaffer	cff70ece5a	test_azure_astreaming_and_function_calling	2025-10-25 12:18:53 -07:00
Ishaan Jaffer	0bedf1c0a7	fix tests	2025-10-25 10:19:24 -07:00
Ishaan Jaffer	a6ea8a5984	test_openai_stream_options_call	2025-09-27 15:01:25 -07:00
Ishaan Jaffer	30a3795e78	test_vertex_ai_stream	2025-09-27 12:42:37 -07:00
Ishaan Jaffer	919d680e18	test_completion_azure_function_calling_stream	2025-09-27 12:38:18 -07:00
Alexsander Hamir	eaa04cd8ce	fix: use fastuuid helper (#14903 ) * fix: use fastuuid helper across the codebase First batch of changes, simple drop in replacement. * second batch of changes * fixed: script mistake on helper file	2025-09-25 15:47:01 -07:00
Krrish Dholakia	d05f58721e	test: remove end of life model from tests	2025-09-09 21:01:45 -07:00
Ishaan Jaff	c709d7505d	test fix: test_parallel_streaming_requests	2025-09-06 16:07:30 -07:00
Krrish Dholakia	aaf9c38a10	test: skip test - ran out of credits	2025-08-14 15:01:26 -07:00
Krish Dholakia	6afaf5721a	[Fix] Streaming - consistent 'finish_reason' chunk index (#13560 ) * feat(model_response_utils.py): new function to check if modelresponsestream is empty used for checking https://github.com/BerriAI/litellm/issues/13348 * fix(streaming_handler.py): skip chunk if empty Fixes https://github.com/BerriAI/litellm/issues/13348 * fix(streaming_handler.py): add is_empty logic to async flow	2025-08-12 23:21:57 -07:00
Ishaan Jaff	984f91f4f5	test_completion_gemini_stream	2025-08-07 13:24:00 -07:00
Ishaan Jaff	eeed03a78f	test fix: gcp deprecated gemini-1.5-flash	2025-08-06 08:43:45 -07:00
Krish Dholakia	324cfe8bdc	fix(streaming_handler.py): include cost in streaming usage object (#13319 ) Fixes https://github.com/BerriAI/litellm/issues/12689	2025-08-05 18:38:31 -07:00
Krrish Dholakia	378db1b62d	test: remove o1-preview	2025-07-28 17:47:57 -07:00
Krish Dholakia	1737cf4257	VertexAI - camelcase optional params for image generation + Anthropic - streaming, always ensure assistant role set on only first chunk (#12889 ) * fix(vertex_ai/image_generation): transform `_` param to camelcase Fixes https://github.com/BerriAI/litellm/issues/12690 * test(test_vertex_image_generation.py): add unit tests * fix(streaming_handler.py): assert only 1 assistant chunk in stream Fixes https://github.com/BerriAI/litellm/issues/12616 * fix(streaming_handler.py): fix check	2025-07-27 10:09:43 -07:00
Ishaan Jaff	bf300f8ca7	Revert "Litellm dev 07 21 2025 p1 (#12848 )" This reverts commit `e4e10aa4ed`.	2025-07-22 18:28:36 -07:00
Krish Dholakia	e4e10aa4ed	Litellm dev 07 21 2025 p1 (#12848 ) * fix(main.py): fix async retryer Fixes https://github.com/BerriAI/litellm/issues/12830 * fix(forward_clientside_headers_by_model_group.py): filter out 'content-type' from forwardable headers clientside content-type != proxy content type, can cause requests to hang * test(tests/): update tests	2025-07-21 22:09:39 -07:00
Ishaan Jaff	4a7b9dee5f	test fix - anthropic deprecated claude 2	2025-07-21 18:22:39 -07:00
Ishaan Jaff	437f4765b4	test_completion_mistral_api_mistral_large_function_call_with_streaming	2025-07-03 14:58:28 -07:00
Krish Dholakia	c0319d0d01	Litellm dev fix gemini web search tracking (#12288 ) * feat(stream_chunk_builder_utils.py): correctly return web_search_requests on stream chunk builder * fix(types/utils.py): handle prompttokendetails * fix(stream_chunk_builder_utils.py): fix ruff check error * test: try-except rate limit error * fix: fix import	2025-07-03 12:27:14 -07:00
Krrish Dholakia	a198d4a39f	test: change mistral model service tier exceeded	2025-07-02 21:11:02 -07:00
Krish Dholakia	ccc085faee	Merge in - Gemini streaming - thinking content parsing - return in `reasoning_content` (#11298 ) * fix(base_routing_strategy.py): compress increments to redis - reduces write ops * fix(base_routing_strategy.py): make get and reset in memory keys atomic * fix(base_routing_strategy.py): don't reset keys - causes discrepency on subsequent requests to instance * fix(parallel_request_limiter.py): retrieve values of previous slots from cache more accurate rate limiting with sliding window * fix: fix test * fix: fix linting error * fix(gemini/): fix streaming handler for function calling Closes https://github.com/BerriAI/litellm/pull/11294 * fix: fix linting error * test: update test * fix(vertex_and_google_ai_studio_gemini.py): return none on skipped chunk * fix(streaming_handler.py): skip none chunks on async streaming	2025-06-02 23:14:38 -07:00
Akim Tsvigun	acaa80294c	Integration with Nebius AI Studio added (#11143 ) * integration with Nebius AI Studio added * Merged with main * Reviewer's comments resolved * spelling error fixed * accidental change reverted	2025-05-27 11:05:22 -07:00
Ishaan Jaff	580e221000	fix ai21 test	2025-05-07 21:26:35 -07:00
Krrish Dholakia	66cf75cd5d	test: handle internal server errors	2025-05-01 16:47:30 -07:00
Krrish Dholakia	cec138c47e	test: remove redundant tests	2025-05-01 16:46:21 -07:00
Krish Dholakia	1ea046cc61	test: update tests to new deployment model (#10142 ) * test: update tests to new deployment model * test: update model name * test: skip cohere rbac issue test * test: update test - replace gpt-4o model	2025-04-18 14:22:12 -07:00
Ishaan Jaff	b3f37b860d	test fix azure deprecated mistral ai	2025-04-15 21:42:40 -07:00
Krish Dholakia	f899b828cf	Support openrouter `reasoning_content` on streaming (#9094 ) * feat(convert_dict_to_response.py): support openrouter format of reasoning content * fix(transformation.py): fix openrouter streaming with reasoning content Fixes https://github.com/BerriAI/litellm/issues/8193#issuecomment-270892962 * fix: fix type error	2025-03-09 20:03:59 -07:00
Krrish Dholakia	320cb1d51a	docs: cleanup 'signature_delta' from docs	2025-03-05 23:53:38 -08:00
Krish Dholakia	ec4f665e29	Return `signature` on anthropic streaming + migrate to `signature` field instead of `signature_delta` [MINOR bump] (#9021 ) * Fix missing signature_delta in thinking blocks when streaming from Claude 3.7 (#8797) Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> * test: update test to enforce signature found * feat(refactor-signature-param-to-be-'signature'-instead-of-'signature_delta'): keeps it in sync with anthropic * fix: fix linting error --------- Co-authored-by: Martin Krasser <krasserm@googlemail.com>	2025-03-05 19:33:54 -08:00
Krish Dholakia	3de4209569	fix caching on main branch (#8858 ) * fix(streaming_handler.py): fix is delta empty check to handle empty str * fix(streaming_handler.py): fix delta chunk on final response	2025-02-26 19:16:34 -08:00
Krish Dholakia	ab7c4d1a0e	Litellm dev bedrock anthropic 3 7 v2 (#8843 ) * feat(bedrock/converse/transformation.py): support claude-3-7-sonnet reasoning_Content transformation Closes https://github.com/BerriAI/litellm/issues/8777 * fix(bedrock/): support returning `reasoning_content` on streaming for claude-3-7 Resolves https://github.com/BerriAI/litellm/issues/8777 * feat(bedrock/): unify converse reasoning content blocks for consistency across anthropic and bedrock * fix(anthropic/chat/transformation.py): handle deepseek-style 'reasoning_content' extraction within transformation.py simpler logic * feat(bedrock/): fix streaming to return blocks in consistent format * fix: fix linting error * test: fix test * feat(factory.py): fix bedrock thinking block translation on tool calling allows passing the thinking blocks back to bedrock for tool calling * fix(types/utils.py): don't exclude provider_specific_fields on model dump ensures consistent responses * fix: fix linting errors * fix(convert_dict_to_response.py): pass reasoning_content on root * fix: test * fix(streaming_handler.py): add helper util for setting model id * fix(streaming_handler.py): fix setting model id on model response stream chunk * fix(streaming_handler.py): fix linting error * fix(streaming_handler.py): fix linting error * fix(types/utils.py): add provider_specific_fields to model stream response * fix(streaming_handler.py): copy provider specific fields and add them to the root of the streaming response * fix(streaming_handler.py): fix check * fix: fix test * fix(types/utils.py): ensure messages content is always openai compatible * fix(types/utils.py): fix delta object to always be openai compatible only introduce new params if variable exists * test: fix bedrock nova tests * test: skip flaky test * test: skip flaky test in ci/cd	2025-02-26 16:05:33 -08:00
Krish Dholakia	017c482d7b	fix(o_series_transformation.py): fix optional param check for o-serie… (#8787 ) * fix(o_series_transformation.py): fix optional param check for o-series models o3-mini and o-1 do not support parallel tool calling * fix(utils.py): support 'drop_params' for 'thinking' param across models allows switching to older claude versions (or non-anthropic models) and param to be safely dropped * fix: fix passing thinking param in optional params allows dropping thinking_param where not applicable * test: update old model * fix(utils.py): fix linting errors * fix(main.py): add param to acompletion	2025-02-26 12:26:55 -08:00
Krish Dholakia	142b195784	Add anthropic thinking + reasoning content support (#8778 ) * feat(anthropic/chat/transformation.py): add anthropic thinking param support * feat(anthropic/chat/transformation.py): support returning thinking content for anthropic on streaming responses * feat(anthropic/chat/transformation.py): return list of thinking blocks (include block signature) allows usage in tool call responses * fix(types/utils.py): extract and map reasoning_content from anthropic as content str * test: add testing to ensure thinking_blocks are returned at the root * fix(anthropic/chat/handler.py): return thinking blocks on streaming - include signature * feat(factory.py): handle anthropic thinking blocks translation if in assistant response * test: handle openai internal instability * test: handle openai audio instability * ci: pin anthropic dep * test: handle openai audio instability * fix: fix linting error * refactor(anthropic/chat/transformation.py): refactor function to remain <50 LOC * fix: fix linting error * fix: fix linting error * fix: fix linting error * fix: fix linting error	2025-02-24 21:54:30 -08:00
Ishaan Jaff	46469c6087	set timeout for deepseek testing	2025-01-27 21:25:28 -08:00
Krish Dholakia	6bafdbc546	Litellm dev 01 25 2025 p4 (#8006 ) * feat(main.py): use asyncio.sleep for mock_Timeout=true on async request adds unit testing to ensure proxy does not fail if specific Openai requests hang (e.g. recent o1 outage) * fix(streaming_handler.py): fix deepseek r1 return reasoning content on streaming Fixes https://github.com/BerriAI/litellm/issues/7942 * Revert "fix(streaming_handler.py): fix deepseek r1 return reasoning content on streaming" This reverts commit 7a052a64e3642616405e71350627e2e4f66615b4. * fix(deepseek-r-1): return reasoning_content as a top-level param ensures compatibility with existing tools that use it * fix: fix linting error	2025-01-26 08:01:05 -08:00
Krish Dholakia	76795dba39	Deepseek r1 support + watsonx qa improvements (#7907 ) * fix(types/utils.py): support returning 'reasoning_content' for deepseek models Fixes https://github.com/BerriAI/litellm/issues/7877#issuecomment-2603813218 * fix(convert_dict_to_response.py): return deepseek response in provider_specific_field allows for separating openai vs. non-openai params in model response * fix(utils.py): support 'provider_specific_field' in delta chunk as well allows deepseek reasoning content chunk to be returned to user from stream as well Fixes https://github.com/BerriAI/litellm/issues/7877#issuecomment-2603813218 * fix(watsonx/chat/handler.py): fix passing space id to watsonx on chat route * fix(watsonx/): fix watsonx_text/ route with space id * fix(watsonx/): qa item - also adds better unit testing for watsonx embedding calls * fix(utils.py): rename to '..fields' * fix: fix linting errors * fix(utils.py): fix typing - don't show provider-specific field if none or empty - prevents default respons e from being non-oai compatible * fix: cleanup unused imports * docs(deepseek.md): add docs for deepseek reasoning model	2025-01-21 23:13:15 -08:00
Krish Dholakia	c4ff0b6487	refactor: make bedrock image transformation requests async (#7840 ) * refactor: initial commit for using separate sync vs. async transformation routes for bedrock ensures no blocking calls e.g. when converting image url to b64 * perf(converse_transformation.py): make bedrock converse transformation async asyncify's the bedrock message transformation - useful for handling image urls for bedrock * fix(converse_handler.py): fix logging for async streaming * style: cleanup unused imports	2025-01-17 20:14:15 -08:00
Krrish Dholakia	32538f09fc	test: cleanup test	2025-01-05 14:18:29 -08:00
Ishaan Jaff	137879ffea	vertex testing use pathrise-convert-1606954137718	2025-01-05 14:00:17 -08:00
Krish Dholakia	0120176541	Litellm dev 12 30 2024 p2 (#7495 ) * test(azure_openai_o1.py): initial commit with testing for azure openai o1 preview model * fix(base_llm_unit_tests.py): handle azure o1 preview response format tests skip as o1 on azure doesn't support tool calling yet * fix: initial commit of azure o1 handler using openai caller simplifies calling + allows fake streaming logic alr. implemented for openai to just work * feat(azure/o1_handler.py): fake o1 streaming for azure o1 models azure does not currently support streaming for o1 * feat(o1_transformation.py): support overriding 'should_fake_stream' on azure/o1 via 'supports_native_streaming' param on model info enables user to toggle on when azure allows o1 streaming without needing to bump versions * style(router.py): remove 'give feedback/get help' messaging when router is used Prevents noisy messaging Closes https://github.com/BerriAI/litellm/issues/5942 * fix(types/utils.py): handle none logprobs Fixes https://github.com/BerriAI/litellm/issues/328 * fix(exception_mapping_utils.py): fix error str unbound error * refactor(azure_ai/): move to openai_like chat completion handler allows for easy swapping of api base url's (e.g. ai.services.com) Fixes https://github.com/BerriAI/litellm/issues/7275 * refactor(azure_ai/): move to base llm http handler * fix(azure_ai/): handle differing api endpoints * fix(azure_ai/): make sure all unit tests are passing * fix: fix linting errors * fix: fix linting errors * fix: fix linting error * fix: fix linting errors * fix(azure_ai/transformation.py): handle extra body param * fix(azure_ai/transformation.py): fix max retries param handling * fix: fix test * test(test_azure_o1.py): fix test * fix(llm_http_handler.py): support handling azure ai unprocessable entity error * fix(llm_http_handler.py): handle sync invalid param error for azure ai * fix(azure_ai/): streaming support with base_llm_http_handler * fix(llm_http_handler.py): working sync stream calls with unprocessable entity handling for azure ai * fix: fix linting errors * fix(llm_http_handler.py): fix linting error * fix(azure_ai/): handle cohere tool call invalid index param error	2025-01-01 18:57:29 -08:00
Krish Dholakia	31ace870a2	Litellm dev 12 28 2024 p1 (#7463 ) * refactor(utils.py): migrate amazon titan config to base config * refactor(utils.py): refactor bedrock meta invoke model translation to use base config * refactor(utils.py): move bedrock ai21 to base config * refactor(utils.py): move bedrock cohere to base config * refactor(utils.py): move bedrock mistral to use base config * refactor(utils.py): move all provider optional param translations to using a config * docs(clientside_auth.md): clarify how to pass vertex region to litellm proxy * fix(utils.py): handle scenario where custom llm provider is none / empty * fix: fix get config * test(test_otel_load_tests.py): widen perf margin * fix(utils.py): fix get provider config check to handle custom llm's * fix(utils.py): fix check	2024-12-28 20:26:00 -08:00
Krish Dholakia	39dabb2e89	Litellm dev 12 24 2024 p4 (#7407 ) * fix(invoke_handler.py): fix mock response iterator to handle tool calling returns tool call if returned by model response * fix(prometheus.py): add new 'tokens_by_tag' metric on prometheus allows tracking 'token usage' by task * feat(prometheus.py): add input + output token tracking by tag * feat(prometheus.py): add tag based deployment failure tracking allows admin to track failure by use-case	2024-12-24 20:24:06 -08:00
Krish Dholakia	3671829e39	Complete 'requests' library removal (#7350 ) * refactor: initial commit moving watsonx_text to base_llm_http_handler + clarifying new provider directory structure * refactor(watsonx/completion/handler.py): move to using base llm http handler removes 'requests' library usage * fix(watsonx_text/transformation.py): fix result transformation migrates to transformation.py, for usage with base llm http handler * fix(streaming_handler.py): migrate watsonx streaming to transformation.py ensures streaming works with base llm http handler * fix(streaming_handler.py): fix streaming linting errors and remove watsonx conditional logic * fix(watsonx/): fix chat route post completion route refactor * refactor(watsonx/embed): refactor watsonx to use base llm http handler for embedding calls as well * refactor(base.py): remove requests library usage from litellm * build(pyproject.toml): remove requests library usage * fix: fix linting errors * fix: fix linting errors * fix(types/utils.py): fix validation errors for modelresponsestream * fix(replicate/handler.py): fix linting errors * fix(litellm_logging.py): handle modelresponsestream object * fix(streaming_handler.py): fix modelresponsestream args * fix: remove unused imports * test: fix test * fix: fix test * test: fix test * test: fix tests * test: fix test * test: fix patch target * test: fix test	2024-12-22 07:21:25 -08:00
Krish Dholakia	70a9ea99f2	Controll fallback prompts client-side (#7334 ) * feat(router.py): support passing model-specific messages in fallbacks * docs(routing.md): separate router timeouts into separate doc allow for 1 fallbacks doc (across proxy/router) * docs(routing.md): cleanup router docs * docs(reliability.md): cleanup docs * docs(reliability.md): cleaned up fallback doc just have 1 doc across sdk/proxy simplifies docs * docs(reliability.md): add setting model-specific fallback prompts * fix: fix linting errors * test: skip test causing openai rate limit errros * test: fix test * test: run vertex test first to catch error	2024-12-20 19:09:53 -08:00

1 2

70 Commits