litellm

mirror of https://github.com/tiennm99/litellm.git synced 2026-06-20 23:47:50 +00:00

Author	SHA1	Message	Date
Krish Dholakia	b5850b6b65	Handle azure deepseek reasoning response (#8288 ) (#8366 ) * Handle azure deepseek reasoning response (#8288) * Handle deepseek reasoning response * Add helper method + unit test * Fix: Follow infinity api url format (#8346) * Follow infinity api url format * Update test_infinity.py * fix(infinity/transformation.py): fix linting error --------- Co-authored-by: vibhavbhat <vibhavb00@gmail.com> Co-authored-by: Hao Shan <53949959+haoshan98@users.noreply.github.com>	2025-02-07 17:45:51 -08:00
Krish Dholakia	f651d51f26	Litellm dev 02 07 2025 p2 (#8377 ) * fix(caching_routes.py): mask redis password on `/cache/ping` route * fix(caching_routes.py): fix linting erro * fix(caching_routes.py): fix linting error on caching routes * fix: fix test - ignore mask_dict - has a breakpoint * fix(azure.py): add timeout param + elapsed time in azure timeout error * fix(http_handler.py): add elapsed time to http timeout request makes it easier to debug how long request took before failing	2025-02-07 17:30:38 -08:00
Byron Grogan	5a42be43e0	fix: add azure/o1-2024-12-17 to model_prices_and_context_window.json (#8371 )	2025-02-07 16:22:33 -08:00
Krish Dholakia	dfbbf0bde8	fix: dictionary changed size during iteration error (#8327 ) (#8341 ) Co-authored-by: Joey Feldberg <joeyfeldberg@users.noreply.github.com> Co-authored-by: Joey Feldberg <12495578+joeyfeldberg@users.noreply.github.com>	2025-02-07 16:20:28 -08:00
Krish Dholakia	5d170162d3	fix(nvidia_nim/embed.py): add 'dimensions' support (#8302 ) * fix(nvidia_nim/embed.py): add 'dimensions' support Fixes https://github.com/BerriAI/litellm/issues/8238 * fix(proxy_Server.py): initialize router redis cache if setup on proxy Fixes https://github.com/BerriAI/litellm/issues/6602 * test: add unit testing for new helper function	2025-02-07 16:19:32 -08:00
Krrish Dholakia	16be203283	build(pyproject.toml): bump version	2025-02-07 09:25:58 -08:00
Nikolaiev Dmytro	346d8a9132	Update deepseek API prices for 2025-02-08 (#8363 )	2025-02-07 08:25:35 -08:00
Krrish Dholakia	c4cfd5eb1f	build(ui): updates	2025-02-06 23:25:09 -08:00
Krrish Dholakia	790c6eb02a	bump: version 1.60.6 → 1.60.7	2025-02-06 23:24:38 -08:00
Krrish Dholakia	9f426a6b1a	build(ui/): update ui build	2025-02-06 23:24:25 -08:00
Krish Dholakia	6b8b49451f	Fix azure max retries error (#8340 ) * fix(azure.py): ensure max_retries=0 is respected Fixes https://github.com/BerriAI/litellm/issues/6129 * fix(test_openai.py): add unit test to ensure openai sdk calls always respect max_retries = 0 * test(test_azure_openai.py): add unit testing for azure_text/ route * fix(azure.py): fix passing max retries on streaming * fix(azure.py): fix azure max retries on async completion + streaming * fix(completion/handler.py): fix azure text async completion + streaming * test(test_azure_openai.py): ensure azure openai max retries always respected * test(test_azure_o_series.py): add testing to ensure max retries always respected * Added gemini providers for 2.0-flash and 2.0-flash lite (#8321) * Update model_prices_and_context_window.json added gemini providers for 2.0-flash and 2.0-flash light * Update model_prices_and_context_window.json fixed URL --------- Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> * Convert tool use arguments to string before counting tokens (#6989) In at least some cases the `messages["tool_calls"]["function"]["arguments"]` is a dict, not a string. In order to tokenize it properly it needs to be a string. In the case that it is already a string this is a noop, which is also fine. * build(model_prices_and_context_window.json): add gemini 2.0 flash lite pricing * build(model_prices_and_context_window.json): add gemini commercial rate limits * fix(utils.py): fix linting error * refactor(utils.py): refactor to maintain function size --------- Co-authored-by: Bardia Khosravi <bardiakhosravi95@gmail.com> Co-authored-by: Josh Morrow <josh@jcmorrow.com>	2025-02-06 23:20:48 -08:00
Krish Dholakia	d720744656	Litellm dev 02 06 2025 p3 (#8343 ) * feat(handle_jwt.py): initial commit to allow scope based model access * feat(handle_jwt.py): allow model access based on token scopes allow admin to control model access from IDP * test(test_jwt.py): add unit testing for scope based model access * docs(token_auth.md): add scope based model access to docs * docs(token_auth.md): update docs * docs(token_auth.md): update docs * build: add gemini commercial rate limits * fix: fix linting error	2025-02-06 23:15:33 -08:00
Krish Dholakia	f87ab251b0	UI Updates (#8345 ) * fix(.globals.css): revert .md hard set caused regression in invitation link display (and possibly other places) * Fix keys not showing on refresh for internal users (#8312) * [Bug] UI: Newly created key does not display on the View Key Page (#8039) - Fixed issue where all keys appeared blank for admin users. - Implemented filtering of data via team settings to ensure all keys are displayed correctly. * Fix: - Updated the validator to allow model editing when `keyTeam.team_alias === "Default Team"`. - Ensured other teams still follow the original validation rules. * - added some classes in global.css - added text wrap in output of request,response and metadata in index.tsx - fixed styles of table in table.tsx * - added full payload when we open single log entry - added Combined Info Card in index.tsx * fix: keys not showing on refresh for internal user * fixed user id passed as null when keyuser is you (#8271) * fix(user_dashboard.tsx): ensure non admin can't view other keys --------- Co-authored-by: Taha Ali <123803932+tahaali-dev@users.noreply.github.com> Co-authored-by: Jaswanth Karani <karani.jaswanth@gmail.com>	2025-02-06 22:41:20 -08:00
Ishaan Jaff	e3aab50ab3	docs assembly ai	2025-02-06 21:30:36 -08:00
Ishaan Jaff	7739be340b	fix assembly pass through cost tracking v1.60.6	2025-02-06 21:20:59 -08:00
Ishaan Jaff	229f270dd6	docs assembly ai eu endpoints	2025-02-06 21:13:40 -08:00
Ishaan Jaff	ab761b9dc8	bump: version 1.60.5 → 1.60.6	2025-02-06 21:06:07 -08:00
Ishaan Jaff	778bbcdd9c	fix test_get_model_info_gemini	2025-02-06 21:05:47 -08:00
Ishaan Jaff	7706ff1f1e	ui new build	2025-02-06 18:31:21 -08:00
Ishaan Jaff	65c91cbbbc	(QA+UI) - e2e flow for adding assembly ai passthrough endpoints (#8337 ) * add initial test for assembly ai * start using PassthroughEndpointRouter * migrate to lllm passthrough endpoints * add assembly ai as a known provider * fix PassthroughEndpointRouter * fix set_pass_through_credentials * working EU request to assembly ai pass through endpoint * add e2e test assembly * test_assemblyai_routes_with_bad_api_key * clean up pass through endpoint router * e2e testing for assembly ai pass through * test assembly ai e2e testing * delete assembly ai models * fix code quality * ui working assembly ai api base flow * fix install assembly ai * update model call details with kwargs for pass through logging * fix tracking assembly ai model in response * _handle_assemblyai_passthrough_logging * fix test_initialize_deployment_for_pass_through_unsupported_provider * TestPassthroughEndpointRouter * _get_assembly_transcript * fix assembly ai pt logging tests * fix assemblyai_proxy_route * fix _get_assembly_region_from_url	2025-02-06 18:27:54 -08:00
Ishaan Jaff	5dcb87a88b	(bug fix router.py) - safely handle `choices=[]` on llm responses (#8342 ) * test fix test_router_with_empty_choices * fix _should_raise_content_policy_error	2025-02-06 18:22:08 -08:00
Ishaan Jaff	d2fec8bf13	databricks/meta-llama-3.3-70b-instruct	2025-02-06 18:21:56 -08:00
Krish Dholakia	f031926b82	fix(utils.py): handle key error in msg validation (#8325 ) * fix(utils.py): handle key error in msg validation * Support running Aim Guard during LLM call (#7918) * support running Aim Guard during LLM call * Rename header * adjust docs and fix type annotations * fix(timeout.md): doc fix for openai example on dynamic timeouts --------- Co-authored-by: Tomer Bin <117278227+hxtomer@users.noreply.github.com>	2025-02-06 18:13:46 -08:00
Anton Abilov	fac1d2ccef	Fixed meta llama 3.3 key for Databricks API (#8093 ) See correct key reference here: https://docs.databricks.com/en/machine-learning/model-serving/foundation-model-overview.html#pay-per-token	2025-02-06 18:05:49 -08:00
Ishaan Jaff	b535c9bdc0	(Bug Fix - Langfuse) - fix for when model response has `choices=[]` (#8339 ) * refactor _get_langfuse_input_output_content * test_langfuse_logging_completion_with_malformed_llm_response * fix _get_langfuse_input_output_content * fixes for langfuse linting * unit testing for get chat/text content for langfuse * fix _should_raise_content_policy_error	2025-02-06 18:02:26 -08:00
Rok Benko	3ec9c28fb7	Update local_debugging.md (#8308 )	2025-02-06 16:19:32 -08:00
Wanis Elabbar	15ac5f3c32	Fix pricing for Gemini 2.0 Flash 001 (#8320 ) Model Type Price Price with Batch API Gemini 2.0 Flash 1M Input tokens $0.15 $0.075 1M Input audio tokens $1.00 $0.50 1M Output text tokens $0.60 $0.30 https://cloud.google.com/vertex-ai/generative-ai/pricing#token-based-pricing	2025-02-06 16:17:29 -08:00
Luis Sanchez	1b4f0f7192	Add aistudio GEMINI 2.0 to model_prices_and_context_window.json (#8335 )	2025-02-06 16:16:54 -08:00
exiao	85491a0bab	Add Arize Cookbook for Turning on LiteLLM Proxy (#8336 ) * Add files via upload * Update arize_integration.md	2025-02-06 16:16:28 -08:00
Krish Dholakia	bcfa641b81	Add gemini-2.0-flash pricing + model info (#8303 ) * add gemini-2.0-flash-001 (#8289) * build(model_prices_and_context_window.json): add gemini-2.0-flash-001 to model cost map Adds new gemini model with token based pricing to model cost map --------- Co-authored-by: kushagro <kush@orby.ai>	2025-02-05 20:49:26 -08:00
Tyler Wagner	5e921804b9	fix: docs links (#8294 ) Fixed the docs links in the enterprise md.	2025-02-05 20:41:20 -08:00
Krish Dholakia	b4e5c0de69	Improve rpm check on keys (#8301 ) * fix(parallel_request_limiter.py): initial commit that solves the rpm limit check on keys Fixes https://github.com/BerriAI/litellm/issues/6938 * fix(parallel_request_limiter.py): simpler approach - just increment RPM in pre call hook instead of on success * fix(parallel_request_limiter.py): pass testing * fix: fix linting error * fix(parallel_request_limiter.py): fix parallel request check for keys	2025-02-05 20:23:08 -08:00
Krish Dholakia	7e873538f6	Fix edit team on ui (#8295 ) * fix(columns.tsx): fix request logs team column to indicate the value is the alias not the id * fix(team_info.tsx): add edit team logic to team info page * fix(team_info.tsx): re-enable updating team settings on UI Fixes https://github.com/BerriAI/litellm/issues/8281 * fix(team_info.tsx): fix save changes on team update * fix(teams.tsx): allow edit button to still act as a quick action button -> drop user into settings page for team * test(config.yml): run dev ui during testing make sure no ui regressions are pushed on main * build: update ci/cd * ci(config.yml): fix test * ci: fix ci * ci: update * ci: fix * ci: another attempt to get nvm working in ci/cd * ci: fix ci * ci: test update * ci: test update 2 * ci: test 3 * fix(team_info.tsx): fix linting error	2025-02-05 20:13:17 -08:00
Krish Dholakia	443ae55904	Azure OpenAI improvements - o3 native streaming, improved tool call + response format handling (#8292 ) * fix(convert_dict_to_response.py): only convert if response is the response_format tool call passed in Fixes https://github.com/BerriAI/litellm/issues/8241 * fix(gpt_transformation.py): makes sure response format / tools conversion doesn't remove previous tool calls * refactor(gpt_transformation.py): refactor out json schema converstion to base config keeps logic consistent across providers * fix(o_series_transformation.py): support o3 mini native streaming Fixes https://github.com/BerriAI/litellm/issues/8274 * fix(gpt_transformation.py): remove unused variables * test: update test	2025-02-05 19:38:58 -08:00
Ishaan Jaff	515598114c	bump: version 1.60.4 → 1.60.5 v1.60.5	2025-02-05 19:02:45 -08:00
Ishaan Jaff	03f738eff6	fix test_models_by_provider	2025-02-05 19:01:00 -08:00
Ishaan Jaff	818792228c	(Refactor) - migrate bedrock invoke to `BaseLLMHTTPHandler` class (#8290 ) * initial transform for invoke * invoke transform_response * working - able to make request * working get_complete_url * working - invoke now runs on llm_http_handler * fix unused imports * track litellm overhead ms * working stream request * sign_request transform * sign_request update * use has_async_custom_stream_wrapper property * use get_async_custom_stream_wrapper in base llm http handler * fix make_call in invoke handler * fix invoke with streaming get_async_custom_stream_wrapper * working bedrock async streaming with invoke * fix make call handler for bedrock * test_all_model_configs * fix test_bedrock_custom_prompt_template * sync streaming for bedrock invoke * fix _add_stream_param_to_request_body * test_async_text_completion_bedrock * fix transform_request * fix get_supported_openai_params * fix test supports tool choice * fix test_supports_tool_choice * add unit test coverage for bedrock invoke transform * fix location of transformation files * update import loc * fix bedrock invoke unit tests * fix import for max completion tokens	2025-02-05 18:58:55 -08:00
Ishaan Jaff	e41bc5f32b	fixed issues #8126 and #8127 (#8275 ) (#8299 ) Co-authored-by: Jaswanth Karani <karani.jaswanth@gmail.com>	2025-02-05 18:52:58 -08:00
Ishaan Jaff	b76b380bc8	fix add back sambanova/Qwen2.5-72B-Instruct	2025-02-05 18:44:17 -08:00
Ishaan Jaff	ffd890e744	add assembly ai cost tracking (#8298 )	2025-02-05 18:43:37 -08:00
Ishaan Jaff	e42fcf4d03	(UI) - Add Assembly AI provider to UI (#8297 ) * add assembly ai to ui * specify api base for assembly ai	2025-02-05 18:42:51 -08:00
Ishaan Jaff	6cef115bb0	(Security fix) - remove code block that inserts master key hash into DB (#8268 ) * remove code block upserting master key hash to db * run test to check if key upserted into db * run ci/cd again * litellm_proxy_security_tests * litellm_proxy_security_tests * run prisma entrypoint * ci/cd run again * fix test master key not in db	2025-02-05 17:25:42 -08:00
Zhaohan Dong	88e7046165	Added compatibility guidance, etc. for xAI Grok model (#8282 ) * Various updates Signed-off-by: Zhaohan Dong <65422392+zhaohan-dong@users.noreply.github.com> * Update xAI branding Signed-off-by: Zhaohan Dong <65422392+zhaohan-dong@users.noreply.github.com> * Revert changes Signed-off-by: Zhaohan Dong <65422392+zhaohan-dong@users.noreply.github.com> --------- Signed-off-by: Zhaohan Dong <65422392+zhaohan-dong@users.noreply.github.com>	2025-02-05 17:21:47 -08:00
waterstark	fbe3c58372	Added a guide for users who want to use LiteLLM with AI/ML API. (#7058 ) * Added a guide for users who want to use LiteLLM with AI/ML. * Minor changes * Minor changes * Fix sidebars.js --------- Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>	2025-02-05 06:20:35 -08:00
Krish Dholakia	8d3a942fbd	Litellm staging (#8270 ) * fix(opik.py): cleanup * docs(opik_integration.md): cleanup opik integration docs * fix(redact_messages.py): fix redact messages check header logic ensures stringified bool value in header is still asserted to true allows dynamic message redaction * feat(redact_messages.py): support `x-litellm-enable-message-redaction` request header allows dynamic message redaction v1.60.4 v1.60.2-dev1	2025-02-04 22:35:48 -08:00
Krish Dholakia	3c813b3a87	Fix deepseek calling - refactor to use base_llm_http_handler (#8266 ) * refactor(deepseek/): move deepseek to base llm http handler Fixes https://github.com/BerriAI/litellm/issues/8128#issuecomment-2635430457 * fix(gpt_transformation.py): support stream parsing for gpt-like calls * test(test_deepseek_completion.py): add async streaming test * fix(gpt_transformation.py): fix import * fix(gpt_transformation.py): return full api base and content type	2025-02-04 22:30:00 -08:00
Ishaan Jaff	51b9a02615	run ci/cd again	2025-02-04 22:19:57 -08:00
Ishaan Jaff	e3b0fd7061	bump: version 1.60.3 → 1.60.4	2025-02-04 22:03:18 -08:00
Krish Dholakia	4e34fc3bf8	[BETA] Support OIDC `role` based access to proxy (#8260 ) * feat(proxy/_types.py): add new jwt field params allows users + services to auth into proxy * feat(handle_jwt.py): allow team role proxy access allows proxy admin to set allowed team roles * fix(proxy/_types.py): add 'routes' to role based permissions allow proxy admin to restrict what routes a team can access easily * feat(handle_jwt.py): support more flexible role based route access v2 on role based 'allowed_routes' * test(test_jwt.py): add unit test for rbac for proxy routes * feat(handle_jwt.py): ensure cost tracking always works for any jwt request with `enforce_rbac=True` * docs(token_auth.md): add documentation on controlling model access via OIDC Roles * test: increase time delay before retrying * test: handle model overloaded for test	2025-02-04 21:59:39 -08:00
Krrish Dholakia	7f06b88192	fix(internal_user_endpoints.py): fix try-except for team not in db	2025-02-04 21:57:43 -08:00

1 2 3 4 5 ...

19382 Commits