Commit Graph

606 Commits

Author SHA1 Message Date
Krish Dholakia ba39f9e360 Helicone base url support + fix for embedding cache hits on str input (#11211)
* fix(helicone.py): add helicone api base support

Fixes https://github.com/BerriAI/litellm/issues/10825

* test: add unit test for cache hit response on embedding calls

* fix(caching_handler.py): fix handling cache hit on embedding when input is string

Fixes LIT-197

* docs(helicone_integration.md): document new helicone api base param
2025-05-28 22:02:55 -07:00
Krish Dholakia 7072466775 VertexAI - codeExecution tool support + anyOf handling (#11195)
* fix(vertex_and_google_ai_studio_gemini.py): handle both camel case and underscores in the tool for vertex ai code execution

support vertex ai code execution

* docs(vertex.md): add code execution example to vertex ai

* fix(vertex_ai/common_utils.py): when anyof in field, just select anyof - don't include other k,v pairs - vertex throws error

Fixes https://github.com/BerriAI/litellm/issues/11164

* fix(common_utils.py): add title field inside anyof - to retain some description

Addresses https://github.com/BerriAI/litellm/issues/11164#issuecomment-2914728385
2025-05-27 21:23:14 -07:00
Ishaan Jaff 6c36dc269b test: fix test_vertexai_model_garden_model_completion 2025-05-27 18:51:50 -07:00
Akim Tsvigun acaa80294c Integration with Nebius AI Studio added (#11143)
* integration with Nebius AI Studio added

* Merged with main

* Reviewer's comments resolved

* spelling error fixed

* accidental change reverted
2025-05-27 11:05:22 -07:00
Ishaan Jaff 4d2edc4e7a [Fixes] Aiohttp transport fixes - add handling for aiohttp.ClientPayloadError and ssl_verification settings (#11162)
* fix: AiohttpResponseStream transport

* fix: use AiohttpResponseStream transport by default

* fix: AiohttpResponseStream transport

* fixes: mapping aiohttp exceptions

* fixes: aiohttp rollout

* fixes: add support ssl_verify for aiohttp

* fixes: add support ssl_verify for aiohttp

* fixes: remove duplicates
2025-05-26 21:14:35 -07:00
Krish Dholakia 010a4d44af Fix passing standard optional params (#11124)
* fix(main.py): use processed non-default-params as standard input params for langfuse

Fixes https://github.com/BerriAI/litellm/issues/11072

 Fixes https://github.com/BerriAI/litellm/issues/11096

* fix(main.py): rename variable to be more accurate

* test(test_langfuse_e2e_test.py): add router unit test for langfuse e2e testing

Prevent https://github.com/BerriAI/litellm/issues/11072 from happening again

* build: update lock

* fix(utils.py): refactor optional params function

make it easier to get the standardized non default params

* fix(utils.py): improve process non default params function

* fix(main.py): include provider specific params in processed non default params used in logging

ensures user can see any provider specific params on langfuse

 ensures user can see any provider specific params on langfus e
2025-05-24 12:12:31 -07:00
Ishaan Jaff 86cdb8382b [Feat] Use aiohttp transport by default - 97% lower median latency (#11097)
* fix: add flag for disabling use_aiohttp_transport

* feat: add _create_async_transport

* feat: fixes for transport

* add httpx-aiohttp

* feat: fixes for transport

* refactor: fixes for transport

* build: fix deps

* fixes: test fixes

* fix: ensure aiohttp does not auto set content type

* test: test fixes

* feat: add LiteLLMAiohttpTransport

* fix: fixes for responses API handling

* test: fixes for responses API handling

* test: fixes for responses API handling

* feat: fixes for transport

* fix: base embedding handler

* test: test_async_http_handler_force_ipv4

* test: fix failing deepeval test

* fix: add YARL for bedrock urls

* fix: issues with transport

* fix: comment out linting issues

* test fix

* test: XAI is unstable

* test: fixes for using respx

* test: XAI fixes

* test: XAI fixes

* test: infinity testing fixes

* docs(config_settings.md): document param

* test: test_openai_image_edit_litellm_sdk

* test: remove deprecated test

* bump respx==0.22.0

* test: test_xai_message_name_filtering

* test: fix anthropic test after bumping httpx

* use n 4 for mapped tests (#11109)

* fix: use 1 session per event loop

* test: test_client_session_helper

* fix: linting error

* fix: resolving GET requests on httpx 0.28.1

* test fixes proxy unit tests

* fix: add ssl verify settings

* fix: proxy unit tests

* fix: refactor

* tests: basic unit tests for aiohttp transports

* tests: fixes xai

---------

Co-authored-by: Krrish Dholakia <krrishdholakia@gmail.com>
2025-05-23 22:55:35 -07:00
Tornike Gurgenidze db4183715a feat: add embeddings to CustomLLM (#10980)
* feat: add embeddings to CustomLLM

* feat: add aembedding to custom llm
2025-05-22 22:55:46 -07:00
Krrish Dholakia 469d395177 test: update groq test - change on their end 2025-05-22 15:02:01 -07:00
slytechnical 98e9db340c [Feature] Add supports_computer_use to the model list (#10881)
* Add support for supports_computer_use in model info

* Corrected list of supports_computer_use models

* Further fix computer use compatible claude models, fix existing test that predated supports_computer_use in the model list

* Move computer use test case into existing test_utils file

* Moved tests in to test_utils.py
2025-05-20 17:07:43 -07:00
Krrish Dholakia b122ead5b3 test: update tests 2025-05-20 13:08:47 -07:00
Krrish Dholakia 4e3c8ae94f test: update test due to cohere ssl issues 2025-05-19 20:07:57 -07:00
Krish Dholakia cc626ad3ec Handle openai gpt file data + add openai 'supports_pdf_input' to all vision models + Support bedrock tool cache pointing (#10897)
* fix(openai/gpt_transformation.py): handle missing filename for openai file data call

* fix(openai/gpt_transformation.py): clean handling for sync + async pdf url transformation flows

Fixes https://github.com/BerriAI/litellm/issues/10820

* build(model_prices_and_context_window.json): add 'supports_pdf_input' for all openai models which have 'vision' support

Follows openai guidelines

* feat(bedrock/chat): support cache pointing tool calls on Bedrock

Closes https://github.com/BerriAI/litellm/pull/10613

* fix: fix linting error
2025-05-17 07:29:01 -07:00
Dror Baron 93639df4c3 Feat/support anonymize in aim guardrail (#10757)
* Enable update/delete org members on UI  (#8560)

* feat(organization_endpoints.py): expose new `/organization/delete` endpoint. Cascade org deletion to member, teams and keys

Ensures any org deletion is handled correctly

* test(test_organizations.py): add simple test to ensure org deletion works

* feat(organization_endpoints.py): expose /organization/update endpoint, and define response models for org delete + update

* fix(organizations.tsx): support org delete on UI + move org/delete endpoint to use DELETE

* feat(organization_endpoints.py): support `/organization/member_update` endpoint

Allow admin to update member's role within org

* feat(organization_endpoints.py): support deleting member from org

* test(test_organizations.py): add e2e test to ensure org member flow works

* fix(organization_endpoints.py): fix code qa check

* fix(schema.prisma): don't introduce ondelete:cascade - breaking change

* docs(organization_endpoints.py): document missing params

* support anonymize and deanonymize

* use new response schema

* don't use detected because action already means there are detections

* log to debug

* CR fixes

* lint

* add tests

* use single quotes in deanonymiztion

* remove engage action case

* set max entities to 100 to prevent memory leak

* add test case for de-anonymization of llm response

---------

Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>
2025-05-15 22:18:58 -07:00
Ishaan Jaff dc16e47df6 [UI] Allow adding Bedrock, Presidio, Lakera, AIM guardrails on UI (#10874)
* ui fix bedrock guard

* polish: logo should appear after selecting provider

* fix ui config bedrock

* fix: refactor - use specific configs per provider

* fix: refactor - use specific configs per provider

* feat: ui, show provider specific params for guardrails

* fix: updated type of LiteLLM params for guardrails

* fix: updated type of LiteLLM params for guardrails

* ui, use endpoint for adding presidio, bedrock guardrails

* fix: linting error

* add llama guard and secret detector on UI

* add aim on ui

* allow adding lakera AI on litellm ui

* fix: fixes for params to init guardrails

* test: test_guardrail_info_response

* test: test_initialize_presidio_guardrail

* fix: init guardrails

* fix: init guardrails

* add showSearch

* working bedrock guard
2025-05-15 21:22:56 -07:00
Ishaan Jaff 33f11f1479 fix arize config tests 2025-05-13 20:21:14 -07:00
Ishaan Jaff 8142c20c98 [Feat] Allow specifying PII Entities Config when using Presidio Guardrails (#10810)
* refactor: use analyze_text, anonymize_text

* feat: allow defining pii_entities_config for presidio

* feat: use entities config for presidio analyze request

* feat: add test_presidio_pii.py

* testing: add guardrails testing job

* feat: allow blocking specific entities pii

* test: use 1 file for presidio guard tests

* fix: presidio pii tests

* test: presidio blocked entity

* clean up docs

* docs presidio pii parsing

* fix: raise_exception_if_blocked_entities_detected

* fix: linting errors
2025-05-13 19:48:56 -07:00
Ishaan Jaff 3130c4f8f9 [Refactor] Move LLM Guard, Secret Detection to Enterprise Pip packagea (#10782)
* refactor: move guardrails to pip

* refactor: move guardrails to pip

* testing fix: move guardrails to pip

* git commit setup_litellm_enterprise_pip
2025-05-13 09:42:22 -07:00
Krish Dholakia d37cc63250 Add new model provider Novita AI (#7582) (#9527)
* Add new model provider Novita AI (#7582)

* feat: add new model provider Novita AI

* feat: use deepseek r1 model for examples in Novita AI docs

* fix: fix tests

* fix: fix tests for novita

* fix: fix novita transformation

* ci: fix ci yaml

* fix: fix novita transformation and test (#10056)

---------

Co-authored-by: Jason <ggbbddjm@gmail.com>
2025-05-12 21:49:30 -07:00
Ishaan Jaff 643d2a8ccb [Feat] Option to force/always use the litellm proxy (#10559) (#10633) (#10773)
* [Feat] Option to force/always use the litellm proxy (#10559) (#10633)

* fix: add use_litellm_proxy

* fix: update LiteLLMProxyChatConfig

* fix get llm provider logic

* tests get llm provider logic

* add dynamic use_litellm_proxy

* docs forcsing litellm proxy usage

* fix: _should_use_litellm_proxy_by_default

* fixes: get_custom_llm_provider

---------

Co-authored-by: Antoine Legrand <2t.antoine@gmail.com>
2025-05-12 20:22:54 -07:00
Krish Dholakia beae5cfea9 Litellm staging 05 10 2025 - openai pdf url support + sagemaker chat content length error fix (#10724)
* Support pdf url's to openai (#10640)

* fix(gpt_transformation.py): support pdf url input to openai

pass as base64 as openai doesn't support image url's

* fix(openai.py): support async message transformation

allows async get request to convert url to base64

* fix(gpt_transformation.py): fix linting errrors and use common components across sync + async flows

* fix: fix linting errors

* fix(openai.py): pop correct var

* Fix sagemaker chat calls - content length error  (#10607)

* fix(sagemaker_chat/): support passing dynamic aws params

previously being ignored

* refactor(sagemaker/chat): more refactoring

* fix(sagemaker_chat/): make sure streaming is correctly handled post-refactor

* refactor: more refactoring to support using signed json str

* fix(sagemaker/chat): working sync streaming post refactor

* fix(sagemaker/chat): support async streaming post refactor

* fix(llm_http_handler.py): await async function

* fix: remove print statements

* test: update test

* test: update test

* fix(llm_http_handler.py): retain passing in data as json str

* test: update test

* fix(base_model_iterator.py): fix linting error

* test: test auth

* fix: fix linting error

* test: update test

* test: update translation test

* fix(gpt_transformation.py): handle awaitable/non-awaitable object

* fix: handle async flow for message transformation on openai compatible api's

* test: cleanup testing

* test: update test

* test(test_router.py): use model with higher quota

* test: simplify test

* test: update test
2025-05-10 17:41:57 -07:00
Krish Dholakia 7210b713dc Add target model name validation (#10722)
* fix(auth_checks.py): enforce auth checks on target model names

ensures user has access to models they are trying to call

* test(test_auth_utils.py): add unit tests for auth check

* fix(exception_mapping_utils.py): handle mistral 429 exception

* fix: fix linting error

* fix(auth_checks.py): add max fallback depth
2025-05-10 14:27:06 -07:00
Krish Dholakia 6f32189093 fix(caching_handler.py): fix embedding str caching result (#10700)
* fix(caching_handler.py): fix embedding str caching result

Fixes issue where str caching results were not being correctly assembled on str input

* feat(azure/image_generation): Support dropping response_format for azure gpt-image-1

Fixes LIT-118

* test(test_utils.py): add unit testing

* test: rename file to avoid testing conflict
2025-05-09 23:37:02 -07:00
Krrish Dholakia 49deea0df9 test: update test 2025-05-08 21:12:14 -07:00
Ishaan Jaff 88f5f9b7f8 fix ai21 test 2025-05-07 21:45:57 -07:00
Ishaan Jaff 580e221000 fix ai21 test 2025-05-07 21:26:35 -07:00
Ishaan Jaff 85d843ab4d text fix vertex deprecated a model 2025-05-03 16:22:24 -07:00
Krish Dholakia 7273bb442a UI - allow reassigning team to other org (#10527)
* feat(team_info.tsx): allow user to reassign team to another org

* style(team_info.tsx): fix org id styling

* feat(team_endpoints.py): add validation check before migrating team to another org

ensure model access, budgets and membership is respected

* fix(team_endpoints.py): update model migration validation to check if org has 'all-proxy-models' access

* fix(organization_view.tsx): show teams belonging to org

* feat(team_endpoints.py): handle wildcard model check on org migration

* fix(team_endpoints.py): nest router check

* test: update testing - use model with higher quota

* build: update poetry lock
2025-05-03 08:44:43 -07:00
Carlos Freund cb177dbd7a Fix and rewrite of token_counter (#10409)
* added tests

messages_with_counts: Made tolerance explicit for each test. But they match the new implementation(which beats the old)

* new token counter impl

* compare old and new implementation in test

* delete old token counter

* moved tests to /tests/litellm/litellm_core_utils

* use existing types

* docstrings

* warn about using default params on unknown model.

* created type for the token_counter_function

* check key == "content"

* throw error on invalid detail-type, ignore type-warning.

* fix imports
2025-05-01 23:34:37 -07:00
Krish Dholakia 9cc39af131 Add vertex ai meta llama 4 support + handle tool call result in content for vertex ai (#10492)
* refactor(vertex_ai/llama): handle response transformation within config

Allows us to handle https://github.com/BerriAI/litellm/issues/10441#issuecomment-2844975599

* fix(vertex_ai/llama): handle tool call in content

Fixes https://github.com/BerriAI/litellm/issues/10441

* fix(vertex_ai/llama): return 'tool_calls' as finish reason if tool call returned

vertex ai returns stop

* feat(vertex_ai/): cost tracking for vertex_ai/meta/llama-4

* ci(test-linting.yml): pin openai version

* build: reorder pinning

* ci(pyproject.toml): limit openai version

temporary patch as new version has linting errors

* ci(pyproject.toml): limit openai version

temporary patch around linting errors

* ci(limit-openai-version): temporary patch

* fix: fix linting errors

* fix: fix linting error

* fix(parallel_request_limiter_v2.py): add team based multi-instance rate limiting

* fix: fix linting errors

* build(pyproject.toml): modify pin

* ci: bump pin
2025-05-01 22:47:06 -07:00
Ishaan Jaff de7870cb54 Add llamafile as a provider (#10203) (#10482)
* Update docs for OpenAI compatible providers, add Llamafile docs, include Llamafile in the sidebar

* Add Llamafile as an LlmProviders enum

* Add llamafile as a OpenAI compatible provider (in the list of compatible providers)

* Add Llamafile chat config and tests

* Wire up Llamafile

Co-authored-by: Peter Wilson <peter@mozilla.ai>
2025-05-01 18:36:55 -07:00
Krrish Dholakia 66cf75cd5d test: handle internal server errors 2025-05-01 16:47:30 -07:00
Krrish Dholakia cec138c47e test: remove redundant tests 2025-05-01 16:46:21 -07:00
Krrish Dholakia 4ab0ee0b65 test: more testing fixes 2025-05-01 15:36:13 -07:00
Ruperto A. Martinez 298a3574f4 Add supports_pdf_input: true to Claude 3.7 bedrock models (#9917)
* Add supports_pdf_input: true to Claude 3.7 bedrock models

* update unit test

---------

Co-authored-by: RupertoXTI <rmartinez@xtillion.com>
2025-05-01 14:56:54 -07:00
Krish Dholakia 6ad483dde7 Litellm dev 04 30 2025 p1 (#10462)
* fix(exception_mapping_utils.py): correctly pass through 504 status code

openai also raises a 504 status code

* build(model_prices_and_context_window.json): add gpt-4o-mini-tts to model cost map

Fixes https://github.com/BerriAI/litellm/issues/9591

* fix(cost_calculator.py): fix input cost calculation for gpt-4o-mini-tts

Fixes https://github.com/BerriAI/litellm/issues/9591

* test: testing updates
2025-04-30 22:11:12 -07:00
Krish Dholakia 711601e22a Add key-level multi-instance tpm/rpm/max parallel request limiting (#10458)
* fix: initial commit of v2 parallel request limiter hook

enables multi-instance rate limiting to work

* fix: subsequent commit with additional refactors

* fix(parallel_request_limiter_v2.py): cleanup initial call hook

simplify it

* fix(parallel_request_limiter_v2.py): working v2 parallel request limiter

* fix: more updates - still not passing testing

* fix(test_parallel_request_limiter_v2.py): update test + add conftest

* fix: fix ruff checks

* fix(parallel_request_limiter_v2.py): use pull via pattern method to load in keys instance wouldn't have seen yet

Fixes issue where redis syncing was not pulling key until instance had seen it

* test: update testing to cover tpm and rpm

* fix(parallel_request_limiter_v2.py): fix ruff errors

* fix(proxy/hooks/__init__.py): feature flag export

* fix(proxy/hooks/__init_.py): fix linting error

* ci(config.yml): add tests/enterprise to ci/cd

* fix: fix ruff check

* test: update testing
2025-04-30 21:32:31 -07:00
Krish Dholakia 9e35ca2010 Embedding caching fixes - handle str -> list cache, set usage tokens for cache hits, combine usage tokens on partial cache hits (#10424)
* build(model_prices_and_context_window.json): add fireworks ai new 0-4b pricing tier

* build(model_prices_and_context_window.json): add more fireworks ai models

* test: update testing

* fix(caching_handler.py): handle str + list cache

Fixes issue on cache hits for embedding when initial cached input was str

* test(test_caching.py): add e2e test on caching with individual item and then list

* fix(caching_handler.py): set usage tokens for cache hits

enables token counting to work

* fix(caching_handler.py): combine usage between cached result and embedding response

Handles case of new input to embedding response

* fix: cleanup

* test: move to gpt-4o-new-test

* test: update test
2025-04-29 21:21:28 -07:00
Krish Dholakia d783190e04 Update fireworks ai pricing (#10425)
* build(model_prices_and_context_window.json): add fireworks ai new 0-4b pricing tier

* build(model_prices_and_context_window.json): add more fireworks ai models

* test: update testing

* test: testing updates

* test: update test

* test: update test
2025-04-29 20:58:05 -07:00
Krish Dholakia bf9382a182 Handle more gemini tool calling edge cases + support bedrock 'stable-image-core' (#10351)
* test(test_amazing_vertex_completion.py): try to repro https://github.com/BerriAI/litellm/issues/10319

* fix(common_utils.py): handle edge case on tools

Fixes https://github.com/BerriAI/litellm/issues/10319

* test: add unit testing for infinite loops

* fix(amazon_stability3_transformation.py): support 'stable-image-core' transformation

Fixes https://github.com/BerriAI/litellm/issues/8488

* test: add unit testing for stable image core model

* test: update test
2025-04-28 14:22:29 -07:00
Krrish Dholakia a649f10e63 test: update test to not use gemini-pro
google removed it
2025-04-23 11:31:09 -07:00
Krrish Dholakia 8184124217 test: update testing 2025-04-23 11:21:50 -07:00
Krrish Dholakia 174a1aa007 test: update test 2025-04-23 10:51:18 -07:00
Krrish Dholakia f5996b2f6b test: update test to skip 'gemini-pro' - model deprecated 2025-04-23 00:01:02 -07:00
Ishaan Jaff 7cb95bcc96 [Bug Fix] caching does not account for thinking or reasoning_effort config (#10140)
* _get_litellm_supported_chat_completion_kwargs

* test caching with thinking
2025-04-21 22:39:40 -07:00
Ishaan Jaff 3c463f6715 test fix - output_cost_per_reasoning_token was added to model cost map 2025-04-19 10:02:25 -07:00
Krish Dholakia 2508ca71cb Handle fireworks ai tool calling response (#10130)
* feat(fireworks_ai/chat): handle tool calling with fireworks ai correctly

Fixes https://github.com/BerriAI/litellm/issues/7209

* fix(utils.py): handle none type in message

* fix: fix model name in test

* fix(utils.py): fix validate check for openai messages

* fix: fix model returned

* fix(main.py): fix text completion routing

* test: update testing

* test: skip test - cohere having RBAC issues
2025-04-19 09:37:45 -07:00
Ishaan Jaff 0a35c208d7 test assistants fixes 2025-04-19 08:09:45 -07:00
Ishaan Jaff a62805f98f fixes for assistans API tests 2025-04-19 07:59:53 -07:00
Ishaan Jaff 5bf76f0bb1 test fixes for azure assistants 2025-04-19 07:36:40 -07:00