Commit Graph

597 Commits

Author SHA1 Message Date
slytechnical 98e9db340c [Feature] Add supports_computer_use to the model list (#10881)
* Add support for supports_computer_use in model info

* Corrected list of supports_computer_use models

* Further fix computer use compatible claude models, fix existing test that predated supports_computer_use in the model list

* Move computer use test case into existing test_utils file

* Moved tests in to test_utils.py
2025-05-20 17:07:43 -07:00
Krrish Dholakia b122ead5b3 test: update tests 2025-05-20 13:08:47 -07:00
Krrish Dholakia 4e3c8ae94f test: update test due to cohere ssl issues 2025-05-19 20:07:57 -07:00
Krish Dholakia cc626ad3ec Handle openai gpt file data + add openai 'supports_pdf_input' to all vision models + Support bedrock tool cache pointing (#10897)
* fix(openai/gpt_transformation.py): handle missing filename for openai file data call

* fix(openai/gpt_transformation.py): clean handling for sync + async pdf url transformation flows

Fixes https://github.com/BerriAI/litellm/issues/10820

* build(model_prices_and_context_window.json): add 'supports_pdf_input' for all openai models which have 'vision' support

Follows openai guidelines

* feat(bedrock/chat): support cache pointing tool calls on Bedrock

Closes https://github.com/BerriAI/litellm/pull/10613

* fix: fix linting error
2025-05-17 07:29:01 -07:00
Dror Baron 93639df4c3 Feat/support anonymize in aim guardrail (#10757)
* Enable update/delete org members on UI  (#8560)

* feat(organization_endpoints.py): expose new `/organization/delete` endpoint. Cascade org deletion to member, teams and keys

Ensures any org deletion is handled correctly

* test(test_organizations.py): add simple test to ensure org deletion works

* feat(organization_endpoints.py): expose /organization/update endpoint, and define response models for org delete + update

* fix(organizations.tsx): support org delete on UI + move org/delete endpoint to use DELETE

* feat(organization_endpoints.py): support `/organization/member_update` endpoint

Allow admin to update member's role within org

* feat(organization_endpoints.py): support deleting member from org

* test(test_organizations.py): add e2e test to ensure org member flow works

* fix(organization_endpoints.py): fix code qa check

* fix(schema.prisma): don't introduce ondelete:cascade - breaking change

* docs(organization_endpoints.py): document missing params

* support anonymize and deanonymize

* use new response schema

* don't use detected because action already means there are detections

* log to debug

* CR fixes

* lint

* add tests

* use single quotes in deanonymiztion

* remove engage action case

* set max entities to 100 to prevent memory leak

* add test case for de-anonymization of llm response

---------

Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>
2025-05-15 22:18:58 -07:00
Ishaan Jaff dc16e47df6 [UI] Allow adding Bedrock, Presidio, Lakera, AIM guardrails on UI (#10874)
* ui fix bedrock guard

* polish: logo should appear after selecting provider

* fix ui config bedrock

* fix: refactor - use specific configs per provider

* fix: refactor - use specific configs per provider

* feat: ui, show provider specific params for guardrails

* fix: updated type of LiteLLM params for guardrails

* fix: updated type of LiteLLM params for guardrails

* ui, use endpoint for adding presidio, bedrock guardrails

* fix: linting error

* add llama guard and secret detector on UI

* add aim on ui

* allow adding lakera AI on litellm ui

* fix: fixes for params to init guardrails

* test: test_guardrail_info_response

* test: test_initialize_presidio_guardrail

* fix: init guardrails

* fix: init guardrails

* add showSearch

* working bedrock guard
2025-05-15 21:22:56 -07:00
Ishaan Jaff 33f11f1479 fix arize config tests 2025-05-13 20:21:14 -07:00
Ishaan Jaff 8142c20c98 [Feat] Allow specifying PII Entities Config when using Presidio Guardrails (#10810)
* refactor: use analyze_text, anonymize_text

* feat: allow defining pii_entities_config for presidio

* feat: use entities config for presidio analyze request

* feat: add test_presidio_pii.py

* testing: add guardrails testing job

* feat: allow blocking specific entities pii

* test: use 1 file for presidio guard tests

* fix: presidio pii tests

* test: presidio blocked entity

* clean up docs

* docs presidio pii parsing

* fix: raise_exception_if_blocked_entities_detected

* fix: linting errors
2025-05-13 19:48:56 -07:00
Ishaan Jaff 3130c4f8f9 [Refactor] Move LLM Guard, Secret Detection to Enterprise Pip packagea (#10782)
* refactor: move guardrails to pip

* refactor: move guardrails to pip

* testing fix: move guardrails to pip

* git commit setup_litellm_enterprise_pip
2025-05-13 09:42:22 -07:00
Krish Dholakia d37cc63250 Add new model provider Novita AI (#7582) (#9527)
* Add new model provider Novita AI (#7582)

* feat: add new model provider Novita AI

* feat: use deepseek r1 model for examples in Novita AI docs

* fix: fix tests

* fix: fix tests for novita

* fix: fix novita transformation

* ci: fix ci yaml

* fix: fix novita transformation and test (#10056)

---------

Co-authored-by: Jason <ggbbddjm@gmail.com>
2025-05-12 21:49:30 -07:00
Ishaan Jaff 643d2a8ccb [Feat] Option to force/always use the litellm proxy (#10559) (#10633) (#10773)
* [Feat] Option to force/always use the litellm proxy (#10559) (#10633)

* fix: add use_litellm_proxy

* fix: update LiteLLMProxyChatConfig

* fix get llm provider logic

* tests get llm provider logic

* add dynamic use_litellm_proxy

* docs forcsing litellm proxy usage

* fix: _should_use_litellm_proxy_by_default

* fixes: get_custom_llm_provider

---------

Co-authored-by: Antoine Legrand <2t.antoine@gmail.com>
2025-05-12 20:22:54 -07:00
Krish Dholakia beae5cfea9 Litellm staging 05 10 2025 - openai pdf url support + sagemaker chat content length error fix (#10724)
* Support pdf url's to openai (#10640)

* fix(gpt_transformation.py): support pdf url input to openai

pass as base64 as openai doesn't support image url's

* fix(openai.py): support async message transformation

allows async get request to convert url to base64

* fix(gpt_transformation.py): fix linting errrors and use common components across sync + async flows

* fix: fix linting errors

* fix(openai.py): pop correct var

* Fix sagemaker chat calls - content length error  (#10607)

* fix(sagemaker_chat/): support passing dynamic aws params

previously being ignored

* refactor(sagemaker/chat): more refactoring

* fix(sagemaker_chat/): make sure streaming is correctly handled post-refactor

* refactor: more refactoring to support using signed json str

* fix(sagemaker/chat): working sync streaming post refactor

* fix(sagemaker/chat): support async streaming post refactor

* fix(llm_http_handler.py): await async function

* fix: remove print statements

* test: update test

* test: update test

* fix(llm_http_handler.py): retain passing in data as json str

* test: update test

* fix(base_model_iterator.py): fix linting error

* test: test auth

* fix: fix linting error

* test: update test

* test: update translation test

* fix(gpt_transformation.py): handle awaitable/non-awaitable object

* fix: handle async flow for message transformation on openai compatible api's

* test: cleanup testing

* test: update test

* test(test_router.py): use model with higher quota

* test: simplify test

* test: update test
2025-05-10 17:41:57 -07:00
Krish Dholakia 7210b713dc Add target model name validation (#10722)
* fix(auth_checks.py): enforce auth checks on target model names

ensures user has access to models they are trying to call

* test(test_auth_utils.py): add unit tests for auth check

* fix(exception_mapping_utils.py): handle mistral 429 exception

* fix: fix linting error

* fix(auth_checks.py): add max fallback depth
2025-05-10 14:27:06 -07:00
Krish Dholakia 6f32189093 fix(caching_handler.py): fix embedding str caching result (#10700)
* fix(caching_handler.py): fix embedding str caching result

Fixes issue where str caching results were not being correctly assembled on str input

* feat(azure/image_generation): Support dropping response_format for azure gpt-image-1

Fixes LIT-118

* test(test_utils.py): add unit testing

* test: rename file to avoid testing conflict
2025-05-09 23:37:02 -07:00
Krrish Dholakia 49deea0df9 test: update test 2025-05-08 21:12:14 -07:00
Ishaan Jaff 88f5f9b7f8 fix ai21 test 2025-05-07 21:45:57 -07:00
Ishaan Jaff 580e221000 fix ai21 test 2025-05-07 21:26:35 -07:00
Ishaan Jaff 85d843ab4d text fix vertex deprecated a model 2025-05-03 16:22:24 -07:00
Krish Dholakia 7273bb442a UI - allow reassigning team to other org (#10527)
* feat(team_info.tsx): allow user to reassign team to another org

* style(team_info.tsx): fix org id styling

* feat(team_endpoints.py): add validation check before migrating team to another org

ensure model access, budgets and membership is respected

* fix(team_endpoints.py): update model migration validation to check if org has 'all-proxy-models' access

* fix(organization_view.tsx): show teams belonging to org

* feat(team_endpoints.py): handle wildcard model check on org migration

* fix(team_endpoints.py): nest router check

* test: update testing - use model with higher quota

* build: update poetry lock
2025-05-03 08:44:43 -07:00
Carlos Freund cb177dbd7a Fix and rewrite of token_counter (#10409)
* added tests

messages_with_counts: Made tolerance explicit for each test. But they match the new implementation(which beats the old)

* new token counter impl

* compare old and new implementation in test

* delete old token counter

* moved tests to /tests/litellm/litellm_core_utils

* use existing types

* docstrings

* warn about using default params on unknown model.

* created type for the token_counter_function

* check key == "content"

* throw error on invalid detail-type, ignore type-warning.

* fix imports
2025-05-01 23:34:37 -07:00
Krish Dholakia 9cc39af131 Add vertex ai meta llama 4 support + handle tool call result in content for vertex ai (#10492)
* refactor(vertex_ai/llama): handle response transformation within config

Allows us to handle https://github.com/BerriAI/litellm/issues/10441#issuecomment-2844975599

* fix(vertex_ai/llama): handle tool call in content

Fixes https://github.com/BerriAI/litellm/issues/10441

* fix(vertex_ai/llama): return 'tool_calls' as finish reason if tool call returned

vertex ai returns stop

* feat(vertex_ai/): cost tracking for vertex_ai/meta/llama-4

* ci(test-linting.yml): pin openai version

* build: reorder pinning

* ci(pyproject.toml): limit openai version

temporary patch as new version has linting errors

* ci(pyproject.toml): limit openai version

temporary patch around linting errors

* ci(limit-openai-version): temporary patch

* fix: fix linting errors

* fix: fix linting error

* fix(parallel_request_limiter_v2.py): add team based multi-instance rate limiting

* fix: fix linting errors

* build(pyproject.toml): modify pin

* ci: bump pin
2025-05-01 22:47:06 -07:00
Ishaan Jaff de7870cb54 Add llamafile as a provider (#10203) (#10482)
* Update docs for OpenAI compatible providers, add Llamafile docs, include Llamafile in the sidebar

* Add Llamafile as an LlmProviders enum

* Add llamafile as a OpenAI compatible provider (in the list of compatible providers)

* Add Llamafile chat config and tests

* Wire up Llamafile

Co-authored-by: Peter Wilson <peter@mozilla.ai>
2025-05-01 18:36:55 -07:00
Krrish Dholakia 66cf75cd5d test: handle internal server errors 2025-05-01 16:47:30 -07:00
Krrish Dholakia cec138c47e test: remove redundant tests 2025-05-01 16:46:21 -07:00
Krrish Dholakia 4ab0ee0b65 test: more testing fixes 2025-05-01 15:36:13 -07:00
Ruperto A. Martinez 298a3574f4 Add supports_pdf_input: true to Claude 3.7 bedrock models (#9917)
* Add supports_pdf_input: true to Claude 3.7 bedrock models

* update unit test

---------

Co-authored-by: RupertoXTI <rmartinez@xtillion.com>
2025-05-01 14:56:54 -07:00
Krish Dholakia 6ad483dde7 Litellm dev 04 30 2025 p1 (#10462)
* fix(exception_mapping_utils.py): correctly pass through 504 status code

openai also raises a 504 status code

* build(model_prices_and_context_window.json): add gpt-4o-mini-tts to model cost map

Fixes https://github.com/BerriAI/litellm/issues/9591

* fix(cost_calculator.py): fix input cost calculation for gpt-4o-mini-tts

Fixes https://github.com/BerriAI/litellm/issues/9591

* test: testing updates
2025-04-30 22:11:12 -07:00
Krish Dholakia 711601e22a Add key-level multi-instance tpm/rpm/max parallel request limiting (#10458)
* fix: initial commit of v2 parallel request limiter hook

enables multi-instance rate limiting to work

* fix: subsequent commit with additional refactors

* fix(parallel_request_limiter_v2.py): cleanup initial call hook

simplify it

* fix(parallel_request_limiter_v2.py): working v2 parallel request limiter

* fix: more updates - still not passing testing

* fix(test_parallel_request_limiter_v2.py): update test + add conftest

* fix: fix ruff checks

* fix(parallel_request_limiter_v2.py): use pull via pattern method to load in keys instance wouldn't have seen yet

Fixes issue where redis syncing was not pulling key until instance had seen it

* test: update testing to cover tpm and rpm

* fix(parallel_request_limiter_v2.py): fix ruff errors

* fix(proxy/hooks/__init__.py): feature flag export

* fix(proxy/hooks/__init_.py): fix linting error

* ci(config.yml): add tests/enterprise to ci/cd

* fix: fix ruff check

* test: update testing
2025-04-30 21:32:31 -07:00
Krish Dholakia 9e35ca2010 Embedding caching fixes - handle str -> list cache, set usage tokens for cache hits, combine usage tokens on partial cache hits (#10424)
* build(model_prices_and_context_window.json): add fireworks ai new 0-4b pricing tier

* build(model_prices_and_context_window.json): add more fireworks ai models

* test: update testing

* fix(caching_handler.py): handle str + list cache

Fixes issue on cache hits for embedding when initial cached input was str

* test(test_caching.py): add e2e test on caching with individual item and then list

* fix(caching_handler.py): set usage tokens for cache hits

enables token counting to work

* fix(caching_handler.py): combine usage between cached result and embedding response

Handles case of new input to embedding response

* fix: cleanup

* test: move to gpt-4o-new-test

* test: update test
2025-04-29 21:21:28 -07:00
Krish Dholakia d783190e04 Update fireworks ai pricing (#10425)
* build(model_prices_and_context_window.json): add fireworks ai new 0-4b pricing tier

* build(model_prices_and_context_window.json): add more fireworks ai models

* test: update testing

* test: testing updates

* test: update test

* test: update test
2025-04-29 20:58:05 -07:00
Krish Dholakia bf9382a182 Handle more gemini tool calling edge cases + support bedrock 'stable-image-core' (#10351)
* test(test_amazing_vertex_completion.py): try to repro https://github.com/BerriAI/litellm/issues/10319

* fix(common_utils.py): handle edge case on tools

Fixes https://github.com/BerriAI/litellm/issues/10319

* test: add unit testing for infinite loops

* fix(amazon_stability3_transformation.py): support 'stable-image-core' transformation

Fixes https://github.com/BerriAI/litellm/issues/8488

* test: add unit testing for stable image core model

* test: update test
2025-04-28 14:22:29 -07:00
Krrish Dholakia a649f10e63 test: update test to not use gemini-pro
google removed it
2025-04-23 11:31:09 -07:00
Krrish Dholakia 8184124217 test: update testing 2025-04-23 11:21:50 -07:00
Krrish Dholakia 174a1aa007 test: update test 2025-04-23 10:51:18 -07:00
Krrish Dholakia f5996b2f6b test: update test to skip 'gemini-pro' - model deprecated 2025-04-23 00:01:02 -07:00
Ishaan Jaff 7cb95bcc96 [Bug Fix] caching does not account for thinking or reasoning_effort config (#10140)
* _get_litellm_supported_chat_completion_kwargs

* test caching with thinking
2025-04-21 22:39:40 -07:00
Ishaan Jaff 3c463f6715 test fix - output_cost_per_reasoning_token was added to model cost map 2025-04-19 10:02:25 -07:00
Krish Dholakia 2508ca71cb Handle fireworks ai tool calling response (#10130)
* feat(fireworks_ai/chat): handle tool calling with fireworks ai correctly

Fixes https://github.com/BerriAI/litellm/issues/7209

* fix(utils.py): handle none type in message

* fix: fix model name in test

* fix(utils.py): fix validate check for openai messages

* fix: fix model returned

* fix(main.py): fix text completion routing

* test: update testing

* test: skip test - cohere having RBAC issues
2025-04-19 09:37:45 -07:00
Ishaan Jaff 0a35c208d7 test assistants fixes 2025-04-19 08:09:45 -07:00
Ishaan Jaff a62805f98f fixes for assistans API tests 2025-04-19 07:59:53 -07:00
Ishaan Jaff 5bf76f0bb1 test fixes for azure assistants 2025-04-19 07:36:40 -07:00
Ishaan Jaff b9756bf006 test_completion_azure 2025-04-19 07:24:11 -07:00
Krrish Dholakia 652e1b7f0f test: update test 2025-04-18 20:36:15 -07:00
Krrish Dholakia 3e87ec4f16 test: replace removed fireworks ai models 2025-04-18 14:23:16 -07:00
Krish Dholakia 1ea046cc61 test: update tests to new deployment model (#10142)
* test: update tests to new deployment model

* test: update model name

* test: skip cohere rbac issue test

* test: update test - replace gpt-4o model
2025-04-18 14:22:12 -07:00
Krrish Dholakia 415abfc222 test: update test 2025-04-18 13:13:58 -07:00
Krrish Dholakia f7dd688035 test: handle cohere rbac issue (verified happens on calling azure directly) 2025-04-18 08:42:12 -07:00
Ishaan Jaff 257e78ffb5 test fix vertex_ai/mistral-large@2407 2025-04-16 21:52:52 -07:00
Ishaan Jaff 198922b26f test fixes for vertex mistral, this model was deprecated on vertex 2025-04-16 20:51:45 -07:00
Ishaan Jaff c38146e180 test fix 2025-04-16 20:13:31 -07:00