Commit Graph

227 Commits

Author SHA1 Message Date
steve-gore-snapdocs 88240c4cba Fix Anthropic token counting for VertexAI (#16171)
* transform anthropic messages in gemini handler

* initial

* linting

* remove extra testt

* maintain consistency

* more tests

* Revert "transform anthropic messages in gemini handler"

This reverts commit 805e60fd2887991bb4b4554b9394437b874835f9.

* don't lint file we aren't changing

* cleanup

* cleanup

* Cleanup
2025-11-02 09:02:07 -08:00
Ishaan Jaffer cb57455172 test_foward_litellm_user_info_to_backend_llm_call 2025-10-27 13:48:23 -07:00
Krish Dholakia 2bd41dc034 Guardrails - Responses API, Image Gen, Text completions, Audio transcriptions, Audio Speech, Rerank, Anthropic Messages API support via the unified apply_guardrails function (#15706)
* fix(presidio.py): handle content as a list of texts

covers openai + anthropic messages api

* fix(presidio.py): safe get messages

* test: add unit testing for presidio guardrails

* fix(unified_guardrail.py): initial commit

* fix(enkryptai.py): implement apply_guardrail to enkrypt guardrail

* fix(unified_guardrail.py): support unified guardrail on input

* feat(unified_guardrail.py): add post call success hook implementation

allows us to just have 1 place to handle llm translation to guardrail api spec

* refactor: refactor initial unified guardrail component

* refactor: more refactoring

* feat(responses/): add guardrails to responses api

allows existing guardrails to work for new llm endpoints

* docs(adding_guardrail_support.md): document new guardrail endpoint support

* test: add unit tests

* feat(image_generation/): add guardrail support for image generation endpoint

* feat(openai/text_completion): support guardrails on `/v1/completions` API

* docs: document guardrails support on new endpoints

* docs: clarify when guardrails run

* feat(openai/speech): add guardrail support for input

* docs(rerank/): add guardrail support on input query

* fix: fix ruff check
2025-10-25 13:38:57 -07:00
Ishaan Jaffer 0bedf1c0a7 fix tests 2025-10-25 10:19:24 -07:00
Carlo Alberto Ferraris 8b1424166b attempt to avoid/minimize deadlocks (#15281)
Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>
2025-10-24 12:22:38 -07:00
Ishaan Jaff f55745fc5e [Fix] Forward anthropic-beta headers to Bedrock, VertexAI (#15700)
* [Fix] Forward anthropic-beta headers to Bedrock and other cross-provider scenarios (#15623)

* add_provider_specific_headers_to_request

* fix add_provider_specific_headers_to_request

* test_provider_specific_header_multi_provider

* test_provider_specific_header_in_request

---------

Co-authored-by: Jack Venberg <jack.venberg@rover.com>
2025-10-18 16:26:32 -07:00
Nagailic Sergiu (Nikro) 6842d705d5 fix(token-counter): extract model_info from deployment for custom_tokenizer (#15657) (#15680) 2025-10-17 19:38:45 -07:00
Achintya Rajan 264f1cded1 Merge branch 'main' into litellm_view_key_pagination_calls_fix 2025-10-06 18:10:57 -07:00
Krrish Dholakia 63cb2764fe test: fix raise 2025-10-04 16:11:22 -07:00
= 6ba077593f Update test_key_generate_prisma.py 2025-10-04 14:36:19 -07:00
= 5e03ef7382 fixes bloated key alias network calls with lean endpoint 2025-10-04 14:32:15 -07:00
Ishaan Jaffer 9c29f35c4b test_end_user_jwt_auth 2025-10-02 18:48:11 -07:00
Ishaan Jaffer ce57f59531 test_gemini_pass_through_endpoint 2025-09-27 17:17:12 -07:00
Ishaan Jaffer 0ec7dace79 test_embedding 2025-09-27 16:57:27 -07:00
Ishaan Jaffer 3c5e0abaf2 async_log_success_event 2025-09-27 14:17:13 -07:00
Ishaan Jaffer 6aa35ec999 test text-embedding-ada-002 2025-09-27 12:41:35 -07:00
Ishaan Jaffer c27beb74b9 test fix 2025-09-27 12:40:34 -07:00
Ishaan Jaffer 284a8549a1 test_chat_completion 2025-09-27 11:43:20 -07:00
Ishaan Jaffer 3baa3aff1b test fix 2025-09-27 10:38:35 -07:00
Mubashir Osmani 625ed3f8cf fix: prisma client state retries (#14925)
* added qwen models and gpt-5-codex

* fix flaky test

* fix failing test

* Added retries to prisma client state

* fix: prisma client state retries in pods

* Revert "fix failing test"

This reverts commit dbec4988a2627257fd05b905e216225664517f32.

* Revert "fix flaky test"

This reverts commit b0ac2f2dc35ca433af0c82f3cda770d6981caff4.

* Revert "added qwen models and gpt-5-codex"

This reverts commit 9a8a8f2d47ab4dc8aecb0cd9a6a4f82ed81bb056.

* Revert "fix: prisma client state retries in pods"

This reverts commit 04e58e5ca1a489916e3b49e9b674f5c6713fd7cd.

* fix lint

* Revert "fix lint"

This reverts commit 5303d52a5e3bee7e131dcabd098e94f0613a7bb9.

* fixed lint
2025-09-25 21:54:00 -07:00
Alexsander Hamir eaa04cd8ce fix: use fastuuid helper (#14903)
* fix: use fastuuid helper across the codebase

First batch of changes, simple drop in replacement.

* second batch of changes

* fixed: script mistake on helper file
2025-09-25 15:47:01 -07:00
Mubashir Osmani a7a6381926 fix: flaky passthrough tests (#14692)
* fix: flaky passthrough tests

* Revert "fix: flaky passthrough tests"

This reverts commit ffe692e017600a8853ab7c31f95485958ab74c5f.

* fix: serialize prisma objects
2025-09-18 15:35:14 -07:00
Krish Dholakia bfaab8ad7e Merge pull request #14557 from timelfrink/fix/issue-14478-bedrock-count-tokens-endpoint
Implement AWS Bedrock CountTokens API support
2025-09-17 23:51:06 -07:00
Tim Elfrink c234b13275 Apply code formatting and linting fixes
- Apply Black formatting to all Bedrock CountTokens files
- Clean up imports and remove unused variables in tests
- Fix indentation and simplify test structure
- Fix pyright type error with type ignore annotation
- All tests continue to pass after cleanup
2025-09-18 08:28:17 +02:00
Tim Elfrink e74ac35b5d Add comprehensive tests for Bedrock CountTokens functionality
- Add endpoint integration test in test_proxy_token_counter.py
- Add unit tests for transformation logic in bedrock/count_tokens/
- Test model extraction from request body vs endpoint path
- Test input format detection (converse vs invokeModel)
- Test request transformation from Anthropic to Bedrock format
- All tests follow existing codebase patterns and pass successfully
2025-09-18 08:16:56 +02:00
Mubashir Osmani 8b804303ed fix: ci/cd tests + lint errors (#14646)
* fix: lint errors + tests

* fixed ci tests

* fixed tests

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
2025-09-17 17:06:43 -07:00
Sameer Kankute 69c01488bd remove not needed names (#14641) 2025-09-17 14:26:48 -07:00
Krish Dholakia 635dc72211 Merge pull request #14604 from Sameerlite/litellm_gemini_api_base_update
Litellm gemini api base update
2025-09-16 22:38:44 -07:00
Alexsander Hamir 02db2e8ae8 [Performance] RPS Improvement +500 RPS when sending the user field (#14616)
* perf tool

* fix: cache type issue

* fix: exception hanging & cache setting

1. Removed unhandled exceptions
2. Set cache value to dict
2025-09-16 16:18:23 -07:00
Sameerlite f08fc45a0f add base url support for gemini 2025-09-16 15:15:24 +05:30
Sameer Kankute 1a123b2cd5 Litellm gemini cli bug fix (#14451)
* Fix gemini cli error

* Add reasoning request support

* Added better handling

* remove other PR code

* refactored code for better structure following

---------

Co-authored-by: sameer@berri.ai <sameer@berri.ai>
2025-09-12 11:55:26 -07:00
Krrish Dholakia c45ede7187 test: update test 2025-09-09 21:31:34 -07:00
Ishaan Jaff 2cc85936ed Revert "Security fix - prevent proxy_admin_viewer from modifying other user's credentials + remove hardcoded sensitive keys from test repo" (#14362) 2025-09-08 18:40:54 -07:00
Krrish Dholakia 06d472f205 test: fix tests 2025-09-06 21:59:02 -07:00
Krish Dholakia 2716fa7981 Merge branch 'main' into litellm_dev_09_01_2025_p2 2025-09-06 19:03:25 -07:00
Ishaan Jaff 49532d6d8b Revert "[Feat]Cancel upstream on client disconnect (#14295)" (#14304)
This reverts commit 51de2ebb64.
2025-09-06 15:29:15 -07:00
Krish Dholakia 447f1ea6bc Merge pull request #14118 from iabhi4/fix-14100
bug(auth): support for ES256/ES384/ES512 and EdDSA JWT verification
2025-09-06 09:57:26 -07:00
katsuhiro muto 51de2ebb64 [Feat]Cancel upstream on client disconnect (#14295)
* cancel upstream on client disconnect

* add comments

* add test

* set timeout in constraints.py

* Guard against missing 'type' key

* update dependency to fix uvicorn bugs
2025-09-06 08:58:51 -07:00
Krrish Dholakia ff955c73a8 refactor: remove db credential comment 2025-09-01 19:54:47 -07:00
iabhi4 75e698feef bug(auth): support for ES256/ES384/ES512 and EdDSA JWT verification 2025-08-31 15:58:26 -07:00
Ishaan Jaff 1249385a99 [Feat] GEMINI CLI - Add Token Counter for VertexAI Models (#13558)
* add VertexAIModelInfo

* working API call to vertex ai

* add count_tokens MODE

* _construct_url

* test_vertex_ai_gemini_token_counting_with_contents
2025-08-12 20:53:47 -07:00
Ishaan Jaff afe159bb8b [Feat] GEMINI CLI Integration - Add /countTokens endpoint support (#13545)
* stash changes for token counter

* working TokenCountRequest

* working acount_tokens

* add GoogleAIStudioTokenCounter

* re-use validate_environment

* fixes count_tokens

* fixes google_count_tokens

* fixes token counter base class

* fix TokenCountResponse

* fix - use BaseTokenCounter

* add should_use_token_counting_api

* fixes for GoogleAIStudioTokenCounter

* fixes for should_use_token_counting_api

* fixes for google_count_tokens

* fixes for /messages count_tokens

* fixes for should_use_token_counting_api

* working e2e gemini token counter

* ruff check fixes

* fixes for token counter

* fixes for TokenCountResponse

* cleanup TokenCountRequest

* add TokenCountDetailsResponse

* fix use well typed Responses

* fix typing for TokenCountDetailsResponse

* test_vertex_ai_gemini_token_counting_with_contents

* fixes for TokenCountDetailsResponse

* test fixes

* test_factory_registration

* test_proxy_token_counter.py

* TestGoogleAIStudioTokenCounter

* fix token_counter
2025-08-12 16:19:58 -07:00
Krish Dholakia 0da25fadc0 Exclude none fields on /chat/completion - fixes n8n bug + Allow calling /v1/models when end user over budget (#13320)
* fix(proxy_server.py): exclude none fields before returning

Fixes https://github.com/BerriAI/litellm/issues/13055

* test: add unit tests

* feat(auth_checks.py): allow info routes to work when end user over budget

Fixes https://github.com/BerriAI/litellm/issues/13286
2025-08-05 21:39:46 -07:00
Jugal D. Bhatt 609fa9f5ca [LLM Translation + Coding tools] Added litellm claude code count tokens support (#13261)
* Added litellm claude code count tokens support

* fix mypy

* create helper

* Revert construct

* revert construct

* fix return

* Add reutrn none

* change to factory approach

* refactor to BaseModelInfo

* enum fix
2025-08-05 10:57:24 -07:00
Krrish Dholakia 952c2b5215 test: update test 2025-08-01 09:07:53 -07:00
Ishaan Jaff 79be436c2b [Feat] Background Health Checks - Allow disabling background health checks for a specific (#13186)
* disable background health checks for specific models

* test_background_health_check_skip_disabled_models

* Disable Background Health Checks For Specific Models
2025-07-31 13:48:35 -07:00
Krrish Dholakia 7e5bc8af28 test: update test 2025-07-29 21:35:44 -07:00
Ishaan Jaff 50466e0077 test_user_api_key_auth 2025-07-29 18:01:40 -07:00
Ishaan Jaff a6f7c70185 [Feat] Allow using query_params for setting API Key for generateContent routes (#13100)
* fix is_generate_content_route

* fix route checks

* fix get_api_key
2025-07-29 14:11:06 -07:00
Krrish Dholakia ff0b40a22b test: fix test 2025-07-27 09:52:22 -07:00