Commit Graph

11 Commits

Author SHA1 Message Date
Ishaan Jaffer 85d4000af6 test_vertex_ai_partner_models_token_counting_endpoint 2025-11-26 11:37:55 -08:00
Carlo Alberto Ferraris b50fcc4b56 vertex ai: use the correct domain for the global location when counting tokens (#17116) 2025-11-25 19:22:20 -08:00
Bowen Liang 4e12e3f90d fix typo of orginal (#16255) 2025-11-04 18:55:44 -08:00
steve-gore-snapdocs 88240c4cba Fix Anthropic token counting for VertexAI (#16171)
* transform anthropic messages in gemini handler

* initial

* linting

* remove extra testt

* maintain consistency

* more tests

* Revert "transform anthropic messages in gemini handler"

This reverts commit 805e60fd2887991bb4b4554b9394437b874835f9.

* don't lint file we aren't changing

* cleanup

* cleanup

* Cleanup
2025-11-02 09:02:07 -08:00
Tim Elfrink c234b13275 Apply code formatting and linting fixes
- Apply Black formatting to all Bedrock CountTokens files
- Clean up imports and remove unused variables in tests
- Fix indentation and simplify test structure
- Fix pyright type error with type ignore annotation
- All tests continue to pass after cleanup
2025-09-18 08:28:17 +02:00
Tim Elfrink e74ac35b5d Add comprehensive tests for Bedrock CountTokens functionality
- Add endpoint integration test in test_proxy_token_counter.py
- Add unit tests for transformation logic in bedrock/count_tokens/
- Test model extraction from request body vs endpoint path
- Test input format detection (converse vs invokeModel)
- Test request transformation from Anthropic to Bedrock format
- All tests follow existing codebase patterns and pass successfully
2025-09-18 08:16:56 +02:00
Ishaan Jaff 1249385a99 [Feat] GEMINI CLI - Add Token Counter for VertexAI Models (#13558)
* add VertexAIModelInfo

* working API call to vertex ai

* add count_tokens MODE

* _construct_url

* test_vertex_ai_gemini_token_counting_with_contents
2025-08-12 20:53:47 -07:00
Ishaan Jaff afe159bb8b [Feat] GEMINI CLI Integration - Add /countTokens endpoint support (#13545)
* stash changes for token counter

* working TokenCountRequest

* working acount_tokens

* add GoogleAIStudioTokenCounter

* re-use validate_environment

* fixes count_tokens

* fixes google_count_tokens

* fixes token counter base class

* fix TokenCountResponse

* fix - use BaseTokenCounter

* add should_use_token_counting_api

* fixes for GoogleAIStudioTokenCounter

* fixes for should_use_token_counting_api

* fixes for google_count_tokens

* fixes for /messages count_tokens

* fixes for should_use_token_counting_api

* working e2e gemini token counter

* ruff check fixes

* fixes for token counter

* fixes for TokenCountResponse

* cleanup TokenCountRequest

* add TokenCountDetailsResponse

* fix use well typed Responses

* fix typing for TokenCountDetailsResponse

* test_vertex_ai_gemini_token_counting_with_contents

* fixes for TokenCountDetailsResponse

* test fixes

* test_factory_registration

* test_proxy_token_counter.py

* TestGoogleAIStudioTokenCounter

* fix token_counter
2025-08-12 16:19:58 -07:00
Jugal D. Bhatt 609fa9f5ca [LLM Translation + Coding tools] Added litellm claude code count tokens support (#13261)
* Added litellm claude code count tokens support

* fix mypy

* create helper

* Revert construct

* revert construct

* fix return

* Add reutrn none

* change to factory approach

* refactor to BaseModelInfo

* enum fix
2025-08-05 10:57:24 -07:00
Krish Dholakia e68bb4e051 Litellm dev 12 12 2024 (#7203)
* fix(azure/): support passing headers to azure openai endpoints

Fixes https://github.com/BerriAI/litellm/issues/6217

* fix(utils.py): move default tokenizer to just openai

hf tokenizer makes network calls when trying to get the tokenizer - this slows down execution time calls

* fix(router.py): fix pattern matching router - add generic "*" to it as well

Fixes issue where generic "*" model access group wouldn't show up

* fix(pattern_match_deployments.py): match to more specific pattern

match to more specific pattern

allows setting generic wildcard model access group and excluding specific models more easily

* fix(proxy_server.py): fix _delete_deployment to handle base case where db_model list is empty

don't delete all router models  b/c of empty list

Fixes https://github.com/BerriAI/litellm/issues/7196

* fix(anthropic/): fix handling response_format for anthropic messages with anthropic api

* fix(fireworks_ai/): support passing response_format + tool call in same message

Addresses https://github.com/BerriAI/litellm/issues/7135

* Revert "fix(fireworks_ai/): support passing response_format + tool call in same message"

This reverts commit 6a30dc692986a513cfb99c7a10c7cd34d8b93a4f.

* test: fix test

* fix(replicate/): fix replicate default retry/polling logic

* test: add unit testing for router pattern matching

* test: update test to use default oai tokenizer

* test: mark flaky test

* test: skip flaky test
2024-12-13 08:54:03 -08:00
Krish Dholakia 27e18358ab fix(pattern_match_deployments.py): default to user input if unable to… (#6632)
* fix(pattern_match_deployments.py): default to user input if unable to map based on wildcards

* test: fix test

* test: reset test name

* test: update conftest to reload proxy server module between tests

* ci(config.yml): move langfuse out of local_testing

reduce ci/cd time

* ci(config.yml): cleanup langfuse ci/cd tests

* fix: update test to not use global proxy_server app module

* ci: move caching to a separate test pipeline

speed up ci pipeline

* test: update conftest to check if proxy_server attr exists before reloading

* build(conftest.py): don't block on inability to reload proxy_server

* ci(config.yml): update caching unit test filter to work on 'cache' keyword as well

* fix(encrypt_decrypt_utils.py): use function to get salt key

* test: mark flaky test

* test: handle anthropic overloaded errors

* refactor: create separate ci/cd pipeline for proxy unit tests

make ci/cd faster

* ci(config.yml): add litellm_proxy_unit_testing to build_and_test jobs

* ci(config.yml): generate prisma binaries for proxy unit tests

* test: readd vertex_key.json

* ci(config.yml): remove `-s` from proxy_unit_test cmd

speed up test

* ci: remove any 'debug' logging flag

speed up ci pipeline

* test: fix test

* test(test_braintrust.py): rerun

* test: add delay for braintrust test
2024-11-08 00:55:57 +05:30