Files
litellm/tests/local_testing
Krish Dholakia 9e35ca2010 Embedding caching fixes - handle str -> list cache, set usage tokens for cache hits, combine usage tokens on partial cache hits (#10424)
* build(model_prices_and_context_window.json): add fireworks ai new 0-4b pricing tier

* build(model_prices_and_context_window.json): add more fireworks ai models

* test: update testing

* fix(caching_handler.py): handle str + list cache

Fixes issue on cache hits for embedding when initial cached input was str

* test(test_caching.py): add e2e test on caching with individual item and then list

* fix(caching_handler.py): set usage tokens for cache hits

enables token counting to work

* fix(caching_handler.py): combine usage between cached result and embedding response

Handles case of new input to embedding response

* fix: cleanup

* test: move to gpt-4o-new-test

* test: update test
2025-04-29 21:21:28 -07:00
..
2025-04-19 08:09:45 -07:00
2025-04-16 07:57:10 -07:00
2025-03-11 08:27:36 -04:00
2025-01-22 20:19:31 +09:00
2025-03-21 16:21:18 -07:00
2025-02-10 22:13:58 -08:00
2025-04-01 07:12:29 -07:00
2025-01-05 13:43:32 -08:00