Commit Graph

5 Commits

Author SHA1 Message Date
Ishaan Jaff 7656cb3d6e test fix 2025-09-01 17:04:47 -07:00
Krish Dholakia 78997c2e35 Anthropic - working mid-stream fallbacks (#13149)
* fix(router.py): add acompletion_streaming_iterator inside router

allows router to catch errors mid-stream for fallbacks

Work for https://github.com/BerriAI/litellm/issues/6532

* fix(router.py): working mid-stream fallbacks

* fix(router.py): more iterations

* fix(router.py): working mid-stream fallbacks with fallbacks set on router

* fix(router.py): pass prior content back in new request as assistant prefix message

* fix(router.py): add a system prompt to help guide non-prefix supporting models to use the continued text correctly

* fix(common_utils.py): support converting `prefix: true` for non-prefix supporting models

* fix: reduce LOC in function

* test(test_router.py): add unit tests for new function

* test: add basic unit test

* fix(router.py): ensure return type of fallback stream is compatible with CustomStreamWrapper

prevent client code from breaking

* fix: cleanup

* test: update test

* fix: fix linting error
2025-07-31 21:22:49 -07:00
Ishaan Jaff c7f14e936a (code quality) run ruff rule to ban unused imports (#7313)
* remove unused imports

* fix AmazonConverseConfig

* fix test

* fix import

* ruff check fixes

* test fixes

* fix testing

* fix imports
2024-12-19 12:33:42 -08:00
Krish Dholakia 7e5085dc7b Litellm dev 11 21 2024 (#6837)
* Fix Vertex AI function calling invoke: use JSON format instead of protobuf text format. (#6702)

* test: test tool_call conversion when arguments is empty dict

Fixes https://github.com/BerriAI/litellm/issues/6833

* fix(openai_like/handler.py): return more descriptive error message

Fixes https://github.com/BerriAI/litellm/issues/6812

* test: skip overloaded model

* docs(anthropic.md): update anthropic docs to show how to route to any new model

* feat(groq/): fake stream when 'response_format' param is passed

Groq doesn't support streaming when response_format is set

* feat(groq/): add response_format support for groq

Closes https://github.com/BerriAI/litellm/issues/6845

* fix(o1_handler.py): remove fake streaming for o1

Closes https://github.com/BerriAI/litellm/issues/6801

* build(model_prices_and_context_window.json): add groq llama3.2b model pricing

Closes https://github.com/BerriAI/litellm/issues/6807

* fix(utils.py): fix handling ollama response format param

Fixes https://github.com/BerriAI/litellm/issues/6848#issuecomment-2491215485

* docs(sidebars.js): refactor chat endpoint placement

* fix: fix linting errors

* test: fix test

* test: fix test

* fix(openai_like/handler): handle max retries

* fix(streaming_handler.py): fix streaming check for openai-compatible providers

* test: update test

* test: correctly handle model is overloaded error

* test: update test

* test: fix test

* test: mark flaky test

---------

Co-authored-by: Guowang Li <Guowang@users.noreply.github.com>
2024-11-22 01:53:52 +05:30
Krrish Dholakia 3560f0ef2c refactor: move all testing to top-level of repo
Closes https://github.com/BerriAI/litellm/issues/486
2024-09-28 21:08:14 -07:00