Commit Graph

21313 Commits

Author SHA1 Message Date
Ishaan Jaff dabbb58cd8 test_nova_optional_params_tool_choice 2025-04-04 22:20:04 -07:00
Krish Dholakia 5099aac1a5 Add DBRX Anthropic w/ thinking + response_format support (#9744)
* feat(databricks/chat/): add anthropic w/ reasoning content support via databricks

Allows user to call claude-3-7-sonnet with thinking via databricks

* refactor: refactor choices transformation + add unit testing

* fix(databricks/chat/transformation.py): support thinking blocks on databricks response streaming

* feat(databricks/chat/transformation.py): support response_format for claude models

* fix(databricks/chat/transformation.py): correctly handle response_format={"type": "text"}

* feat(databricks/chat/transformation.py): support 'reasoning_effort' param mapping for anthropic

* fix: fix ruff errors

* fix: fix linting error

* test: update test

* fix(databricks/chat/transformation.py): handle json mode output parsing

* fix(databricks/chat/transformation.py): handle json mode on streaming

* test: update test

* test: update dbrx testing

* test: update testing

* fix(base_model_iterator.py): handle non-json chunk

* test: update tests

* fix: fix ruff check

* fix: fix databricks config import

* fix: handle _tool = none

* test: skip invalid test
2025-04-04 22:13:32 -07:00
Krish Dholakia e3b231bc11 fix(litellm-proxy-extras/utils.py): check migrations from correct directory + place prisma schema inside litellm-proxy-extras dir (#9767)
Allows prisma migrate deploy to work as expected on new db's
2025-04-04 22:11:07 -07:00
Ishaan Jaff 220fa23d2b watsonx/ibm/granite-3-8b-instruct 2025-04-04 21:46:02 -07:00
Ishaan Jaff e2bb203075 update watsonx/ibm/granite-3-8b-instruct" 2025-04-04 21:45:04 -07:00
Ishaan Jaff f0f2f819bd Merge pull request #9760 from BerriAI/litellm_prometheus_error_monitoring
[Reliability] Prometheus emit llm provider on failure metric - make it easy to differentiate litellm error vs llm api error
2025-04-04 21:37:28 -07:00
Ishaan Jaff b7cd4cef07 test_get_exception_class_name 2025-04-04 21:32:55 -07:00
Ishaan Jaff df4593d58b test prom unit tests 2025-04-04 21:30:05 -07:00
Ishaan Jaff f4353973bd Merge pull request #9766 from BerriAI/litellm_add_auth_metrics_endpoint
[Security feature] Allow adding authentication on /metrics endpoints
2025-04-04 21:28:18 -07:00
Ishaan Jaff b89ed69257 Merge branch 'main' into litellm_add_auth_metrics_endpoint 2025-04-04 21:28:06 -07:00
Ishaan Jaff f402e9bbd1 _get_exception_class_name 2025-04-04 21:23:21 -07:00
Ishaan Jaff 8559bcc252 DB Transaction Queue Health Metrics 2025-04-04 21:16:12 -07:00
Ishaan Jaff 8c3670e192 Merge pull request #9719 from BerriAI/litellm_metrics_pod_lock_manager
[Reliability] Emit operational metrics for new DB Transaction architecture
2025-04-04 21:12:06 -07:00
Ishaan Jaff df51d8bcfa Merge branch 'main' into litellm_metrics_pod_lock_manager 2025-04-04 21:11:39 -07:00
Ishaan Jaff fc4c453cb9 test_no_auth_metrics_when_disabled 2025-04-04 21:02:29 -07:00
Krrish Dholakia 7cd7bdbd0f build: fix model cost map 2025-04-04 20:48:29 -07:00
Krrish Dholakia 5826108c9a build: bump 2025-04-04 20:45:27 -07:00
caramulrooney 3e9066e91d Update model_prices_and_context_window.json (#9620)
Add watsonx/ibm/granite-3-8b-instruct
2025-04-04 20:44:06 -07:00
Hugo Liu 08f9e1447b fix(asr-groq): add groq whisper models to model cost map (#9648)
Co-authored-by: liuhu <liuhu@huami.com>
2025-04-04 20:43:46 -07:00
Chaos Yu 001043ba05 make sure metadata available and have a value (#9764) 2025-04-04 20:39:12 -07:00
Ishaan Jaff eaad3b2402 PrometheusAuthMiddleware 2025-04-04 20:37:53 -07:00
Krish Dholakia af42e5855f Gemini image generation output support (#9646)
* fix(gemini/transformation.py): make GET request to get uri details, if cannot be inferred

* fix: fix linting errors

* Revert "fix: fix linting errors"

This reverts commit 926a5a527ff27a107b39da8f5a26b0ee8e2d9884.

* fix(gemini/transformation.py): modalities param support

Partially resolves https://github.com/BerriAI/litellm/issues/9237

* feat(google_ai_studio/): add image generation support

Closes https://github.com/BerriAI/litellm/issues/9237

* fix: fix types

* fix: fix ruff check
2025-04-04 20:37:48 -07:00
Ishaan Jaff 86b473d267 allow adding auth on /metrics endpoint 2025-04-04 20:37:17 -07:00
Krish Dholakia 90a4dfab3c fix(xai/chat/transformation.py): filter out 'name' param for xai non-… (#9761)
* fix(xai/chat/transformation.py): filter out 'name' param for xai non-user roles

Fixes https://github.com/BerriAI/litellm/issues/9720

* test fix test_hf_chat_template

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
2025-04-04 20:37:08 -07:00
Krish Dholakia d66db2207b Allow team members to see team models (#9742)
* fix(proxy_server.py): allow team member to see team models

* fix(model_dashboard.tsx): show edit + delete icons to be disabled if user is not admin and did not create models

* fix(proxy_server.py): fix ruff function size error

* fix(proxy_server.py): fix user model filter check
2025-04-04 20:36:48 -07:00
Ishaan Jaff 96ce5dbf7d _should_run_auth_on_metrics_endpoint 2025-04-04 20:32:04 -07:00
Ishaan Jaff c7523818b4 PrometheusAuthMiddleware 2025-04-04 20:27:17 -07:00
Krrish Dholakia b5851769fc fix: fix import 2025-04-04 20:26:11 -07:00
Krrish Dholakia 6395bd8d65 test: mark flaky test 2025-04-04 20:25:05 -07:00
Ishaan Jaff f16c531002 _mount_metrics_endpoint 2025-04-04 19:54:20 -07:00
Krish Dholakia c555c15ad7 fix(router.py): support reusable credentials via passthrough router (#9758)
* fix(router.py): support reusable credentials via passthrough router

enables reusable vertex credentials to be used in passthrough

* test: fix test

* test(test_router_adding_deployments.py): add unit testing
2025-04-04 18:40:14 -07:00
Ishaan Jaff 253060cb09 allow requiring auth for /metrics endpoint 2025-04-04 17:35:02 -07:00
Ishaan Jaff 8d76da03fe Merge pull request #9759 from BerriAI/litellm_reliability_fix_db_txs
[Reliability] v2 DB Deadlock Reduction Architecture – Add Max Size for In-Memory Queue + Backpressure Mechanism
2025-04-04 17:12:20 -07:00
Ishaan Jaff c402db9057 prometheus emit llm provider on failure metric 2025-04-04 17:07:43 -07:00
Ishaan Jaff 150e77cd7d Merge branch 'main' into litellm_reliability_fix_db_txs 2025-04-04 16:46:46 -07:00
Ishaan Jaff d3018a4c28 Merge branch 'main' into litellm_metrics_pod_lock_manager 2025-04-04 16:46:32 -07:00
Ishaan Jaff 5c2bc796b1 test fix test_hf_chat_template 2025-04-04 16:45:20 -07:00
Ishaan Jaff 901d6fe7b7 add operational metrics for pod lock manager v2 arch 2025-04-04 16:41:07 -07:00
Krish Dholakia e1f7bcb47d Fix VertexAI Credential Caching issue (#9756)
* refactor(vertex_llm_base.py): Prevent credential misrouting for projects

Fixes https://github.com/BerriAI/litellm/issues/7904

* fix: passing unit tests

* fix(vertex_llm_base.py): common auth logic across sync + async vertex ai calls

prevents credential caching issue across both flows

* test: fix test

* fix(vertex_llm_base.py): handle project id in default cause

* fix(factory.py): don't pass cache control if not set

bedrock invoke does not support this

* test: fix test

* fix(vertex_llm_base.py): add .exception message in load_auth

* fix: fix ruff error
2025-04-04 16:38:08 -07:00
Ishaan Jaff bde88b3ba6 fix type error 2025-04-04 16:34:43 -07:00
Ishaan Jaff 1cdee4b331 Merge branch 'main' into litellm_metrics_pod_lock_manager 2025-04-04 16:33:16 -07:00
Ishaan Jaff decb6649ec test_queue_flush_limit 2025-04-04 16:29:06 -07:00
Ishaan Jaff e77a178a37 test_queue_size_reduction_with_large_volume 2025-04-04 16:21:29 -07:00
Ishaan Jaff dc063fdfec test_queue_size_reduction_with_large_volume 2025-04-04 15:59:35 -07:00
Ishaan Jaff 5bed0b7557 aggregated values 2025-04-04 15:55:14 -07:00
Ishaan Jaff eb48cbdec6 aggregate_queue_updates 2025-04-04 15:54:07 -07:00
Ishaan Jaff cdd351a03b Merge pull request #9745 from BerriAI/litellm_sso_fixes_dev
[Feat] Allow assigning SSO users to teams on MSFT SSO
2025-04-04 15:40:19 -07:00
Ishaan Jaff 888446256c fix vertex failing test 2025-04-04 15:37:48 -07:00
Ishaan Jaff 93068cb142 flush_all_updates_from_in_memory_queue 2025-04-04 15:34:56 -07:00
Ishaan Jaff 065477abb4 add _get_aggregated_spend_update_queue_item 2025-04-04 15:32:27 -07:00