Commit Graph

54 Commits

Author SHA1 Message Date
Sameer Kankute 8fd0c81e5b Add cost tracking for streaming in vertex ai 2025-11-20 15:08:38 +05:30
Ishaan Jaffer 159db27d5c fix test claude-sonnet-4-5-20250929 2025-10-31 18:13:29 -07:00
Sameer Kankute c1369a07ba Add Add per model group header forwarding for Bedrock Invoke API (#16042) 2025-10-30 20:10:17 -07:00
Sameer Kankute 85d4142845 Fix litellm_param based costing 2025-10-08 21:14:23 +05:30
Krish Dholakia 64083111d3 (Feat) Add Vertex AI Live API WebSocket Passthrough with Cost Tracking
(Feat) Add Vertex AI Live API WebSocket Passthrough with Cost Tracking
2025-09-30 21:14:16 -07:00
Ishaan Jaffer 04b3ac89b8 test: QueryParams 2025-09-30 18:45:38 -07:00
Sameerlite ce0b815959 fix test 2025-09-27 02:08:09 +05:30
Sameerlite 61a450f2e2 fix lint 2025-09-27 01:16:09 +05:30
Sameerlite 67e7ad5aa9 Add vertex live api passthrough with cost tracking 2025-09-27 00:55:47 +05:30
Ishaan Jaffer 706b9214c0 fix: test_init_kwargs_for_pass_through_endpoint_basic 2025-09-18 07:59:05 -07:00
Ishaan Jaff 433d1a4947 [Bug fix] - Fix /messages fallback from Anthropic API -> Bedrock API (#13946)
* use helper get_provider_specific_headers

* fix get_provider_specific_headers

* test_anthropic_messages_fallbacks

* bedrock/us.anthropic.claude-sonnet-4

* fix: get_provider_specific_headers

* TestProviderSpecificHeaderUtils

* test_anthropic_messages_fallbacks
2025-08-25 13:44:54 -07:00
Ishaan Jaff b78495d398 [Fix] Ensure /messages works when using `bedrock/converse/<model> with LiteLLM (#13627)
* get_bedrock_provider_config_for_messages_api

* fixes for get_bedrock_provider_config_for_messages_api

* test_anthropic_messages_litellm_router_bedrock

* fix merge conflicts

* fix - refactor based on jugal's comment
2025-08-14 16:50:05 -07:00
Krish Dholakia 039c8a922c Azure api_version="preview" support + Bedrock cost tracking via Anthropic /v1/messages (#13072)
* fix(azure/chat/gpt_transformation.py): support api_version="preview"

Fixes https://github.com/BerriAI/litellm/issues/12945

* Fix anthropic passthrough logging handler model fallback for streaming requests (#13022)

* fix: anthropic passthrough logging handler model fallback for streaming requests

- Add fallback logic to retrieve model from logging_obj.model_call_details when request_body.model is empty
- Fixes issue #12933 where streaming requests to anthropic passthrough endpoints would crash due to missing model field
- Ensures downstream logging and cost calculation work correctly for all streaming scenarios
- Maintains backwards compatibility with existing non-streaming requests

* test: add minimal tests for anthropic passthrough logging handler model fallback

- Add unit tests for the model fallback logic in _handle_logging_anthropic_collected_chunks
- Test existing behavior when request_body.model is present
- Test fallback logic when request_body.model is empty but logging_obj.model_call_details has model
- Test edge cases where both sources are empty or missing
- Ensure backwards compatibility and graceful degradation

* fix(anthropic_passthrough_logging_handler.py): add provider to model name (accurate cost tracking)

* fix(anthropic_passthrough_logging_handler.py): don't reset custom llm provider, if already set

* fix: fix check

---------

Co-authored-by: Haggai Shachar <haggai.shachar@backline.ai>
2025-07-29 08:13:55 -07:00
Ishaan Jaff 847c4514c4 test fix - test_anthropic_messages_passthrough.py 2025-06-30 21:56:31 -07:00
Ishaan Jaff d65a9fdcc7 [Bug Fix] Using /messages with lowest latency routing (#12180)
* add validate_anthropic_api_metadata

* fixes for lowest latency deployment

* add _select_metadata_field

* test_anthropic_messages_litellm_router_latency_metadata_tracking
2025-06-30 15:57:19 -07:00
Ishaan Jaff 75298af605 [Bug Fix] Cost tracking and logging via the /v1/messages API are not working when using Claude Code (#11928)
* add test_anthropic_messages_litellm_router_streaming_with_logging to base tests

* move test

* fixes for base ant tests

* working bedrock ant logging

* use BaseAnthropicMessagesStreamingIterator

* use common iterator for messages streaming

* TestAnthropicDirectAPI

* test_anthropic_claude3_transformation.py

* fix code QA checks

* fix logging for anthropic messages in SLP

* fix TestAnthropicOpenAIAPI

* remove hard coded usage for adapter

* test_anthropic_messages_litellm_router_streaming_with_logging
2025-06-20 18:08:35 -07:00
Ishaan Jaff 931b2e4875 [Bug Fix] Fix model_group tracked for /v1/messages and /moderations (#11933)
* fixes _get_router_metadata_variable_name

* fixes _update_kwargs_before_fallbacks

* test_anthropic_messages_litellm_router_non_streaming_with_logging

* test_moderations_api_logging

* fix _pass_through_moderation_endpoint_factory
2025-06-20 14:51:50 -07:00
Ishaan Jaff 9ec6df59e4 fixes for pass through tests 2025-06-18 21:47:37 -07:00
Krish Dholakia c92b6c175c Prometheus - fix request increment + add route tracking for streaming requests (#11731)
* fix(prometheus.py): remove request increment from inside the log success event

it's only done on post-call success/failure

* fix(litellm_logging.py): add additional validation step for checking if 'stream' is true

prevent double counting on non-stream requests

* test: add unit testing to ensure stream is not incorrectly set to true

* feat(litellm_logging.py): emit request route in standard logging payload

used by prometheus streaming metrics for route

* fix: fix otel test

* fix: fix linting errors

* test: update test

* fix: fix linting error
2025-06-14 16:26:48 -07:00
Ishaan Jaff 362e358a77 [Feat] Allow using litellm.completion with /v1/messages API Spec (use gpt-4, gemini etc with claude code) (#11502)
* feat: add anthropic stream wrapper

* feat: add AnthropicExperimentalPassThroughConfig

* feat: working non streaming anthropic

* feat: working streaming anthropic-litellm bridge

* test - anthropic OpenAI bridge tests

* fix: add sync support for anthropic_messages

* fix: using is async check

* fix: ensure streams are SSE

* fix: imports

* fix code qa check

* fix: linting errors

* test_sync_openai_messages

* cleanup remove stash file
2025-06-06 20:35:53 -07:00
Ishaan Jaff 3a6802fef1 [Feat] - Add Support for Showing Passthrough endpoint Error Logs on LiteLLM UI (#10990)
* fix: add error logging for passthrough endpoints

* feat: add error logging for passthrough endpoints

* fix: post_call_failure_hook track errors on pt

* fix: use constant for MAXIMUM_TRACEBACK_LINES_TO_LOG

* docs MAXIMUM_TRACEBACK_LINES_TO_LOG

* test: ensure failure callback triggered

* fix: move _init_kwargs_for_pass_through_endpoint
2025-05-20 18:29:39 -07:00
Ishaan Jaff eeb27d70c1 [Fix] Allow using dynamic aws_region with /messages on Bedrock (#10779)
* fix: fix get_complete_url

* test: test_anthropic_messages_bedrock_dynamic_region
2025-05-12 20:22:38 -07:00
Ishaan Jaff 51930c07c5 [Fix]: /messages - allow using dynamic AWS params (#10769)
* fix: dynamic AWS params added for messages routes

* Update tests/pass_through_unit_tests/test_anthropic_messages_passthrough.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-05-12 14:09:17 -07:00
Ishaan Jaff e5a08a5ae1 [Feat] Add streaming support for using bedrock invoke models with /v1/messages (#10710)
* add basic bedrock transform

* test_anthropic_messages_streaming_bedrock_invoke

* fix: typing ant

* fix: get async response iterator

* fix: code quality check
2025-05-09 18:56:23 -07:00
Ishaan Jaff a0ee31edf8 [Feat] Add support for using Bedrock Invoke models in /v1/messages format (#10681)
* fix: add transform_anthropic_messages_request

* fix: add get_requested_response_api_optional_param

* fix: use base llm http handler for anthropic messages

* fix: add anthropic transform response

* fix: transform_anthropic_messages_response

* fix: fixes for anthropic messages

* fix: code qa fixes

* fix: pass thinking to anthropic

* fix: linting

* fixes

* feat: add folder for bedrock invoke messages

* feat: init bedrock invoke messages for anthropic claude family

* test: add bedrock invoke test for us anthropic

* test: test_anthropic_messages_non_streaming_bedrock_invokec

* feat: update anthropic messages transforms

* feat: update anthropic messages transforms

* Update litellm/utils.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* fix: test_anthropic_messages_non_streaming

* fix: linting override

* fix: linting error

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-05-08 21:16:47 -07:00
Ishaan Jaff 9d8f570f14 [Refactor] Anthropic /v1/messages endpoint - Refactor to use base llm http handler and transformations (#10677)
* fix: add transform_anthropic_messages_request

* fix: add get_requested_response_api_optional_param

* fix: use base llm http handler for anthropic messages

* fix: add anthropic transform response

* fix: transform_anthropic_messages_response

* fix: fixes for anthropic messages

* fix: code qa fixes

* fix: pass thinking to anthropic

* fix: linting

* fixes
2025-05-08 17:56:50 -07:00
Ishaan Jaff 931c0c760c fix test_anthropic_messages_litellm_router_streaming_with_logging 2025-05-03 09:34:32 -07:00
Ishaan Jaff 4eac0f64f3 [Feat] Pass through endpoints - ensure PassthroughStandardLoggingPayload is logged and contains method, url, request/response body (#10194)
* ensure passthrough_logging_payload is filled in kwargs

* test_assistants_passthrough_logging

* test_assistants_passthrough_logging

* test_assistants_passthrough_logging

* test_threads_passthrough_logging

* test _init_kwargs_for_pass_through_endpoint

* _init_kwargs_for_pass_through_endpoint
2025-04-21 19:46:22 -07:00
Ishaan Jaff bd39a395f1 use new anthropic interface 2025-03-31 14:31:09 -07:00
Ishaan Jaff 9eb9a369bb working anthropic API tests 2025-03-26 17:34:41 -07:00
Krrish Dholakia 6b2f385ddf test: update tests 2025-03-22 12:56:42 -07:00
Krrish Dholakia 6f719d0461 test: fix test 2025-03-22 12:50:58 -07:00
Krrish Dholakia 3ce3689282 test: migrate testing 2025-03-22 12:48:53 -07:00
Krrish Dholakia 94d3413335 refactor(llm_passthrough_endpoints.py): refactor vertex passthrough to use common llm passthrough handler.py 2025-03-22 10:42:46 -07:00
Ishaan Jaff f47987e673 (Refactor) /v1/messages to follow simpler logic for Anthropic API spec (#9013)
* anthropic_messages_handler v0

* fix /messages

* working messages with router methods

* test_anthropic_messages_handler_litellm_router_non_streaming

* test_anthropic_messages_litellm_router_non_streaming_with_logging

* AnthropicMessagesConfig

* _handle_anthropic_messages_response_logging

* working with /v1/messages endpoint

* working /v1/messages endpoint

* refactor to use router factory function

* use aanthropic_messages

* use BaseConfig for Anthropic /v1/messages

* track api key, team on /v1/messages endpoint

* fix get_logging_payload

* BaseAnthropicMessagesTest

* align test config

* test_anthropic_messages_with_thinking

* test_anthropic_streaming_with_thinking

* fix - display anthropic url for debugging

* test_bad_request_error_handling

* test_anthropic_messages_router_streaming_with_bad_request

* fix ProxyException

* test_bad_request_error_handling_streaming

* use provider_specific_header

* test_anthropic_messages_with_extra_headers

* test_anthropic_messages_to_wildcard_model

* fix gcs pub sub test

* standard_logging_payload

* fix unit testing for anthopic /v1/messages support

* fix pass through anthropic messages api

* delete dead code

* fix anthropic pass through response

* revert change to spend tracking utils

* fix get_litellm_metadata_from_kwargs

* fix spend logs payload json

* proxy_pass_through_endpoint_tests

* TestAnthropicPassthroughBasic

* fix pass through tests

* test_async_vertex_proxy_route_api_key_auth

* _handle_anthropic_messages_response_logging

* vertex_credentials

* test_set_default_vertex_config

* test_anthropic_messages_litellm_router_non_streaming_with_logging

* test_ageneric_api_call_with_fallbacks_basic

* test__aadapter_completion
2025-03-06 00:43:08 -08:00
Ishaan Jaff 047d1b1208 (Bug Fix) - Accurate token counting for /anthropic/ API Routes on LiteLLM Proxy (#8880)
* fix _create_anthropic_response_logging_payload

* fix - pass through don't create standard logging payload

* fix logged key hash

* test_init_kwargs_for_pass_through_endpoint_basic

* test_unit_test_anthropic_pass_through

* fix anthropic pass through logging handler

* test_stream_token_counting_anthropic_with_include_usage

* convert_str_chunk_to_generic_chunk

* _build_complete_streaming_response

* test_anthropic_basic_completion_with_headers

* test_anthropic_streaming_with_headers

* improve test for pass through token counting
2025-02-27 15:43:03 -08:00
Ishaan Jaff 24df2331ec (fix) Anthropic pass through cost tracking (#8874)
* fix _create_anthropic_response_logging_payload

* fix - pass through don't create standard logging payload

* fix logged key hash

* test_init_kwargs_for_pass_through_endpoint_basic

* test_unit_test_anthropic_pass_through

* fix anthropic pass through logging handler
2025-02-27 15:42:43 -08:00
Ishaan Jaff 65c91cbbbc (QA+UI) - e2e flow for adding assembly ai passthrough endpoints (#8337)
* add initial test for assembly ai

* start using PassthroughEndpointRouter

* migrate to lllm passthrough endpoints

* add assembly ai as a known provider

* fix PassthroughEndpointRouter

* fix set_pass_through_credentials

* working EU request to assembly ai pass through endpoint

* add e2e test assembly

* test_assemblyai_routes_with_bad_api_key

* clean up pass through endpoint router

* e2e testing for assembly ai pass through

* test assembly ai e2e testing

* delete assembly ai models

* fix code quality

* ui working assembly ai api base flow

* fix install assembly ai

* update model call details with kwargs for pass through logging

* fix tracking assembly ai model in response

* _handle_assemblyai_passthrough_logging

* fix test_initialize_deployment_for_pass_through_unsupported_provider

* TestPassthroughEndpointRouter

* _get_assembly_transcript

* fix assembly ai pt logging tests

* fix assemblyai_proxy_route

* fix _get_assembly_region_from_url
2025-02-06 18:27:54 -08:00
Ishaan Jaff 915cc064c5 fix test test_is_assemblyai_route 2025-02-03 21:58:32 -08:00
Ishaan Jaff 8fd60a420d (Feat) - New pass through add assembly ai passthrough endpoints (#8220)
* add assembly ai pass through request

* fix assembly pass through

* fix test_assemblyai_basic_transcribe

* fix assemblyai auth check

* test_assemblyai_transcribe_with_non_admin_key

* working assembly ai test

* working assembly ai proxy route

* use helper func to pass through logging

* clean up logging assembly ai

* test: update test to handle gemini token counter change

* fix(factory.py): fix bedrock http:// handling

* add unit testing for assembly pt handler

* docs assembly ai pass through endpoint

* fix proxy_pass_through_endpoint_tests

* fix standard_passthrough_logging_object

* fix ASSEMBLYAI_API_KEY

* test test_assemblyai_proxy_route_basic_post

* test_assemblyai_proxy_route_get_transcript

* fix is is_assemblyai_route

* test_is_assemblyai_route

---------

Co-authored-by: Krrish Dholakia <krrishdholakia@gmail.com>
2025-02-03 21:54:32 -08:00
Ishaan Jaff b6d61ec22b (Feat) pass through vertex - allow using credentials defined on litellm router for vertex pass through (#8100)
* test_add_vertex_pass_through_deployment

* VertexPassThroughRouter

* fix use_in_pass_through

* VertexPassThroughRouter

* fix vertex_credentials

* allow using _initialize_deployment_for_pass_through

* test_add_vertex_pass_through_deployment

* _set_default_vertex_config

* fix verbose_proxy_logger

* fix use_in_pass_through

* fix _get_token_and_url

* test_get_vertex_location_from_url

* test_get_vertex_credentials_none

* run pt unit testing again

* fix add_vertex_credentials

* test_adding_deployments.py

* rename file
2025-01-29 17:54:02 -08:00
Krrish Dholakia 8ab1335ae0 test: fix unit test 2025-01-16 21:11:17 -08:00
Krish Dholakia c57266c9dc test: initial commit enforcing testing on all anthropic pass through … (#7794)
* test: initial commit enforcing testing on all anthropic pass through functions

prevents future regressions

* test(test_unit_test_anthropic_pass_through.py): add unit test for '_get_user_from_metadata' function

* test(test_unit_test_anthropic_passthrough.py): add unit test for handle_logging_anthropic_collected_chunks

* test(test_unit_test_anthropic_pass_through): add coverage for all anthropic pass through functions
2025-01-15 22:02:35 -08:00
Krish Dholakia 80d6bbec29 Litellm dev 01 14 2025 p2 (#7772)
* feat(pass_through_endpoints.py): fix anthropic end user cost tracking

* fix(anthropic/chat/transformation.py): use returned provider model for anthropic

handles anthropic `-latest` tag in request body throwing cost calculation errors

ensures we can be accurate in our model cost tracking

* feat(model_prices_and_context_window.json): add gemini-2.0-flash-thinking-exp pricing

* test: update test to use assumption that user_api_key_dict can get anthropic user id

* test: fix test

* fix: fix test

* fix(anthropic_pass_through.py): uncomment previous anthropic end-user cost tracking code block

can't guarantee user api key dict always has end user id - too many code paths

* fix(user_api_key_auth.py): this allows end user id from request body to always be read and set in auth object

* fix(auth_check.py): fix linting error

* test: fix auth check

* fix(auth_utils.py): fix get end user id to handle metadata = None
2025-01-15 21:34:50 -08:00
Ishaan Jaff 137879ffea vertex testing use pathrise-convert-1606954137718 2025-01-05 14:00:17 -08:00
Ishaan Jaff f3b13a9af3 (feat) Add Bedrock knowledge base pass through endpoints (#7267)
* bugfix: Proxy Routing for Bedrock Knowledgebase URLs are incorrect (#7097)

* Fixing routing bug where bedrock knowledgebase urls were being generated incorrectly

* Preparing for PR

* Preparing for PR

* Preparing for PR

---------

Co-authored-by: Luke Birk <lb0737@att.com>

* fix _is_bedrock_agent_runtime_route

* docs - Query Knowledge Base

* test_is_bedrock_agent_runtime_route

* fix bedrock_proxy_route

---------

Co-authored-by: LBirk <2731718+LBirk@users.noreply.github.com>
Co-authored-by: Luke Birk <lb0737@att.com>
2024-12-16 22:19:34 -08:00
Krish Dholakia 816f0ef8d2 LiteLLM Minor Fixes & Improvements (12/05/2024) (#7051)
* fix(cost_calculator.py): move to using `.get_model_info()` for cost per token calculations

ensures cost tracking is reliable - handles edge cases of parsing model cost map

* build(model_prices_and_context_window.json): add 'supports_response_schema' for select tgai models

Fixes https://github.com/BerriAI/litellm/pull/7037#discussion_r1872157329

* build(model_prices_and_context_window.json): remove 'pdf input' and 'vision' support from nova micro in model map

Bedrock docs indicate no support for micro - https://docs.aws.amazon.com/bedrock/latest/userguide/conversation-inference-supported-models-features.html

* fix(converse_transformation.py): support amazon nova tool use

* fix(opentelemetry): Add missing LLM request type attribute to spans (#7041)

* feat(opentelemetry): add LLM request type attribute to spans

* lint

* fix: curl usage (#7038)

curl -d, --data <data> is lowercase d
curl -D, --dump-header <filename> is uppercase D

references:
https://curl.se/docs/manpage.html#-d
https://curl.se/docs/manpage.html#-D

* fix(spend_tracking.py): handle empty 'id' in model response - when creating spend log

Fixes https://github.com/BerriAI/litellm/issues/7023

* fix(streaming_chunk_builder.py): handle initial id being empty string

Fixes https://github.com/BerriAI/litellm/issues/7023

* fix(anthropic_passthrough_logging_handler.py): add end user cost tracking for anthropic pass through endpoint

* docs(pass_through/): refactor docs location + add table on supported features for pass through endpoints

* feat(anthropic_passthrough_logging_handler.py): support end user cost tracking via anthropic sdk

* docs(anthropic_completion.md): add docs on passing end user param for cost tracking on anthropic sdk

* fix(litellm_logging.py): use standard logging payload if present in kwargs

prevent datadog logging error for pass through endpoints

* docs(bedrock.md): add rerank api usage example to docs

* bugfix/change dummy tool name format (#7053)

* fix viewing keys (#7042)

* ui new build

* build(model_prices_and_context_window.json): add bedrock region models to model cost map (#7044)

* bye (#6982)

* (fix) litellm router.aspeech  (#6962)

* doc Migrating Databases

* fix aspeech on router

* test_audio_speech_router

* test_audio_speech_router

* docs show supported providers on batches api doc

* change dummy tool name format

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>
Co-authored-by: yujonglee <yujonglee.dev@gmail.com>

* fix: fix linting errors

* test: update test

* fix(litellm_logging.py): fix pass through check

* fix(test_otel_logging.py): fix test

* fix(cost_calculator.py): update handling for cost per second

* fix(cost_calculator.py): fix cost check

* test: fix test

* (fix) adding public routes when using custom header  (#7045)

* get_api_key_from_custom_header

* add test_get_api_key_from_custom_header

* fix testing use 1 file for test user api key auth

* fix test user api key auth

* test_custom_api_key_header_name

* build: update ui build

---------

Co-authored-by: Doron Kopit <83537683+doronkopit5@users.noreply.github.com>
Co-authored-by: lloydchang <lloydchang@gmail.com>
Co-authored-by: hgulersen <haymigulersen@gmail.com>
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: yujonglee <yujonglee.dev@gmail.com>
2024-12-06 14:29:53 -08:00
Ishaan Jaff 8fd3bf34d8 (feat) pass through llm endpoints - add PATCH support (vertex context caching requires for update ops) (#6924)
* add PATCH for pass through endpoints

* test_pass_through_routes_support_all_methods
2024-11-26 14:39:13 -08:00
Ishaan Jaff c285132ad6 (docs) Simplify /vertex_ai/ pass through docs (#6910)
* simplify vertex pass through docs

* allow using known path for setting up pass throughs

* add unit testing for vtx pass through auth
2024-11-25 23:57:50 -08:00
Ishaan Jaff 552c0dd7a4 (fix) pass through endpoints - run logging async + use thread pool executor for sync logging callbacks (#6907)
* run pass through logging async

* fix use thread_pool_executor for pass through logging

* test_pass_through_request_logging_failure_with_stream

* fix anthropic pt logging test

* test_pass_through_request_logging_failure
2024-11-25 22:52:05 -08:00