Commit Graph

211 Commits

Author SHA1 Message Date
WilsonSunBritten 0721345703 Switch to string constant based truncation 2025-08-28 16:28:36 -06:00
Ishaan Jaff b9132968b2 [Perf] Improvements for Async Success Handler (Logging Callbacks) - Approx +130 RPS (#13905)
* [Performance] Reduce Significant CPU overhead from litellm_logging.py (#13895)

* fix: litellm.configured_cold_storage_logger

* fix Session Management - Non-OpenAI Models docs

* ruff fix

* test fix

* create LoggingWorker

* add GLOBAL_LOGGING_WORKER for async task handling

* fix logging tests

* add conftest

* fix conftest

* test fix location of encode bedrock runtime modelid arn

* fix conftest.py

* tuning LoggingWorker

* conftest.py

* fix conftest batches/

* test_async_chat_azure

* event_loop

* test_bedrock_streaming_passthrough_test2

* fix GLOBAL_LOGGING_WORKER

* logging worker

* add flush for global logging worker

* Revert "fix GLOBAL_LOGGING_WORKER"

This reverts commit d254f508f48935652f054777652938ad71976cce.

* fix conftest clear_queue

* fix conftest clear_queue

* setup_and_teardown for llm translation

* docs AWS_REGION

* test_async_chat_azure

* change test DIR

* run ci/cd again

* use 1 job for litellm_router_unit_testing

* fix space

* fix litellm_router_unit_testing

* test_aaarouter_dynamic_cooldown_message_retry_time

* litellm_router_unit_testing

* conftest.py clearing qu

* fixes litellm_router_unit_testing

* fixes clear_queue

* fix router_unit_tests

* remove conftest

* add back conftest for router

* fix event loop test

* test fix

* fixes for LoggingWorker

* ruff fix
2025-08-23 13:13:23 -07:00
Ishaan Jaff d96df5e9be Revert "test_stream_token_counting_anthropic_with_include_usage"
This reverts commit c3aee1194b.
2025-08-16 13:07:00 -07:00
Ishaan Jaff c3aee1194b test_stream_token_counting_anthropic_with_include_usage 2025-08-16 13:06:31 -07:00
Krrish Dholakia 8bb5ee2ba5 test: update unit test 2025-08-16 12:39:03 -07:00
Ishaan Jaff 4d941c914e [Feat] Responses API Session Handling - Multi media support (#13347)
* rename ResponsesSessionHandler

* use ResponsesSessionHandler

* test session handler

* refactor ResponsesSessionHandler

* fix get_proxy_server_request_from_spend_log

* use constant for LITELLM_TRUNCATED_PAYLOAD_FIELD

* add _should_check_cold_storage_for_full_payload

* add get_class_type_for_custom_logger_name

* get_active_custom_logger_for_callback_name

* add get_proxy_server_request_from_cold_storage to CustomLogger

* add ColdStorageHandler

* start using cold storage integration

* add get_proxy_server_request_from_cold_storage

* fixes from manual testing

* s3 v2 fix getting region name

* ChatCompletionImageUrlObject

* use _get_configured_cold_storage_custom_logger

* fixes for _should_check_cold_storage_for_full_payload

* fix _download_object_from_s3

* test_s3_v2_with_cold_storage

* add cold_storage_object_key to StandardLoggingMetadata

* use get_proxy_server_request_from_cold_storage_with_object_key

* add cold_storage_object_key to SpendLogsMetadata

* add cold_storage_object_key

* get_proxy_server_request_from_cold_storage_with_object_key

* use get_proxy_server_request_from_cold_storage_with_object_key

* test responses API

* add get_proxy_server_request_from_cold_storage_with_object_key

* session handler fixes

* test session handler

* fix ruff checks

* _download_object_from_s3

* cleanup

* test

* lint fix

* test_e2e_cold_storage_successful_retrieval

* test_e2e_generate_cold_storage_object_key_successful

* test_async_gcs_pub_sub_v1

* test fix

* test fix

* test fix

* test_standard_logging_metadata_has_cold_storage_object_key_field

* test_sanitize_request_body_for_spend_logs_payload_basic

* test_transform_input_image_item_to_image_item_with_image_data
2025-08-07 10:59:53 -07:00
Edward D'Amato 30fc5b871c feat(integrations): allow setting of braintrust callback base url (#13368)
* feat(integrations): allow setting of braintrust callback base url

* chore(misc): remove extra additions due to merge
2025-08-07 08:40:11 -07:00
Ishaan Jaff ee70d593c1 [Feat] Allow redacting message / response content for specific logging integrations - DD LLM Observability (#13158)
* fix redact_standard_logging_payload

* add StandardCustomLoggerInitParams

* allow defining DatadogLLMObsInitParams

* fix init DataDogLLMObsLogger

* fix import

* update redact_standard_logging_payload_from_model_call_details

* test_dd_llms_obs_redaction

* docs DD logging

* docs DD

* docs DD

* Redacting Messages, Response docs DD LLM Obs

* fix redaction logic

* fix create_llm_obs_payload

* fix logging response

* fixes

* ruff fix

* fix test

* test_dd_llms_obs_redaction

* test_create_llm_obs_payload

* redact_standard_logging_payload_from_model_call_details

* img - dd_llm_obs

* docs DD

* fix linting

* fix linting

* fix mypy

* test_create_llm_obs_payload

* test_create_llm_obs_payload

* fix mock_env_vars

* fix _handle_anthropic_messages_response_logging
2025-07-31 16:44:16 -07:00
Jugal D. Bhatt c7774ba495 add fix for redaction (#13005) 2025-07-25 17:48:48 -07:00
Jugal D. Bhatt a46b9d376f [Prometheus] Move Prometheus to enterprise folder (#12659)
* fix tools fetch for keys

* add promethues to enterprise

* remove old prom

* remove old prom

* fix tests

* safe imports

* add if

* fix enterprise test

* rename imports

* added label import

* added label import

* move tests to enterprise

* fix tests

* add log

* build: update versions

---------

Co-authored-by: Krrish Dholakia <krrishdholakia@gmail.com>
2025-07-18 11:54:47 -07:00
Ishaan Jaff 33c84846e9 [Refactor] Vector Stores - Use class VectorStorePreCallHook for all Vector Store Integrations (#12715)
* add VectorStorePreCallHook

* vector_store_pre_call_hook

* add pop_vector_stores_to_run

* async_get_chat_completion_prompt

* working e2e tests

* test_e2e_bedrock_knowledgebase_retrieval_with_completion

* delete old files

* fix logging test

* VectorStorePreCallHook

* fix ruff check

* vector_store_pre_call_hook

* linting error fixes
2025-07-17 16:31:58 -07:00
Ishaan Jaff d8327b4740 [Bug Fix] [Bug]: Knowledge Base Call returning error (#12628)
* bug fix using vector stores as tools

* test_e2e_bedrock_knowledgebase_retrieval_with_llm_api_call_with_tools
2025-07-15 21:33:49 -07:00
Krish Dholakia 749051105b Team Members - reset budget, if duration set + Prometheus - support tag based metrics (#12534)
* fix(internal_user_endpoints.py): initial commit removing logic to create new budget for new user if default max budget in team set

* feat(proxy_setting_endpoints.py): update team member budget when set via default internal user endpoint

removes need to create a unique budget per user

* feat(proxy_server.py): set team max member budget on startup, if set on config.yaml

* fix(prometheus.py): support custom tags for tracking on prometheus

Allows tracking user agent values on prometheus metrics

* test(test_internal_user_endpoints.py): fix test
2025-07-11 22:54:16 -07:00
Krish Dholakia c4af2eb5e2 MCP - usage tracking (#12397)
* fix(common_daily_activity.py): initial commit with working mock BE endpoint for mcp usage

* feat(ui/): show mcp server activity on UI

allows admin to know which mcp's are being used

* feat(common_daily_activity.py): return activity by key

* feat(ui/): show top api keys for a given model / mcp server

allow user to know which key is driving spend

* fix(common_daily_activity.py): use known mcp server names

* feat(server.py): log the namespaced tool name (includes server prefix)

allow accurate cost tracking

* feat(db_spend_update_writer.py): log by mcp_namespaced_tool_name

store aggregate daily activity by mcp_namespaced_tool_name

Enables cost / usage tracking by mcp tool name

* fix(server.py): add key/user metadata to mcp calls

* refactor(common_daily_activity.py): update to return mcp activity in API

* fix(common_daily_activity.py): handle empty key

* fix(common_daily_activity.py): track when api key is empty

* test(test_spend_management_endpoints.py): update tests

* fix: fix ui linting error

* fix: fix linting errors

* test: add missing key

* build(schema.prisma): add mcp tool tracking

* fix(migration.sql): add schema migration file

* feat(server.py): add request logging for mcp calls

enables storing the mcp calls

* fix(new_usage.tsx): fix linting errors

* fix: fix code qa errors

* fix(activity_metrics.tsx): fix ui linting errors post-merge

* fix(types/utils.py): fix linting error

* fix(server.py): always have name
2025-07-08 22:08:16 -07:00
Jugal D. Bhatt 6f27385edf Ensure message redaction works for responses API logging (#12291)
* add fixes to choice implementation redaction

* add isInstance check on responses API

* change datadog to revert back

* change datadog to revert back

* fix type errors

* Redaction test changes

* Redaction test changes

* Redaction test changes - remove changes
2025-07-04 15:11:20 -07:00
Ishaan Jaff 3d71b49d11 [Feat] Add failure logging support for s3 logger (#12299)
* add async_log_failure_event

* test_basic_s3_v2_logging_failure
2025-07-04 11:11:30 -07:00
Ishaan Jaff 66fafa3a7f [Feat] Polish - add better error validation when users configure prometheus metrics and labels to control cardinality (#12182)
* self._pretty_print_invalid_metric_error

* docs prometheus.md

* test prom validation checks

* update metric name

* fix _pretty_print_validation_errors

* fix linting

* test prometheus

* test fixes - prometheus
2025-07-01 20:17:17 -07:00
Ishaan Jaff d727d63a81 [Feat] Add new AWS SQS Logging Integration (#12176)
* add aws_sqs

* add sqs controls

* add SQS to registry

* fix url lib parse

* fixes AWS SQS

* test_async_sqs_logger_flush

* fix test

* fix SQS logger auth

* add AWS SQS

* add aws sqs

* docs logging

* test_async_sqs_logger_flush

* test_async_sqs_logger_flush

* add SQS logger

* update SQS logging

* use constants for SQS
2025-06-30 14:02:49 -07:00
Ishaan Jaff 2bb8048864 [Feat] Add OpenAI Search Vector Store Operation (#12018)
* add BaseVectorStoreTransformation

* fix BaseVectorStoreTransformation

* add OpenAIVectorStoreTransformation

* fix transform

* add search, asearch vector stores

* add skeleton for vector store searching

* fix VectorStoreSearchOptionalRequestParams

* fix VectorStoreRequestUtils

* fix litellm.asearch/litellm.search

* fix BaseVectorStoreConfig

* add vector_store_search_handler to llm http handler

* use llm http handler for searching vector stores

* fix base vector store config

* fix vector_store_search_handler

* async_vector_store_search_handler

* add conftest

* add BaseVectorStoreTest

* move litellm.integrations.vector_store_integrations

* fix working OAI OpenAIVectorStoreConfig

* add Search vector store

* add OpenAI Vector Stores
2025-06-24 15:52:43 -07:00
Ishaan Jaff 8c5fb6f539 [Feat] Enterprise - Allow dynamically disabling callbacks in request headers (#11985)
* Add support for disabling callbacks via x-litellm-disable-callbacks header

* add _is_callback_disabled_via_headers

* add get_proxy_server_request_headers

* _is_callback_disabled_via_headers

* X_LITELLM_DISABLE_CALLBACKS

* add EnterpriseCallbackControls

* use EnterpriseCallbackControls

* use CustomLoggerRegistry

* use CustomLoggerRegistry

* CustomLoggerRegistry

* EnterpriseCallbackControls

* TestEnterpriseCallbackControls

* docs clean up

* docs dynamic callbacks

* doc fixes

* fix code qa checks

* fix CustomLoggerRegistry

---------

Co-authored-by: Cursor Agent <cursoragent@cursor.com>
2025-06-23 14:32:05 -07:00
Ishaan Jaff 931b2e4875 [Bug Fix] Fix model_group tracked for /v1/messages and /moderations (#11933)
* fixes _get_router_metadata_variable_name

* fixes _update_kwargs_before_fallbacks

* test_anthropic_messages_litellm_router_non_streaming_with_logging

* test_moderations_api_logging

* fix _pass_through_moderation_endpoint_factory
2025-06-20 14:51:50 -07:00
Krrish Dholakia 3540984965 test: fix tests 2025-06-19 16:35:31 -07:00
Ishaan Jaff bcdb53920a [Fix] Bug Fix for using prom metrics config (#11779)
* fixes for using config for prom metrics

* test_set_llm_deployment_success_metrics_with_label_filtering

* fixes for deployment failure

* fix code qa checks

* test_async_post_call_success_hook
2025-06-17 14:44:24 -07:00
Krrish Dholakia bb907b5ecc test: fix test 2025-06-16 18:49:41 -07:00
Krish Dholakia 7a128e2017 VertexAI Anthropic - streaming passthrough cost tracking (#11734)
* feat(vertex_passthrough_logging_handler.py): initial anthropic passthrough streaming cost tracking support

* fix: fix linting errors

* test: update test
2025-06-15 01:16:43 -07:00
Krish Dholakia c92b6c175c Prometheus - fix request increment + add route tracking for streaming requests (#11731)
* fix(prometheus.py): remove request increment from inside the log success event

it's only done on post-call success/failure

* fix(litellm_logging.py): add additional validation step for checking if 'stream' is true

prevent double counting on non-stream requests

* test: add unit testing to ensure stream is not incorrectly set to true

* feat(litellm_logging.py): emit request route in standard logging payload

used by prometheus streaming metrics for route

* fix: fix otel test

* fix: fix linting errors

* test: update test

* fix: fix linting error
2025-06-14 16:26:48 -07:00
Ishaan Jaff ad82792c4b fix lf OTEL 2025-06-14 15:43:38 -07:00
Krrish Dholakia 3608db5ffe fix(prometheus.py): update tests 2025-06-06 09:12:54 -07:00
Ishaan Jaff 23627d6a26 [Fix] [Bug]: Knowledge Base Call returning error (#11467)
* fix:get_and_pop_recognised_vector_store_tools

* test: tools wwith vector stores

* test - bedrock kb tools

* fix: add clear comment

* fix: vector store tools
2025-06-05 18:24:36 -07:00
Ishaan Jaff a1f3a1c5dc [Feat] Performance - Don't create 1 task for every hanging request alert (#11385)
* feat: add async_get_oldest_n_keys in memory cache

* fix: add add_request_to_hanging_request_check

* test: alerting

* feat: v2 hanging request check

* fix: HangingRequestData

* fix: AlertingHangingRequestCheck

* fix: check_for_hanging_requests

* fix: use correct metadata location for hanging requests

* fix: formatting alert

* test hanging request check

* fix: add guard flags for background tasks alerting
2025-06-03 21:12:54 -07:00
Krrish Dholakia 2fe0a2750b test: ensure aws region correctly set 2025-06-03 20:58:21 -07:00
Ishaan Jaff 41a2a62511 fix: bedrock kb test 2025-06-03 11:55:41 -07:00
Ishaan Jaff 3db272b6d2 [Perf] - Add Async + Batched S3 Logging (#11340)
* fix: add s3 v2 async

* fix: add s3 v2 async

* fix: add s3 v2 async

* test: s3 v2 logging

* fixes: s3 logging

* fixes: s3 logging use max upload batch size

* fixes: s3 logging tests

* fixes: s3 logging tests

* fixes: s3 logging tests
2025-06-02 21:52:34 -07:00
Ishaan Jaff 7d47417906 test: fixes 2025-05-31 12:42:56 -07:00
Krish Dholakia 1995c7aad5 fix(utils.py): support non default params for audio transcription (#11212)
* fix(utils.py): support non default params for audio transcription

allows passing provider specific params straight through on transcription calls

* fix(gpt_transformation.py): fix o_series model routing

call _transform_request on async event

* refactor: refactor tests

* test(test_azure_chat_o_series_transformation.py): add unit test for azure o series error

* test: update test

* test: update json

* fix: fix mutiple keyword error
2025-05-28 22:24:02 -07:00
Ishaan Jaff 0590b1eb3a [Fix] Prometheus Metrics - Do not track end_user by default + expose flag to enable tracking end_user on prometheus (#11192)
* fix: testing for disabling end user on metrics

* fix: fixes for test_prometheus_factory

* Delete litellm/model_prices_and_context_window_backup.json

* fix: issues with merge conflicts

* fix: test_get_end_user_id_for_cost_tracking_prometheus_only

* Update tests/test_litellm/integrations/test_prometheus.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-05-27 17:06:58 -07:00
Krish Dholakia 066a502b89 Litellm dev 05 26 2025 p1 (#11165)
* fix(utils.py): prevent leaking sensitive keys to langfuse

Fixes https://github.com/BerriAI/litellm/issues/11150

* test(langfuse/): unit test preventing future bedrock key leaks

Fixes https://github.com/BerriAI/litellm/issues/11150

* test(test_langfuse_e2e_test.py): add unit test for vertex - make sure no key leaks occur

* ci(test-litellm.yml): add pytest retry to github workflow

* fix(proxy_server.py): support forwarding `/sso/key/generate` to the server root path url

Fixes https://github.com/BerriAI/litellm/issues/10761

* fix(proxy_server.py): don't rewrite absolute path (PROXY_BASE_URL) with relative path (SERVER_ROOT_PATH)

This causes issues when using a custom path with sso, when doing redirects

* fix(utils.py): ignore token - will mistakenly redact 'max_tokens' as well
2025-05-26 22:00:48 -07:00
Krish Dholakia 010a4d44af Fix passing standard optional params (#11124)
* fix(main.py): use processed non-default-params as standard input params for langfuse

Fixes https://github.com/BerriAI/litellm/issues/11072

 Fixes https://github.com/BerriAI/litellm/issues/11096

* fix(main.py): rename variable to be more accurate

* test(test_langfuse_e2e_test.py): add router unit test for langfuse e2e testing

Prevent https://github.com/BerriAI/litellm/issues/11072 from happening again

* build: update lock

* fix(utils.py): refactor optional params function

make it easier to get the standardized non default params

* fix(utils.py): improve process non default params function

* fix(main.py): include provider specific params in processed non default params used in logging

ensures user can see any provider specific params on langfuse

 ensures user can see any provider specific params on langfus e
2025-05-24 12:12:31 -07:00
Ishaan Jaff 86cdb8382b [Feat] Use aiohttp transport by default - 97% lower median latency (#11097)
* fix: add flag for disabling use_aiohttp_transport

* feat: add _create_async_transport

* feat: fixes for transport

* add httpx-aiohttp

* feat: fixes for transport

* refactor: fixes for transport

* build: fix deps

* fixes: test fixes

* fix: ensure aiohttp does not auto set content type

* test: test fixes

* feat: add LiteLLMAiohttpTransport

* fix: fixes for responses API handling

* test: fixes for responses API handling

* test: fixes for responses API handling

* feat: fixes for transport

* fix: base embedding handler

* test: test_async_http_handler_force_ipv4

* test: fix failing deepeval test

* fix: add YARL for bedrock urls

* fix: issues with transport

* fix: comment out linting issues

* test fix

* test: XAI is unstable

* test: fixes for using respx

* test: XAI fixes

* test: XAI fixes

* test: infinity testing fixes

* docs(config_settings.md): document param

* test: test_openai_image_edit_litellm_sdk

* test: remove deprecated test

* bump respx==0.22.0

* test: test_xai_message_name_filtering

* test: fix anthropic test after bumping httpx

* use n 4 for mapped tests (#11109)

* fix: use 1 session per event loop

* test: test_client_session_helper

* fix: linting error

* fix: resolving GET requests on httpx 0.28.1

* test fixes proxy unit tests

* fix: add ssl verify settings

* fix: proxy unit tests

* fix: refactor

* tests: basic unit tests for aiohttp transports

* tests: fixes xai

---------

Co-authored-by: Krrish Dholakia <krrishdholakia@gmail.com>
2025-05-23 22:55:35 -07:00
Krish Dholakia 2efaa3cf36 Expose /list and /info endpoints for Audit Log events (#11102)
* feat(audit_logging_endpoints.py): expose list endpoint to show all audit logs

make it easier for user to retrieve individual endpoints

* feat(enterprise/): add audit logging endpoint

* feat(audit_logging_endpoints.py): expose new GET `/audit/{id}` endpoint

make it easier to retrieve view individual audit logs

* feat(key_management_event_hooks.py): correctly show the key of the user who initiated the change

* fix(key_management_event_hooks.py): add key rotations as an audit log event

'

* test(test_audit_logging_endpoints.py): add simple unit testing for audit log endpoint

* fix: testing fixes

* fix: fix ruff check
2025-05-23 22:54:59 -07:00
Ishaan Jaff 13bbf11ab0 test: fix failing deepeval test 2025-05-23 14:40:39 -07:00
Ishaan Jaff 754a94db97 Revert "Revert "Support passing prompt_label to langfuse (#11018)""
This reverts commit 0be7e7d088.
2025-05-22 14:14:39 -07:00
Ishaan Jaff 0be7e7d088 Revert "Support passing prompt_label to langfuse (#11018)"
This reverts commit 2b50b43ae2.
2025-05-22 14:11:19 -07:00
Krish Dholakia 2b50b43ae2 Support passing prompt_label to langfuse (#11018)
* fix: add prompt label support to prompt management hook

* feat: support 'prompt_label' parameter for langfuse prompt management

Closes https://github.com/BerriAI/litellm/discussions/9003#discussioncomment-13221555

* fix(litellm_logging.py): deep copy optional params to avoid mutation while logging

* fix(log-consistent-optional-param-values-across-providers): ensures params can be used for finetuning from providers

* fix: fix linting error

* test: update test

* test: update langfuse tests

* fix(litellm_logging.py): avoid deepcopying optional params

might contain thread object
2025-05-21 22:27:36 -07:00
Ishaan Jaff 14321a2708 [Feat] Prometheus - Track route on proxy_* metrics (#10992)
* fix: trace route on prometheus metrics

* fix: show route on prometheus metrics for total fails

* test: trace route on metrics

* fix: tests for route in prom metrics

* test: fix test metrics

* test: fix test_proxy_failure_metrics
2025-05-20 22:55:55 -07:00
Ishaan Jaff 298912bd38 [UI] - Add Guardrail Tracing to LiteLLM SpendLogs (#10893)
* feat: trace guardrail SLP in spendLogs

* test: trace guardrail SLP in spendLogs

* add guardrail viewer

* checkpoint - working guardrail view on logs

* ui add guardrail view to SpendLogs

* test: fixes guardrails

* trace: fixes guardrails
2025-05-16 12:20:20 -07:00
Ishaan Jaff 42e6e664b2 [Refactor] Make Pagerduty a free feature (#10857)
* refactor: make pagerduty free

* refactor: make pagerduty free

* fix: pagerduty loc

* fix: linting error
2025-05-15 10:12:06 -07:00
Ishaan Jaff 2a994d7016 [Feat] Presidio Improvements - Allow adding presidio api base on UI, Test presidio on Test Key, fixes for running presidio hook (#10840)
* feat: add GuardrailProviderSpecificParams

* feat: add add apply_guardrail helper for presidio

* ui cleanup

* fixes pii config on ui

* fixes for adding presidio pii

* refactor: InitializeGuardrails

* feat: init guardrails from DB

* allow running guardrails from test key pg

* fix: running a request with guardrails on UI

* fix: types/guardrails.py

* test: test_presidio_pre_call_hook_with_different_call_types

* test: test_initialize_presidio_guardrail

* test: fix custom guardrail tests
2025-05-14 17:41:33 -07:00
Krish Dholakia a421316e56 fix(litellm_logging.py): log custom headers in requester metadata (#10818)
* fix(litellm_logging.py): log custom headers in requester metadata

allows passing along custom headers from client to logging integration - e.g. `x-correlation-id`

* refactor: move enterprise code out of OSS package

work towards simplified CE version of docker image

* test: update test

* fix: fix linting error
2025-05-13 23:04:37 -07:00
Ishaan Jaff a4fb1da2d9 fix: pass application/json for GenericAPILogger (#10772)
* fix: pass application/json for GenericAPILogger

* fix: test_generic_api_callback
2025-05-12 14:15:33 -07:00