Commit Graph

45 Commits

Author SHA1 Message Date
Mateo Wang 2c733c00f5 chore(ci): modernize model references in tests and configs (#27856)
* test: modernize models used in CircleCI e2e test suites

Replaces obsolete models (gpt-4o, gpt-4o-mini, gpt-3.5-turbo,
claude-3-5-sonnet-20240620, claude-sonnet-4-20250514) with current
equivalents across the e2e_openai_endpoints and
proxy_e2e_anthropic_messages_tests CircleCI jobs.

- gpt-4o -> gpt-5.5 (responses API e2e tests)
- gpt-4o-mini -> gpt-5-mini (websocket responses, oai_misc_config)
- gpt-4o-mini-2024-07-18 -> gpt-4.1-mini-2025-04-14 (fine-tuning,
  still actively fine-tunable)
- gpt-4 / gpt-3.5-turbo target_model_names example -> gpt-5.5 /
  gpt-5-mini
- bedrock claude-3-5-sonnet-20240620 batch entry -> haiku-4-5-20251001
  (also aligning oai_misc_config model_name with what
  test_bedrock_batches_api.py actually requests)
- bedrock claude-sonnet-4-20250514 (deprecated, retires 2026-06-15)
  -> claude-sonnet-4-5-20250929

* test: point bedrock-claude-sonnet-4 alias at Sonnet 4.6, not 4.5

Greptile/Cursor flagged that after the previous commit, the
bedrock-claude-sonnet-4 alias collided with bedrock-claude-sonnet-4.5
(both pointed to claude-sonnet-4-5-20250929). Rename to
bedrock-claude-sonnet-4.6 and point it at the Sonnet 4.6 Bedrock ID
(us.anthropic.claude-sonnet-4-6, already in the litellm model
registry) so the alias name matches the underlying model version.

* test: modernize models across remaining CI-mounted configs & tests

Expands the modernization sweep to all CircleCI-mounted proxy configs
and to test directories where the model literal is a fixture/route key
(not the test's subject).

Config changes:
- proxy_server_config.yaml: bump gpt-3.5-turbo / gpt-3.5-turbo-1106 /
  gpt-4o / gemini-1.5-flash / dall-e-3 underlying models; rename
  gpt-3.5-turbo-end-user-test alias to gpt-5-mini-end-user-test; bump
  text-embedding-ada-002 underlying to text-embedding-3-small. User-
  facing aliases (gpt-3.5-turbo, gpt-4, text-embedding-ada-002, etc.)
  preserved for backward compatibility with tests.
- simple_config.yaml, otel_test_config.yaml, spend_tracking_config.yaml:
  bump gpt-3.5-turbo underlying to gpt-5-mini.
- pass_through_config.yaml: claude-3-5-sonnet / claude-3-7-sonnet /
  claude-3-haiku entries replaced with claude-sonnet-4-5 / claude-
  haiku-4-5 / claude-opus-4-7.
- oai_misc_config.yaml: align alias name with the gpt-5-mini rename.

Test changes (proactive: claude-sonnet-4-20250514 / claude-opus-4-
20250514 retire 2026-06-15):
- tests/llm_translation/test_anthropic_completion.py: bump 3 references
  + paired Vertex AI ID to claude-sonnet-4-5.
- tests/llm_translation/test_optional_params.py: bump 2 references.
- tests/pass_through_unit_tests/test_anthropic_messages_passthrough.py
  and test_bedrock_anthropic_messages_test.py: bump router fixtures
  using the deprecated model IDs.
- tests/pass_through_unit_tests/base_anthropic_messages_tool_search_test.py:
  modernize docstring examples.
- tests/test_end_users.py: update references to renamed alias.

* test: modernize placeholder model literals in router_unit_tests

Mass replace_all on fixture/placeholder model literals across the
router_unit_tests/ suite (model name is a routing key / label, not the
test subject). Sub-agent sweep so far — additional commits will follow
for logging_callback_tests/, enterprise/, top-level tests/test_*.py,
and other CI-mounted dirs.

Mappings applied:
- gpt-3.5-turbo -> gpt-5-mini
- gpt-4 (bare) -> gpt-5.5
- gpt-4o (bare) -> gpt-5
- text-embedding-ada-002 -> text-embedding-3-small
- claude-3-sonnet-20240229 / claude-3-opus-20240229 /
  claude-3-haiku-20240307 / claude-3-5-sonnet-20240620 ->
  claude-sonnet-4-5-20250929 / claude-opus-4-7 /
  claude-haiku-4-5-20251001 as appropriate

Explicitly preserved:
- gpt-4o-mini-* variants (transcribe, tts, etc.) where they're current
- gpt-4-turbo / gpt-4-vision-preview / gpt-4-0613 (subject literals)
- JSONL batch body literals
- Mock LLM response model fields (must match upstream)
- Fake/mock identifiers

* test: modernize placeholder model literals across remaining CI suites

Sub-agent sweep across logging_callback_tests/, guardrails_tests/,
enterprise/, pass_through_unit_tests/, otel_tests/,
llm_responses_api_testing/, batches_tests/, spend_tracking_tests/,
litellm_utils_tests/, unified_google_tests/, and a few top-level
tests/test_*.py files where the model literal is a fixture or
placeholder (router model_list, mock standard logging payload, mock
callback data) rather than the test's subject.

Mappings applied (see scope notes below):
- gpt-3.5-turbo -> gpt-5-mini
- gpt-4 (bare) -> gpt-5.5
- gpt-4o (bare) -> gpt-5.5 (corrected from initial gpt-5 — bare gpt-5
  is not a valid OpenAI alias; only gpt-5.5 / gpt-5.4 / gpt-5.2-codex
  / gpt-5-mini exist)
- gpt-4o-mini (bare) -> gpt-5-mini
- text-embedding-ada-002 -> text-embedding-3-small
- claude-3-sonnet-20240229 -> claude-sonnet-4-5-20250929
- claude-3-opus-20240229 -> claude-opus-4-7
- claude-3-haiku-20240307 -> claude-haiku-4-5-20251001
- claude-3-5-sonnet-20240620/20241022 -> claude-sonnet-4-5-20250929
- claude-3-7-sonnet-20250219 -> claude-sonnet-4-6
- gemini-1.5-flash -> gemini-2.5-flash
- gemini-1.5-pro -> gemini-2.5-pro

Explicitly preserved (not modernized):
- llm_translation/ tests where model is the SUBJECT (provider-specific
  translation/transformation logic). Only the deprecated 20250514
  references were already bumped in a prior commit.
- Cost-calc / tokenizer subject tests in test_utils.py (skip-ranges
  documented by the sub-agent).
- Bedrock model IDs in test_health_check.py path-stripping tests.
- JSONL batch request bodies and mock LLM response bodies (must match
  upstream literal).
- Langfuse expected-request-body JSON fixtures (cost values are exact-
  match-asserted; changing the model would shift response_cost).
- gpt-3.5-turbo-instruct (text-completion endpoint; no modern OpenAI
  equivalent).
- Top-level tests calling the proxy through user-facing aliases
  (gpt-3.5-turbo, gpt-4, text-embedding-ada-002, dall-e-3) — aliases
  in proxy_server_config.yaml stay; only the underlying model was
  bumped.
- tests/test_gpt5_azure_temperature_support.py (the test's whole point
  is model-name handling).
- Fake / mock / openai/fake identifiers.

Notable side fixes:
- test_spend_accuracy_tests.py: UPSTREAM_MODEL now matches what
  spend_tracking_config.yaml's proxy actually routes to (gpt-5-mini),
  resolving a latent inconsistency.
- proxy_server_config.yaml: bare `gpt-5` alias renamed to `gpt-5.5`
  (bare gpt-5 is not a valid OpenAI alias).
- test_batches_logging_unit_tests.py: explicit_models list entries
  kept distinct (gpt-5-mini + gpt-5.5) after bulk rename.

* test: fix CI failures from model modernization sweep

CI surfaced 4 categories of regression from the bulk modernization:

1. Azure deployment names are customer-specific. Reverted:
   - tests/litellm_utils_tests/test_health_check.py: azure/text-
     embedding-3-small -> azure/text-embedding-ada-002 (the CI Azure
     account does not have a text-embedding-3-small deployment).
   - tests/logging_callback_tests/test_custom_callback_router.py:
     same revert for two router fixtures driving aembedding.

2. gpt-5 family does not accept temperature != 1. Tests that pass a
   custom temperature swapped from gpt-5-mini to gpt-4.1-mini (modern
   non-reasoning OpenAI mini that still accepts temperature/logprobs):
   - tests/logging_callback_tests/test_datadog.py
   - tests/logging_callback_tests/test_langsmith_unit_test.py
   - tests/logging_callback_tests/test_otel_logging.py

3. proxy_server_config.yaml's gpt-3.5-turbo-large alias was routing to
   gpt-5.5 (a reasoning model that rejects logprobs). The proxy test
   tests/test_openai_endpoints.py::test_chat_completion_streaming
   exercises logprobs/top_logprobs through that alias. Bumped the
   underlying model to gpt-4.1 (non-reasoning, still modern).

4. tests/logging_callback_tests/test_gcs_pub_sub.py asserts against a
   pinned JSON fixture (gcs_pub_sub_body/spend_logs_payload.json) with
   hardcoded model="gpt-4o" and a model-specific spend value. Reverted
   the litellm.acompletion calls in the test to model="gpt-4o" so the
   fixture's exact-match assertions still hold.

5. tests/pass_through_unit_tests/test_anthropic_messages_passthrough.py:
   anthropic.messages.create routing to openai/gpt-5-mini returned an
   empty content[0] with max_tokens=100 (reasoning-token consumption).
   Swapped to openai/gpt-4.1-mini.

* test: fix Assistants API model + 2 cursor[bot] review nits

1. pass_through_unit_tests/test_custom_logger_passthrough.py: gpt-5.5
   isn't accepted by the /v1/assistants endpoint
   ("unsupported_model"). Switch to gpt-4.1-mini (modern, Assistants-
   API-supported, non-reasoning).

2. example_config_yaml/pass_through_config.yaml: the previous sweep
   bumped the claude-3-7-sonnet alias to claude-opus-4-7, which is a
   tier change (Sonnet -> Opus). Map to claude-sonnet-4-6 to keep the
   Sonnet tier intact. (Cursor bugbot review.)

3. example_config_yaml/simple_config.yaml: model_name was left as
   gpt-3.5-turbo while the underlying was bumped to gpt-5-mini, which
   muddles the "simple" example. Make both sides gpt-5-mini so the
   most basic example is a straight 1:1 mapping again. (Cursor bugbot
   review.)

* fix: revert gpt-4/gpt-3.5-turbo alias underlying to non-reasoning models

tests/test_openai_endpoints.py::test_completion calls the proxy alias
"gpt-4" with temperature=0, and other tests call gpt-3.5-turbo with
custom temperature / logprobs / the legacy /v1/completions endpoint.
The earlier modernization mapped both aliases to gpt-5.5 / gpt-5-mini,
which are reasoning models that reject temperature != 1 and don't
expose /v1/completions. Map the aliases to gpt-4.1 / gpt-4.1-mini
(modern non-reasoning OpenAI models) instead — keeps user-facing
aliases preserved while picking a current underlying that still
supports the parameters/endpoints the tests exercise.
2026-05-15 15:44:28 -07:00
Ishaan Jaffer e8461b5b97 style: run black formatter on files from main merge 2026-04-17 13:02:59 -07:00
Emerson Gomes cba3bcf1a9 fix(logging): avoid shared callback list references (#20984) 2026-02-13 18:32:41 +05:30
Ishaan Jaffer 2b069a343b test_init_custom_logger_compatible_class_as_callback 2025-12-06 16:21:50 -08:00
Ishaan Jaffer 2bc5d93f23 use_callback_in_llm_call 2025-10-01 18:32:37 -07:00
Ishaan Jaffer ddf918cb42 use_callback_in_llm_call 2025-09-27 11:55:41 -07:00
Ishaan Jaffer d739d226ed fix: test 2025-09-19 16:28:09 -07:00
Edward D'Amato 30fc5b871c feat(integrations): allow setting of braintrust callback base url (#13368)
* feat(integrations): allow setting of braintrust callback base url

* chore(misc): remove extra additions due to merge
2025-08-07 08:40:11 -07:00
Ishaan Jaff d727d63a81 [Feat] Add new AWS SQS Logging Integration (#12176)
* add aws_sqs

* add sqs controls

* add SQS to registry

* fix url lib parse

* fixes AWS SQS

* test_async_sqs_logger_flush

* fix test

* fix SQS logger auth

* add AWS SQS

* add aws sqs

* docs logging

* test_async_sqs_logger_flush

* test_async_sqs_logger_flush

* add SQS logger

* update SQS logging

* use constants for SQS
2025-06-30 14:02:49 -07:00
Ishaan Jaff 8c5fb6f539 [Feat] Enterprise - Allow dynamically disabling callbacks in request headers (#11985)
* Add support for disabling callbacks via x-litellm-disable-callbacks header

* add _is_callback_disabled_via_headers

* add get_proxy_server_request_headers

* _is_callback_disabled_via_headers

* X_LITELLM_DISABLE_CALLBACKS

* add EnterpriseCallbackControls

* use EnterpriseCallbackControls

* use CustomLoggerRegistry

* use CustomLoggerRegistry

* CustomLoggerRegistry

* EnterpriseCallbackControls

* TestEnterpriseCallbackControls

* docs clean up

* docs dynamic callbacks

* doc fixes

* fix code qa checks

* fix CustomLoggerRegistry

---------

Co-authored-by: Cursor Agent <cursoragent@cursor.com>
2025-06-23 14:32:05 -07:00
Krrish Dholakia bb907b5ecc test: fix test 2025-06-16 18:49:41 -07:00
Krish Dholakia 7a128e2017 VertexAI Anthropic - streaming passthrough cost tracking (#11734)
* feat(vertex_passthrough_logging_handler.py): initial anthropic passthrough streaming cost tracking support

* fix: fix linting errors

* test: update test
2025-06-15 01:16:43 -07:00
Ishaan Jaff ad82792c4b fix lf OTEL 2025-06-14 15:43:38 -07:00
Ishaan Jaff 3db272b6d2 [Perf] - Add Async + Batched S3 Logging (#11340)
* fix: add s3 v2 async

* fix: add s3 v2 async

* fix: add s3 v2 async

* test: s3 v2 logging

* fixes: s3 logging

* fixes: s3 logging use max upload batch size

* fixes: s3 logging tests

* fixes: s3 logging tests

* fixes: s3 logging tests
2025-06-02 21:52:34 -07:00
Krish Dholakia 2efaa3cf36 Expose /list and /info endpoints for Audit Log events (#11102)
* feat(audit_logging_endpoints.py): expose list endpoint to show all audit logs

make it easier for user to retrieve individual endpoints

* feat(enterprise/): add audit logging endpoint

* feat(audit_logging_endpoints.py): expose new GET `/audit/{id}` endpoint

make it easier to retrieve view individual audit logs

* feat(key_management_event_hooks.py): correctly show the key of the user who initiated the change

* fix(key_management_event_hooks.py): add key rotations as an audit log event

'

* test(test_audit_logging_endpoints.py): add simple unit testing for audit log endpoint

* fix: testing fixes

* fix: fix ruff check
2025-05-23 22:54:59 -07:00
Ishaan Jaff 13bbf11ab0 test: fix failing deepeval test 2025-05-23 14:40:39 -07:00
Ishaan Jaff 42e6e664b2 [Refactor] Make Pagerduty a free feature (#10857)
* refactor: make pagerduty free

* refactor: make pagerduty free

* fix: pagerduty loc

* fix: linting error
2025-05-15 10:12:06 -07:00
Ishaan Jaff d13117992c fix: test_init_custom_logger_compatible_class_as_callback 2025-05-10 17:26:12 -07:00
Ishaan Jaff 3731ee436a [Refactor] Use pip package for enterprise/ folder (#10709)
* init enterprise pip

* init enterprise pip

* init enterprise pip

* test: enterprise pip

* add litellm-enterprise to pip

* litellm ent check

* litellm ent check

* fix import email router

* fix setup_litellm_enterprise_pip

* fix local testing with enterprise pip
2025-05-09 17:18:48 -07:00
Ishaan Jaff dd32860d62 [Feat] V2 Emails - Fixes for sending emails when creating keys + Resend API support (#10602)
* working email integration

* fix get_custom_loggers_for_type

* add SendKeyCreatedEmailEvent type

* bug fix, only send 1 email when creating key for user

* polish for emails for key created

* polish for key created email

* fix test_init_custom_logger_compatible_class_as_callback

* testing resend email integration

* testing fixes for email integration
2025-05-06 22:50:48 -07:00
Ishaan Jaff 489f1a6c25 [Feat] v2 Custom Logger API Endpoints (#10575)
* fixes for generic api logger

* tests for generic api logger

* test_generic_api_callback_multiple_logs

* allow health checking generic api endpoints

* docs generic api endpoint for logging

* allow setting headers for generic api  callback

* fix for test_init_custom_logger_compatible_class_as_callback

* fix linting
2025-05-05 16:57:55 -07:00
Ishaan Jaff 988e20aa36 [QA] Bedrock Vector Stores Integration - Allow using with registry + in OpenAI API spec with tools (#10516)
* refactor KB implementation to use central registry

* allow passing tools when making KB calls

* test fixes

* linting fix

* fix kb tests

* QA for KB stored in DB

* fix, use litellm_credential_name when adding KB on litellm UI

* QA list endpoint vector stores

* allow using UI creds with KBs
2025-05-03 08:30:38 -07:00
Ishaan Jaff f30871ef13 [Feat] Add support for using Bedrock Knowledge Bases with LiteLLM /chat/completions requests (#10413)
* add make_bedrock_kb_retrieve_request

* working bedrock KB hook

* working bedrock KB hook

* test_openai_with_knowledge_base_mock_openai

* fix linting

* fix BedrockKnowledgeBaseHook

* docs using bedrock kb with litellm

* docs kb with litellm

* fix bedrock kb test

* DynamicPromptManagementParamLiteral

* fix _should_run_prompt_management_hooks_without_prompt_id

* test_init_custom_logger_compatible_class_as_callback
2025-04-29 17:29:02 -07:00
Ishaan Jaff 44264ab6d6 fix failing agent ops test 2025-04-22 14:39:50 -07:00
Ishaan Jaff c1a642ce20 [UI] Allow setting prompt cache_control_injection_points (#10000)
* test_anthropic_cache_control_hook_system_message

* test_anthropic_cache_control_hook.py

* should_run_prompt_management_hooks

* fix should_run_prompt_management_hooks

* test_anthropic_cache_control_hook_specific_index

* fix test

* fix linting errors

* ChatCompletionCachedContent

* initial commit for cache control

* fixes ui design

* fix inserting cache_control_injection_points

* fix entering cache control points

* fixes for using cache control on ui + backend

* update cache control settings on edit model page

* fix init custom logger compatible class

* fix linting errors

* fix linting errors

* fix get_chat_completion_prompt
2025-04-14 21:17:42 -07:00
Krish Dholakia 21ea52105a Support arize phoenix on litellm proxy (#7756) (#8715)
* Update opentelemetry.py

wip

* Update test_opentelemetry_unit_tests.py

* fix a few paths and tests

* fix path

* Update litellm_logging.py

* accidentally removed code

* Add type for protocol

* Add and update tests

* minor changes

* update and add additional arize phoenix test

* update existing test

* address feedback

* use standard_logging_object

* address feedback

Co-authored-by: Nate Mar <67926244+nate-mar@users.noreply.github.com>
2025-02-22 20:55:11 -08:00
Ishaan Jaff f77882948d test_init_custom_logger_compatible_class_as_callback 2025-01-24 21:27:22 -08:00
Krish Dholakia 4911cd80a1 fix(utils.py): move adding custom logger callback to success event in… (#7905)
* fix(utils.py): move adding custom logger callback to success event into separate function + don't add success callback to failure event

if user is explicitly choosing 'success' callback, don't log failure as well

* test(test_utils.py): add unit test to ensure custom logger callback only adds callback to specific event

* fix(utils.py): remove string from list of callbacks once corresponding callback class is added

prevents floating values - simplifies testing

* fix(utils.py): fix linting error

* test: cleanup args before test

* test: fix test

* test: update test

* test: fix test
2025-01-22 21:49:09 -08:00
Krrish Dholakia 55546f403b Revert "fix: fix test"
This reverts commit 4e672f6269.
2025-01-22 18:21:07 -08:00
Krrish Dholakia 4e672f6269 fix: fix test 2025-01-22 18:19:49 -08:00
Ishaan Jaff 5c870c0c51 (performance improvement - litellm sdk + proxy) - ensure litellm does not create unnecessary threads when running async functions (#7680)
* fix handle_sync_success_callbacks_for_async_calls

* fix handle_sync_success_callbacks_for_async_calls

* fix linting / testing errors

* use handle_sync_success_callbacks_for_async_calls

* add unit testing for logging fixes
2025-01-10 17:57:22 -08:00
Ishaan Jaff 03b1db5a7d (Feat) - Add PagerDuty Alerting Integration (#7478)
* define basic types

* fix verbose_logger.exception statement

* fix basic alerting

* test pager duty alerting

* test_pagerduty_alerting_high_failure_rate

* PagerDutyAlerting

* async_log_failure_event

* use pre_call_hook

* add _request_is_completed helper util

* update AlertingConfig

* rename PagerDutyInternalEvent

* _send_alert_if_thresholds_crossed

* use pagerduty as _custom_logger_compatible_callbacks_literal

* fix slack alerting imports

* fix imports in slack alerting

* PagerDutyAlerting

* fix _load_alerting_settings

* test_pagerduty_hanging_request_alerting

* working pager duty alerting

* fix linting

* doc pager duty alerting

* update hanging_response_handler

* fix import location

* update failure_threshold

* update async_pre_call_hook

* docs pagerduty

* test - callback_class_str_to_classType

* fix linting errors

* fix linting + testing error

* PagerDutyAlerting

* test_pagerduty_hanging_request_alerting

* fix unused imports

* docs pager duty

* @pytest.mark.flaky(retries=6, delay=2)

* test_model_info_bedrock_converse_enforcement
2025-01-01 07:12:51 -08:00
Krish Dholakia 41e5b3aa8d HumanLoop integration for Prompt Management (#7479)
* feat(humanloop.py): initial commit for humanloop prompt management integration

Closes https://github.com/BerriAI/litellm/issues/213

* feat(humanloop.py): working e2e humanloop prompt management integration

Closes https://github.com/BerriAI/litellm/issues/213

* fix(humanloop.py): fix linting errors

* fix: fix linting erro

* fix: fix test

* test: handle filenotfound error
2024-12-30 22:26:03 -08:00
Ishaan Jaff 17d5ff2fa4 (fix) initializing OTEL Logging on LiteLLM Proxy - ensure OTEL logger is initialized only once (#7435)
* add otel to _custom_logger_compatible_callbacks_literal

* remove extra code

* fix _get_custom_logger_settings_from_proxy_server

* update unit tests
2024-12-26 21:17:19 -08:00
Ishaan Jaff a790d43116 [Bug Fix]: ImportError: cannot import name 'T' from 're' (#7314)
* fix unused imports

* add test for python 3.12

* re introduce error - as a test

* update config for ci/cd

* fix python 13 install

* bump pyyaml

* bump numpy

* fix embedding requests

* bump pillow dep

* bump version

* bump pydantic

* bump tiktoken

* fix import

* fix python 3.13 import

* fix unused imports in tests/*
2024-12-19 13:09:30 -08:00
Ishaan Jaff 3c984ed60e (feat) Add Azure Blob Storage Logging Integration (#7265)
* add path to http handler

* AzureBlobStorageLogger

* test_azure_blob_storage

* use constants for Azure storage

* use helper get_azure_ad_token_from_entrata_id

* azure blob storage support

* get_azure_ad_token_from_azure_storage

* fix import

* azure logging

* docs azure storage

* add docs on azure blobs

* add premium user check

* add azure_storage  as identified logging callback

* async_upload_payload_to_azure_blob_storage

* docs azure storage

* callback_class_str_to_classType
2024-12-16 22:18:22 -08:00
Krish Dholakia 19a4273fda feat(langfuse/): support langfuse prompt management (#7073)
* feat(langfuse/): support langfuse prompt management

Initial working commit for langfuse prompt management support

Closes https://github.com/BerriAI/litellm/issues/6269

* test: update test

* fix(litellm_logging.py): suppress linting error
2024-12-06 23:10:22 -08:00
Krish Dholakia 7e9d8b58f6 LiteLLM Minor Fixes & Improvements (11/23/2024) (#6870)
* feat(pass_through_endpoints/): support logging anthropic/gemini pass through calls to langfuse/s3/etc.

* fix(utils.py): allow disabling end user cost tracking with new param

Allows proxy admin to disable cost tracking for end user - keeps prometheus metrics small

* docs(configs.md): add disable_end_user_cost_tracking reference to docs

* feat(key_management_endpoints.py): add support for restricting access to `/key/generate` by team/proxy level role

Enables admin to restrict key creation, and assign team admins to handle distributing keys

* test(test_key_management.py): add unit testing for personal / team key restriction checks

* docs: add docs on restricting key creation

* docs(finetuned_models.md): add new guide on calling finetuned models

* docs(input.md): cleanup anthropic supported params

Closes https://github.com/BerriAI/litellm/issues/6856

* test(test_embedding.py): add test for passing extra headers via embedding

* feat(cohere/embed): pass client to async embedding

* feat(rerank.py): add `/v1/rerank` if missing for cohere base url

Closes https://github.com/BerriAI/litellm/issues/6844

* fix(main.py): pass extra_headers param to openai

Fixes https://github.com/BerriAI/litellm/issues/6836

* fix(litellm_logging.py): don't disable global callbacks when dynamic callbacks are set

Fixes issue where global callbacks - e.g. prometheus were overriden when langfuse was set dynamically

* fix(handler.py): fix linting error

* fix: fix typing

* build: add conftest to proxy_admin_ui_tests/

* test: fix test

* fix: fix linting errors

* test: fix test

* fix: fix pass through testing
2024-11-23 15:17:40 +05:30
Krish Dholakia 3beecfb0d4 LiteLLM Minor Fixes & Improvements (11/13/2024) (#6729)
* fix(utils.py): add logprobs support for together ai

Fixes

https://github.com/BerriAI/litellm/issues/6724

* feat(pass_through_endpoints/): add anthropic/ pass-through endpoint

adds new `anthropic/` pass-through endpoint + refactors docs

* feat(spend_management_endpoints.py): allow /global/spend/report to query team + customer id

enables seeing spend for a customer in a team

* Add integration with MLflow Tracing (#6147)

* Add MLflow logger

Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>

* Streaming handling

Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>

* lint

Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>

* address comments and fix issues

Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>

* address comments and fix issues

Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>

* Move logger construction code

Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>

* Add docs

Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>

* async handlers

Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>

* new picture

Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>

---------

Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>

* fix(mlflow.py): fix ruff linting errors

* ci(config.yml): add mlflow to ci testing

* fix: fix test

* test: fix test

* Litellm key update fix (#6710)

* fix(caching): convert arg to equivalent kwargs in llm caching handler

prevent unexpected errors

* fix(caching_handler.py): don't pass args to caching

* fix(caching): remove all *args from caching.py

* fix(caching): consistent function signatures + abc method

* test(caching_unit_tests.py): add unit tests for llm caching

ensures coverage for common caching scenarios across different implementations

* refactor(litellm_logging.py): move to using cache key from hidden params instead of regenerating one

* fix(router.py): drop redis password requirement

* fix(proxy_server.py): fix faulty slack alerting check

* fix(langfuse.py): avoid copying functions/thread lock objects in metadata

fixes metadata copy error when parent otel span in metadata

* test: update test

* fix(key_management_endpoints.py): fix /key/update with metadata update

* fix(key_management_endpoints.py): fix key_prepare_update helper

* fix(key_management_endpoints.py): reset value to none if set in key update

* fix: update test

'

* Litellm dev 11 11 2024 (#6693)

* fix(__init__.py): add 'watsonx_text' as mapped llm api route

Fixes https://github.com/BerriAI/litellm/issues/6663

* fix(opentelemetry.py): fix passing parallel tool calls to otel

Fixes https://github.com/BerriAI/litellm/issues/6677

* refactor(test_opentelemetry_unit_tests.py): create a base set of unit tests for all logging integrations - test for parallel tool call handling

reduces bugs in repo

* fix(__init__.py): update provider-model mapping to include all known provider-model mappings

Fixes https://github.com/BerriAI/litellm/issues/6669

* feat(anthropic): support passing document in llm api call

* docs(anthropic.md): add pdf anthropic call to docs + expose new 'supports_pdf_input' function

* fix(factory.py): fix linting error

* add clear doc string for GCS bucket logging

* Add docs to export logs to Laminar (#6674)

* Add docs to export logs to Laminar

* minor fix: newline at end of file

* place laminar after http and grpc

* (Feat) Add langsmith key based logging (#6682)

* add langsmith_api_key to StandardCallbackDynamicParams

* create a file for langsmith types

* langsmith add key / team based logging

* add key based logging for langsmith

* fix langsmith key based logging

* fix linting langsmith

* remove NOQA violation

* add unit test coverage for all helpers in test langsmith

* test_langsmith_key_based_logging

* docs langsmith key based logging

* run langsmith tests in logging callback tests

* fix logging testing

* test_langsmith_key_based_logging

* test_add_callback_via_key_litellm_pre_call_utils_langsmith

* add debug statement langsmith key based logging

* test_langsmith_key_based_logging

* (fix) OpenAI's optional messages[].name  does not work with Mistral API  (#6701)

* use helper for _transform_messages mistral

* add test_message_with_name to base LLMChat test

* fix linting

* add xAI on Admin UI (#6680)

* (docs) add benchmarks on 1K RPS  (#6704)

* docs litellm proxy benchmarks

* docs GCS bucket

* doc fix - reduce clutter on logging doc title

* (feat) add cost tracking stable diffusion 3 on Bedrock  (#6676)

* add cost tracking for sd3

* test_image_generation_bedrock

* fix get model info for image cost

* add cost_calculator for stability 1 models

* add unit testing for bedrock image cost calc

* test_cost_calculator_with_no_optional_params

* add test_cost_calculator_basic

* correctly allow size Optional

* fix cost_calculator

* sd3 unit tests cost calc

* fix raise correct error 404 when /key/info is called on non-existent key  (#6653)

* fix raise correct error on /key/info

* add not_found_error error

* fix key not found in DB error

* use 1 helper for checking token hash

* fix error code on key info

* fix test key gen prisma

* test_generate_and_call_key_info

* test fix test_call_with_valid_model_using_all_models

* fix key info tests

* bump: version 1.52.4 → 1.52.5

* add defaults used for GCS logging

* LiteLLM Minor Fixes & Improvements (11/12/2024)  (#6705)

* fix(caching): convert arg to equivalent kwargs in llm caching handler

prevent unexpected errors

* fix(caching_handler.py): don't pass args to caching

* fix(caching): remove all *args from caching.py

* fix(caching): consistent function signatures + abc method

* test(caching_unit_tests.py): add unit tests for llm caching

ensures coverage for common caching scenarios across different implementations

* refactor(litellm_logging.py): move to using cache key from hidden params instead of regenerating one

* fix(router.py): drop redis password requirement

* fix(proxy_server.py): fix faulty slack alerting check

* fix(langfuse.py): avoid copying functions/thread lock objects in metadata

fixes metadata copy error when parent otel span in metadata

* test: update test

* bump: version 1.52.5 → 1.52.6

* (feat) helm hook to sync db schema  (#6715)

* v0 migration job

* fix job

* fix migrations job.yml

* handle standalone DB on helm hook

* fix argo cd annotations

* fix db migration helm hook

* fix migration job

* doc fix Using Http/2 with Hypercorn

* (fix proxy redis) Add redis sentinel support  (#6154)

* add sentinel_password support

* add doc for setting redis sentinel password

* fix redis sentinel - use sentinel password

* Fix: Update gpt-4o costs to that of gpt-4o-2024-08-06 (#6714)

Fixes #6713

* (fix) using Anthropic `response_format={"type": "json_object"}`  (#6721)

* add support for response_format=json anthropic

* add test_json_response_format to baseLLM ChatTest

* fix test_litellm_anthropic_prompt_caching_tools

* fix test_anthropic_function_call_with_no_schema

* test test_create_json_tool_call_for_response_format

* (feat) Add cost tracking for Azure Dall-e-3 Image Generation  + use base class to ensure basic image generation tests pass  (#6716)

* add BaseImageGenTest

* use 1 class for unit testing

* add debugging to BaseImageGenTest

* TestAzureOpenAIDalle3

* fix response_cost_calculator

* test_basic_image_generation

* fix img gen basic test

* fix _select_model_name_for_cost_calc

* fix test_aimage_generation_bedrock_with_optional_params

* fix undo changes cost tracking

* fix response_cost_calculator

* fix test_cost_azure_gpt_35

* fix remove dup test (#6718)

* (build) update db helm hook

* (build) helm db pre sync hook

* (build) helm db sync hook

* test: run test_team_logging firdst

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Dinmukhamed Mailibay <47117969+dinmukhamedm@users.noreply.github.com>
Co-authored-by: Kilian Lieret <kilian.lieret@posteo.de>

* test: update test

* test: skip anthropic overloaded error

* test: cleanup test

* test: update tests

* test: fix test

* test: handle gemini overloaded model error

* test: handle internal server error

* test: handle anthropic overloaded error

* test: handle claude instability

---------

Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>
Co-authored-by: Yuki Watanabe <31463517+B-Step62@users.noreply.github.com>
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Dinmukhamed Mailibay <47117969+dinmukhamedm@users.noreply.github.com>
Co-authored-by: Kilian Lieret <kilian.lieret@posteo.de>
2024-11-15 11:18:31 +05:30
Ishaan Jaff 030ece8c3f (Feat) New Logging integration - add Datadog LLM Observability support (#6449)
* add type for dd llm obs request ob

* working dd llm obs

* datadog use well defined type

* clean up

* unit test test_create_llm_obs_payload

* fix linting

* add datadog_llm_observability

* add datadog_llm_observability

* docs DD LLM obs

* run testing again

* document DD_ENV

* test_create_llm_obs_payload
2024-10-28 22:01:32 +05:30
Ishaan Jaff cdda7c243f (refactor) prometheus async_log_success_event to be under 100 LOC (#6416)
* unit testig for prometheus

* unit testing for success metrics

* use 1 helper for _increment_token_metrics

* use helper for _increment_remaining_budget_metrics

* use _increment_remaining_budget_metrics

* use _increment_top_level_request_and_spend_metrics

* use helper for _set_latency_metrics

* remove noqa violation

* fix test prometheus

* test prometheus

* unit testing for all prometheus helper functions

* fix prom unit tests

* fix unit tests prometheus

* fix unit test prom
2024-10-24 16:41:09 +04:00
Ishaan Jaff d1f457d17a (testing) add test coverage for init custom logger class (#6341)
* working test for init custom logger

* add test coverage for custom_logger_compatible_class_as_callback
2024-10-21 15:56:32 +05:30
Ishaan Jaff bd9e29b8b9 working test for init custom logger 2024-10-21 14:33:52 +05:30
Ishaan Jaff 24a3090ff6 fix init logger tests 2024-10-21 14:25:19 +05:30
Ishaan Jaff 11adc12326 add unit tests for init callbacks 2024-10-21 14:20:37 +05:30