Commit Graph

26 Commits

Author SHA1 Message Date
Ishaan Jaff c3e673b627 [Feat] Add github co-pilot as a new LLM API provider (#12325)
* Litellm dev 03 05 2025 contributor prs (#9079)

* feat: add support for copilot provider

* test: add tests for github copilot

* chore: clean up github copilot authenticator

* test: add test for github copilot authenticator

* test: add test for github copilot for sonnet 3.7 thought model

* Fix #7629 - Add tzdata package to Dockerfile (#8915)

* Add tzdata package to Dockerfile

* Move tzdata to python requirement.txt

* feat: add support for copilot provider (#8577)

* feat: add support for copilot provider

* test: add tests for github copilot

* chore: clean up github copilot authenticator

* test: add test for github copilot authenticator

* test: add test for github copilot for sonnet 3.7 thought model

---------

Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>

* feat: add model information for copilot models

* fix: fix linting errors

* test: remove integration test for github_copilot + fix misisng mock

* fix: use print to make sure the logger message shown

* test: remove debug print

* fix lint (#11112)

* Add init files to make test directories Python packages and update import paths in test_token_counter.py (#11119)

* Update litellm/model_prices_and_context_window_backup.json

Co-authored-by: மனோஜ்குமார் பழனிச்சாமி <smartmanoj42857@gmail.com>

---------

Co-authored-by: Son H. Nguyen <nhs.000.dev@gmail.com>
Co-authored-by: subnet.dev <50828879+subnet-dev@users.noreply.github.com>
Co-authored-by: Son H. Nguyen <33925625+nhs000@users.noreply.github.com>
Co-authored-by: மனோஜ்குமார் பழனிச்சாமி <smartmanoj42857@gmail.com>
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>

* refactor github copilot

* test_github_copilot_transformation.py

* test_github_copilot_authenticator.py

* add GitHub Copilot

* fix order

* doc fix

---------

Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>
Co-authored-by: Son H. Nguyen <nhs.000.dev@gmail.com>
Co-authored-by: subnet.dev <50828879+subnet-dev@users.noreply.github.com>
Co-authored-by: Son H. Nguyen <33925625+nhs000@users.noreply.github.com>
Co-authored-by: மனோஜ்குமார் பழனிச்சாமி <smartmanoj42857@gmail.com>
2025-07-04 13:12:16 -07:00
Ishaan Jaff 39955129f5 fix mapped tests (#12320)
* fix - use flush llm client cache

* faster mapped tests

* test_async_multiple_response_ids_routing

* fix tests

* test_ateam_member_update_admin_requires_premium

* regular mapped tests

* Revert "Fix: Initialize JSON logging for all loggers when JSON_LOGS=True (#12206)"

This reverts commit 2c60c316ec.

* reset num workers
2025-07-04 10:04:43 -07:00
Cole McIntosh 2c60c316ec Fix: Initialize JSON logging for all loggers when JSON_LOGS=True (#12206)
When JSON_LOGS=True is set, error logs were not being formatted as JSON despite
the configuration. This was because the logging initialization code configured
individual loggers but failed to properly initialize all loggers with the JSON
formatter.

This fix ensures that when json_logs is enabled, the _initialize_loggers_with_handler()
function is called to:
- Configure all loggers (root, LiteLLM, Router, Proxy) with JSON formatter
- Disable logger propagation to prevent duplicate entries
- Set up exception handlers for JSON formatting

Fixes LIT-267
2025-07-02 12:18:28 -07:00
Jugal D. Bhatt d322e772f0 Litellm add sentry scrubbing (#12210)
* add sentry scrubbing

* add new constants

* remove_unused_import

* sentry scrubbing test

* added unit test
2025-07-01 20:42:36 -07:00
Tim O'Farrell e50789730d Fix allow strings in calculate cost (#12200)
* Allow strings in calculate cost

Sometimes the cost per unit is a string (e.g.: If a value like "3e-7" was read from the config.yaml)

* Add comprehensive tests for string cost value handling

- Added test_string_cost_values() to test basic string cost conversion functionality
- Added test_calculate_cost_component_with_string_values() to test the calculate_cost_component function directly
- Added test_string_cost_values_edge_cases() to test mixed string/float costs and error handling
- Added test_string_cost_values_with_threshold() to test string costs with threshold pricing
- Enhanced _get_token_base_cost() to handle string-to-float conversion for base costs and threshold costs
- Enhanced generic_cost_per_token() to handle string-to-float conversion for audio and reasoning token costs
- All tests cover scientific notation (e.g., '3e-7'), decimal notation (e.g., '0.000001'), and error handling for invalid strings
- Maintains backward compatibility with existing float cost values

* Dry up code

* Fixed case where number was an integer

* Allowing None

---------

Co-authored-by: openhands <openhands@all-hands.dev>
2025-07-01 11:37:15 -07:00
Krish Dholakia 49ed3065f6 VertexAI Anthropic - streaming cost tracking w/ prompt caching fixes (#12188)
* fix(rebuild-usage-object---ensure-cache_tokens-is-set): Ensures cache tokens is correctly set

Fixes https://github.com/BerriAI/litellm/issues/12149

* test(test_stream_chunk_builder_utils.py): add unit test to ensure cached tokens is part of stream chunk builder

Ensures standardized values are used
2025-06-30 22:00:10 -07:00
Krish Dholakia f7af8902b0 /v1/messages - Remove hardcoded model name on streaming + Tags - enable setting custom header tags (#12131)
* fix(anthropic/experimental_pass_through): use given model name when returning streaming chunks

don't harcode model name on streaming

confusing for user

* fix(anthropic/streaming_iterator.py): remove scope of import

* feat(litellm_logging.py): allow admin to specify additional headers for using as spend tags

Closes https://github.com/BerriAI/litellm/issues/12129

* test(test_litellm_logging.py): add unit tests

* feat(openweb_ui.md): add custom tag tutorial to docs

* docs(cost_tracking.md): add tag based usage UI screenshot

* test: update test

* fix: fix import
2025-06-28 21:49:35 -07:00
Cole McIntosh 0b95fb63cc Add Azure OpenAI assistant features cost tracking (#12045)
* Add Azure OpenAI assistant features cost tracking

Implements cost tracking for Azure's new assistant features:
- File Search: $0.1 USD per 1 GB/Day (storage-based pricing)
- Code Interpreter: $0.03 USD per session
- Computer Use: $0.003 input + $0.012 output per 1K tokens

Features:
- Provider-specific pricing (Azure vs OpenAI)
- Model-specific pricing overrides via JSON config
- Environment variable configuration
- Backwards compatible with existing OpenAI pricing

* Add comprehensive tests for Azure assistant features cost tracking

- Unit tests for file search, code interpreter, computer use, vector store
- Integration tests for combined cost calculation
- Provider-specific pricing tests (Azure vs OpenAI)
- Model-specific pricing override tests
- Edge case handling (None inputs, zero values)
- All 17 tests passing

* Fix test and ensure all Azure assistant cost tracking tests pass

- Fixed integration test approach
- All 17 tests now passing
- Comprehensive coverage of Azure assistant features cost tracking

* Enhance cost tracking for Azure assistant features

- Safely convert and extract parameters for file search, computer use, and code interpreter sessions.
- Ensure model_info is consistently converted to a dictionary format.
- Improve error handling for input values to prevent type-related issues.
- Maintain compatibility with existing cost calculation methods.

* Refactor cost tracking for Azure assistant features

- Introduced separate methods for handling costs related to web search, file search, vector store, computer use, and code interpreter.
- Enhanced parameter extraction and conversion for file search and computer use.
- Improved error handling and type safety throughout the cost calculation process.
- Maintained compatibility with existing cost calculation methods while streamlining the overall structure.
2025-06-27 21:33:00 -07:00
Ishaan Jaff ebf6395bc1 [Feat] Add Eleven Labs - Speech To Text Support on LiteLLM (#12119)
* add ELEVENLABS as a provider

* add deepgram to main.py

* add ElevenLabsException

* add ElevenLabsAudioTranscriptionConfig

* add transform_audio_transcription_response

* TestElevenLabsAudioTranscription

* add elevenlabs/scribe_v1 to model cost map

* add ElevenLabsAudioTranscriptionConfig

* add AudioTranscriptionRequestData

* add ElevenLabs transform

* use AudioTranscriptionRequestData

* refactoring fixes

* add ProcessedAudioFile util for reading audio files

* test_elevenlabs_diarize_parameter_passthrough

* docs eleven labs

* docs fixes

* fix code qa checks

* fixes - audio transcription

* ui - add ElevenLabs logo

* add elevenlabs logo

* docs - ElevenLabs

* test fix elevenlabs
2025-06-27 17:50:49 -07:00
Ishaan Jaff f04808e293 [Bug Fix] Exception mapping for context window exceeded - should catch anthropic exceptions (#12113)
* fix is is_error_str_context_window_exceeded

* test_is_error_str_context_window_exceeded

* fix is_error_str_context_window_exceeded
2025-06-27 15:49:07 -07:00
Davis Featherstone 51074ffcae Fix Azure-OpenAI Vision API Compliance (#12075)
* Fix Azure-OpenAI Vision API Compliance

* Linting Fix
2025-06-26 22:54:15 -07:00
Cole McIntosh 8965cc3b6c Fix unpack_defs handling of nested $ref inside anyOf items (#11964)
* refactor(unpack_defs): enhance handling of schema properties and anyOf structures

- Improved the unpack_defs function to handle top-level properties and nested structures more effectively.
- Added recursion for items in schemas and refined the handling of anyOf branches to ensure proper unpacking of references.
- Streamlined the logic for resolving $ref keys and managing nested schemas.

* test(unpack_defs): add test for resolving nested $ref in anyOf items

- Introduced a new test to verify that unpack_defs correctly resolves references within items of anyOf structures, addressing a specific bug scenario (Issue #11372).
- The test includes a minimal schema to ensure proper unpacking and validation of the resolved items schema.

* refactor(unpack_defs): implement a generic resolver for $ref entries

- Redesigned the unpack_defs function to provide a more robust and dependency-free implementation for resolving all $ref entries in JSON schemas.
- Introduced a depth-first traversal method that efficiently handles nested structures, including anyOf, allOf, and items, while avoiding infinite recursion.
- Enhanced memory management by resolving nodes in-place without creating a full dereferenced copy, improving performance and reducing overhead.

* Remove test for unpack_defs resolving nested references in anyOf items from test_utils.py

* Add test for unpack_defs resolving nested references in anyOf items

This commit introduces a new test to ensure that the unpack_defs function correctly resolves $ref references within items of anyOf schemas, addressing issue #11372. The test verifies that the unpacked schema contains the expected properties and structure.
2025-06-24 09:08:15 -07:00
Krish Dholakia a89397a798 Litellm dev 06 23 2025 p1 (#11989)
* fix(litellm_logging.py): fix using router model id for logging calls

Fixes https://github.com/BerriAI/litellm/issues/11975#issuecomment-2995882238

* test(test_litellm_logging.py): add unit test for custom price tracking

* fix(vertex_ai/): don't send invalid format parameter to vertex

causes calls to fail

* fix(vertex_ai_context_caching.py): if cached content present and tools in message, cache tools as well

gemini throws errors if tools passed in alongside cached content

* test: add unit tests

* fix: fix linting errors

* test: test_vertex_ai_common_utils.py

update test

* fix(streaming_handler.py): unset response cost when creating model response
2025-06-23 22:33:06 -07:00
Krish Dholakia 308e82d885 LiteLLM SDK <-> Proxy improvement (don't transform message client-side) + Bedrock - handle qs:.. in base64 file data + Tag Management - support adding public model names (#11908)
* fix(factory.py): handle qs:.. in mime type

Fixes https://github.com/BerriAI/litellm/issues/11839

* feat(litellm_proxy/): don't transform messages client-side

leave litellm proxy messages untouched - allow proxy to handle transformation

 prevents double transformation

* feat(tag_management_endpoints.py): support adding models to tag by adding model_name

Closes https://github.com/BerriAI/litellm/issues/11884

* test(test_tag_management_endpoints.py): add unit tests for adding new model by public model name

* test: update test
2025-06-19 22:34:18 -07:00
Krish Dholakia 0d09c8ec96 Litellm dev 06 18 2025 p1 (#11872)
* fix(spend_tracking_utils.py): add user agent tags from standard logging payload, in spend logs payload

* feat(litellm_logging.py): identify user agent tags as `User-Agent: ..` and allow admin to disable storing user agent as tag

* fix(azure_ai/): pass content type header in azure ai request

Fixes https://github.com/BerriAI/litellm/issues/11227

* test: add unit test

* fix(router.py): fix passing dynamic credentials to retrieve batch

Fixes batch retrieval when using router

* test: add more unit tests
2025-06-18 21:24:36 -07:00
Krish Dholakia c92b6c175c Prometheus - fix request increment + add route tracking for streaming requests (#11731)
* fix(prometheus.py): remove request increment from inside the log success event

it's only done on post-call success/failure

* fix(litellm_logging.py): add additional validation step for checking if 'stream' is true

prevent double counting on non-stream requests

* test: add unit testing to ensure stream is not incorrectly set to true

* feat(litellm_logging.py): emit request route in standard logging payload

used by prometheus streaming metrics for route

* fix: fix otel test

* fix: fix linting errors

* test: update test

* fix: fix linting error
2025-06-14 16:26:48 -07:00
Krrish Dholakia 31a73be03f fix(litellm_logging.py): skip should_run_logging check on streaming 2025-06-13 21:19:24 -07:00
Krish Dholakia 07472ce21f Logging: prevent double logging logs when bridge is used (anthropic <-> chat completion OR chat completion <-> responses api) (#11687)
* feat(anthropic/passthrough): pass dynamic api key/api base params to litellm.completion

allows calls to work with config.yaml

* fix(responses_api/transformation): fix passing dynamic params to responses api from .completion()

Allows responses api to work with config.yaml

* fix(langfuse.py): fix responses api usage logging to langfuse

* refactor(litellm_logging.py): add more generic solution for responses api usage logging

ensures it works across all logging integrations

* fix(litellm_logging.py): patch for anthropic messages not returning a pydantic object

it should ideally return a pydantic object, which would simplify checks and reduce errors

* fix(handler.py): correctly bubble up empty choices errors to litellm.completion

causes downstream errors as it is expected there is at least one choice set

* feat(litellm_logging.py): prevent double logging litellm responses

ensures accurate spend tracking for calls when bridges are used

* fix(litellm_logging.py): ensure logging is consistently enforced across all call types

* fix: patch - set calltype before entering bridge api

ensures logging object is applying the correct logic on the event hooks

* fix(types/router.py): loosen type hint for mock response

* change space_key header to space_id for Arize (#11595)

* feat(schema): add additional indexes to LiteLLM_SpendLogs for improved query performance (#11675)

* Revert "feat(schema): add additional indexes to LiteLLM_SpendLogs for improve…" (#11683)

This reverts commit 2a7f113fde.

* [Feat] Use dedicated Rest endpoints for list, calling MCP tools  (#11684)

* fix: (fix) use specific rest endpoints for MCP

* ui - use rest mcp endpoints

* fix imports

* docs DISABLE_AIOHTTP_TRUST_ENV

* docs(caching.md): remove batch redis get recommendation - old code path, no longer necessary

* fix(vertex_and_google_ai_studio_gemini.py): handle gemini not passing audio token usage data

* Chat Completions <-> Responses API Bridge Improvements (#11685)

* feat(anthropic/passthrough): pass dynamic api key/api base params to litellm.completion

allows calls to work with config.yaml

* fix(responses_api/transformation): fix passing dynamic params to responses api from .completion()

Allows responses api to work with config.yaml

* fix(langfuse.py): fix responses api usage logging to langfuse

* refactor(litellm_logging.py): add more generic solution for responses api usage logging

ensures it works across all logging integrations

* fix(litellm_logging.py): patch for anthropic messages not returning a pydantic object

it should ideally return a pydantic object, which would simplify checks and reduce errors

* fix(handler.py): correctly bubble up empty choices errors to litellm.completion

causes downstream errors as it is expected there is at least one choice set

* fix(response_metadata.py): allow model_info to be none

* fix(litellm_logging.py): copy object before mutating

* fix: fix lint check

* fix: fix linting error

* fix: fix linting error

---------

Co-authored-by: vanities <mischkeaa@gmail.com>
Co-authored-by: Cole McIntosh <82463175+colesmcintosh@users.noreply.github.com>
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
2025-06-12 23:07:36 -07:00
Krish Dholakia 39de3610be fix(internal_user_endpoints.py): support user with + in email on us… (#11601)
* fix(internal_user_endpoints.py): support user with `+` in email on user info

ensures user is correctly parsed from input

* fix(factory.py): support vertex function call args as None

handles empty string in args for vertex gemini calls

* docs(langfuse_integration.md): pin langfuse sdk version on docs

* fix(vertex_ai/): return empty dict, instead of none when empty string given

* refactor: reduce function size

* fix: fix linting errors

* fix: revert check

* fix(internal_user_endpoints.py): fix check

* test: update tests

* test: update tests
2025-06-10 22:13:10 -07:00
Krish Dholakia 8dd8615a54 Ensure consistent 'created' across all chunks + set tool call id for ollama streaming calls (#11528)
* fix(streaming_handler.py): maintain same 'created' across all chunks

Fixes https://github.com/BerriAI/litellm/issues/11437

* test: add unit test to ensure created is always the same across all chunks

* fix(types/utils.py): set a tool call id, if missing in delta tool call

Ensures stream chunk builder can reconstruct tool calls correctly

Fixes https://github.com/BerriAI/litellm/issues/11262

* fix(responses/transformation.py): support passing mcp server tool call to anthropic

allows switching between openai and anthropic for mcp tool calling

* fix(ollama/chat/transformation.py): set tool call id's when missing
2025-06-07 20:50:07 -07:00
Krish Dholakia 603bd73a17 Gemini - web search cost tracking + Update max output tokens for nova models
* fix(vertex_and_google_ai_studio_gemini.py): add web search request tracking

Enables cost calculation for google web search

* fix(vertex_and_gemini): use common processing logic across stream / non-stream calls

* fix(vertex_And_google_ai_studio_Gemini.py): fix initial choice

* fix: fix linting error

* fix: add initial support for google search cost tracking

* fix(tool_call_cost_tracking.py): working tool cost tracking for gemini

* fix(vertex_ai/gemini/cost_calculator.py): add google web search tool cost tracking for vertex ai

Closes LIT-210

* fix: fix check

* build(model_prices_and_context_window.json): fix amazon nova max output tokens

Closes https://github.com/BerriAI/litellm/issues/11441

* fix: fix ruff check
2025-06-05 23:25:18 -07:00
Ishaan Jaff 99c91fe41f [Feat]: Performance add DD profiler to monitor python profile of LiteLLM CPU% (#11375)
* feat: add DD profile

* fix: test_should_use_dd_profiler

* docs dd profiler

* docs DD profiler
2025-06-03 12:03:08 -07:00
Ishaan Jaff d7f19bbfe3 [Bug]: Performance Fix Max langfuse clients reached: 20 is greater than 20 (#11285)
* fix: initializing langfuse clients

* fix: initializing langfuse clients

* tests: tests for langfuse cache
2025-05-30 22:34:39 -07:00
Vinnie-Singleton-NN 178a614d4a Add sentry sample rate (#10283)
* Add SENTRY_API_SAMPLE_RATE configuration option for Sentry SDK

* removed print line

* Update Sentry documentation with sample rate information

---------

Co-authored-by: Vinnie <vinnie@Vinnies-MacBook-Pro.local>
2025-05-28 16:44:10 -07:00
Ishaan Jaff e606bfe31d [Feat - Contributor PR] Add Video support for Bedrock Converse (#11166)
* feat: add video support for bedrock converse api (#11043)

* fixes: bedrock add video support

* fixes: bedrock add video support

---------

Co-authored-by: yytdfc <fuchen@foxmail.com>
2025-05-26 20:17:07 -07:00
Krish Dholakia ef42461c1e Litellm fix GitHub action testing (#11163)
* test: add __init__.py files

* refactor: rename test folder to avoid naming conflict

* test: update workflows

* test: update tests

* test: update imports

* test: update tests

* test: remove unused import

* ci(test-litellm.yml): add pytest retry to github workflow

* test: fix test
2025-05-26 14:41:42 -07:00