Commit Graph

54 Commits

Author SHA1 Message Date
Ishaan Jaffer 01fd4d7cef fix fireworks test 2025-11-26 18:58:32 -08:00
Krish Dholakia 06906534b3 feat(audio_transcriptions/): calculate duration of audio file for cost calculation + feat (image_generations): cost tracking accuracy improved with output_format, quality, size values fixed per openai model
* feat(audio_transcriptions/): calculate duration of audio file for cost calculation

Fixes https://github.com/BerriAI/litellm/issues/11846

Closes https://github.com/BerriAI/litellm/issues/14605

* fix(cost_calculator.py): correctly use base model, when set

Fixes issue where azure base model was being ignored

* feat(cost_calculator.py): fix default cost tracking quality param for image generation

* feat(image_generations/): return output_format, quality, size

aligns response to openai spec and improves cost tracking accuracy

* fix(cost_calculator.py): refactor cost calculation for image generation to use image response instead of hidden params

* build: update build

* fix: fix cost calculation

* build: update poetry lock

* fix: fix ruff checks

* fix: fix aembedding

* fix: fix ruff errors

* fix: modify to catch errors

* fix: test

* fix: loosen test to handle openai lib out of sync

* fix: fix base models

* fix: fix usage object
2025-11-08 16:24:31 -08:00
Krish Dholakia 202eaeb1a2 Revert "(feat) Audio transcription - cost tracking + (feat) image generation …" (#16409)
This reverts commit c96da44265.
2025-11-08 15:38:16 -08:00
Krish Dholakia c96da44265 (feat) Audio transcription - cost tracking + (feat) image generation - accurate cost tracking based on output_format/quality/size
* feat(audio_transcriptions/): calculate duration of audio file for cost calculation

Fixes https://github.com/BerriAI/litellm/issues/11846

Closes https://github.com/BerriAI/litellm/issues/14605

* fix(cost_calculator.py): correctly use base model, when set

Fixes issue where azure base model was being ignored

* feat(cost_calculator.py): fix default cost tracking quality param for image generation

* feat(image_generations/): return output_format, quality, size

aligns response to openai spec and improves cost tracking accuracy

* fix(cost_calculator.py): refactor cost calculation for image generation to use image response instead of hidden params

* build: update build

* fix: fix cost calculation

* build: update poetry lock

* fix: fix ruff checks

* fix: fix aembedding

* fix: fix ruff errors

* fix: modify to catch errors

* fix: test

* fix: loosen test to handle openai lib out of sync
2025-11-08 15:30:46 -08:00
Ishaan Jaffer 214c10f6ef test_completion_cost_databricks_embedding 2025-10-25 11:47:03 -07:00
Ishaan Jaffer 6aa35ec999 test text-embedding-ada-002 2025-09-27 12:41:35 -07:00
Krrish Dholakia 0e747aaaf1 test: fix test 2025-09-16 19:20:12 -07:00
Krrish Dholakia d05f58721e test: remove end of life model from tests 2025-09-09 21:01:45 -07:00
Ishaan Jaff d37be48a80 test: llama-3.3-70b-versatile 2025-09-01 20:14:12 -07:00
Jugal D. Bhatt aea0605eed [LLM Translation] Fix Realtime API endpoint for no intent (#13476)
* fix intent params

* Add responses

* fix unrelated test

* test fix - fireworks API endpoint is down

* test fix fireworks ai is having an active outage

* test_completion_cost_databricks

* dbrx fix test API currently not responding

* Update OpenAI Realtime handler to use the correct endpoint and include all query parameters. Adjusted error messages for missing API base and key. Updated health check URL construction to pass model as a query parameter.

* Enhance OpenAI Realtime handler tests to ensure model parameter inclusion in WebSocket URL. Added new tests to verify correct URL construction with model and additional parameters, preventing 'missing_model' errors. Updated existing tests for consistency.

* Remove debug print statements for API base and key in OpenAIRealtime handler to clean up the code.

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
2025-08-14 16:24:14 -07:00
Ishaan Jaff 461cd0c30a test_completion_cost_deepseek 2025-07-23 13:16:12 -07:00
Krrish Dholakia 4ab0ee0b65 test: more testing fixes 2025-05-01 15:36:13 -07:00
Krish Dholakia 6ad483dde7 Litellm dev 04 30 2025 p1 (#10462)
* fix(exception_mapping_utils.py): correctly pass through 504 status code

openai also raises a 504 status code

* build(model_prices_and_context_window.json): add gpt-4o-mini-tts to model cost map

Fixes https://github.com/BerriAI/litellm/issues/9591

* fix(cost_calculator.py): fix input cost calculation for gpt-4o-mini-tts

Fixes https://github.com/BerriAI/litellm/issues/9591

* test: testing updates
2025-04-30 22:11:12 -07:00
Krish Dholakia d783190e04 Update fireworks ai pricing (#10425)
* build(model_prices_and_context_window.json): add fireworks ai new 0-4b pricing tier

* build(model_prices_and_context_window.json): add more fireworks ai models

* test: update testing

* test: testing updates

* test: update test

* test: update test
2025-04-29 20:58:05 -07:00
Krrish Dholakia 652e1b7f0f test: update test 2025-04-18 20:36:15 -07:00
Krrish Dholakia 3e87ec4f16 test: replace removed fireworks ai models 2025-04-18 14:23:16 -07:00
Ishaan Jaff 65f8015221 test fix - azure deprecated azure ai mistral 2025-04-15 21:08:55 -07:00
Krish Dholakia 0dbd663877 fix(cost_calculator.py): handle custom pricing at deployment level fo… (#9855)
* fix(cost_calculator.py): handle custom pricing at deployment level for router

* test: add unit tests

* fix(router.py): show custom pricing on UI

check correct model str

* fix: fix linting error

* docs(custom_pricing.md): clarify custom pricing for proxy

Fixes https://github.com/BerriAI/litellm/issues/8573#issuecomment-2790420740

* test: update code qa test

* fix: cleanup traceback

* fix: handle litellm param custom pricing

* test: update test

* fix(cost_calculator.py): add router model id to list of potential model names

* fix(cost_calculator.py): fix router model id check

* fix: router.py - maintain older model registry approach

* fix: fix ruff check

* fix(router.py): router get deployment info

add custom values to mapped dict

* test: update test

* fix(utils.py): update only if value is non-null

* test: add unit test
2025-04-09 22:13:10 -07:00
Ishaan Jaff d8f47fc9e5 databricks/databricks-meta-llama-3-3-70b-instruct 2025-04-07 20:16:24 -07:00
Ishaan Jaff 7262606411 test_completion_cost_databricks 2025-04-05 13:30:17 -07:00
Ishaan Jaff d87bb9bb6e test_completion_cost_databricks 2025-04-05 13:13:25 -07:00
Ishaan Jaff 1638872762 databricks/databricks-meta-llama-3.3-70b-instruct" 2025-04-05 13:12:21 -07:00
Krish Dholakia 722f3ff0e6 fix(cost_calculator.py): allows checking received + sent model name when checking for cost calculation (#9669)
Fixes issue introduced by https://github.com/BerriAI/litellm/commit/dfb838eaff82301d4101d09982fbbb251bbc1ce1#r154667517
2025-03-31 21:29:48 -07:00
Krish Dholakia 4351c77253 Support Gemini audio token cost tracking + fix openai audio input token cost tracking (#9535)
* fix(vertex_and_google_ai_studio_gemini.py): log gemini audio tokens in usage object

enables accurate cost tracking

* refactor(vertex_ai/cost_calculator.py): refactor 128k+ token cost calculation to only run if model info has it

Google has moved away from this for gemini-2.0 models

* refactor(vertex_ai/cost_calculator.py): migrate to usage object for more flexible data passthrough

* fix(llm_cost_calc/utils.py): support audio token cost tracking in generic cost per token

enables vertex ai cost tracking to work with audio tokens

* fix(llm_cost_calc/utils.py): default to total prompt tokens if text tokens field not set

* refactor(llm_cost_calc/utils.py): move openai cost tracking to generic cost per token

more consistent behaviour across providers

* test: add unit test for gemini audio token cost calculation

* ci: bump ci config

* test: fix test
2025-03-26 17:26:25 -07:00
Krrish Dholakia e2ae504a81 test: skip flaky tests 2025-03-11 19:43:04 -07:00
Ishaan Jaff 6e3b21775f test_cost_azure_openai_prompt_caching 2025-03-08 16:19:28 -08:00
Krish Dholakia 5e386c28b2 Litellm dev 03 04 2025 p3 (#8997)
* fix(core_helpers.py): handle litellm_metadata instead of 'metadata'

* feat(batches/): ensure batches logs are written to db

makes batches response dict compatible

* fix(cost_calculator.py): handle batch response being a dictionary

* fix(batches/main.py): modify retrieve endpoints to use @client decorator

enables logging to work on retrieve call

* fix(batches/main.py): fix retrieve batch response type to be 'dict' compatible

* fix(spend_tracking_utils.py): send unique uuid for retrieve batch call type

create batch and retrieve batch share the same id

* fix(spend_tracking_utils.py): prevent duplicate retrieve batch calls from being double counted

* refactor(batches/): refactor cost tracking for batches - do it on retrieve, and within the established litellm_logging pipeline

ensures cost is always logged to db

* fix: fix linting errors

* fix: fix linting error
2025-03-04 21:58:03 -08:00
Krish Dholakia 251467a525 add bedrock llama vision support + cohere / infinity rerank - 'return_documents' support (#8684)
* build(model_prices_and_context_window.json): mark bedrock llama as supporting vision based on docs

* Add price for Cerebras llama3.3-70b (#8676)

* docs(readme.md): fix contributing docs

point people to new mock directory testing structure s/o @vibhavbhat

* build: update contributing readme

* docs(readme.md): improve docs

* docs(readme.md): cleanup readme on tests/

* docs(README.md): cleanup doc

* feat(infinity/): support returning documents when return_documents=True

* test(test_rerank.py): add e2e testing for cohere rerank

* fix: fix linting errors

* fix(together_ai/): fix together ai transformation

* fix: fix linting error

* fix: fix linting errors

* fix: fix linting errors

* test: mark cohere as flaky

* build: fix model supports check

* test: fix test

* test: mark flaky test

* fix: fix test

* test: fix test

---------

Co-authored-by: Yury Koleda <fut.wrk@gmail.com>
2025-02-20 21:23:54 -08:00
Krish Dholakia b682dc4ec8 Add cost tracking for rerank via bedrock (#8691)
* feat(bedrock/rerank): infer model region if model given as arn

* test: add unit testing to ensure bedrock region name inferred from arn on rerank

* feat(bedrock/rerank/transformation.py): include search units for bedrock rerank result

Resolves https://github.com/BerriAI/litellm/issues/7258#issuecomment-2671557137

* test(test_bedrock_completion.py): add testing for bedrock cohere rerank

* feat(cost_calculator.py): refactor rerank cost tracking to support bedrock cost tracking

* build(model_prices_and_context_window.json): add amazon.rerank model to model cost map

* fix(cost_calculator.py): bedrock/common_utils.py

get base model from model w/ arn -> handles rerank model

* build(model_prices_and_context_window.json): add bedrock cohere rerank pricing

* feat(bedrock/rerank): migrate bedrock config to basererank config

* Revert "feat(bedrock/rerank): migrate bedrock config to basererank config"

This reverts commit 84fae1f1679a209a3e9cdcea593ed683fdb96acc.

* test: add testing to ensure large doc / queries are correctly counted

* Revert "test: add testing to ensure large doc / queries are correctly counted"

This reverts commit 4337f1657e13a6d35527a400e3be17c11d4b662b.

* fix(migrate-jina-ai-to-rerank-config): enables cost tracking

* refactor(jina_ai/): finish migrating jina ai to base rerank config

enables cost tracking

* fix(jina_ai/rerank): e2e jina ai rerank cost tracking

* fix: cleanup dead code

* fix: fix python3.8 compatibility error

* test: fix test

* test: add e2e testing for azure ai rerank

* fix: fix linting error

* test: mark cohere as flaky
2025-02-20 21:00:18 -08:00
Krish Dholakia ea985dda0b fix(model_cost_map): fix json parse error on model cost map + add unit test (#8629)
Fixes https://github.com/BerriAI/litellm/pull/8619#issuecomment-2666693045
2025-02-18 11:18:16 -08:00
Krish Dholakia 9c20c69915 Fix bedrock model pricing + add unit test using bedrock pricing api (#7978)
* test(test_completion_cost.py): add unit testing to ensure all bedrock models with region name have cost tracked

* feat: initial script to get bedrock pricing from amazon api

ensures bedrock pricing is accurate

* build(model_prices_and_context_window.json): correct bedrock model prices based on api check

ensures accurate bedrock pricing

* ci(config.yml): add bedrock pricing check to ci/cd

ensures litellm always maintains up-to-date pricing for bedrock models

* ci(config.yml): add beautiful soup to ci/cd

* test: bump groq model

* test: fix test
2025-01-28 17:57:49 -08:00
Krish Dholakia 03eef5a2a0 Fix custom pricing - separate provider info from model info (#7990)
* fix(utils.py): initial commit fixing custom cost tracking

refactors out provider specific model info from `get_model_info` - this was causing custom costs to be registered incorrectly

* fix(utils.py): cleanup `_supports_factory` to check provider info, if model info is None

some providers support features like vision across all models

* fix(utils.py): refactor to use _supports_factory

* test: update testing

* fix: fix linting errors

* test: fix testing
2025-01-25 21:49:28 -08:00
Krish Dholakia 8ca3229b26 Ensure base_model cost tracking works across all endpoints (#7989)
* test(test_completion_cost.py): add sdk test to ensure base model is used for cost tracking

* test(test_completion_cost.py): add sdk test to ensure custom pricing works

* fix(main.py): add base model cost tracking support for embedding calls

Enables base model cost tracking for embedding calls when base model set as a litellm_param

* fix(litellm_logging.py): update logging object with litellm params - including base model, if given

ensures base model param is always tracked

* fix(main.py): fix linting errors
2025-01-24 21:05:26 -08:00
Krish Dholakia c6e9240405 Add datadog health check support + fix bedrock converse cost tracking w/ region name specified (#7958)
* fix(bedrock/converse_handler.py): fix bedrock region name on async calls

* fix(utils.py): fix split model handling

Fixes bedrock cost calculation when region name is given

* feat(_health_endpoints.py): support health checking datadog integration

Closes https://github.com/BerriAI/litellm/issues/7921
2025-01-23 22:17:09 -08:00
Krish Dholakia becd4bc748 Litellm dev 01 11 2025 p3 (#7702)
* fix(__init__.py): fix init to exclude pricing-only model cost values from real model names

prevents bad health checks on wildcard routes

* fix(get_llm_provider.py): fix to handle calling bedrock_converse models
2025-01-11 20:06:54 -08:00
Krish Dholakia 4af23353d6 Allow assigning teams to org on UI + OpenAI omni-moderation cost model tracking (#7566)
* feat(cost_calculator.py): add cost tracking ($0) for openai moderations endpoint

removes sentry cost tracking errors caused by this

* build(teams.tsx): allow assigning teams to orgs
2025-01-08 16:58:21 -08:00
Krish Dholakia 4e69711411 Litellm dev 01 07 2025 p1 (#7618)
* fix(main.py): pass custom llm provider on litellm logging provider update

* fix(cost_calculator.py): don't append provider name to return model if existing llm provider

Fixes https://github.com/BerriAI/litellm/issues/7607

* fix(prometheus_services.py): fix prometheus system health error logging

Fixes https://github.com/BerriAI/litellm/issues/7611
2025-01-07 21:22:31 -08:00
Krish Dholakia c3edfc2c92 LiteLLM Minor Fixes & Improvements (12/23/2024) - p3 (#7394)
* build(model_prices_and_context_window.json): add gemini-1.5-flash context caching

* fix(context_caching/transformation.py): just use last identified cache point

Fixes https://github.com/BerriAI/litellm/issues/6738

* fix(context_caching/transformation.py): pick first contiguous block - handles system message error from google

Fixes https://github.com/BerriAI/litellm/issues/6738

* fix(vertex_ai/gemini/): track context caching tokens

* refactor(gemini/): place transformation.py inside `chat/` folder

make it easy for user to know we support the equivalent endpoint

* fix: fix import

* refactor(vertex_ai/): move vertex_ai cost calc inside vertex_ai/ folder

make it easier to see cost calculation logic

* fix: fix linting errors

* fix: fix circular import

* feat(gemini/cost_calculator.py): support gemini context caching cost calculation

generifies anthropic's cost calculation function and uses it across anthropic + gemini

* build(model_prices_and_context_window.json): add cost tracking for gemini-1.5-flash-002 w/ context caching

Closes https://github.com/BerriAI/litellm/issues/6891

* docs(gemini.md): add gemini context caching architecture diagram

make it easier for user to understand how context caching works

* docs(gemini.md): link to relevant gemini context caching code

* docs(gemini/context_caching): add readme in github, make it easy for dev to know context caching is supported + where to go for code

* fix(llm_cost_calc/utils.py): handle gemini 128k token diff cost calc scenario

* fix(deepseek/cost_calculator.py): support deepseek context caching cost calculation

* test: fix test
2024-12-23 22:02:52 -08:00
Krish Dholakia 179d2f56b7 LiteLLM Minor Fixes & Improvements (12/16/2024) - p1 (#7263)
* fix(factory.py): skip empty text blocks for bedrock user messages

Fixes https://github.com/BerriAI/litellm/issues/7169

* Add support for Gemini 2.0 GoogleSearch tool (#7257)

* Add support for google_search tool in gemini 2.0

* Add/modify tests

* Fix grounding check

* Remove 2.0 grounding test; exclude experimental model in VERTEX_MODELS_TO_NOT_TEST

* Swap order of tools

* DFix formatting

* fix(get_api_base.py): return api base in streaming response

Fixes https://github.com/BerriAI/litellm/issues/7249

Closes https://github.com/BerriAI/litellm/pull/7250

* fix(cost_calculator.py): only set base model to model if not none

Fixes https://github.com/BerriAI/litellm/issues/7223

* fix(cost_calculator.py): enforce stricter order when picking model for cost calculation

* fix(cost_calculator.py): fix '_select_model_name_for_cost_calc' to return model name with region name prefix if provided

* fix(utils.py): fix 'get_model_info()' to handle edge case where model name starts with custom llm provider AND custom llm provider is given

* fix(cost_calculator.py): handle `custom_llm_provider-` scenario

* fix(cost_calculator.py): e2e working tts cost tracking

ensures initial message is passed in, to cost calculator

* fix(factory.py): suppress linting errors

* fix(cost_calculator.py): strip llm provider from model name after selecting cost calc model

* fix(litellm_logging.py): store initial request in 'input' field + accept base_model to be passed in litellm_params directly

* test: handle none env var value in flaky test

* fix(litellm_logging.py): fix linting errors

---------

Co-authored-by: Sam B <samlingx@gmail.com>
2024-12-17 15:33:36 -08:00
Krish Dholakia 224ead1531 fix(utils.py): fix openai-like api response format parsing (#7273)
* fix(utils.py): fix openai-like api response format parsing

Fixes issue passing structured output to litellm_proxy/ route

* fix(cost_calculator.py): fix whisper transcription cost calc to use file duration, not response time

'

* test: skip test if credentials not found
2024-12-17 12:49:09 -08:00
Krish Dholakia ec36353b41 fix(main.py): fix retries being multiplied when using openai sdk (#7221)
* fix(main.py): fix retries being multiplied when using openai sdk

Closes https://github.com/BerriAI/litellm/pull/7130

* docs(prompt_management.md): add langfuse prompt management doc

* feat(team_endpoints.py): allow teams to add their own models

Enables teams to call their own finetuned models via the proxy

* test: add better enforcement check testing for `/model/new` now that teams can add their own models

* docs(team_model_add.md): tutorial for allowing teams to add their own models

* test: fix test
2024-12-14 11:56:55 -08:00
Krish Dholakia 350cfc36f7 Litellm merge pr (#7161)
* build: merge branch

* test: fix openai naming

* fix(main.py): fix openai renaming

* style: ignore function length for config factory

* fix(sagemaker/): fix routing logic

* fix: fix imports

* fix: fix override
2024-12-10 22:49:26 -08:00
Krish Dholakia b11bc0374e Litellm dev 11 20 2024 (#6838)
* feat(customer_endpoints.py): support passing budget duration via `/customer/new` endpoint

Closes https://github.com/BerriAI/litellm/issues/5651

* docs: add missing params to swagger + api documentation test

* docs: add documentation for all key endpoints

documents all params on swagger

* docs(internal_user_endpoints.py): document all /user/new params

Ensures all params are documented

* docs(team_endpoints.py): add missing documentation for team endpoints

Ensures 100% param documentation on swagger

* docs(organization_endpoints.py): document all org params

Adds documentation for all params in org endpoint

* docs(customer_endpoints.py): add coverage for all params on /customer endpoints

ensures all /customer/* params are documented

* ci(config.yml): add endpoint doc testing to ci/cd

* fix: fix internal_user_endpoints.py

* fix(internal_user_endpoints.py): support 'duration' param

* fix(partner_models/main.py): fix anthropic re-raise exception on vertex

* fix: fix pydantic obj

* build(model_prices_and_context_window.json): add new vertex claude model names

vertex claude changed model names - causes cost tracking errors
2024-11-21 05:20:37 +05:30
Krish Dholakia cb2563e3c0 Litellm dev 10 22 2024 (#6384)
* fix(utils.py): add 'disallowed_special' for token counting on .encode()

Fixes error when '<
endoftext
>' in string

* Revert "(fix) standard logging metadata + add unit testing  (#6366)" (#6381)

This reverts commit 8359cb6fa9.

* add new 35 mode lcard (#6378)

* Add claude 3 5 sonnet 20241022 models for all provides (#6380)

* Add Claude 3.5 v2 on Amazon Bedrock and Vertex AI.

* added anthropic/claude-3-5-sonnet-20241022

* add new 35 mode lcard

---------

Co-authored-by: Paul Gauthier <paul@paulg.com>
Co-authored-by: lowjiansheng <15527690+lowjiansheng@users.noreply.github.com>

* test(skip-flaky-google-context-caching-test): google is not reliable. their sample code is also not working

* Fix metadata being overwritten in speech() (#6295)

* fix: adding missing redis cluster kwargs (#6318)

Co-authored-by: Ali Arian <ali.arian@breadfinancial.com>

* Add support for `max_completion_tokens` in Azure OpenAI (#6376)

Now that Azure supports `max_completion_tokens`, no need for special handling for this param and let it pass thru. More details: https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models?tabs=python-secure#api-support

* build(model_prices_and_context_window.json): add voyage-finance-2 pricing

Closes https://github.com/BerriAI/litellm/issues/6371

* build(model_prices_and_context_window.json): fix llama3.1 pricing model name on map

Closes https://github.com/BerriAI/litellm/issues/6310

* feat(realtime_streaming.py): just log specific events

Closes https://github.com/BerriAI/litellm/issues/6267

* fix(utils.py): more robust checking if unmapped vertex anthropic model belongs to that family of models

Fixes https://github.com/BerriAI/litellm/issues/6383

* Fix Ollama stream handling for tool calls with None content (#6155)

* test(test_max_completions): update test now that azure supports 'max_completion_tokens'

* fix(handler.py): fix linting error

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Low Jian Sheng <15527690+lowjiansheng@users.noreply.github.com>
Co-authored-by: David Manouchehri <david.manouchehri@ai.moda>
Co-authored-by: Paul Gauthier <paul@paulg.com>
Co-authored-by: John HU <hszqqq12@gmail.com>
Co-authored-by: Ali Arian <113945203+ali-arian@users.noreply.github.com>
Co-authored-by: Ali Arian <ali.arian@breadfinancial.com>
Co-authored-by: Anand Taralika <46954145+taralika@users.noreply.github.com>
Co-authored-by: Nolan Tremelling <34580718+NolanTrem@users.noreply.github.com>
2024-10-22 21:18:54 -07:00
Krish Dholakia 7cc12bd5c6 LiteLLM Minor Fixes & Improvements (10/18/2024) (#6320)
* fix(converse_transformation.py): handle cross region model name when getting openai param support

Fixes https://github.com/BerriAI/litellm/issues/6291

* LiteLLM Minor Fixes & Improvements (10/17/2024)  (#6293)

* fix(ui_sso.py): fix faulty admin only check

Fixes https://github.com/BerriAI/litellm/issues/6286

* refactor(sso_helper_utils.py): refactor /sso/callback to use helper utils, covered by unit testing

Prevent future regressions

* feat(prompt_factory): support 'ensure_alternating_roles' param

Closes https://github.com/BerriAI/litellm/issues/6257

* fix(proxy/utils.py): add dailytagspend to expected views

* feat(auth_utils.py): support setting regex for clientside auth credentials

Fixes https://github.com/BerriAI/litellm/issues/6203

* build(cookbook): add tutorial for mlflow + langchain + litellm proxy tracing

* feat(argilla.py): add argilla logging integration

Closes https://github.com/BerriAI/litellm/issues/6201

* fix: fix linting errors

* fix: fix ruff error

* test: fix test

* fix: update vertex ai assumption - parts not always guaranteed (#6296)

* docs(configs.md): add argila env var to docs

* docs(user_keys.md): add regex doc for clientside auth params

* docs(argilla.md): add doc on argilla logging

* docs(argilla.md): add sampling rate to argilla calls

* bump: version 1.49.6 → 1.49.7

* add gpt-4o-audio models to model cost map (#6306)

* (code quality) add ruff check PLR0915 for `too-many-statements`  (#6309)

* ruff add PLR0915

* add noqa for PLR0915

* fix noqa

* add # noqa: PLR0915

* # noqa: PLR0915

* # noqa: PLR0915

* # noqa: PLR0915

* add # noqa: PLR0915

* # noqa: PLR0915

* # noqa: PLR0915

* # noqa: PLR0915

* # noqa: PLR0915

* doc fix Turn on / off caching per Key. (#6297)

* (feat) Support `audio`,  `modalities` params (#6304)

* add audio, modalities param

* add test for gpt audio models

* add get_supported_openai_params for GPT audio models

* add supported params for audio

* test_audio_output_from_model

* bump openai to openai==1.52.0

* bump openai on pyproject

* fix audio test

* fix test mock_chat_response

* handle audio for Message

* fix handling audio for OAI compatible API endpoints

* fix linting

* fix mock dbrx test

* (feat) Support audio param in responses streaming (#6312)

* add audio, modalities param

* add test for gpt audio models

* add get_supported_openai_params for GPT audio models

* add supported params for audio

* test_audio_output_from_model

* bump openai to openai==1.52.0

* bump openai on pyproject

* fix audio test

* fix test mock_chat_response

* handle audio for Message

* fix handling audio for OAI compatible API endpoints

* fix linting

* fix mock dbrx test

* add audio to Delta

* handle model_response.choices.delta.audio

* fix linting

* build(model_prices_and_context_window.json): add gpt-4o-audio audio token cost tracking

* refactor(model_prices_and_context_window.json): refactor 'supports_audio' to be 'supports_audio_input' and 'supports_audio_output'

Allows for flag to be used for openai + gemini models (both support audio input)

* feat(cost_calculation.py): support cost calc for audio model

Closes https://github.com/BerriAI/litellm/issues/6302

* feat(utils.py): expose new `supports_audio_input` and `supports_audio_output` functions

Closes https://github.com/BerriAI/litellm/issues/6303

* feat(handle_jwt.py): support single dict list

* fix(cost_calculator.py): fix linting errors

* fix: fix linting error

* fix(cost_calculator): move to using standard openai usage cached tokens value

* test: fix test

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
2024-10-19 22:23:27 -07:00
Ishaan Jaff 1994100028 (fix) prompt caching cost calculation OpenAI, Azure OpenAI (#6231)
* fix prompt caching cost calculation

* fix testing for prompt cache cost calc
2024-10-15 18:55:31 +05:30
Krish Dholakia 6005450c8f LiteLLM Minor Fixes & Improvements (10/09/2024) (#6139)
* fix(utils.py): don't return 'none' response headers

Fixes https://github.com/BerriAI/litellm/issues/6123

* fix(vertex_and_google_ai_studio_gemini.py): support parsing out additional properties and strict value for tool calls

Fixes https://github.com/BerriAI/litellm/issues/6136

* fix(cost_calculator.py): set default character value to none

Fixes https://github.com/BerriAI/litellm/issues/6133#issuecomment-2403290196

* fix(google.py): fix cost per token / cost per char conversion

Fixes https://github.com/BerriAI/litellm/issues/6133#issuecomment-2403370287

* build(model_prices_and_context_window.json): update gemini pricing

Fixes https://github.com/BerriAI/litellm/issues/6133

* build(model_prices_and_context_window.json): update gemini pricing

* fix(litellm_logging.py): fix streaming caching logging when 'turn_off_message_logging' enabled

Stores unredacted response in cache

* build(model_prices_and_context_window.json): update gemini-1.5-flash pricing

* fix(cost_calculator.py): fix default prompt_character count logic

Fixes error in gemini cost calculation

* fix(cost_calculator.py): fix cost calc for tts models
2024-10-10 00:42:11 -07:00
Krish Dholakia fac3b2ee42 Add pyright to ci/cd + Fix remaining type-checking errors (#6082)
* fix: fix type-checking errors

* fix: fix additional type-checking errors

* fix: additional type-checking error fixes

* fix: fix additional type-checking errors

* fix: additional type-check fixes

* fix: fix all type-checking errors + add pyright to ci/cd

* fix: fix incorrect import

* ci(config.yml): use mypy on ci/cd

* fix: fix type-checking errors in utils.py

* fix: fix all type-checking errors on main.py

* fix: fix mypy linting errors

* fix(anthropic/cost_calculator.py): fix linting errors

* fix: fix mypy linting errors

* fix: fix linting errors
2024-10-05 17:04:00 -04:00
Ishaan Jaff ab0b536143 (feat) add azure openai cost tracking for prompt caching (#6077)
* add azure o1 models to model cost map

* add azure o1 cost tracking

* fix azure cost calc

* add get llm provider test
2024-10-05 15:04:18 +05:30
Ishaan Jaff 3682f661d8 (feat) add cost tracking for OpenAI prompt caching (#6055)
* add cache_read_input_token_cost for prompt caching models

* add prompt caching for latest models

* add openai cost calculator

* add openai prompt caching test

* fix lint check

* add not on how usage._cache_read_input_tokens is used

* fix cost calc whisper openai

* use output_cost_per_second

* add input_cost_per_second
2024-10-05 14:20:15 +05:30