Commit Graph

717 Commits

Author SHA1 Message Date
dcieslak19973 d480cea8b0 Add azure_ai cohere rerank v3.5 (#12283)
* Add azure_ai cohere rerank v3.5

* Fix CI error
2025-07-03 10:01:45 -07:00
codeugar 06c86d6130 Update model_prices_and_context_window.json (#11972)
add
--
"deepseek/deepseek-r1": {
        "max_tokens": 8192,
        "max_input_tokens": 65536,
        "max_output_tokens": 8192,
        "input_cost_per_token": 5.5e-07,
        "input_cost_per_token_cache_hit": 1.4e-07,
        "output_cost_per_token": 2.19e-06,
        "litellm_provider": "deepseek",
        "mode": "chat",
        "supports_function_calling": true,
        "supports_assistant_prefill": true,
        "supports_tool_choice": true,
        "supports_reasoning": true,
        "supports_prompt_caching": true
    },
    "deepseek/deepseek-v3": {
        "max_tokens": 8192,
        "max_input_tokens": 65536,
        "max_output_tokens": 8192,
        "input_cost_per_token": 2.7e-07,
        "input_cost_per_token_cache_hit": 7e-08,
        "cache_read_input_token_cost": 7e-08,
        "cache_creation_input_token_cost": 0.0,
        "output_cost_per_token": 1.1e-06,
        "litellm_provider": "deepseek",
        "mode": "chat",
        "supports_function_calling": true,
        "supports_assistant_prefill": true,
        "supports_tool_choice": true,
        "supports_prompt_caching": true
    },
--
tencent custom deploy deepseek named "deepseek-r1" and "deepseek-v3".
Thanks very much !
2025-06-27 21:36:59 -07:00
Cole McIntosh 0b95fb63cc Add Azure OpenAI assistant features cost tracking (#12045)
* Add Azure OpenAI assistant features cost tracking

Implements cost tracking for Azure's new assistant features:
- File Search: $0.1 USD per 1 GB/Day (storage-based pricing)
- Code Interpreter: $0.03 USD per session
- Computer Use: $0.003 input + $0.012 output per 1K tokens

Features:
- Provider-specific pricing (Azure vs OpenAI)
- Model-specific pricing overrides via JSON config
- Environment variable configuration
- Backwards compatible with existing OpenAI pricing

* Add comprehensive tests for Azure assistant features cost tracking

- Unit tests for file search, code interpreter, computer use, vector store
- Integration tests for combined cost calculation
- Provider-specific pricing tests (Azure vs OpenAI)
- Model-specific pricing override tests
- Edge case handling (None inputs, zero values)
- All 17 tests passing

* Fix test and ensure all Azure assistant cost tracking tests pass

- Fixed integration test approach
- All 17 tests now passing
- Comprehensive coverage of Azure assistant features cost tracking

* Enhance cost tracking for Azure assistant features

- Safely convert and extract parameters for file search, computer use, and code interpreter sessions.
- Ensure model_info is consistently converted to a dictionary format.
- Improve error handling for input values to prevent type-related issues.
- Maintain compatibility with existing cost calculation methods.

* Refactor cost tracking for Azure assistant features

- Introduced separate methods for handling costs related to web search, file search, vector store, computer use, and code interpreter.
- Enhanced parameter extraction and conversion for file search and computer use.
- Improved error handling and type safety throughout the cost calculation process.
- Maintained compatibility with existing cost calculation methods while streamlining the overall structure.
2025-06-27 21:33:00 -07:00
Ishaan Jaff ebf6395bc1 [Feat] Add Eleven Labs - Speech To Text Support on LiteLLM (#12119)
* add ELEVENLABS as a provider

* add deepgram to main.py

* add ElevenLabsException

* add ElevenLabsAudioTranscriptionConfig

* add transform_audio_transcription_response

* TestElevenLabsAudioTranscription

* add elevenlabs/scribe_v1 to model cost map

* add ElevenLabsAudioTranscriptionConfig

* add AudioTranscriptionRequestData

* add ElevenLabs transform

* use AudioTranscriptionRequestData

* refactoring fixes

* add ProcessedAudioFile util for reading audio files

* test_elevenlabs_diarize_parameter_passthrough

* docs eleven labs

* docs fixes

* fix code qa checks

* fixes - audio transcription

* ui - add ElevenLabs logo

* add elevenlabs logo

* docs - ElevenLabs

* test fix elevenlabs
2025-06-27 17:50:49 -07:00
Krish Dholakia 8bd1f8f6ab Add o3 and o4-mini deep research models (#12109)
* build(model_prices_and_context_window.json): add o3-deep-research models

* build(model_prices_and_context_window.json): add o4-deep-research model

* build(model_prices_and_context_window.json): add o4-mini-deep-research versioned model
2025-06-27 09:58:53 -07:00
Krrish Dholakia 0e96f412a1 build(model_prices_and_context_window.json): mark azure o3-pro as responses api model
Fixes https://github.com/BerriAI/litellm/issues/12059
2025-06-26 10:41:03 -07:00
Krish Dholakia 1a4ad8bf18 Update mistral 'supports_response_schema' field + Fix ollama embedding (#12024)
* build(model_prices_and_context_window.json): update all mistral models (besides codestral-mamba) to indicate support for response schema

Closes https://github.com/BerriAI/litellm/issues/12012

* fix(route_llm_request.py): if llm router is not initialized, go straight through to litellm sdk

Fixes https://github.com/BerriAI/litellm/issues/12008

* test: add unit test

* fix(ollama_embeddings): fix unecessary await

Fixes https://github.com/BerriAI/litellm/issues/11997

* test: update ollama embedding tests
2025-06-25 07:20:13 -07:00
Marty Sullivan a5ce1cd49b add azure o3-pro pricing (#11990) 2025-06-24 10:57:24 -07:00
Cole McIntosh eacb4dfdef Add Mistral 3.2 24B to model mapping (#11926)
* feat(model_prices_and_context_window.json): add mistral-small-3.2-24b-instruct model with token costs and chat mode support

* fix(model_prices_and_context_window.json): update model paths to include 'openrouter' prefix for mistral-small-3.1 and 3.2
2025-06-23 14:54:39 -07:00
Cole McIntosh 02a095d4db feat: implement Perplexity citation tokens and search queries cost calculation (#11938)
* feat: add citation_cost_per_token and search_queries_cost_per_1000 fields to ModelInfoBase

- Add citation_cost_per_token field to ModelInfoBase for Perplexity citation token costs
- Add search_queries_cost_per_1000 field to ModelInfoBase for Perplexity search query costs
- Update _get_model_info_helper to include these fields in model info responses
- Enables proper cost calculation for Perplexity-specific usage metrics

* feat: update Perplexity sonar-deep-research model pricing configuration

- Update input/output token costs to / per million tokens respectively
- Add reasoning token cost at  per million tokens
- Add citation_cost_per_token at  per million tokens (same as input)
- Add search_queries_cost_per_1000 at /bin/zsh.005 per 1000 search queries
- Remove deprecated search_context_cost_per_query structure
- Aligns with Perplexity's updated pricing model for deep research capabilities

* feat: implement Perplexity-specific cost calculator

- Create cost_per_token function for Perplexity provider
- Calculate standard input/output token costs
- Add citation token cost calculation using citation_cost_per_token rate
- Add reasoning token cost calculation with fallback to completion_tokens_details
- Add search query cost calculation using search_queries_cost_per_1000 rate
- Return separate prompt_cost and completion_cost for accurate billing
- Handles all Perplexity-specific usage metrics: citation_tokens, num_search_queries, reasoning_tokens

* feat: integrate Perplexity cost calculator with main cost calculation system

- Import perplexity_cost_per_token function in main cost calculator
- Add perplexity provider case to cost_per_token function
- Enables automatic routing of Perplexity cost calculations to provider-specific logic
- Maintains compatibility with existing cost calculation patterns
- Supports all Perplexity-specific cost metrics through unified interface

* feat: enhance Perplexity response transformation to extract cost-related fields

- Override transform_response method to extract Perplexity-specific usage fields
- Add _enhance_usage_with_perplexity_fields method to process API responses
- Extract citation_tokens from citations array using character-based estimation (~4 chars/token)
- Extract num_search_queries from both usage field and root level with priority handling
- Create usage object when none exists to ensure cost fields are always captured
- Handle empty citations and missing fields gracefully
- Enables automatic extraction of cost metrics from Perplexity API responses

* test: add comprehensive test suite for Perplexity cost calculation features

Add 82 comprehensive tests across 3 test files:

- test_perplexity_cost_calculator.py (59 tests):
  * Cost calculation with citation tokens, search queries, reasoning tokens
  * Various combinations and edge cases
  * Integration with main cost calculator
  * Model info access and validation
  * Zero values and missing fields handling

- test_perplexity_chat_transformation.py (12 tests):
  * Citation token extraction from API responses
  * Search query extraction from usage and root fields
  * Priority handling and field aggregation
  * Empty citations and missing fields handling
  * Token estimation accuracy validation

- test_perplexity_integration.py (11 tests):
  * End-to-end cost calculation workflows
  * High-volume and edge case scenarios
  * Model info integration validation
  * Case-insensitive provider matching
  * Transformation preservation of existing fields

Ensures reliability and correctness of all Perplexity cost features with comprehensive coverage of happy path, edge cases, and error conditions.

* fix: remove unused Union import from Perplexity transformation

- Remove unused typing.Union import from litellm/llms/perplexity/chat/transformation.py
- Fixes F401 linting error: 'typing.Union imported but unused'
- Maintains only necessary imports: Any, List, Optional, Tuple

* Fix JSON schema validation and use web_search_requests field

- Add citation_cost_per_token and search_queries_cost_per_1000 to JSON schema
- Update Perplexity transformation to use web_search_requests in PromptTokensDetailsWrapper
- Update Perplexity cost calculator to read from web_search_requests field
- Maintain backward compatibility while using standard LiteLLM fields

* Fix type errors in Perplexity cost calculator

- Add null checks for token counts and cost values to prevent None multiplication errors
- Use .get() with fallback values instead of direct dictionary access
- Ensure all arithmetic operations handle None values safely

This fixes the failing job 44517525148 type errors.

* Refactor Perplexity cost calculation tests to improve accuracy and consistency

- Replace absolute difference assertions with math.isclose for better precision in cost comparisons
- Update tests to utilize PromptTokensDetailsWrapper for handling web search requests
- Ensure all test cases correctly reflect the new structure of usage fields, enhancing clarity and maintainability

* fix: address type hinting issues in PerplexityChatConfig usage handling

- Add type ignore comments to model_response.usage assignments to resolve type checking errors
- Ensures compatibility with type definitions while maintaining existing functionality

* Update model pricing configuration in JSON backup

- Add citation_cost_per_token and search_queries_cost_per_1000 fields to enhance cost tracking
- Remove deprecated search_context_cost_per_query structure to streamline pricing model
- Aligns with recent updates in Perplexity's pricing strategy

* Update search queries cost structure in model_prices_and_context_window.json to use search_context_cost_per_query

* Refactor search queries cost structure in model_prices_and_context_window_backup.json and update related code to use search_queries_cost_per_query. Remove deprecated search_queries_cost_per_1000 references across model info and tests.

* Enhance cost calculation in cost_calculator.py by introducing a safe float casting function to handle potential None and invalid values. Update cost calculations for input, citation, output, reasoning, and search query tokens to use this new function, ensuring more robust handling of model pricing data.

* Refactor cost calculation in cost_calculator.py to support both legacy and current search cost keys. Enhance handling of search cost values by accommodating both dictionary and float formats, ensuring robust cost computation for search queries.

* Update test cases to reflect changes in cost structure, renaming search_queries_cost_per_query to search_context_cost_per_query for consistency with recent refactor. Ensure assertions in tests align with updated cost keys.

* Update test_perplexity_integration.py to rename search_queries_cost_per_query to search_context_cost_per_query, ensuring consistency with recent cost structure changes. Adjust assertions to align with updated cost keys.
2025-06-23 14:15:25 -07:00
Erv Walter aaa41d1e24 Update Azure o3 pricing to match OpenAI pricing ($2/$8 per 1M tokens) (#11937)
* Initial plan for issue

* Update Azure o3 pricing to match OpenAI pricing ($2/$8 per 1M tokens)

Co-authored-by: ervwalter <768790+ervwalter@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: ervwalter <768790+ervwalter@users.noreply.github.com>
2025-06-20 23:17:46 -07:00
Ishaan Jaff 99d851544a [Feat] Add Azure Codex Models on LiteLLM + new /v1 preview Azure OpenAI API (#11934)
* fix get_complete_url

* fixes _is_azure_v1_api_version

* test_azure_responses_api_preview_api_version

* TestAzureResponsesAPIConfig

* add azure/codex-mini

* fix azure/codex-mini

* Update litellm/llms/azure/responses/transformation.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* fix linting

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-06-20 18:08:44 -07:00
Ishaan Jaff 19f13c842f add gemini-2.5-pro (#11927) 2025-06-20 11:29:06 -07:00
Krish Dholakia 40cc61c8f3 build(model_prices_and_context_window.json): mark all gemini-2.5 mode… (#11907)
* build(model_prices_and_context_window.json): mark all gemini-2.5 models as supporting pdf input

Closes https://github.com/BerriAI/litellm/issues/11881

* fix(anthropic_transformation.py): set custom llm provider custom property

Fixes https://github.com/BerriAI/litellm/issues/11861

* test: add unit test for checking supports_reasoning

* test: add test for vertex ai flow

* feat(bedrock/anthropic): ensure thinking param correctly passed for bedrock/invoke
2025-06-19 21:07:25 -07:00
lgruen-vcgs e1c77e70c2 Add AWS Bedrock profiles for the APAC region (#11883)
Likely fixes #6905, #9228, and addresses https://github.com/BerriAI/litellm/issues/11057#issuecomment-2903257768.
2025-06-19 20:10:04 -07:00
Ishaan Jaff 0fe8bf2fc2 fix gemini-2.5-flash-lite-preview-06-17 2025-06-19 17:07:34 -07:00
fatih akyon 5b6ba871a5 [Bug Fix] add missing flash-2.5-flash-lite for gemini provider, fix gemini-2.5-flash pricing (#11901) 2025-06-19 16:38:17 -07:00
Krrish Dholakia 649636b26b build(model_prices_and_context_window.json): ensure tpm/rpm limits are int
Closes https://github.com/BerriAI/litellm/issues/11882
2025-06-19 14:58:16 -07:00
Ishaan Jaff e1764af890 fix meta_llama/Llama-3.3-8B-Instruct 2025-06-19 13:44:05 -07:00
Krrish Dholakia b080220d02 build: fix gemini-2.5-pro rate limits 2025-06-18 22:56:56 -07:00
Krrish Dholakia dfafa986ea build(model_prices_and_context_window.json): add gemini google ai studio rate limits 2025-06-18 22:55:54 -07:00
Low Jian Sheng ca6fa63362 Fix gemini 2.5 flash config (#11830)
* fix gemini 2.5 flash config

* add gemini 2.5 flash
2025-06-18 20:16:48 -07:00
salah alzubi d7e53edc26 Update model_prices_and_context_window.json (#11803)
-- Updated pricing for Gemini Flash
-- Updated a few Openrouter models
-- Updated pricing for Gemini Flash Lite
2025-06-17 17:12:08 -07:00
Emerson Gomes b21f4a3f74 Add Vertex Imagen-4 models (#11767) 2025-06-16 10:08:51 -07:00
Krish Dholakia 0908618a19 Litellm stable release 06 14 2025 (#11737)
* docs: initial commit with stable release changelog notes

* docs: style updates

* docs(index.md): updated changelog

* docs(index.md): cleanup

* docs(index.md): add general proxy improvements

* docs: index.md

cleanup
2025-06-14 16:56:29 -07:00
nevin b7cb66ee8f Fixed grok-3-mini to not use stop tokens (#11563)
* fixed grok-3-mini to not use stop tokens

* added xai config test
2025-06-14 14:26:43 -07:00
Cole McIntosh 6b9754e2aa Merge pull request #11642 from colesmcintosh/mistral-reasoning
Enhance Mistral model support with reasoning capabilities
2025-06-12 16:42:53 -06:00
Ishaan Jaff 27cc503185 add gpt-4o-mini-transcribe (#11676) 2025-06-12 15:30:25 -07:00
Cole McIntosh 12a61fce4a [Feat] Enhance Mistral model support with reasoning capabilities
* Added support for reasoning parameters in magistral models, including "reasoning_effort" and "thinking".
* Updated the MistralConfig class to handle reasoning system prompts.
* Implemented tests to verify reasoning functionality and ensure correct parameter mapping for magistral models.
* Enhanced the model prices JSON to reflect new reasoning capabilities.
2025-06-11 17:13:06 -06:00
Ishaan Jaff 52ef96261f [UI] Add Deepgram provider to supported providers list and mappings (#11634)
* Add Deepgram provider to supported providers list and mappings

* add logo

* Add deepgram to model cost map

* ui - require api key for deepgram

* fix logo path

---------

Co-authored-by: Cursor Agent <cursoragent@cursor.com>
2025-06-11 12:12:12 -07:00
Krrish Dholakia e4ac1cdef2 build(model_prices_and_context_window.json): fix o3-pro mode to 'responses' 2025-06-11 09:08:58 -07:00
Krish Dholakia 56f481a47e Add new o3 models pricing (#11606)
* build(model_prices_and_context_window.json): add o3-pro pricing

* build(model_prices_and_context_window.json): add updated o3 model pricing

* build(model_prices_and_context_window.json): add new o3-pro model version
2025-06-10 16:33:11 -07:00
Cole McIntosh 3919b64209 Add new Mistral models to pricing and context window JSON: add 'mistral/magistral-medium-2506' and 'mistral/magistral-small-2506' with token limits and cost details 2025-06-10 08:38:24 -06:00
Krish Dholakia 25c0d39307 Add VertexAI claude-opus-4 + Assign users to orgs on creation (#11572)
* build(model_prices_and_context_window.json): add 'claude-opus-4' on vertexai (no @)

* build(model_prices_and_context_window.json): add claude sonnet 4 without 2

*@

* feat(internal_user_endpoints.py): assign user to orgs on user creation

allows user to be a member of orgs on creation - work to enable default orgs on UI

* fix(internal_user_endpoints.py): fix http_request
2025-06-09 23:24:06 -07:00
Ishaan Jaff 9241fca2f5 Fix: Adds support for choosing the default region based on where the model is available (#11566)
* fix: vtx default region for global only models

* track gemini-2.5-pro-preview-05-06

* fix is_global_only_vertex_model

* test_is_global_only_vertex_model

* test_get_vertex_region_global_only_model

* fix json format

* fix get_supported_regions
2025-06-09 18:29:44 -07:00
Cole McIntosh abe4c8fe4c feat: add gpt-4o-audio-preview model configuration to model_prices_and_context_window.json (#11560) 2025-06-09 14:56:36 -07:00
Ishaan Jaff eb02cf1a2d Revert "Nebius model pricing info updted (#11445)" (#11493)
This reverts commit 32281de91f.
2025-06-06 11:04:21 -07:00
Akim Tsvigun 32281de91f Nebius model pricing info updted (#11445) 2025-06-06 10:43:04 -07:00
Ishaan Jaff 2aa75e1403 add codex-mini-latest (#11492) 2025-06-06 10:39:09 -07:00
Peter Dave Hello b452f82045 Add Google Gemini 2.5 Pro Preview 06-05 (#11447) 2025-06-06 09:28:53 -07:00
Krish Dholakia 603bd73a17 Gemini - web search cost tracking + Update max output tokens for nova models
* fix(vertex_and_google_ai_studio_gemini.py): add web search request tracking

Enables cost calculation for google web search

* fix(vertex_and_gemini): use common processing logic across stream / non-stream calls

* fix(vertex_And_google_ai_studio_Gemini.py): fix initial choice

* fix: fix linting error

* fix: add initial support for google search cost tracking

* fix(tool_call_cost_tracking.py): working tool cost tracking for gemini

* fix(vertex_ai/gemini/cost_calculator.py): add google web search tool cost tracking for vertex ai

Closes LIT-210

* fix: fix check

* build(model_prices_and_context_window.json): fix amazon nova max output tokens

Closes https://github.com/BerriAI/litellm/issues/11441

* fix: fix ruff check
2025-06-05 23:25:18 -07:00
Krrish Dholakia 505d2fe0c7 build: bump 2025-06-05 00:08:53 -07:00
Jimmy Tsai 4019f79808 feat: add deepseek-r1 family model configuration to pricing JSON (#11394) 2025-06-04 22:39:06 -07:00
Cole McIntosh 7bbd8262ed Add Claude 4 Sonnet & Opus, DeepSeek R1, and fix Llama Vision model pricing configurations (#11339)
* fix: update model path for llama-v3p2-90b-vision-instruct in pricing configuration (missing fireworks_ai/ prefix)

* feat: add deepseek-r1-0528 model configuration to pricing JSON

* feat: add configurations for new Claude 4 model alias to pricing JSON

* undo prefix change

* fix: update supports_response_schema to false in pricing JSON for litellm_provider

* update supports_tool_choice and supports_response_schema

* Update model configuration to disable function calling and tool choice for multiple models in fireworks_ai. Adjusted supported parameters in FireworksAIConfig to conditionally include tools and tool_choice based on model compatibility.

* Refactor FireworksAIConfig to use supports_function_calling from utils

* Enhance FireworksAIConfig to conditionally support tool_choice based on model capabilities
2025-06-03 20:39:47 -07:00
Marty Sullivan d247a390bd add gemini-embeddings-001 model prices and context window (#11332)
* add gemini-embeddings-001 model prices and context window

* use scientific notation
2025-06-03 15:59:30 -07:00
Cole McIntosh 621d609879 feat: add cerebras/qwen-3-32b model pricing and capabilities to model_prices_and_context_window.json (#11373) 2025-06-03 11:32:13 -07:00
Cole McIntosh 94650c10fe feat: Add support for Cohere Embed v4.0 model (#11329)
- Updated model_prices_and_context_window.json to include embed-v4.0 with relevant pricing and metadata.
- Added embed-v4.0 to cohere_embedding_models in constants.py.
- Implemented comprehensive tests for Cohere Embed v4.0 in test_cohere.py, covering basic functionality, input types, error handling, and optional parameters.
2025-06-02 11:25:29 -07:00
Krish Dholakia 06484f6e5a Xai, VertexAI, Google AI Studio - live web search support in OpenAI format (#11251)
* build(model_prices_and_context_window.json): fix 'supports_web_search' flag - openai only supports it on 2 models - gpt-4o-search-preview and gpt-4o-mini-search-preview

* feat(xai/chat): add xai web search options param support

* test: add max tokens to test

xai output very verbose

* build(xai/): add web search support for all xai models

* build(model_prices_and_cost.json): add gemini-2.0 supports web search

* feat(gemini/): map openai 'web_search_options' to google's 'googlesearch' tool

* build(model_prices_and_context_window.json): add supports_web_search for vertex_ai/gemini-2 models

* fix: fix circular reference error

* fix(convert_dict_to_response.py): handle scenario where xai returns finish reason as 'stop' for tool calls

* fix: reduce function size

* fix: import session handling

* Revert "fix: import session handling"

This reverts commit deb257dc10.

* fix: linting pin mypy

* [Feat]: Guardrails - Add streaming for bedrock post guard (#11247)

* feat: add streaming for bedrock post guard

* fix: bedrock guardrails

* fix: add clear comments

* Update litellm/proxy/guardrails/guardrail_hooks/bedrock_guardrails.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update litellm/proxy/guardrails/guardrail_hooks/bedrock_guardrails.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* fix: clean up bedrock guardrails

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* [Fix] Responses API - Session management  (#11254)

* fix: import session handling

* fix: imports for session handler

* tests: tests for session handler

* Update enterprise/litellm_enterprise/enterprise_callbacks/session_handler.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* bump: bump litellm enterprise

* fixes: test_create_user_default_budget

* fix(xai/): filter 'strict' on tool call

* test: update test for new error string

* fix(utils.py): default to None if not set in  model cost map

ensures consistent usage of 'supports_[x]' flags

* fix(fireworks_ai/): support fireworks ai document inlining on pdf's sent via openai 'file' message type

* test: update test

* test: name filter_value_from_dict

* fix(fireworks_ai/): handle cache control flag in messages

* fix(xai/chat): fix check

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-05-31 14:26:16 -07:00
Krish Dholakia 8fb2779c9e build(model_prices_and_context_window.json): add supports parallel function calling to all gemini models (#11225) 2025-05-28 22:32:02 -07:00
Regis David Souza Mesquita 56c32ef503 Update mistral-medium prices and context sizes (#10729)
* Update mistral-medium prices and context sizes

While testing the Mistral model, I noticed a discrepancy in the pricing shown on the logs screen. After reviewing the code, I confirmed that the pricing values were incorrect.

This PR corrects the input and output token pricing for the latest Mistral model and adds the newly released mistral-medium-2505 version.

* Adds tool calling flag to mistral-medium

* Adds mistral-medium price updates to the main model price file

* Update model_prices_and_context_window_backup.json

sets mistral medium alias to the old values as it probably points to the old version.

* Update model_prices_and_context_window.json

* Update model_prices_and_context_window_backup.json

* Update model_prices_and_context_window.json
2025-05-28 16:42:28 -07:00