litellm

mirror of https://github.com/tiennm99/litellm.git synced 2026-06-24 19:39:12 +00:00

Author	SHA1	Message	Date
dcieslak19973	d480cea8b0	Add azure_ai cohere rerank v3.5 (#12283 ) * Add azure_ai cohere rerank v3.5 * Fix CI error	2025-07-03 10:01:45 -07:00
codeugar	06c86d6130	Update model_prices_and_context_window.json (#11972 ) add -- "deepseek/deepseek-r1": { "max_tokens": 8192, "max_input_tokens": 65536, "max_output_tokens": 8192, "input_cost_per_token": 5.5e-07, "input_cost_per_token_cache_hit": 1.4e-07, "output_cost_per_token": 2.19e-06, "litellm_provider": "deepseek", "mode": "chat", "supports_function_calling": true, "supports_assistant_prefill": true, "supports_tool_choice": true, "supports_reasoning": true, "supports_prompt_caching": true }, "deepseek/deepseek-v3": { "max_tokens": 8192, "max_input_tokens": 65536, "max_output_tokens": 8192, "input_cost_per_token": 2.7e-07, "input_cost_per_token_cache_hit": 7e-08, "cache_read_input_token_cost": 7e-08, "cache_creation_input_token_cost": 0.0, "output_cost_per_token": 1.1e-06, "litellm_provider": "deepseek", "mode": "chat", "supports_function_calling": true, "supports_assistant_prefill": true, "supports_tool_choice": true, "supports_prompt_caching": true }, -- tencent custom deploy deepseek named "deepseek-r1" and "deepseek-v3". Thanks very much !	2025-06-27 21:36:59 -07:00
Cole McIntosh	0b95fb63cc	Add Azure OpenAI assistant features cost tracking (#12045 ) * Add Azure OpenAI assistant features cost tracking Implements cost tracking for Azure's new assistant features: - File Search: $0.1 USD per 1 GB/Day (storage-based pricing) - Code Interpreter: $0.03 USD per session - Computer Use: $0.003 input + $0.012 output per 1K tokens Features: - Provider-specific pricing (Azure vs OpenAI) - Model-specific pricing overrides via JSON config - Environment variable configuration - Backwards compatible with existing OpenAI pricing * Add comprehensive tests for Azure assistant features cost tracking - Unit tests for file search, code interpreter, computer use, vector store - Integration tests for combined cost calculation - Provider-specific pricing tests (Azure vs OpenAI) - Model-specific pricing override tests - Edge case handling (None inputs, zero values) - All 17 tests passing * Fix test and ensure all Azure assistant cost tracking tests pass - Fixed integration test approach - All 17 tests now passing - Comprehensive coverage of Azure assistant features cost tracking * Enhance cost tracking for Azure assistant features - Safely convert and extract parameters for file search, computer use, and code interpreter sessions. - Ensure model_info is consistently converted to a dictionary format. - Improve error handling for input values to prevent type-related issues. - Maintain compatibility with existing cost calculation methods. * Refactor cost tracking for Azure assistant features - Introduced separate methods for handling costs related to web search, file search, vector store, computer use, and code interpreter. - Enhanced parameter extraction and conversion for file search and computer use. - Improved error handling and type safety throughout the cost calculation process. - Maintained compatibility with existing cost calculation methods while streamlining the overall structure.	2025-06-27 21:33:00 -07:00
Ishaan Jaff	ebf6395bc1	[Feat] Add Eleven Labs - Speech To Text Support on LiteLLM (#12119 ) * add ELEVENLABS as a provider * add deepgram to main.py * add ElevenLabsException * add ElevenLabsAudioTranscriptionConfig * add transform_audio_transcription_response * TestElevenLabsAudioTranscription * add elevenlabs/scribe_v1 to model cost map * add ElevenLabsAudioTranscriptionConfig * add AudioTranscriptionRequestData * add ElevenLabs transform * use AudioTranscriptionRequestData * refactoring fixes * add ProcessedAudioFile util for reading audio files * test_elevenlabs_diarize_parameter_passthrough * docs eleven labs * docs fixes * fix code qa checks * fixes - audio transcription * ui - add ElevenLabs logo * add elevenlabs logo * docs - ElevenLabs * test fix elevenlabs	2025-06-27 17:50:49 -07:00
Krish Dholakia	8bd1f8f6ab	Add o3 and o4-mini deep research models (#12109 ) * build(model_prices_and_context_window.json): add o3-deep-research models * build(model_prices_and_context_window.json): add o4-deep-research model * build(model_prices_and_context_window.json): add o4-mini-deep-research versioned model	2025-06-27 09:58:53 -07:00
Krrish Dholakia	0e96f412a1	build(model_prices_and_context_window.json): mark azure o3-pro as responses api model Fixes https://github.com/BerriAI/litellm/issues/12059	2025-06-26 10:41:03 -07:00
Krish Dholakia	1a4ad8bf18	Update mistral 'supports_response_schema' field + Fix ollama embedding (#12024 ) * build(model_prices_and_context_window.json): update all mistral models (besides codestral-mamba) to indicate support for response schema Closes https://github.com/BerriAI/litellm/issues/12012 * fix(route_llm_request.py): if llm router is not initialized, go straight through to litellm sdk Fixes https://github.com/BerriAI/litellm/issues/12008 * test: add unit test * fix(ollama_embeddings): fix unecessary await Fixes https://github.com/BerriAI/litellm/issues/11997 * test: update ollama embedding tests	2025-06-25 07:20:13 -07:00
Marty Sullivan	a5ce1cd49b	add azure o3-pro pricing (#11990 )	2025-06-24 10:57:24 -07:00
Cole McIntosh	eacb4dfdef	Add Mistral 3.2 24B to model mapping (#11926 ) * feat(model_prices_and_context_window.json): add mistral-small-3.2-24b-instruct model with token costs and chat mode support * fix(model_prices_and_context_window.json): update model paths to include 'openrouter' prefix for mistral-small-3.1 and 3.2	2025-06-23 14:54:39 -07:00
Cole McIntosh	02a095d4db	feat: implement Perplexity citation tokens and search queries cost calculation (#11938 ) * feat: add citation_cost_per_token and search_queries_cost_per_1000 fields to ModelInfoBase - Add citation_cost_per_token field to ModelInfoBase for Perplexity citation token costs - Add search_queries_cost_per_1000 field to ModelInfoBase for Perplexity search query costs - Update _get_model_info_helper to include these fields in model info responses - Enables proper cost calculation for Perplexity-specific usage metrics * feat: update Perplexity sonar-deep-research model pricing configuration - Update input/output token costs to / per million tokens respectively - Add reasoning token cost at per million tokens - Add citation_cost_per_token at per million tokens (same as input) - Add search_queries_cost_per_1000 at /bin/zsh.005 per 1000 search queries - Remove deprecated search_context_cost_per_query structure - Aligns with Perplexity's updated pricing model for deep research capabilities * feat: implement Perplexity-specific cost calculator - Create cost_per_token function for Perplexity provider - Calculate standard input/output token costs - Add citation token cost calculation using citation_cost_per_token rate - Add reasoning token cost calculation with fallback to completion_tokens_details - Add search query cost calculation using search_queries_cost_per_1000 rate - Return separate prompt_cost and completion_cost for accurate billing - Handles all Perplexity-specific usage metrics: citation_tokens, num_search_queries, reasoning_tokens * feat: integrate Perplexity cost calculator with main cost calculation system - Import perplexity_cost_per_token function in main cost calculator - Add perplexity provider case to cost_per_token function - Enables automatic routing of Perplexity cost calculations to provider-specific logic - Maintains compatibility with existing cost calculation patterns - Supports all Perplexity-specific cost metrics through unified interface * feat: enhance Perplexity response transformation to extract cost-related fields - Override transform_response method to extract Perplexity-specific usage fields - Add _enhance_usage_with_perplexity_fields method to process API responses - Extract citation_tokens from citations array using character-based estimation (~4 chars/token) - Extract num_search_queries from both usage field and root level with priority handling - Create usage object when none exists to ensure cost fields are always captured - Handle empty citations and missing fields gracefully - Enables automatic extraction of cost metrics from Perplexity API responses * test: add comprehensive test suite for Perplexity cost calculation features Add 82 comprehensive tests across 3 test files: - test_perplexity_cost_calculator.py (59 tests): * Cost calculation with citation tokens, search queries, reasoning tokens * Various combinations and edge cases * Integration with main cost calculator * Model info access and validation * Zero values and missing fields handling - test_perplexity_chat_transformation.py (12 tests): * Citation token extraction from API responses * Search query extraction from usage and root fields * Priority handling and field aggregation * Empty citations and missing fields handling * Token estimation accuracy validation - test_perplexity_integration.py (11 tests): * End-to-end cost calculation workflows * High-volume and edge case scenarios * Model info integration validation * Case-insensitive provider matching * Transformation preservation of existing fields Ensures reliability and correctness of all Perplexity cost features with comprehensive coverage of happy path, edge cases, and error conditions. * fix: remove unused Union import from Perplexity transformation - Remove unused typing.Union import from litellm/llms/perplexity/chat/transformation.py - Fixes F401 linting error: 'typing.Union imported but unused' - Maintains only necessary imports: Any, List, Optional, Tuple * Fix JSON schema validation and use web_search_requests field - Add citation_cost_per_token and search_queries_cost_per_1000 to JSON schema - Update Perplexity transformation to use web_search_requests in PromptTokensDetailsWrapper - Update Perplexity cost calculator to read from web_search_requests field - Maintain backward compatibility while using standard LiteLLM fields * Fix type errors in Perplexity cost calculator - Add null checks for token counts and cost values to prevent None multiplication errors - Use .get() with fallback values instead of direct dictionary access - Ensure all arithmetic operations handle None values safely This fixes the failing job 44517525148 type errors. * Refactor Perplexity cost calculation tests to improve accuracy and consistency - Replace absolute difference assertions with math.isclose for better precision in cost comparisons - Update tests to utilize PromptTokensDetailsWrapper for handling web search requests - Ensure all test cases correctly reflect the new structure of usage fields, enhancing clarity and maintainability * fix: address type hinting issues in PerplexityChatConfig usage handling - Add type ignore comments to model_response.usage assignments to resolve type checking errors - Ensures compatibility with type definitions while maintaining existing functionality * Update model pricing configuration in JSON backup - Add citation_cost_per_token and search_queries_cost_per_1000 fields to enhance cost tracking - Remove deprecated search_context_cost_per_query structure to streamline pricing model - Aligns with recent updates in Perplexity's pricing strategy * Update search queries cost structure in model_prices_and_context_window.json to use search_context_cost_per_query * Refactor search queries cost structure in model_prices_and_context_window_backup.json and update related code to use search_queries_cost_per_query. Remove deprecated search_queries_cost_per_1000 references across model info and tests. * Enhance cost calculation in cost_calculator.py by introducing a safe float casting function to handle potential None and invalid values. Update cost calculations for input, citation, output, reasoning, and search query tokens to use this new function, ensuring more robust handling of model pricing data. * Refactor cost calculation in cost_calculator.py to support both legacy and current search cost keys. Enhance handling of search cost values by accommodating both dictionary and float formats, ensuring robust cost computation for search queries. * Update test cases to reflect changes in cost structure, renaming search_queries_cost_per_query to search_context_cost_per_query for consistency with recent refactor. Ensure assertions in tests align with updated cost keys. * Update test_perplexity_integration.py to rename search_queries_cost_per_query to search_context_cost_per_query, ensuring consistency with recent cost structure changes. Adjust assertions to align with updated cost keys.	2025-06-23 14:15:25 -07:00
Erv Walter	aaa41d1e24	Update Azure o3 pricing to match OpenAI pricing ($2/$8 per 1M tokens) (#11937 ) * Initial plan for issue * Update Azure o3 pricing to match OpenAI pricing ($2/$8 per 1M tokens) Co-authored-by: ervwalter <768790+ervwalter@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: ervwalter <768790+ervwalter@users.noreply.github.com>	2025-06-20 23:17:46 -07:00
Ishaan Jaff	99d851544a	[Feat] Add Azure Codex Models on LiteLLM + new /v1 preview Azure OpenAI API (#11934 ) * fix get_complete_url * fixes _is_azure_v1_api_version * test_azure_responses_api_preview_api_version * TestAzureResponsesAPIConfig * add azure/codex-mini * fix azure/codex-mini * Update litellm/llms/azure/responses/transformation.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * fix linting --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-06-20 18:08:44 -07:00
Ishaan Jaff	19f13c842f	add gemini-2.5-pro (#11927 )	2025-06-20 11:29:06 -07:00
Krish Dholakia	40cc61c8f3	build(model_prices_and_context_window.json): mark all gemini-2.5 mode… (#11907 ) * build(model_prices_and_context_window.json): mark all gemini-2.5 models as supporting pdf input Closes https://github.com/BerriAI/litellm/issues/11881 * fix(anthropic_transformation.py): set custom llm provider custom property Fixes https://github.com/BerriAI/litellm/issues/11861 * test: add unit test for checking supports_reasoning * test: add test for vertex ai flow * feat(bedrock/anthropic): ensure thinking param correctly passed for bedrock/invoke	2025-06-19 21:07:25 -07:00
lgruen-vcgs	e1c77e70c2	Add AWS Bedrock profiles for the APAC region (#11883 ) Likely fixes #6905, #9228, and addresses https://github.com/BerriAI/litellm/issues/11057#issuecomment-2903257768.	2025-06-19 20:10:04 -07:00
Ishaan Jaff	0fe8bf2fc2	fix gemini-2.5-flash-lite-preview-06-17	2025-06-19 17:07:34 -07:00
fatih akyon	5b6ba871a5	[Bug Fix] add missing `flash-2.5-flash-lite` for gemini provider, fix `gemini-2.5-flash` pricing (#11901 )	2025-06-19 16:38:17 -07:00
Krrish Dholakia	649636b26b	build(model_prices_and_context_window.json): ensure tpm/rpm limits are int Closes https://github.com/BerriAI/litellm/issues/11882	2025-06-19 14:58:16 -07:00
Ishaan Jaff	e1764af890	fix meta_llama/Llama-3.3-8B-Instruct	2025-06-19 13:44:05 -07:00
Krrish Dholakia	b080220d02	build: fix gemini-2.5-pro rate limits	2025-06-18 22:56:56 -07:00
Krrish Dholakia	dfafa986ea	build(model_prices_and_context_window.json): add gemini google ai studio rate limits	2025-06-18 22:55:54 -07:00
Low Jian Sheng	ca6fa63362	Fix gemini 2.5 flash config (#11830 ) * fix gemini 2.5 flash config * add gemini 2.5 flash	2025-06-18 20:16:48 -07:00
salah alzubi	d7e53edc26	Update model_prices_and_context_window.json (#11803 ) -- Updated pricing for Gemini Flash -- Updated a few Openrouter models -- Updated pricing for Gemini Flash Lite	2025-06-17 17:12:08 -07:00
Emerson Gomes	b21f4a3f74	Add Vertex Imagen-4 models (#11767 )	2025-06-16 10:08:51 -07:00
Krish Dholakia	0908618a19	Litellm stable release 06 14 2025 (#11737 ) * docs: initial commit with stable release changelog notes * docs: style updates * docs(index.md): updated changelog * docs(index.md): cleanup * docs(index.md): add general proxy improvements * docs: index.md cleanup	2025-06-14 16:56:29 -07:00
nevin	b7cb66ee8f	Fixed grok-3-mini to not use stop tokens (#11563 ) * fixed grok-3-mini to not use stop tokens * added xai config test	2025-06-14 14:26:43 -07:00
Cole McIntosh	6b9754e2aa	Merge pull request #11642 from colesmcintosh/mistral-reasoning Enhance Mistral model support with reasoning capabilities	2025-06-12 16:42:53 -06:00
Ishaan Jaff	27cc503185	add gpt-4o-mini-transcribe (#11676 )	2025-06-12 15:30:25 -07:00
Cole McIntosh	12a61fce4a	[Feat] Enhance Mistral model support with reasoning capabilities * Added support for reasoning parameters in magistral models, including "reasoning_effort" and "thinking". * Updated the MistralConfig class to handle reasoning system prompts. * Implemented tests to verify reasoning functionality and ensure correct parameter mapping for magistral models. * Enhanced the model prices JSON to reflect new reasoning capabilities.	2025-06-11 17:13:06 -06:00
Ishaan Jaff	52ef96261f	[UI] Add Deepgram provider to supported providers list and mappings (#11634 ) * Add Deepgram provider to supported providers list and mappings * add logo * Add deepgram to model cost map * ui - require api key for deepgram * fix logo path --------- Co-authored-by: Cursor Agent <cursoragent@cursor.com>	2025-06-11 12:12:12 -07:00
Krrish Dholakia	e4ac1cdef2	build(model_prices_and_context_window.json): fix o3-pro mode to 'responses'	2025-06-11 09:08:58 -07:00
Krish Dholakia	56f481a47e	Add new o3 models pricing (#11606 ) * build(model_prices_and_context_window.json): add o3-pro pricing * build(model_prices_and_context_window.json): add updated o3 model pricing * build(model_prices_and_context_window.json): add new o3-pro model version	2025-06-10 16:33:11 -07:00
Cole McIntosh	3919b64209	Add new Mistral models to pricing and context window JSON: add 'mistral/magistral-medium-2506' and 'mistral/magistral-small-2506' with token limits and cost details	2025-06-10 08:38:24 -06:00
Krish Dholakia	25c0d39307	Add VertexAI `claude-opus-4` + Assign users to orgs on creation (#11572 ) * build(model_prices_and_context_window.json): add 'claude-opus-4' on vertexai (no @) * build(model_prices_and_context_window.json): add claude sonnet 4 without 2 @ feat(internal_user_endpoints.py): assign user to orgs on user creation allows user to be a member of orgs on creation - work to enable default orgs on UI * fix(internal_user_endpoints.py): fix http_request	2025-06-09 23:24:06 -07:00
Ishaan Jaff	9241fca2f5	Fix: Adds support for choosing the default region based on where the model is available (#11566 ) * fix: vtx default region for global only models * track gemini-2.5-pro-preview-05-06 * fix is_global_only_vertex_model * test_is_global_only_vertex_model * test_get_vertex_region_global_only_model * fix json format * fix get_supported_regions	2025-06-09 18:29:44 -07:00
Cole McIntosh	abe4c8fe4c	feat: add gpt-4o-audio-preview model configuration to model_prices_and_context_window.json (#11560 )	2025-06-09 14:56:36 -07:00
Ishaan Jaff	eb02cf1a2d	Revert "Nebius model pricing info updted (#11445 )" (#11493 ) This reverts commit `32281de91f`.	2025-06-06 11:04:21 -07:00
Akim Tsvigun	32281de91f	Nebius model pricing info updted (#11445 )	2025-06-06 10:43:04 -07:00
Ishaan Jaff	2aa75e1403	add codex-mini-latest (#11492 )	2025-06-06 10:39:09 -07:00
Peter Dave Hello	b452f82045	Add Google Gemini 2.5 Pro Preview 06-05 (#11447 )	2025-06-06 09:28:53 -07:00
Krish Dholakia	603bd73a17	Gemini - web search cost tracking + Update max output tokens for nova models * fix(vertex_and_google_ai_studio_gemini.py): add web search request tracking Enables cost calculation for google web search * fix(vertex_and_gemini): use common processing logic across stream / non-stream calls * fix(vertex_And_google_ai_studio_Gemini.py): fix initial choice * fix: fix linting error * fix: add initial support for google search cost tracking * fix(tool_call_cost_tracking.py): working tool cost tracking for gemini * fix(vertex_ai/gemini/cost_calculator.py): add google web search tool cost tracking for vertex ai Closes LIT-210 * fix: fix check * build(model_prices_and_context_window.json): fix amazon nova max output tokens Closes https://github.com/BerriAI/litellm/issues/11441 * fix: fix ruff check	2025-06-05 23:25:18 -07:00
Krrish Dholakia	505d2fe0c7	build: bump	2025-06-05 00:08:53 -07:00
Jimmy Tsai	4019f79808	feat: add deepseek-r1 family model configuration to pricing JSON (#11394 )	2025-06-04 22:39:06 -07:00
Cole McIntosh	7bbd8262ed	Add Claude 4 Sonnet & Opus, DeepSeek R1, and fix Llama Vision model pricing configurations (#11339 ) * fix: update model path for llama-v3p2-90b-vision-instruct in pricing configuration (missing fireworks_ai/ prefix) * feat: add deepseek-r1-0528 model configuration to pricing JSON * feat: add configurations for new Claude 4 model alias to pricing JSON * undo prefix change * fix: update supports_response_schema to false in pricing JSON for litellm_provider * update supports_tool_choice and supports_response_schema * Update model configuration to disable function calling and tool choice for multiple models in fireworks_ai. Adjusted supported parameters in FireworksAIConfig to conditionally include tools and tool_choice based on model compatibility. * Refactor FireworksAIConfig to use supports_function_calling from utils * Enhance FireworksAIConfig to conditionally support tool_choice based on model capabilities	2025-06-03 20:39:47 -07:00
Marty Sullivan	d247a390bd	add gemini-embeddings-001 model prices and context window (#11332 ) * add gemini-embeddings-001 model prices and context window * use scientific notation	2025-06-03 15:59:30 -07:00
Cole McIntosh	621d609879	feat: add cerebras/qwen-3-32b model pricing and capabilities to model_prices_and_context_window.json (#11373 )	2025-06-03 11:32:13 -07:00
Cole McIntosh	94650c10fe	feat: Add support for Cohere Embed v4.0 model (#11329 ) - Updated model_prices_and_context_window.json to include embed-v4.0 with relevant pricing and metadata. - Added embed-v4.0 to cohere_embedding_models in constants.py. - Implemented comprehensive tests for Cohere Embed v4.0 in test_cohere.py, covering basic functionality, input types, error handling, and optional parameters.	2025-06-02 11:25:29 -07:00
Krish Dholakia	06484f6e5a	Xai, VertexAI, Google AI Studio - live web search support in OpenAI format (#11251 ) * build(model_prices_and_context_window.json): fix 'supports_web_search' flag - openai only supports it on 2 models - gpt-4o-search-preview and gpt-4o-mini-search-preview * feat(xai/chat): add xai web search options param support * test: add max tokens to test xai output very verbose * build(xai/): add web search support for all xai models * build(model_prices_and_cost.json): add gemini-2.0 supports web search * feat(gemini/): map openai 'web_search_options' to google's 'googlesearch' tool * build(model_prices_and_context_window.json): add supports_web_search for vertex_ai/gemini-2 models * fix: fix circular reference error * fix(convert_dict_to_response.py): handle scenario where xai returns finish reason as 'stop' for tool calls * fix: reduce function size * fix: import session handling * Revert "fix: import session handling" This reverts commit `deb257dc10`. * fix: linting pin mypy * [Feat]: Guardrails - Add streaming for bedrock post guard (#11247) * feat: add streaming for bedrock post guard * fix: bedrock guardrails * fix: add clear comments * Update litellm/proxy/guardrails/guardrail_hooks/bedrock_guardrails.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update litellm/proxy/guardrails/guardrail_hooks/bedrock_guardrails.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * fix: clean up bedrock guardrails --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * [Fix] Responses API - Session management (#11254) * fix: import session handling * fix: imports for session handler * tests: tests for session handler * Update enterprise/litellm_enterprise/enterprise_callbacks/session_handler.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * bump: bump litellm enterprise * fixes: test_create_user_default_budget * fix(xai/): filter 'strict' on tool call * test: update test for new error string * fix(utils.py): default to None if not set in model cost map ensures consistent usage of 'supports_[x]' flags * fix(fireworks_ai/): support fireworks ai document inlining on pdf's sent via openai 'file' message type * test: update test * test: name filter_value_from_dict * fix(fireworks_ai/): handle cache control flag in messages * fix(xai/chat): fix check --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-05-31 14:26:16 -07:00
Krish Dholakia	8fb2779c9e	build(model_prices_and_context_window.json): add supports parallel function calling to all gemini models (#11225 )	2025-05-28 22:32:02 -07:00
Regis David Souza Mesquita	56c32ef503	Update mistral-medium prices and context sizes (#10729 ) * Update mistral-medium prices and context sizes While testing the Mistral model, I noticed a discrepancy in the pricing shown on the logs screen. After reviewing the code, I confirmed that the pricing values were incorrect. This PR corrects the input and output token pricing for the latest Mistral model and adds the newly released mistral-medium-2505 version. * Adds tool calling flag to mistral-medium * Adds mistral-medium price updates to the main model price file * Update model_prices_and_context_window_backup.json sets mistral medium alias to the old values as it probably points to the old version. * Update model_prices_and_context_window.json * Update model_prices_and_context_window_backup.json * Update model_prices_and_context_window.json	2025-05-28 16:42:28 -07:00

1 2 3 4 5 ...

717 Commits