Commit Graph

10 Commits

Author SHA1 Message Date
Cole McIntosh b6589a72c9 fix: Add size parameter support for Vertex AI image generation (#12292)
- Added 'size' to supported parameters for vertex_ai in get_optional_params_image_gen
- Implemented mapping from OpenAI size format (e.g., '1024x1024') to Vertex AI aspectRatio format (e.g., '1:1')
- Supports common aspect ratios: 1:1 (square), 16:9 (landscape), 9:16 (portrait)
- Added comprehensive test coverage for the size parameter mapping

Fixes LIT-279: Vertex AI Image Generation Aspect Ratio Support
2025-07-03 21:43:37 -07:00
Cole McIntosh 8965cc3b6c Fix unpack_defs handling of nested $ref inside anyOf items (#11964)
* refactor(unpack_defs): enhance handling of schema properties and anyOf structures

- Improved the unpack_defs function to handle top-level properties and nested structures more effectively.
- Added recursion for items in schemas and refined the handling of anyOf branches to ensure proper unpacking of references.
- Streamlined the logic for resolving $ref keys and managing nested schemas.

* test(unpack_defs): add test for resolving nested $ref in anyOf items

- Introduced a new test to verify that unpack_defs correctly resolves references within items of anyOf structures, addressing a specific bug scenario (Issue #11372).
- The test includes a minimal schema to ensure proper unpacking and validation of the resolved items schema.

* refactor(unpack_defs): implement a generic resolver for $ref entries

- Redesigned the unpack_defs function to provide a more robust and dependency-free implementation for resolving all $ref entries in JSON schemas.
- Introduced a depth-first traversal method that efficiently handles nested structures, including anyOf, allOf, and items, while avoiding infinite recursion.
- Enhanced memory management by resolving nodes in-place without creating a full dereferenced copy, improving performance and reducing overhead.

* Remove test for unpack_defs resolving nested references in anyOf items from test_utils.py

* Add test for unpack_defs resolving nested references in anyOf items

This commit introduces a new test to ensure that the unpack_defs function correctly resolves $ref references within items of anyOf schemas, addressing issue #11372. The test verifies that the unpacked schema contains the expected properties and structure.
2025-06-24 09:08:15 -07:00
Cole McIntosh 02a095d4db feat: implement Perplexity citation tokens and search queries cost calculation (#11938)
* feat: add citation_cost_per_token and search_queries_cost_per_1000 fields to ModelInfoBase

- Add citation_cost_per_token field to ModelInfoBase for Perplexity citation token costs
- Add search_queries_cost_per_1000 field to ModelInfoBase for Perplexity search query costs
- Update _get_model_info_helper to include these fields in model info responses
- Enables proper cost calculation for Perplexity-specific usage metrics

* feat: update Perplexity sonar-deep-research model pricing configuration

- Update input/output token costs to / per million tokens respectively
- Add reasoning token cost at  per million tokens
- Add citation_cost_per_token at  per million tokens (same as input)
- Add search_queries_cost_per_1000 at /bin/zsh.005 per 1000 search queries
- Remove deprecated search_context_cost_per_query structure
- Aligns with Perplexity's updated pricing model for deep research capabilities

* feat: implement Perplexity-specific cost calculator

- Create cost_per_token function for Perplexity provider
- Calculate standard input/output token costs
- Add citation token cost calculation using citation_cost_per_token rate
- Add reasoning token cost calculation with fallback to completion_tokens_details
- Add search query cost calculation using search_queries_cost_per_1000 rate
- Return separate prompt_cost and completion_cost for accurate billing
- Handles all Perplexity-specific usage metrics: citation_tokens, num_search_queries, reasoning_tokens

* feat: integrate Perplexity cost calculator with main cost calculation system

- Import perplexity_cost_per_token function in main cost calculator
- Add perplexity provider case to cost_per_token function
- Enables automatic routing of Perplexity cost calculations to provider-specific logic
- Maintains compatibility with existing cost calculation patterns
- Supports all Perplexity-specific cost metrics through unified interface

* feat: enhance Perplexity response transformation to extract cost-related fields

- Override transform_response method to extract Perplexity-specific usage fields
- Add _enhance_usage_with_perplexity_fields method to process API responses
- Extract citation_tokens from citations array using character-based estimation (~4 chars/token)
- Extract num_search_queries from both usage field and root level with priority handling
- Create usage object when none exists to ensure cost fields are always captured
- Handle empty citations and missing fields gracefully
- Enables automatic extraction of cost metrics from Perplexity API responses

* test: add comprehensive test suite for Perplexity cost calculation features

Add 82 comprehensive tests across 3 test files:

- test_perplexity_cost_calculator.py (59 tests):
  * Cost calculation with citation tokens, search queries, reasoning tokens
  * Various combinations and edge cases
  * Integration with main cost calculator
  * Model info access and validation
  * Zero values and missing fields handling

- test_perplexity_chat_transformation.py (12 tests):
  * Citation token extraction from API responses
  * Search query extraction from usage and root fields
  * Priority handling and field aggregation
  * Empty citations and missing fields handling
  * Token estimation accuracy validation

- test_perplexity_integration.py (11 tests):
  * End-to-end cost calculation workflows
  * High-volume and edge case scenarios
  * Model info integration validation
  * Case-insensitive provider matching
  * Transformation preservation of existing fields

Ensures reliability and correctness of all Perplexity cost features with comprehensive coverage of happy path, edge cases, and error conditions.

* fix: remove unused Union import from Perplexity transformation

- Remove unused typing.Union import from litellm/llms/perplexity/chat/transformation.py
- Fixes F401 linting error: 'typing.Union imported but unused'
- Maintains only necessary imports: Any, List, Optional, Tuple

* Fix JSON schema validation and use web_search_requests field

- Add citation_cost_per_token and search_queries_cost_per_1000 to JSON schema
- Update Perplexity transformation to use web_search_requests in PromptTokensDetailsWrapper
- Update Perplexity cost calculator to read from web_search_requests field
- Maintain backward compatibility while using standard LiteLLM fields

* Fix type errors in Perplexity cost calculator

- Add null checks for token counts and cost values to prevent None multiplication errors
- Use .get() with fallback values instead of direct dictionary access
- Ensure all arithmetic operations handle None values safely

This fixes the failing job 44517525148 type errors.

* Refactor Perplexity cost calculation tests to improve accuracy and consistency

- Replace absolute difference assertions with math.isclose for better precision in cost comparisons
- Update tests to utilize PromptTokensDetailsWrapper for handling web search requests
- Ensure all test cases correctly reflect the new structure of usage fields, enhancing clarity and maintainability

* fix: address type hinting issues in PerplexityChatConfig usage handling

- Add type ignore comments to model_response.usage assignments to resolve type checking errors
- Ensures compatibility with type definitions while maintaining existing functionality

* Update model pricing configuration in JSON backup

- Add citation_cost_per_token and search_queries_cost_per_1000 fields to enhance cost tracking
- Remove deprecated search_context_cost_per_query structure to streamline pricing model
- Aligns with recent updates in Perplexity's pricing strategy

* Update search queries cost structure in model_prices_and_context_window.json to use search_context_cost_per_query

* Refactor search queries cost structure in model_prices_and_context_window_backup.json and update related code to use search_queries_cost_per_query. Remove deprecated search_queries_cost_per_1000 references across model info and tests.

* Enhance cost calculation in cost_calculator.py by introducing a safe float casting function to handle potential None and invalid values. Update cost calculations for input, citation, output, reasoning, and search query tokens to use this new function, ensuring more robust handling of model pricing data.

* Refactor cost calculation in cost_calculator.py to support both legacy and current search cost keys. Enhance handling of search cost values by accommodating both dictionary and float formats, ensuring robust cost computation for search queries.

* Update test cases to reflect changes in cost structure, renaming search_queries_cost_per_query to search_context_cost_per_query for consistency with recent refactor. Ensure assertions in tests align with updated cost keys.

* Update test_perplexity_integration.py to rename search_queries_cost_per_query to search_context_cost_per_query, ensuring consistency with recent cost structure changes. Adjust assertions to align with updated cost keys.
2025-06-23 14:15:25 -07:00
Krish Dholakia 369922ef90 Convert scientific notation str to int + Bubble up azure content filter results (#11655)
* fix(utils.py): convert stringified numbers to numbers

Closes https://github.com/BerriAI/litellm/issues/11266

* fix(convert_dict_to_model_response_object/): bubble up azure content_filter_results

* fix: fix linting error

* fix: fix linting errors

* fix(types/utils.py): ensure choices is correctly set

* fix: delete field if not set

* fix: expand scope of choicelogprobs value
2025-06-11 23:07:22 -07:00
Ishaan Jaff 9241fca2f5 Fix: Adds support for choosing the default region based on where the model is available (#11566)
* fix: vtx default region for global only models

* track gemini-2.5-pro-preview-05-06

* fix is_global_only_vertex_model

* test_is_global_only_vertex_model

* test_get_vertex_region_global_only_model

* fix json format

* fix get_supported_regions
2025-06-09 18:29:44 -07:00
Krrish Dholakia e5f228abd5 fix(utils.py): handle litellm proxy case for checking model info 2025-06-06 09:24:41 -07:00
Krrish Dholakia 505d2fe0c7 build: bump 2025-06-05 00:08:53 -07:00
Krish Dholakia 06484f6e5a Xai, VertexAI, Google AI Studio - live web search support in OpenAI format (#11251)
* build(model_prices_and_context_window.json): fix 'supports_web_search' flag - openai only supports it on 2 models - gpt-4o-search-preview and gpt-4o-mini-search-preview

* feat(xai/chat): add xai web search options param support

* test: add max tokens to test

xai output very verbose

* build(xai/): add web search support for all xai models

* build(model_prices_and_cost.json): add gemini-2.0 supports web search

* feat(gemini/): map openai 'web_search_options' to google's 'googlesearch' tool

* build(model_prices_and_context_window.json): add supports_web_search for vertex_ai/gemini-2 models

* fix: fix circular reference error

* fix(convert_dict_to_response.py): handle scenario where xai returns finish reason as 'stop' for tool calls

* fix: reduce function size

* fix: import session handling

* Revert "fix: import session handling"

This reverts commit deb257dc10.

* fix: linting pin mypy

* [Feat]: Guardrails - Add streaming for bedrock post guard (#11247)

* feat: add streaming for bedrock post guard

* fix: bedrock guardrails

* fix: add clear comments

* Update litellm/proxy/guardrails/guardrail_hooks/bedrock_guardrails.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update litellm/proxy/guardrails/guardrail_hooks/bedrock_guardrails.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* fix: clean up bedrock guardrails

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* [Fix] Responses API - Session management  (#11254)

* fix: import session handling

* fix: imports for session handler

* tests: tests for session handler

* Update enterprise/litellm_enterprise/enterprise_callbacks/session_handler.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* bump: bump litellm enterprise

* fixes: test_create_user_default_budget

* fix(xai/): filter 'strict' on tool call

* test: update test for new error string

* fix(utils.py): default to None if not set in  model cost map

ensures consistent usage of 'supports_[x]' flags

* fix(fireworks_ai/): support fireworks ai document inlining on pdf's sent via openai 'file' message type

* test: update test

* test: name filter_value_from_dict

* fix(fireworks_ai/): handle cache control flag in messages

* fix(xai/chat): fix check

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-05-31 14:26:16 -07:00
Krish Dholakia 4c82dd9b27 Ollama Chat - parse tool calls on streaming (#11171)
* fix(user_api_key_auth.py): fix else block

Fixes https://github.com/BerriAI/litellm/issues/11170

* refactor(ollama/chat): refactor to base config pattern

easier to maintain fixes

* fix(ollama/chat): support tool call parsing on streaming

Closes https://github.com/BerriAI/litellm/issues/11104

* test: update import location

* fix: cleanup unused import

* fix: fix ruff check error

* test: update import

* test: update test on ci

* ci: cleanup

* fix: fix chekc

* fix: fix api key check order

* test: fix import

* ci: fix script

* test: fix imports

* fix: fix tests
2025-05-27 16:14:49 -07:00
Krish Dholakia ef42461c1e Litellm fix GitHub action testing (#11163)
* test: add __init__.py files

* refactor: rename test folder to avoid naming conflict

* test: update workflows

* test: update tests

* test: update imports

* test: update tests

* test: remove unused import

* ci(test-litellm.yml): add pytest retry to github workflow

* test: fix test
2025-05-26 14:41:42 -07:00