Commit Graph

141 Commits

Author SHA1 Message Date
Krish Dholakia 290e2528cd Schedule budget resets at expectable times (#10331) (#10333)
* Schedule budget resets at expectable times (#10331)

* Enhance budget reset functionality with timezone support and standardized reset times

- Added `get_next_standardized_reset_time` function to calculate budget reset times based on specified durations and timezones.
- Introduced `timezone_utils.py` to manage timezone retrieval and budget reset time calculations.
- Updated budget reset logic in `reset_budget_job.py`, `internal_user_endpoints.py`, `key_management_endpoints.py`, and `team_endpoints.py` to utilize the new timezone-aware reset time calculations.
- Added unit tests for the new reset time functionality in `test_duration_parser.py`.
- Updated `.gitignore` to include `test.py` and made minor formatting adjustments in `docker-compose.yml` for consistency.

* Fixed linting

* Fix for mypy

* Fixed testcase for reset

* fix(duration_parser.py): move off zoneinfo - doesn't work with python 3.8

* test: update test

* refactor: improve budget reset time calculation and update related tests for accuracy

* clean up imports in team_endpoints.py

* test: update budget remaining hours assertions to reflect new reset time logic

* build(model_prices_and_context_window.json): update model

---------

Co-authored-by: Prathamesh Saraf <pratamesh1867@gmail.com>
2025-04-29 20:59:44 -07:00
Krish Dholakia 839878f4f5 Support x-litellm-api-key header param + allow key at max budget to call non-llm api endpoints (#10392)
* fix(user_api_key_auth.py): fix passing `x-litellm-api-key` to user api key auth

Support using this when given, or bearer token when given

 Fixes issue with auth on vertex passthrough

* test(test_user_api_key_auth.py): use new fastapi.security check

* fix(user_api_key_auth.py): allow key at budget, to still call non-llm api endpoints

Fixes issue where key at budget, couldn't call `/key/info`
2025-04-29 18:57:57 -07:00
Ishaan Jaff f984089b01 [Reliability fix] Redis transaction buffer - ensure all redis queues are periodically flushed (#10393)
* test_e2e_size_of_redis_buffer

* fix store_in_memory_spend_updates_in_redis

* _commit_spend_updates_to_db_with_redis

* daily_tag_spend_update_transactions

* pip install fakeredis==2.28.1
2025-04-28 21:36:54 -07:00
Ishaan Jaff 8ae2653280 fix calculated cache key for tests 2025-04-19 09:25:11 -07:00
Krish Dholakia 1ea046cc61 test: update tests to new deployment model (#10142)
* test: update tests to new deployment model

* test: update model name

* test: skip cohere rbac issue test

* test: update test - replace gpt-4o model
2025-04-18 14:22:12 -07:00
Ishaan Jaff 94a553dbb2 [Feat] Emit Key, Team Budget metrics on a cron job schedule (#9528)
* _initialize_remaining_budget_metrics

* initialize_budget_metrics_cron_job

* initialize_budget_metrics_cron_job

* initialize_budget_metrics_cron_job

* test_initialize_budget_metrics_cron_job

* LITELLM_PROXY_ADMIN_NAME

* fix code qa checks

* test_initialize_budget_metrics_cron_job

* test_initialize_budget_metrics_cron_job

* pod lock manager allow dynamic cron job ID

* fix pod lock manager

* require cronjobid for PodLockManager

* fix DB_SPEND_UPDATE_JOB_NAME acquire / release lock

* add comment on prometheus logger

* add debug statements for emitting key, team budget metrics

* test_pod_lock_manager.py

* test_initialize_budget_metrics_cron_job

* initialize_budget_metrics_cron_job

* initialize_remaining_budget_metrics

* remove outdated test
2025-04-10 16:59:14 -07:00
Krish Dholakia 0d503ad8ad Move daily user transaction logging outside of 'disable_spend_logs' flag - different tables (#9772)
* refactor(db_spend_update_writer.py): aggregate table is entirely different

* test(test_db_spend_update_writer.py): add unit test to ensure if disable_spend_logs is true daily user transactions is still logged

* test: fix test
2025-04-05 09:58:16 -07:00
Ishaan Jaff a64631edfb test pod lock manager 2025-04-02 14:39:40 -07:00
Ishaan Jaff 8dc792139e refactor file structure 2025-04-01 18:30:48 -07:00
Ishaan Jaff 290e837515 test_update_logs_with_spend_logs_url 2025-04-01 18:15:01 -07:00
Ishaan Jaff 4ddca7a79c Merge branch 'main' into litellm_fix_service_account_behavior 2025-04-01 12:04:28 -07:00
Ishaan Jaff c2c5dbf24f test_get_enforced_params 2025-04-01 08:41:53 -07:00
Ishaan Jaff 13aa7f75f6 test_enforced_params_check 2025-04-01 07:40:31 -07:00
Ishaan Jaff 55763ae276 test_end_user_transactions_reset 2025-04-01 07:13:25 -07:00
Ishaan Jaff 923ac2303b test_end_user_transactions_reset 2025-03-31 20:55:13 -07:00
Ishaan Jaff aa8261af89 test fixes 2025-03-31 19:33:10 -07:00
Ishaan Jaff a753fc9d9f test_long_term_spend_accuracy_with_bursts 2025-03-31 19:17:13 -07:00
Ishaan Jaff ce5f55d04e test fix update spend 2025-03-31 14:20:47 -07:00
Krish Dholakia 5c107c64dd Add gemini audio input support + handle special tokens in sagemaker response (#9640)
* fix(internal_user_endpoints.py): cleanup unused variables on beta endpoint

no team/org split on daily user endpoint

* build(model_prices_and_context_window.json): gemini-2.0-flash supports audio input

* feat(gemini/transformation.py): support passing audio input to gemini

* test: fix test

* fix(gemini/transformation.py): support audio input as a url

enables passing google cloud bucket urls

* fix(gemini/transformation.py): support explicitly passing format of file

* fix(gemini/transformation.py): expand support for inferred file types from url

* fix(sagemaker/completion/transformation.py): fix special token error when counting sagemaker tokens

* test: fix import
2025-03-29 19:23:09 -07:00
Krish Dholakia 1604f87663 install prisma migration files - connects litellm proxy to litellm's prisma migration files (#9637)
* build(README.md): initial commit adding a separate folder for additional proxy files. Meant to reduce size of core package

* build(litellm-proxy-extras/): new pip package for storing migration files

allows litellm proxy to use migration files, without adding them to core repo

* build(litellm-proxy-extras/): cleanup pyproject.toml

* build: move prisma migration files inside new proxy extras package

* build(run_migration.py): update script to write to correct folder

* build(proxy_cli.py): load in migration files from litellm-proxy-extras

Closes https://github.com/BerriAI/litellm/issues/9558

* build: add MIT license to litellm-proxy-extras

* test: update test

* fix: fix schema

* bump: version 0.1.0 → 0.1.1

* build(publish-proxy-extras.sh): add script for publishing new proxy-extras version

* build(liccheck.ini): add litellm-proxy-extras to authorized packages

* fix(litellm-proxy-extras/utils.py): move prisma migrate logic inside extra proxy pkg

easier since migrations folder already there

* build(pre-commit-config.yaml): add litellm_proxy_extras to ci tests

* docs(config_settings.md): document new env var

* build(pyproject.toml): bump relevant files when litellm-proxy-extras version changed

* build(pre-commit-config.yaml): run poetry check on litellm-proxy-extras as well
2025-03-29 15:27:09 -07:00
Krrish Dholakia 217e8d7d44 test: make script to run clearer 2025-03-29 08:23:18 -07:00
Ishaan Jaff 7e8a02099c Merge branch 'main' into litellm_use_redis_for_updates 2025-03-28 20:12:29 -07:00
Ishaan Jaff 1eaf847f8a test pod lock manager 2025-03-28 13:31:45 -07:00
Ishaan Jaff 758182fc7f fix typo on codebase 2025-03-27 22:36:00 -07:00
Krrish Dholakia 79175ddb53 test: fix test 2025-03-27 22:04:59 -07:00
Krrish Dholakia b3c7785240 test: skip flaky test - failing due to db timeouts - unrelated to test 2025-03-27 20:34:26 -07:00
Krrish Dholakia e2d4597588 test: mark flaky test 2025-03-27 20:10:57 -07:00
Ishaan Jaff 21e3b764f5 use DBSpendUpdateWriter class for managing DB spend updates 2025-03-27 16:31:23 -07:00
Krish Dholakia 11838e1c3b Litellm fix db testing (#9593)
* ci: fix test

* test: safely change db url

* fix: print db url

* test: remove delenv
2025-03-27 14:50:41 -07:00
Ishaan Jaff 0155b0eba2 Merge pull request #9533 from BerriAI/litellm_stability_fixes
[Reliability Fixes] - Gracefully handle exceptions when DB is having an outage
2025-03-26 18:57:38 -07:00
Krrish Dholakia 04490c99d7 test: fix test 2025-03-26 17:12:09 -07:00
Krrish Dholakia d4adc9764b test(test_db_schema_migration.py): ci/cd test to enforce schema migrations are documented in .sql files 2025-03-26 16:59:50 -07:00
Krish Dholakia 132d3f7baa feat(prisma-migrations): add baseline db migration file (#9565)
adds initial baseline db migration file

enables future schema changes to be documented via .sql files
2025-03-26 16:22:56 -07:00
Ishaan Jaff 6e5d2b1ac7 handle failed db connections 2025-03-25 23:14:44 -07:00
Krrish Dholakia 364ea3b7dc test: fix test 2025-03-21 22:02:39 -07:00
Krrish Dholakia 95ef5f1009 refactor(user_api_key_auth.py): move is_route_allowed to inside common_checks
ensures consistent behaviour inside api key + jwt routes
2025-03-21 17:21:07 -07:00
Krrish Dholakia 91cf3fc40d test: initial e2e testing to ensure non admin jwt token cannot create new teams 2025-03-21 16:40:18 -07:00
Ishaan Jaff 63a32ff02c test_reset_budget_job 2025-03-17 19:37:09 -07:00
Krrish Dholakia 6ed995952f fix: fix test 2025-03-14 20:28:50 -07:00
Krish Dholakia c93a5e2301 Merge pull request #9047 from BerriAI/litellm_dev_03_06_2025_p4
feat(handle_jwt.py): support multiple jwt url's
2025-03-10 22:37:35 -07:00
Ishaan Jaff 9dcc25d63b Merge branch 'main' into litellm_fix_team_model_access_checks 2025-03-10 19:05:11 -07:00
Krish Dholakia c58941d49c Merge branch 'main' into litellm_dev_03_06_2025_p4 2025-03-10 18:41:10 -07:00
Krrish Dholakia c08705517b test: fix test 2025-03-09 19:40:03 -07:00
Ishaan Jaff 73df319f4e (Clean up) - Allow switching off storing Error Logs in DB (#9084)
* fix - cleanup, dont store ErrorLogs in 2 tables

* async_post_call_failure_hook

* docs disable error logs

* disable_error_logs
2025-03-08 16:12:03 -08:00
Krrish Dholakia 805679becc feat(handle_jwt.py): support multiple jwt url's 2025-03-06 23:05:54 -08:00
Ishaan Jaff 5ead81786d test_can_team_access_model 2025-03-01 17:42:50 -08:00
Ishaan Jaff f85d5afd58 Merge branch 'main' into litellm_fix_team_model_access_checks 2025-03-01 17:36:45 -08:00
Krish Dholakia c1527ebf52 UI - Allow admin to control default model access for internal users (#8912)
* fix(create_user_button.tsx): allow admin to set models user has access to, on invite

Enables controlling model access on invite

* feat(auth_checks.py): enforce 'no-model-access' special model name on backend

prevent user from calling models if default key has no model access

* fix(chat_ui.tsx): allow user to input custom model

* fix(chat_ui.tsx): pull available models based on models key has access to

* style(create_user_button.tsx): move default model inside 'personal key creation' accordion

* fix(chat_ui.tsx): fix linting error

* test(test_auth_checks.py): add unit-test for special model name

* docs(internal_user_endpoints.py): update docstring

* fix test_moderations_bad_model

* Litellm dev 02 27 2025 p6 (#8891)

* fix(http_parsing_utils.py): orjson can throw errors on some emoji's in text, default to json.loads

* fix(sagemaker/handler.py): support passing model id on async streaming

* fix(litellm_pre_call_utils.py): Fixes https://github.com/BerriAI/litellm/issues/7237

* Fix calling claude via invoke route + response_format support for claude on invoke route (#8908)

* fix(anthropic_claude3_transformation.py): fix amazon anthropic claude 3 tool calling transformation on invoke route

move to using anthropic config as base

* fix(utils.py): expose anthropic config via providerconfigmanager

* fix(llm_http_handler.py): support json mode on async completion calls

* fix(invoke_handler/make_call): support json mode for anthropic called via bedrock invoke

* fix(anthropic/): handle 'response_format: {"type": "text"}` + migrate amazon claude 3 invoke config to inherit from anthropic config

Prevents error when passing in 'response_format: {"type": "text"}

* test: fix test

* fix(utils.py): fix base invoke provider check

* fix(anthropic_claude3_transformation.py): don't pass 'stream' param

* fix: fix linting errors

* fix(converse_transformation.py): handle response_format type=text for converse

* converse_transformation: pass 'description' if set in response_format (#8907)

* test(test_bedrock_completion.py): e2e test ensuring tool description is passed in

* fix(converse_transformation.py): pass description, if set

* fix(transformation.py): Fixes https://github.com/BerriAI/litellm/issues/8767#issuecomment-2689887663

* Fix bedrock passing `response_format: {"type": "text"}` (#8900)

* fix(converse_transformation.py): ignore type: text, value in response_format

no-op for bedrock

* fix(converse_transformation.py): handle adding response format value to tools

* fix(base_invoke_transformation.py): fix 'get_bedrock_invoke_provider' to handle cross-region-inferencing models

* test(test_bedrock_completion.py): add unit testing for bedrock invoke provider logic

* test: update test

* fix(exception_mapping_utils.py): add context window exceeded error handling for databricks provider route

* fix(fireworks_ai/): support passing tools + response_format together

* fix: cleanup

* fix(base_invoke_transformation.py): fix imports

* (Feat) - Show Error Logs on LiteLLM UI  (#8904)

* fix test_moderations_bad_model

* use async_post_call_failure_hook

* basic logging errors in DB

* show status on ui

* show status on ui

* ui show request / response side by side

* stash fixes

* working, track raw request

* track error info in metadata

* fix showing error / request / response logs

* show traceback on error viewer

* ui with traceback of error

* fix async_post_call_failure_hook

* fix(http_parsing_utils.py): orjson can throw errors on some emoji's in text, default to json.loads

* test_get_error_information

* fix code quality

* rename proxy track cost callback test

* _should_store_errors_in_spend_logs

* feature flag error logs

* Revert "_should_store_errors_in_spend_logs"

This reverts commit 7f345df47762ff3be04e6fde2f13e70019ede4ee.

* Revert "feature flag error logs"

This reverts commit 0e90c022bbea3550f169118d81e60d711a4024fe.

* test_spend_logs_payload

* fix OTEL log_db_metrics

* fix import json

* fix ui linting error

* test_async_post_call_failure_hook

* test_chat_completion_bad_model_with_spend_logs

---------

Co-authored-by: Krrish Dholakia <krrishdholakia@gmail.com>

* ui new build

* test_chat_completion_bad_model_with_spend_logs

* docs(release_cycle.md): document release cycle

* bump: version 1.62.0 → 1.62.1

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
2025-02-28 23:23:03 -08:00
Ishaan Jaff 3a086cee06 (Feat) - Show Error Logs on LiteLLM UI (#8904)
* fix test_moderations_bad_model

* use async_post_call_failure_hook

* basic logging errors in DB

* show status on ui

* show status on ui

* ui show request / response side by side

* stash fixes

* working, track raw request

* track error info in metadata

* fix showing error / request / response logs

* show traceback on error viewer

* ui with traceback of error

* fix async_post_call_failure_hook

* fix(http_parsing_utils.py): orjson can throw errors on some emoji's in text, default to json.loads

* test_get_error_information

* fix code quality

* rename proxy track cost callback test

* _should_store_errors_in_spend_logs

* feature flag error logs

* Revert "_should_store_errors_in_spend_logs"

This reverts commit 7f345df47762ff3be04e6fde2f13e70019ede4ee.

* Revert "feature flag error logs"

This reverts commit 0e90c022bbea3550f169118d81e60d711a4024fe.

* test_spend_logs_payload

* fix OTEL log_db_metrics

* fix import json

* fix ui linting error

* test_async_post_call_failure_hook

* test_chat_completion_bad_model_with_spend_logs

---------

Co-authored-by: Krrish Dholakia <krrishdholakia@gmail.com>
2025-02-28 20:10:09 -08:00
Krish Dholakia 740bd7e9ce (security fix) - Enforce model access restrictions on Azure OpenAI route (#8888)
* fix(user_api_key_auth.py): Fixes https://github.com/BerriAI/litellm/issues/8780

security fix - enforce model access checks on azure routes

* test(test_user_api_key_auth.py): add unit testing

* test(test_openai_endpoints.py): add e2e test to ensure azure routes also run through model validation checks
2025-02-27 21:24:58 -08:00