Commit Graph

17 Commits

Author SHA1 Message Date
Krrish Dholakia c3857e60f2 Store batch output file id in DB + Store batch file status in DB + (experimental) BATCH API COST TRACKING 2025-06-25 22:41:22 -07:00
Krish Dholakia e2f6fb2d7c Managed Files + Batches - filter deployments to only those where file was written + save all model file id mappings in DB (prev just 1st one) (#12048)
* test(test_router.py): initial unit test confirming router.afile_content uses dynamic api key / api base

* fix(managed_files.py): filter deployments for only those within file id mapping

ensure call works - only route to models where the file was written

* fix(proxy_server.py): fix loading in model ids from config, if config id is int

* fix(router.py): return all model file id mappings on create_file

if multiple deployments - this ensures all the file id mappings are bubbled up

Fixes issue when trying to use loadbalanced deployments - only 1 file id mapping was being stored
2025-06-25 21:27:06 -07:00
Ishaan Jaff 84bacef587 test_internal_user_endpoints.py 2025-06-21 16:45:14 -07:00
Krish Dholakia 84999ddef3 UI - Fix remaining users activity if no limit + allow filtering by model access groups (#11730)
* feat(enterprise/): fix remaining users check on license

* fix(usage_indicator.tsx): if no max user set, don't render remaining user info card

only for users with user limits on their license

* fix(leftnav.tsx): only show remaining users to admin

* feat(columns.tsx): don't allow sorting on model access groups

it's a list[str]

* feat(model_dashboard.tsx): add model access group filters
2025-06-14 13:38:58 -07:00
Krish Dholakia 4611b821ec Support returning virtual key in custom auth + Handle provider-specific optional params for embedding calls (#11346)
* feat(custom_auth_auto.py): support returning a litellm virtual key from custom auth

allows admin to remap old keys to litellm virtual keys

* fix(utils.py): correctly handle optional params for openai sdk calls

Fixes https://github.com/BerriAI/litellm/issues/11126

* test: update test

* fix(utils.py): handle edge cases
2025-06-03 07:24:13 -07:00
Ishaan Jaff 7fcbb38d91 [Fix] Responses API - Session management (#11254)
* fix: import session handling

* fix: imports for session handler

* tests: tests for session handler

* Update enterprise/litellm_enterprise/enterprise_callbacks/session_handler.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-05-29 20:36:02 -07:00
Krish Dholakia 2efaa3cf36 Expose /list and /info endpoints for Audit Log events (#11102)
* feat(audit_logging_endpoints.py): expose list endpoint to show all audit logs

make it easier for user to retrieve individual endpoints

* feat(enterprise/): add audit logging endpoint

* feat(audit_logging_endpoints.py): expose new GET `/audit/{id}` endpoint

make it easier to retrieve view individual audit logs

* feat(key_management_event_hooks.py): correctly show the key of the user who initiated the change

* fix(key_management_event_hooks.py): add key rotations as an audit log event

'

* test(test_audit_logging_endpoints.py): add simple unit testing for audit log endpoint

* fix: testing fixes

* fix: fix ruff check
2025-05-23 22:54:59 -07:00
Krrish Dholakia 046c6c7149 fix(user_api_key_auth.py): handle user custom auth set with no custom settings 2025-05-23 21:30:44 -07:00
Krish Dholakia e9b7059af4 Litellm add file validation (#11081)
* fix: cleanup print statement

* feat(managed_files.py): add auth check on managed files

Implemented for file retrieve + delete calls

* feat(files_endpoints.py): support returning files by model name

enables managed file support

* feat(managed_files/): filter list of files by the ones created by user

prevents user from seeing another file

* test: update test

* fix(files_endpoints.py): list_files - always default to provider based routing

* build: add new table to prisma schema
2025-05-22 23:05:45 -07:00
Krish Dholakia 70f32154c5 Litellm managed file updates combined (#11040)
* Add LiteLLM Managed file support for `retrieve`, `list` and `cancel` finetuning jobs (#11033)

* feat: initial commit adding managed file support to fine tuning endpoints

* feat(fine_tuning/endpoints.py): working call to openai finetuning route

Uses litellm managed files for finetuning api support

* feat(fine-tuning/main.py): refactor to use LiteLLMFineTuningJob pydantic object

includes 'hidden_params'

* fix: initial commit adding unified finetuning id support

return a unified finetuning id we can use to understand which deployment to route the ft request to

* test: fix test

* feat(managed_files.py): return unified finetuning job id on create finetuning job

enables retrieve, delete to work with litellm managed files

* feat(managed_files.py): support managed files for cancel ft job endpoint

* feat(managed_files.py): support managed files for cancel ft job endpoint

* feat(fine_tuning_endpoints/endpoints.py): add managed files support to list finetuning jobs

* feat(finetuning_endpoints/main): add managed files support for retrieving ft job

Makes it easier to control permissions for ft endpoint

* LiteLLM Managed Files - Enforce validation check if user can access finetuning job (#11034)

* feat: initial commit adding managed file support to fine tuning endpoints

* feat(fine_tuning/endpoints.py): working call to openai finetuning route

Uses litellm managed files for finetuning api support

* feat(fine-tuning/main.py): refactor to use LiteLLMFineTuningJob pydantic object

includes 'hidden_params'

* fix: initial commit adding unified finetuning id support

return a unified finetuning id we can use to understand which deployment to route the ft request to

* test: fix test

* feat(managed_files.py): return unified finetuning job id on create finetuning job

enables retrieve, delete to work with litellm managed files

* feat(managed_files.py): support managed files for cancel ft job endpoint

* feat(managed_files.py): support managed files for cancel ft job endpoint

* feat(fine_tuning_endpoints/endpoints.py): add managed files support to list finetuning jobs

* feat(finetuning_endpoints/main): add managed files support for retrieving ft job

Makes it easier to control permissions for ft endpoint

* feat(managed_files.py): store create fine-tune / batch response object in db

storing this allows us to filter files returned on list based on what user created

* feat(managed_files.py): Ensures users can't retrieve / modify each others jobs

* fix: fix check

* fix: fix ruff check errors

* test: update to handle testing

* fix: suppress linting warning - openai 'seed' is none on azure

* test: update tests

* test: update test
2025-05-22 17:20:41 -07:00
Krish Dholakia 58f958f30a Litellm dev 05 21 2025 p2 (#11039)
* feat: initial commit adding managed file support to fine tuning endpoints

* feat(fine_tuning/endpoints.py): working call to openai finetuning route

Uses litellm managed files for finetuning api support

* feat(fine-tuning/main.py): refactor to use LiteLLMFineTuningJob pydantic object

includes 'hidden_params'

* fix: initial commit adding unified finetuning id support

return a unified finetuning id we can use to understand which deployment to route the ft request to

* test: fix test

* feat(managed_files.py): return unified finetuning job id on create finetuning job

enables retrieve, delete to work with litellm managed files

* test: update test

* fix: fix linting error

* fix: fix ruff linting error

* test: fix check
2025-05-21 21:40:53 -07:00
Krish Dholakia 6cfb6e5253 Litellm dev 05 19 2025 p3 (#10965)
* feat(model_info_view.tsx): enable updating model info for existing models on UI

Fixes LIT-154

* fix(model_info_view.tsx): instantly show model info updates on UI

* feat(proxy_server.py): enable flag on `/models` to include model access groups

This enables admin to assign model access groups to keys/teams on UI

* feat(ui/): add model access groups on ui dropdown when creating teams + keys

* refactor(parallel_request_limiter_v2.py): Migrate multi instance rate limiting to OSS

Closes https://github.com/BerriAI/litellm/issues/10052
2025-05-19 20:49:21 -07:00
Krish Dholakia 9bfd3e4819 fix(router.py): write file to all deployments (#10708)
* fix(router.py): write file to all deployments

allows unified file id to work across multiple deployments

* fix(view_logs/index.tsx): show call type in request logs

* fix(router.py): pass a deep copy of kwargs to avoid conflict across multiple runs

* fix(batch_utils.py): broaden check

* fix(router_utils.py): handle null type for function name

* fix(proxy_track_cost_callback.py): fix ruff check error

* fix(router.py): handle healthy_deployments as a dict

* feat(managed_files.py): support encoding / decoding unified batch id … (#10711)

* feat(managed_files.py): support encoding / decoding unified batch id when using managed files

allows routing retrieve batch to the right model id

* fix: fix linting error

* test: add unit tests

* fix: fix ruff check
2025-05-10 00:08:30 -07:00
Krish Dholakia b8b78f1fde Support unified file id (managed files) for batches (#10650)
* refactor(managed_files.py): move enterprise feature into enterprise folder

prevent unexpected surprises

* refactor: safely handle enterprise hooks

* fix: fix ruff check errors

* fix(files_endpoints.py): cleanup enterprise code from OSS

* refactor: complete cleanup

* fix(managed_files.py): complete cleanup

* fix(managed_files.py): instrument to be able to update deployment values post-router selection and just before making llm call

* fix(managed_files.py): instrument to be able to update deployment values post-router selection and just before making llm call

* fix: fix linting error

* fix: fix linting error
2025-05-07 23:39:40 -07:00
Krish Dholakia 552b7e4013 Add customer + model per key level multi-instance tpm/rpm limiting (#10518)
* fix(redis_cache.py): handle multiple event loops

* fix(parallel_request_limiter_v2.py): add customer tpm limiting

* fix(parallel_request_limiter.py): add customer rpm limiting

* fix(parallel_request_limiter_v2.py): add model per key + customer tpm/rpm limiting

* fix(parallel_request_limiter_v2.py): make error more informative

* fix: fix ruff error

* fix: generate new poetry lock
2025-05-03 10:28:55 -07:00
Krish Dholakia 132bdb1380 Add user + team based multi-instance rate limiting (#10497)
* fix(parallel_request_limiter_v2.py): add user multi-instance rate limiting

* fix(parallel_request_limiter_v2.py): add user multi-instance rpm limiting

* fix(parallel_request_limiter_v2.py): add team based multi-instance rate limiting
2025-05-01 22:09:26 -07:00
Krish Dholakia 711601e22a Add key-level multi-instance tpm/rpm/max parallel request limiting (#10458)
* fix: initial commit of v2 parallel request limiter hook

enables multi-instance rate limiting to work

* fix: subsequent commit with additional refactors

* fix(parallel_request_limiter_v2.py): cleanup initial call hook

simplify it

* fix(parallel_request_limiter_v2.py): working v2 parallel request limiter

* fix: more updates - still not passing testing

* fix(test_parallel_request_limiter_v2.py): update test + add conftest

* fix: fix ruff checks

* fix(parallel_request_limiter_v2.py): use pull via pattern method to load in keys instance wouldn't have seen yet

Fixes issue where redis syncing was not pulling key until instance had seen it

* test: update testing to cover tpm and rpm

* fix(parallel_request_limiter_v2.py): fix ruff errors

* fix(proxy/hooks/__init__.py): feature flag export

* fix(proxy/hooks/__init_.py): fix linting error

* ci(config.yml): add tests/enterprise to ci/cd

* fix: fix ruff check

* test: update testing
2025-04-30 21:32:31 -07:00