Commit Graph

365 Commits

Author SHA1 Message Date
Yuneng Jiang 0f785f988b [Fix] Add missing user_api_key_project_alias to failed-response PagerDuty event
The hanging-response constructor was fixed but the sibling failed-response
constructor at line 104 was still missing this field.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 11:07:04 -07:00
Yuneng Jiang f08d281641 [Fix] Resolve mypy type errors across 3 files
Add missing `user_api_key_project_alias` key to SpendLogsMetadata and
PagerDutyInternalEvent constructors, and cast `reasoning_items` to list
for safe iteration in responses transformation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 10:56:54 -07:00
Sameer Kankute bbd8ca3b3d feat(prometheus): add metrics for managed batch lifecycle
- Add Prometheus metrics for managed batch and file operations
- Track batch creation, file size, duration, and deletion events
- Add CheckBatchCost polling metrics (jobs polled/processed, errors)
- Record metrics in managed_files hook and check_batch_cost utility
- Metrics include labels for model, provider, user, and status

Made-with: Cursor
2026-03-27 20:30:09 +05:30
Krrish Dholakia df2a36dd27 docs: document new github + gitlab ci scripts 2026-03-25 20:17:10 -07:00
Sameer Kankute 4f1e484a9b Merge branch 'main' into litellm_dev_sameer_16_march_week
Resolve conflicts in common_request_processing.py (keep main streaming,
post_call_success_hook try/finally, deferred logging; retain skip_pre_call_logic)
and utils.py (defer + internal-call skip + sync success callbacks for all calls).

Tighten _has_post_call_guardrails for event_hook=None; align deferred
guardrail test. Sync model_prices_and_context_window_backup.json.

Pyright: narrow ignores for passthrough StreamingResponse and post_call hook.
Made-with: Cursor
2026-03-22 00:29:38 +05:30
Sameer Kankute 676a79e9f7 bump: litellm-enterprise 0.1.34 → 0.1.35 2026-03-21 20:42:34 +05:30
Sameer Kankute 32ded9b2f8 fix double-billing issue 2026-03-18 12:47:42 +05:30
Sameer Kankute 7660f39fdb fix(file_search): promote DB helper, suppress sub-call billing, add queries-plural test
- Promote _fetch_managed_vector_stores_by_uuids from @staticmethod to a module-level
  async helper get_managed_vector_store_rows_by_uuids, following the same standalone
  helper pattern as get_team_object / get_key_object so the hot-path DB read is a
  named importable function rather than an inline prisma_client.db.* call
- Pass no-log=True to both inner _call_aresponses sub-calls so they do not fire
  independent billing/monitoring callbacks; cost is accumulated in the synthesized
  response's _hidden_params for the outer responses() call
- Add test_H11b covering the primary queries (plural array) function-tool schema,
  complementing H11 which exercises only the backward-compat singular query path

Made-with: Cursor
2026-03-18 11:38:49 +05:30
Sameer Kankute 76176f2a64 fix(file_search): restore should_use_emulated helper, fix dedup, extract DB helper, clean docstring
- Re-add should_use_emulated_file_search() to emulated_handler.py so H5/H6/H7/H13 tests don't fail with ImportError
- Remove per-file-id deduplication from _build_search_results_for_include so all chunks are returned (matching OpenAI native file_search behaviour); update test_H14 to assert 2 results
- Extract raw prisma DB query in check_vector_store_ids_access into a static _fetch_managed_vector_stores_by_uuids helper so the hot request path uses a named, testable function instead of an inline prisma_client.db.* call
- Remove developer-local path from test module docstring

Made-with: Cursor
2026-03-18 11:26:27 +05:30
Sameer Kankute c735251570 feat(responses): file_search support — Phase 1 native passthrough + Phase 2 emulated fallback
Phase 1 (native passthrough):
- _decode_vector_store_ids_in_tools(): decode LiteLLM-managed unified
  vector_store_ids to provider-native IDs in file_search tools
- Split update_responses_tools_with_model_file_ids() into decode pass
  (always runs) + code_interpreter mapping pass (guarded)
- BaseResponsesAPIConfig.supports_native_file_search() → False by default;
  OpenAIResponsesAPIConfig overrides to True
- ManagedFiles.async_pre_call_hook(): batch team-level access check for
  unified vector_store_ids in file_search tools (no N+1)
- Docs: file_search section in response_api.md

Phase 2 (emulated fallback for non-native providers):
- litellm/responses/file_search/emulated_handler.py: converts file_search
  tool → function tool, intercepts tool call, runs asearch(), makes
  follow-up call, synthesizes OpenAI-format output (file_search_call +
  message + file_citation annotations)
- responses/main.py: routes to emulated handler when provider doesn't
  support file_search natively

Tests: 41 unit tests across 8 families (A-H) in test_file_search_responses.py

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 11:41:44 +05:30
Sameer Kankute ab377f396e Merge pull request #23718 from BerriAI/litellm_fix_vertex_ai_batch
Fix: Vertex ai Batch Output File Download Fails with 500
2026-03-16 19:05:49 +05:30
Sameer Kankute 22b333cae6 Fix downloading vertex ai files 2026-03-16 12:08:06 +05:30
yuneng-jiang 4fc0975d22 Fix flaky e2e batch test: set batch_processed=True on completion in retrieve_batch
The retrieve_batch endpoint sets batch status to "complete" but never set
batch_processed=True, permanently blocking file deletion. CheckBatchCost
(the safety net) also excluded completed batches from its primary query,
so batch_processed was never set by either path.

Three fixes:
1. update_batch_in_database sets batch_processed=True when status reaches
   "complete", with old-schema fallback retry
2. CheckBatchCost primary query no longer excludes complete/completed
   (batch_processed=False filter prevents reprocessing)
3. retrieve_batch early-return now includes "complete" (DB-normalized
   spelling) to avoid unnecessary provider re-polls

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-15 18:18:32 -07:00
Ishaan Jaff 1b96064600 fix(proxy): prevent OOM/Prisma connection loss from unbounded managed-object poll (#23472)
* fix(proxy): cap managed-object poll size + expire stale rows + kill-switch flag to prevent OOM/Prisma connection loss

* fix(constants): simplify PROXY_BATCH_POLLING_ENABLED readability

* docs+test: document new polling env vars, add pagination+stale-cleanup tests

* fix: exclude stale_expired from batch poll queries; fix update_many assertions in tests

* fix: scope stale cleanup to file_purpose, fix file_object mocks, add CheckBatchCost tests

* fix: avoid duplicate cost logging in fallback path; guard integer constants against zero/negative values

* fix: cache _has_batch_processed_column; guard cleanup from aborting poll; narrow fallback except

* fix: add complete/completed to primary query not_in; fix vacuous test assertion

- Primary find_many was missing "complete" and "completed" in its not_in
  filter, creating asymmetry with the fallback query. A job whose status
  was set to "complete" but whose batch_processed flag update failed would
  be silently re-fetched and re-processed every cycle, emitting duplicate
  cost logs.

- test_fallback_completion_update_omits_batch_processed patched
  _is_base64_encoded_unified_file_id to return None, causing an immediate
  continue — so update() was never called and the assertion looped over an
  empty list (vacuously true). Rewrote the test to mock the full
  completion pipeline, verify update() is called exactly once, and assert
  batch_processed is absent from the update data.

- Added symmetric test (primary path) proving batch_processed IS included
  when the column exists.

Made-with: Cursor
2026-03-13 11:01:40 -07:00
github-actions[bot] 6ff693149d bump: litellm-enterprise 0.1.33 → 0.1.34 2026-03-09 11:12:05 +00:00
Harshit28j f18f4e3bbd feat: allow multiple calls from tags 2026-03-07 11:24:18 +05:30
Ryan Crabbe c4db53a98a Address review feedback: remove dead code, add error handling, strengthen test assertions
- Remove unused `completed_jobs` list (dead code after per-job update refactor)
- Wrap DB update in try/except to prevent one failed update from aborting remaining jobs
- Add test assertions verifying batch_processed, status, and file_object are written to DB
2026-03-06 09:25:50 -08:00
Ryan Crabbe 3d55f7f6ab Fix batch list showing stale "validating" status after completion
CheckBatchCost poller updated the status column but not the file_object
JSON column. The list_batches endpoint reads status from file_object,
so batches appeared stuck in "validating" even after Azure reported
them as completed. Now update file_object alongside status in the
per-job DB write.
2026-03-06 08:54:21 -08:00
yuneng-jiang ba7a6d9bfd Merge pull request #22476 from BerriAI/litellm_audit_pagination_fix
[Fix] UI - Audit Logs: Server-side pagination, filtering, and drawer view
2026-03-03 16:52:33 -08:00
yuneng-jiang 657a60ea5b fix(audit): AND semantics for combined JSON filters; remove unused allTeams prop
- Fix object_team_id + object_key_hash combining incorrectly as OR — each
  filter now adds an AND clause wrapping an internal OR over before_value
  and updated_values, so both conditions must be satisfied simultaneously
- Rename helper to _build_json_field_or_condition to reflect its purpose
- Remove allTeams from AuditLogsProps and its call site in index.tsx

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-03 14:24:55 -08:00
Harshit28j d661419109 fix: support list of modes in Mode.default for tag-based guardrails 2026-03-04 01:50:28 +05:30
yuneng-jiang 705ef64ffc fix(ui): Audit logs - server-side pagination, filtering, and drawer view
- Replace client-side full-fetch loop with single react-query call using
  keepPreviousData; remove 5-second polling
- All filters (object ID, action, table, changed_by, team ID, key hash)
  now passed as query params to the backend
- Add object_team_id and object_key_hash params to /audit endpoint using
  Prisma JSON path filtering (PostgreSQL) to search inside before_value
  and updated_values JSON columns
- Migrate table from custom TanStack DataTable to AntD Table with
  server-side pagination
- Replace inline row expansion with a right-side AntD Drawer showing
  metadata and before/after diff
- Refactor uiAuditLogsCall to accept a structured options object

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-28 18:19:07 -08:00
Ephrim Stanley 35f6fd4223 Managed batches fixes for Gemini/Vertex 2026-02-28 20:21:04 -05:00
Ishaan Jaff b5f5b42035 bump: litellm-enterprise 0.1.32 → 0.1.33 + manual publish workflow (#22421)
* bump: litellm-enterprise 0.1.32 → 0.1.33

* ci: add manual workflow to publish litellm-enterprise to PyPI

* Apply suggestion from @greptile-apps[bot]

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

* Apply suggestion from @greptile-apps[bot]

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

* ci: add manual workflow to publish litellm-proxy-extras to PyPI

* fix(ci): commit before publish, add poetry.lock update to enterprise + proxy-extras workflows

---------

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
2026-02-28 10:56:15 -08:00
Krrish Dholakia a26f83fd3c fix: update calendly on repo 2026-02-23 06:13:59 -08:00
Ephrim Stanley 27a33565e7 State management fixes for CheckBatchCost - Address greptile comments 2026-02-23 07:52:58 -05:00
Ephrim Stanley 7b5dc3fb9c State management fixes for CheckBatchCost 2026-02-23 07:16:25 -05:00
Harshit Jain bdf01fa283 fix mypy error 2026-02-19 14:04:00 +05:30
Sameer Kankute 03f5717456 Fixes based on greptile reviews 2026-02-18 12:19:11 +05:30
Sameer Kankute 9f5580fddd Fixes based on greptile reviews 2026-02-18 11:55:06 +05:30
Sameer Kankute 8f80b1085e Add File deletion criteria with batch references 2026-02-18 11:39:32 +05:30
Sameer Kankute 791cef6d99 fix test_chat_completion 2026-02-17 20:26:28 +05:30
Sameer Kankute 72a1bd66c7 Merge pull request #21157 from Point72/ephrimstanley/s3-logger-skip-missing-standard-logging-object
Managed batches - Misc bug fixes
2026-02-16 18:29:59 +05:30
Ephrim Stanley a3762e7d49 Addressed greptile comments to extract common helpers and return 404 2026-02-16 07:58:04 -05:00
Ephrim Stanley 7d794b567c fix: thread deployment model_info through batch cost calculation
batch_cost_calculator only checked the global cost map, ignoring
deployment-level custom pricing (input_cost_per_token_batches etc.).
Add optional model_info param through the batch cost chain and pass
it from CheckBatchCost.
2026-02-15 14:53:30 -05:00
Ephrim Stanley a5626768a3 Add comments 2026-02-14 19:46:56 -05:00
Ephrim Stanley 4d87cb8fe3 Fix deleted managed files returning 403 instead of 404 2026-02-14 19:12:03 -05:00
Ishaan Jaffer 703bf5add0 BUMP Enterprise PIP 2026-02-14 13:40:48 -08:00
Ephrim Stanley e59c8d22af fix: afile_retrieve returns unified ID for batch output files 2026-02-14 10:07:12 -05:00
Ephrim Stanley 5433ae7e8c Fix: bypass managed files access check in batch polling by calling afile_content directly 2026-02-14 00:30:35 -05:00
Ephrim Stanley 358180eb2d Fix: pass deployment credentials to afile_retrieve in managed_files post-call hook 2026-02-14 00:16:34 -05:00
Sameer Kankute aaf5938864 Merge pull request #21089 from BerriAI/litellm_vector_store_endpoints
Add target_model_names for vector store endpoints
2026-02-13 22:18:29 +05:30
Sameer Kankute 4ad0ecd9eb Fix mypy issues 2026-02-13 22:02:27 +05:30
Shivam Rawat 1321cd276b removed /models and v1/models from llm api routes (#20988) 2026-02-13 18:32:41 +05:30
Sameer Kankute be69ba270e Add using managed vector store creds for vector store files endpoint 2026-02-13 10:33:49 +05:30
Sameer Kankute 26c0624fd7 Add managed vector store hooks 2026-02-13 10:03:31 +05:30
Sameer Kankute fa48166b10 Add _PROXY_LiteLLMManagedVectorStores class 2026-02-13 09:57:17 +05:30
Sameer Kankute ce4bebbedf Changed asyncio.create_task() to await for storing batch objects 2026-02-10 12:42:39 +05:30
Sameer Kankute 9bdb163269 Add error file ids as managed files 2026-02-10 12:06:40 +05:30
yuneng-jiang b3f0dccf56 enterprise build 2026-02-05 21:31:20 -08:00