Commit Graph

1806 Commits

Author SHA1 Message Date
Yuneng Jiang a074d1d68b [Infra] Mirror litellm_table_patch source changes (no binaries)
Cherry-pick source-only changes from litellm_table_patch, excluding
build artifacts from the incident response period.

- Remove destructive DROP COLUMN migration (20260311180521_schema_sync)
- Remove now-unnecessary restore migration (20260327232350)
- Bump litellm-proxy-extras 0.4.60 → 0.4.61
- Add regression test to block future DROP COLUMN migrations
- Fix double error handling in getTeamPermissionsCall

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 16:45:12 -07:00
yuneng-jiang fa65433c8c bump: version 1.82.5 → 1.82.6 2026-03-21 22:56:09 -07:00
yuneng-jiang 071c8641de bump: version 0.4.59 → 0.4.60 2026-03-21 22:54:41 -07:00
yuneng-jiang 88a4c7aeaf bump: version 0.4.58 → 0.4.59 2026-03-21 22:54:38 -07:00
Sameer Kankute 676a79e9f7 bump: litellm-enterprise 0.1.34 → 0.1.35 2026-03-21 20:42:34 +05:30
joereyna e5dc912367 bump: version 1.82.4 → 1.82.5 2026-03-19 19:58:04 -07:00
yuneng-jiang 1e6abf8142 Merge branch 'main' into litellm_yj_march_17_2026 2026-03-18 15:13:41 -07:00
yuneng-jiang f88e51e1b9 bump: version 0.4.57 → 0.4.58 2026-03-18 14:07:22 -07:00
jyeros 2b6a32e0cd Add tests 2026-03-18 15:48:59 -05:00
yuneng-jiang 9fa1809c30 bump: version 0.4.56 → 0.4.57 2026-03-17 17:37:04 -07:00
yuneng-jiang 709581c5f9 bump: version 1.82.3 → 1.82.4 2026-03-17 17:31:45 -07:00
yuneng-jiang 278c9babc6 [Infra] Merging RC Branch with Main (#23786)
* fix(test): add missing mocks for test_streamable_http_mcp_handler_mock

The test was missing mocks for extract_mcp_auth_context and set_auth_context,
causing the handler to fail silently in the except block instead of reaching
session_manager.handle_request. This mirrors the fix already applied to the
sibling test_sse_mcp_handler_mock.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix(ci): route OpenAI models through chat completions in pass-through tests

The test_anthropic_messages_openai_model_streaming_cost_injection test fails
because the OpenAI Responses API returns 400 for requests routed through the
Anthropic Messages endpoint. Setting LITELLM_USE_CHAT_COMPLETIONS_URL_FOR_ANTHROPIC_MESSAGES=true
routes OpenAI models through the stable chat completions path instead.
Cost injection still works since it happens at the proxy level.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix(ci): fix assemblyai custom auth and router wildcard test flakiness

1. custom_auth_basic.py: Add user_role='proxy_admin' so the custom auth
   user can access management endpoints like /key/generate. The test
   test_assemblyai_transcribe_with_non_admin_key was hidden behind an
   earlier -x failure and was never reached before.

2. test_router_utils.py: Add flaky(retries=3) and increase sleep from 1s
   to 2s for test_router_get_model_group_usage_wildcard_routes. The async
   callback needs time to write usage to cache, and 1s is insufficient on
   slower CI hardware.

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* ci: retrigger CI pipeline

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix(mypy): use LitellmUserRoles enum instead of raw string in custom_auth_basic

Fixes mypy error: Argument 'user_role' has incompatible type 'str'; expected 'LitellmUserRoles | None'

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* fix: don't close HTTP/SDK clients on LLMClientCache eviction (#22926)

* fix: don't close HTTP/SDK clients on LLMClientCache eviction

Removing the _remove_key override that eagerly called aclose()/close()
on evicted clients. Evicted clients may still be held by in-flight
streaming requests; closing them causes:

  RuntimeError: Cannot send a request, as the client has been closed.

This is a regression from commit fb72979432. Clients that are no longer
referenced will be garbage-collected naturally. Explicit shutdown cleanup
happens via close_litellm_async_clients().

Fixes production crashes after the 1-hour cache TTL expires.

* test: update LLMClientCache unit tests for no-close-on-eviction behavior

Flip the assertions: evicted clients must NOT be closed. Replace
test_remove_key_closes_async_client → test_remove_key_does_not_close_async_client
and equivalents for sync/eviction paths.

Add test_remove_key_removes_plain_values for non-client cache entries.
Remove test_background_tasks_cleaned_up_after_completion (no more _background_tasks).
Remove test_remove_key_no_event_loop variant that depended on old behavior.

* test: add e2e tests for OpenAI SDK client surviving cache eviction

Add two new e2e tests using real AsyncOpenAI clients:
- test_evicted_openai_sdk_client_stays_usable: verifies size-based eviction
  doesn't close the client
- test_ttl_expired_openai_sdk_client_stays_usable: verifies TTL expiry
  eviction doesn't close the client

Both tests sleep after eviction so any create_task()-based close would
have time to run, making the regression detectable.

Also expand the module docstring to explain why the sleep is required.

* docs(AGENTS.md): add rule — never close HTTP/SDK clients on cache eviction

* docs(CLAUDE.md): add HTTP client cache safety guideline

* [Fix] Install bsdmainutils for column command in security scans

The security_scans.sh script uses `column` to format vulnerability
output, but the package wasn't installed in the CI environment.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: handle string callback values in prometheus multiproc setup

When callbacks are configured as a plain string (e.g., `callbacks: "my_callback"`)
instead of a list, the proxy crashes on startup with:
  TypeError: can only concatenate str (not "list") to str

Normalize each callback setting to a list before concatenating.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* bump: version 1.82.2 → 1.82.3

* fix(test): update test_startup_fails_when_db_setup_fails for opt-in enforcement

The --enforce_prisma_migration_check flag is now required to trigger
sys.exit(1) on DB migration failure, after #23675 flipped the default
behavior to warn-and-continue.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(cost_calculator): use model name for per-request custom pricing when router_model_id has no pricing

When custom pricing is passed as per-request kwargs (input_cost_per_token/output_cost_per_token),
completion() registers pricing under the model name, but _select_model_name_for_cost_calc was
selecting the router deployment hash (which has no pricing data), causing response_cost to be 0.0.

Now checks whether the router_model_id entry actually has pricing before preferring it.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-16 15:32:20 -07:00
yuneng-jiang c1dadc0c80 bumping pyJWT for security 2026-03-14 19:17:19 -07:00
yuneng-jiang 820d63f3a6 bump: version 1.82.1 → 1.82.2 2026-03-12 16:07:38 -07:00
yuneng-jiang 3a3fd64fcb bump: version 0.4.55 → 0.4.56 2026-03-12 12:46:47 -07:00
yuneng-jiang cffb2676a5 bump: version 0.4.54 → 0.4.55 2026-03-12 12:46:46 -07:00
yuneng-jiang e1674bd34f bump: version 0.4.53 → 0.4.54 2026-03-11 18:07:58 -07:00
yuneng-jiang bd914281e5 bump: version 0.4.52 → 0.4.53 2026-03-09 14:45:41 -07:00
Joshua Bronson 5801f0b97f Don't pin to exact versions of optional deps (#23052) 2026-03-07 19:29:36 -08:00
yuneng-jiang 55f448abb8 bump: version 0.4.51 → 0.4.52 2026-03-06 23:39:08 -08:00
Ryan Crabbe a9dcc1ab37 bump: version 0.4.50 → 0.4.51 2026-03-06 17:55:12 -08:00
Sameer Kankute 7d790b39be Merge pull request #22765 from BerriAI/main
merge main for 030326
2026-03-04 17:40:42 +05:30
Chesars 5005773909 fix(deps): relax python-multipart version constraint to >=0.0.22
The caret operator (^0.0.x) in zerover projects restricts to a single
patch version. Changed to >= to allow future patch updates.
2026-03-03 15:09:04 -03:00
Harshit28j d07689d2d7 bump: version 1.82.0 → 1.82.1 2026-03-03 11:59:58 +05:30
Ishaan Jaff b5f5b42035 bump: litellm-enterprise 0.1.32 → 0.1.33 + manual publish workflow (#22421)
* bump: litellm-enterprise 0.1.32 → 0.1.33

* ci: add manual workflow to publish litellm-enterprise to PyPI

* Apply suggestion from @greptile-apps[bot]

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

* Apply suggestion from @greptile-apps[bot]

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

* ci: add manual workflow to publish litellm-proxy-extras to PyPI

* fix(ci): commit before publish, add poetry.lock update to enterprise + proxy-extras workflows

---------

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
2026-02-28 10:56:15 -08:00
Ishaan Jaffer 65c5d5a8d3 bump: version 1.81.16 → 1.82.0 2026-02-28 10:21:07 -08:00
Ishaan Jaff 3ff70598ad fix: bump litellm-proxy-extras to 0.4.50 and fix 3 failing tests (#22417)
* fix(ci): handle inline table in pyproject.toml for litellm-proxy-extras version check

* fix: bump litellm-proxy-extras to 0.4.50 in pyproject.toml, requirements.txt, and poetry.lock

* fix(tests): set status_code=200 on JWT mocks and pass pii_tokens through data in presidio test
2026-02-28 10:20:03 -08:00
yuneng-jiang ee7b73764c bump: version 0.4.48 → 0.4.49 2026-02-26 20:29:43 -08:00
Sameer Kankute 678200ee48 Bump litellm version to 1.81.16 2026-02-26 18:18:03 +05:30
yuneng-jiang 1132176289 bump: version 0.4.47 → 0.4.48 2026-02-25 12:08:15 -08:00
Sameer Kankute 370dfdc514 bump: version 1.81.14 → 1.81.15 2026-02-24 09:26:18 +05:30
Sameer Kankute 8714b9ee8f bump: version 0.4.46 → 0.4.47 2026-02-24 08:53:50 +05:30
Ishaan Jaffer 2acc93e451 BUMP 2026-02-21 15:28:18 -08:00
Ryan Crabbe 5bcaeabfd8 Merge origin/main into litellm_fix_streaming_connection_pool_leak
Resolve conflict in test_proxy_server.py: keep both async_data_generator
cleanup tests and store_model_in_db DB config override tests.
2026-02-21 12:44:50 -08:00
Ishaan Jaff 08ae43ace1 fix(migrations): add ensure_project_id migration + bump litellm-proxy-extras to 0.4.46 (#21800)
* fix(migrations): add ensure_project_id_verification_token migration

Ensures project_id column exists on LiteLLM_VerificationToken. The original
migration (20251113000000_add_project_table) adds this column, but may have
been skipped if LiteLLM_ProjectTable already existed and the migration was
resolved as idempotent. Uses IF NOT EXISTS for safety.

* bump: litellm-proxy-extras 0.4.45 → 0.4.46
2026-02-21 12:15:21 -08:00
yuneng-jiang a8d37b9385 bump: version 0.4.44 → 0.4.45 2026-02-20 15:39:44 -08:00
yuneng-jiang 2eded1dda6 bump: version 0.4.43 → 0.4.44 2026-02-19 10:50:23 -08:00
yuneng-jiang 2d928f8c54 fixing merge 2026-02-19 10:46:58 -08:00
yuneng-jiang 9eaaa9740b bump: version 0.4.42 → 0.4.43 2026-02-19 10:28:35 -08:00
yuneng-jiang c911cfbabf Merge remote-tracking branch 'origin' into litellm_key_last_active_tracking 2026-02-19 10:27:48 -08:00
yuneng-jiang 23547df8a1 bump: version 0.4.41 → 0.4.42 2026-02-19 10:11:19 -08:00
yuneng-jiang 35797d1f75 bump: version 0.4.40 → 0.4.41 2026-02-19 10:05:38 -08:00
yuneng-jiang 90a11038f8 bump: version 0.4.40 → 0.4.41 2026-02-18 23:15:53 -08:00
Ryan Crabbe 988e43662e fix: cap uvicorn version to <1.0.0 per Greptile review 2026-02-18 14:19:21 -08:00
Julio Quinteros Pro e37befd5b4 fix: address Greptile feedback - remove duplicates and fix deprecations
- Remove duplicate @pytest.fixture decorator on setup_and_teardown
- Delete conftest_improved.py (duplicate file, pytest only loads conftest.py)
- Remove deprecated event_loop fixture override
- Add asyncio_default_fixture_loop_scope config in pyproject.toml (modern approach)

This fixes pytest-asyncio >=0.22 deprecation warnings while maintaining
session-scoped event loop behavior.
2026-02-17 18:20:43 -03:00
Julio Quinteros Pro 48105e650b fix: remove pytest-retry to avoid conflicting retry plugins
- Remove pytest-retry from dev dependencies in pyproject.toml
- Add pytest-xdist as proper dev dependency (was only in pip install)
- Update CI workflow to reflect proper dependency management
- Prevents conflict between pytest-retry and pytest-rerunfailures

Having both pytest-retry and pytest-rerunfailures installed simultaneously
causes unpredictable behavior and excessive retries.
2026-02-17 16:55:37 -03:00
Julio Quinteros Pro 02126c5aac fix: remove pytest-retry configuration to eliminate duplicate retries
- Remove retries=20 and retry_delay=5 from pytest.ini_options
- These settings are for pytest-retry plugin (different from pytest-rerunfailures)
- Having both pytest-retry + pytest-rerunfailures causes excessive retries
- CI workflow now uses only pytest-rerunfailures with --reruns flag
2026-02-17 16:08:40 -03:00
Julio Quinteros Pro 7a097ae97f fix(ci): reduce parallelism and add retry logic
- Reduce workers from 4 to 2 to avoid race conditions
  - Add --reruns with 2-3 retries per test group
  - Increase timeout from 15 to 20 minutes
  - Add better test isolation
2026-02-17 15:43:43 -03:00
yuneng-jiang c4f0fc9819 bump: version 1.81.12 → 1.81.13 2026-02-16 20:13:26 -08:00
yuneng-jiang 6371b30bfd bump: version 0.4.39 → 0.4.40 2026-02-16 11:20:59 -08:00