Files
litellm/tests/test_litellm/proxy/db/db_transaction_queue
Joe Reyna f92490c308 fix: make PodLockManager.release_lock atomic compare-and-delete (re-land #21226) (#24466)
* fix: make PodLockManager.release_lock atomic compare-and-delete

Re-lands #21226 (reverted in #21469).

release_lock() previously did GET + compare + DEL in separate calls,
leaving a window where another pod could reacquire the lock between
the GET and DEL, causing a stale owner to delete a live lock.

Fix: use a Redis Lua script for atomic compare-and-delete. Script
registration is cached per PodLockManager instance. Falls back to
the old GET+DEL path for cache backends that don't expose
async_register_script.

Original revert was due to e2e tests running in CI without Redis.
Those tests now carry @pytest.mark.skip(reason="Requires Redis connection.")
so this re-land is safe.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: add Lua fallback on execution error + test coverage gaps

Address Greptile review feedback on #24466:

1. Wrap Lua script execution in try/except — if Redis clears loaded
   scripts (restart) or scripting is disabled, fall back to GET+DEL
   rather than letting the exception propagate and leave the lock held
   until TTL. Reset cached script handle so the next call re-registers.

2. Add test_release_lock_lua_path_emits_released_event — verifies
   _emit_released_lock_event is called when Lua path returns 1.

3. Add test_release_lock_falls_back_to_get_del_when_lua_execution_fails
   — verifies the fallback path is taken and script handle is reset.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 17:33:21 -07:00
..