mirror of
https://github.com/tiennm99/litellm.git
synced 2026-07-03 17:08:43 +00:00
f92490c308
* fix: make PodLockManager.release_lock atomic compare-and-delete Re-lands #21226 (reverted in #21469). release_lock() previously did GET + compare + DEL in separate calls, leaving a window where another pod could reacquire the lock between the GET and DEL, causing a stale owner to delete a live lock. Fix: use a Redis Lua script for atomic compare-and-delete. Script registration is cached per PodLockManager instance. Falls back to the old GET+DEL path for cache backends that don't expose async_register_script. Original revert was due to e2e tests running in CI without Redis. Those tests now carry @pytest.mark.skip(reason="Requires Redis connection.") so this re-land is safe. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: add Lua fallback on execution error + test coverage gaps Address Greptile review feedback on #24466: 1. Wrap Lua script execution in try/except — if Redis clears loaded scripts (restart) or scripting is disabled, fall back to GET+DEL rather than letting the exception propagate and leave the lock held until TTL. Reset cached script handle so the next call re-registers. 2. Add test_release_lock_lua_path_emits_released_event — verifies _emit_released_lock_event is called when Lua path returns 1. 3. Add test_release_lock_falls_back_to_get_del_when_lua_execution_fails — verifies the fallback path is taken and script handle is reset. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>