mirror of
https://github.com/tiennm99/litellm.git
synced 2026-07-04 03:13:54 +00:00
7c5e2e8389
* fix(proxy): make async_post_call_response_headers_hook consistent across all endpoints The response headers hook had 5 gaps that prevented callbacks from reliably extracting routing metadata across endpoint types: 1. Hook never fired for /audio/transcriptions (endpoint bypasses base_process_llm_request) 2. custom_llm_provider not accessible in hook data for any endpoint 3. custom_llm_provider not stamped in ResponsesAPIResponse._hidden_params (unlike chat completions) 4. model_info under inconsistent keys (metadata vs litellm_metadata) 5. request_headers always None at all call sites This adds a litellm_call_info parameter to the hook that normalizes routing metadata (custom_llm_provider, model_info, api_base, model_id) regardless of endpoint type. Also stamps custom_llm_provider on Responses API responses, adds the hook call to the transcription handler, and passes request_headers at all call sites. Supersedes PR #21385. * fix(proxy): address review feedback — safer backwards compat and None guards - Replace try/except TypeError with inspect.signature() check for litellm_call_info backwards compatibility. This avoids masking real TypeErrors inside callback implementations and prevents double invocation with inconsistent parameters. - Use (data.get("key") or {}) instead of data.get("key", {}) to guard against keys that exist with an explicit None value, which would cause AttributeError on the subsequent .get() call. * fix(proxy): cache inspect.signature result for callback compat check Move the inspect.signature() call into a module-level helper with a dict cache keyed by callback identity. Avoids repeated introspection per request per callback in the hot path. * fix(proxy): use class identity for signature cache key Key the _CALLBACK_ACCEPTS_CALL_INFO cache by id(type(cb)) instead of id(cb) to avoid stale entries from Python address reuse after GC. All instances of the same callback class share the same method signature, so class identity is both safer and more cache-efficient.