Files
litellm/tests/test_litellm/interactions
Mateo Wang 8acf64e16c fix(interactions): never drop streamed text deltas; always emit terminal completion (#28394)
* fix(interactions): never drop streamed text deltas; always emit terminal completion

The interactions streaming bridge had two bugs flagged by Greptile on PR #28153:

1. The first OutputTextDeltaEvent (and the second, when no ResponseCreatedEvent
   precedes the deltas) was consumed to emit a synthetic interaction.created /
   step.start event, but the chunk's text payload was never forwarded as a
   step.delta. The text only reappeared in the terminal step.stop, which
   defeats the purpose of incremental streaming.

2. When the upstream Responses API stream ended via StopIteration without a
   ResponseCompletedEvent, the iterator emitted step.stop but never the
   terminal interaction.completed event carrying the full collected text.

This refactors the iterator to translate each upstream chunk into a list of
events (instead of a single event) and buffers them in a deque. A text delta
now expands into [interaction.created, step.start, step.delta] on the first
chunk so no token is dropped, and the StopIteration / StopAsyncIteration
fallback always flushes a terminal interaction.completed event when one
hasn't already been sent.

Both behaviors are covered by new unit tests:
- test_no_text_token_is_dropped_during_streaming
- test_response_created_then_text_delta_emits_step_start_and_delta
- test_stop_iteration_fallback_emits_completion_event
- test_response_completed_emits_stop_then_completion (no double-emit)

Co-authored-by: Mateo Wang <mateo-berri@users.noreply.github.com>

* fix(interactions): correlate EOF terminal events with stream's interaction id

The StopIteration fallback path previously built the terminal step.stop /
interaction.completed events with id=None (legacy content.stop) and a
memory-address fallback string (interaction.completed), neither of which
matched the item_id used by the earlier interaction.created / step.start /
step.delta events in the same stream. Downstream consumers correlating
events by id would see a mismatch.

Persist the interaction id derived from the first upstream chunk (item_id
on an OutputTextDeltaEvent, or response.id on a ResponseCreatedEvent) and
reuse it when flushing the terminal events on EOF.

Author: mateo-berri <277851410+mateo-berri@users.noreply.github.com>

* ci(windows): raise UV_HTTP_TIMEOUT to 300s for uv sync

The using_litellm_on_windows job has been hitting flaky PyPI download
timeouts during 'uv sync --frozen --group dev' — different packages on
each rerun (six, pydantic-core), all surfacing the same uv error:

  Failed to download distribution due to network timeout.
  Try increasing UV_HTTP_TIMEOUT (current value: 30s).

uv's default 30s per-request timeout is too tight for the Windows runner
on this project (50+ deps, several multi-MB wheels), so bump it to 300s
to let slow individual downloads complete instead of failing the build.

* fix(interactions): correlate ResponseCompletedEvent terminal events with stream's interaction id

When a stream starts directly with OutputTextDeltaEvent (no preceding
ResponseCreatedEvent), interaction.created carries item_id while
interaction.completed previously carried response.id from
ResponseCompletedEvent. The two ids can differ, leaving consumers that
correlate events by id unable to match the start and completion events.

Fall back to self._interaction_id (set on the first chunk that derives
an id) before response.id, mirroring the EOF terminal path.

---------

Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: Mateo Wang <mateo-berri@users.noreply.github.com>
2026-05-20 16:41:40 -07:00
..