Commit Graph

1 Commits

Author SHA1 Message Date
Alexsander Hamir eb5031da1e [Perf] Fix bottlenecks degrading realtime endpoint performance (#16670)
* Cache realtime websocket request body

Move the realtime request payload builder out of the websocket handler and wrap it with an LRU cache so repeated connections reuse the same bytes object. This keeps the JSON formatting cost down while bounding memory usage.

* Optimize realtime websocket caching

Refactored /v1/realtime to use cached helpers for both the JSON body and query params, introduced a reusable request-scope template, and optimized header handling to avoid redundant work.

* Refine realtime websocket header handling

* Reuse websocket scope headers in auth

* Refactor realtime request body helper

Move the realtime request body formatter into proxy common utils so it can be reused across modules. Reuse it in the websocket auth flow to share LRU caching and avoid ad hoc byte builders.

* fix: revert to old pattern

The old pattern was necessary, we can just return the optimized function instead.

* Reuse SSL context for realtime

Create a shared SSLContext for OpenAI realtime websocket dials and pass it into websockets.connect so we stop re-reading verify paths on every session.

* feat: reuse shared TLS context for realtime websockets

- add `SHARED_REALTIME_SSL_CONTEXT` helper so all realtime websocket clients share the same TLS settings
- wire the shared context into OpenAI, Azure, custom HTTPX handlers, and realtime health checks
- update realtime tests to assert that the expected SSL context is passed to `websockets.connect`

This keeps TLS configuration consistent and avoids recreating SSL contexts per connection.

* Reuse HTTP SSL context for realtime

Remove the standalone realtime SSL helper, expose a shared context directly from the HTTP handler, and point all realtime websocket clients and tests to it. Add the websocket header comparison tool.

* Lazy-load shared realtime SSL context

Fix circular imports introduced by eagerly instantiating the shared TLS context. Make the HTTP handler lazily create the context and have realtime clients/tests fetch it on demand, keeping configuration consistent without breaking startup.

* add: unit test for realtime LRU caches

* fix: merge conflict with imports
2025-11-22 10:01:02 -08:00