* [Test] Add Azure async chat completion timeout test. WIP
* Capture TTFT for /v1/messages streaming responses
The pass-through streaming path for /v1/messages (Anthropic, Bedrock,
Vertex AI, Azure AI, Minimax) logged completion_start_time only after
the entire stream finished. async_success_handler then fell back to
end_time, making TTFT equal to total duration or null in the UI and
Prometheus.
Record the timestamp of the first chunk in async_sse_wrapper and
propagate it to model_call_details before the logging handler runs,
so gen_ai.response.time_to_first_token reflects the real first-chunk
latency.
Fixes#25598
* [Refactor] Implement timeout resolution logic in completion function
add fetch ``request_timeout`` from litellm_settings
* remove stale test case
* remove extra print statement
* default request timeout value in constants to 600s to match timeout defaults handled in the proxy
* fix request timeout if using default value from constants.py
* update code structure, test cases
* only override if the global timeout sets timeout to 6000s
* update code structure, move hard coded values to const and make the reslve function readable by moving fallback logic to a seperate function
* modify default timeout values, replacing hard coded ones with default values defined
---------
Co-authored-by: harish876 <harishgokul01@gmail.com>
Co-authored-by: Joaquin Hui Gomez <joaquinhuigomez@users.noreply.github.com>