* fix(router.py): support base model for model group usage
allows model group info to show accurate cost information for azure models
* fix(router.py): fix changes
* test: add unit tests
* build(pyproject.toml): bump openai version requirements
support custom tool from responses api
Closes https://github.com/BerriAI/litellm/issues/13391
* docs(responses_api.md): add verbosity + free-form function calling parameters
* docs(responses_api.md): add cfg + minimal reasoning to docs
Closes https://github.com/BerriAI/litellm/issues/13391
* docs(responses_api.md): add proxy examples to docs
* refactor: fix ruff error
* added mcp guardrails doc in mcp.md
* add button to reload models
* Added button changes
* added button for scheduling reload
* add multi pod support to reloading the model price json
* fix ruff
* feat(proxy/utils.py): track pre-call hooks in OTEL
some pre call hooks can cause latency in high traffic - make sure this is tracked
* fix(router.py): move redis call on deployment_callback_on_success to pipeline operation
reduces p99 latency by half when redis is enabled
* fix(parallel_request_limiter_v3.py): only run check if any item has rate limits set
Prevents unnecessary latency added by rate limit checks
* test: add unit tests
* Latency Improvements: only track tpm/rpm usage when set on deployment+ LLM Caching - use an in-memory cache to reduce redis calls + OTEL - track time spent on LLM caching (#13472)
* fix(router.py): only track usage for deployments with tpm/rpm set
ensures additional latency avoided for non-tpm/rpm models
* fix(caching_handler.py): log time spent on request get cache to OTEL
enables easy debugging of call latency
* fix(caching_handler.py): use dual cache object for in-memory caching + trace redis call within caching handler
* fix(caching_handler.py): working in-memory cache for redis calls
ensures dual cache works when redis cache setup for llm calls
makes calls quicker by only checking redis when in-memory cache missed for llm api call
* test: remove redundant test
* test: add unit tests
* feat(reasoning): support 'minimal' effort type for OpenAI
* fix(reasoning): correctly map 'minimal' effort to Reasoning object
* chore(dependencies): update OpenAI package version to 1.99.5 in pyproject.toml and requirements.txt
* chore(dependencies): update poetry.lock for OpenAI package version 1.99.5 and Poetry version 2.1.3
* fix sso logout
- add a new login page with sso button
* lint fix
* lint fix
* lint fix
* fix tests
* fix test
* Revert "fix test"
This reverts commit 74eb7345710892d5a9d02baec0ef389b98d0dde3.
* Reapply "fix test"
This reverts commit 72d0b2d4c62f6bb9351a7656ff88efc2ba91aef7.
* add host to add modal
* close modal after save is clicked. and auto-refresh
* show old values in edit modal
* send the whole payload on edit
* Update settings.tsx
* resolve conflict
* fix conflict
* merge main
* first draft of notifications added to settings
* add error compatibility by taking errors from the backend
- db errors
- auth errors
* add support for different types of errors
* minor
* name change
* email alerts page notifications modified
* remove unused code
* fix(access group): allow access group on mcp tool retrieval
* fix(test): fix broken tests and add test case for access group
* fix(mypy): fix typing issues