* fix(router.py): add acompletion_streaming_iterator inside router
allows router to catch errors mid-stream for fallbacks
Work for https://github.com/BerriAI/litellm/issues/6532
* fix(router.py): working mid-stream fallbacks
* fix(router.py): more iterations
* fix(router.py): working mid-stream fallbacks with fallbacks set on router
* fix(router.py): pass prior content back in new request as assistant prefix message
* fix(router.py): add a system prompt to help guide non-prefix supporting models to use the continued text correctly
* fix(common_utils.py): support converting `prefix: true` for non-prefix supporting models
* fix: reduce LOC in function
* test(test_router.py): add unit tests for new function
* test: add basic unit test
* fix(router.py): ensure return type of fallback stream is compatible with CustomStreamWrapper
prevent client code from breaking
* fix: cleanup
* test: update test
* fix: fix linting error
If the user specified in the configuration e.g. "user_header_name:
X-OpenWebUI-User-Email", here we were looking for a dict key
"X-OpenWebUI-User-Email" when the dict actually contained
"x-openwebui-user-email".
Switch to iteration and case insensitive string comparison instead to
fix this.
This fixes customer budget enforcement when the customer ID is passed
in as a header rather than as a "user" value in the body.
* fix(main.py): fix async retryer
Fixes https://github.com/BerriAI/litellm/issues/12830
* fix(forward_clientside_headers_by_model_group.py): filter out 'content-type' from forwardable headers
clientside content-type != proxy content type, can cause requests to hang
* test(tests/): update tests
* fix(prompt_templates/factory.py): handle anthropic cache control on individual tool results
Fixes issue where cache control on individual tool result was being ignored
* test(test_vertex_And_google_ai_studio_gemini.py): initial unit test covering translation for grounding metadata on streaming chunk
* fix(vertex_and_google_ai_studio.py): ensure grounding metadata is preserved on streaming
Closes https://github.com/BerriAI/litellm/issues/10237
* fix(core_helpers.py): include usage in expected openai keys
* refactor(aim.py): refactor to support adding aim guardrails on UI
* fix(base.py): add ui_friendly_name to config model
* feat(ui/): support loading new guardrails from backend api call
removes need to onboard each guardrail to ui
* fix: don't show optional params if not set and don't show ui_friendly_name (internal param0
* fix(ui/add_guardrail_form.tsx): ensure dynamic provider value is used
* fix(ui/): just one-time update the provider map dictionary
* fix(ui/): show masked api base / api key on guardrail update
* refactor(aporia_ai/): refactor to show on UI
* feat(aporia_ai/): add aporia ai guardrail to UI
* refactor(guardrails_ai/): refactor to add via UI
* refactor(lasso.py): refactor to enable adding lasso guardrails via UI
* feat(pangea.py): add pangea guardrail on UI
* feat(panw): add panw prisma airs through UI
* test: update tests
* fix: fix ruff linting error
* test: update tests
* fix: add missing docs
* fix: fix guardrail init
* fix: suppress linting errors
* fix(proxy_server.py): fix linting error
* fix(litellm_pre_call_utils.py): add user agent tags to spend logs in standard logging payload logic
avoid clash when tag based routing is enabled
* test: remove redundant test
* test: rename oidc test to run earlier
quicker debuging
* fix(azure.py): return more detailed error message
* fix(azure/common_utils.py): use default scope, if scope is none
fixes oidc test
* fix: always default to cognitiveservices.azure.com
* test: update test