- Revised the reasoning support indicators in the Mistral model documentation for clarity.
- Improved the `_add_reasoning_system_prompt_if_needed` method to handle both string and list content types for system messages, ensuring the reasoning prompt is correctly prepended.
- Added a new test case to verify the functionality of adding the reasoning system prompt when the existing content is a list.
* fix(internal_user_endpoints.py): support user with `+` in email on user info
ensures user is correctly parsed from input
* fix(factory.py): support vertex function call args as None
handles empty string in args for vertex gemini calls
* docs(langfuse_integration.md): pin langfuse sdk version on docs
* fix(vertex_ai/): return empty dict, instead of none when empty string given
* refactor: reduce function size
* fix: fix linting errors
* fix: revert check
* fix(internal_user_endpoints.py): fix check
* test: update tests
* test: update tests
* chore(pangea-guardrail): Fix typo in debug message.
* docs(pangea-guardrail): Fix YAML example in pangea.md (README)."
* docs(pangea-guardrail): Update pangea.md (README).
* chore(pangea-guardrail): Format with Black.
* Feature/lasso guardrail (#9002)
* first version of lasso guardrail in litellm
* update to the new Lasso API
* change prod api_base and kill the request when lasso detect issue.
* change test for now api, local test pass
* add async tests
* all tests pass
* add docs for the new lasso guardrail
* Remove support for modes other than pre_call in Lasso guardrail
* code structure and naming
* only pre_call docs
* fix lint errors
* move test to the new location follows the same directory structure as litellm/.
* add lasso guard
* docs lasso docs
* add lasso guardrail
* fix lasso guardrail
---------
Co-authored-by: oroxenberg <oro@lasso.security>
* Update web_search.md to include new supported providers and models, enhance web search options, and improve documentation for using web search with various AI models.
* Update LiteLLM version in web_search.md to reflect the latest stable release.
* Fix formatting in web_search.md for model declaration consistency.
* docs(web_search.md): add configuration options for web search in config.yaml
This update introduces sections for setting default and custom web search options in the proxy config file, including examples for different models and context sizes. A note clarifies that users can override these settings in API requests.
* refactor: comment out circuit breaker
causes incorrect rate limiting in high traffic
* fix(base_routing_strategy.py): don't reset value if redis val is lower than current in-memory value
Fixes issue where redis might be trailing in-memory value
* fix(parallel_request_limiter_v2.py): if in-memory higher than redis, don't reset value; add previous slot keys to redis increment to correctly 'get' them
* fix(parallel_request_limiter_v3.py): v3 implementation of parallel request limiter
does not use background redis syncing - increments redis in call
simplify rate limiting logic, to improve accuracy
* fix: fix ruff errors
* fix(parallel_request_limiter_v3.py): don't decrement limit on post call success - causes double decrements
* fix(parallel_request_limiter_v3.py): working accurate multi-instance logic
ensured just 100 requests allowed on 100 users, 10 ramp up, 100 rpm limit key, 2 instances
* fix(parallel_request_limiter_v3.py): working accurate rate limiting with time window resets
allows rate limiting to work across multiple windows
* test: add unit tests for v3 rate limiter
* fix(parallel_request_limiter_v3.py): return window value into in-memory cache
allows in-memory cache checks to be used correctly
* refactor(parallel_request_limiter_v3.py): refactor rate limiting to work for multiple window/counter key pairs
enables using for user/team/model rate limiting
* feat(parallel_request_limiter_v3.py): working rate limiting, across key/user/team/end-user
* fix(parallel_request_limiter_v3.py): add model specific rate limiting
* fix(parallel_request_limiter_v3.py): ignore if no rate limits set
skip unecessary rate limit checks - if no limits set
* fix(parallel_request_limiter_v3.py): initial commit bringing token rate limits back
* fix(parallel_request_limiter_v3.py): increment by value in list + update assertions to handle tokens + max parallel requests
* test(parallel_request_limiter_v3.py): more testing
* fix(parallel_request_limiter.py): working in-memory cache limiter
* fix(redis_cache.py): ignore linting error - use safe hasattr
* fix(parallel_request_limiter_v3.py): fix linting error
* refactor: remove redundant parallel_Request_limiter_v2.py
old / inaccurate implementation
* test: update tests
* style: cleanup
* test: update test
* docs(config_settings.md): document new env var
* test(test_base_routing_strategy.py): update test
* Update web_search.md to include new supported providers and models, enhance web search options, and improve documentation for using web search with various AI models.
* Update LiteLLM version in web_search.md to reflect the latest stable release.
* Fix formatting in web_search.md for model declaration consistency.
* fix(ui_sso.py): update user as proxy admin in db table, when checking for proxy_admin_id
Fixes issue where existing internal user, unable to make calls when set as proxy admin id
* fix(utils.py): fix custom base path
* fix(proxy_server.py): working swagger on custom base
removes the swagger monkey patch - this seems to render the swagger on custom base paths
* fix(ui/): working custom auth uptil login success event
* fix(ui/): working custom server root path for login
* fix(proxy_server.py): create typed dict for ui returned token
allows better documentation of expected params
* refactor(proxy_server.py): refactor all ui login endpoints to use same returned ui token object
* feat(ui_sso.py): add server root path to ui token
* feat(ui_sso.py): allows ui to call correct endpoint
* fix(networking.tsx): update proxy base url with custom root path
* fix(networking.tsx): handle updating proxy base url for non-local instances
* refactor: remove uneccessary references to proxybaseurl in ui code - reduce potential for errors
* fix: fix linting error
* fix(onboarding_link.tsx): fix onboarding link when custom server path is set
* feat(ui_discovery_endpoints.py): add new public .well-known/ route for litellm ui config
returns the server root path and proxy base url for constructing api calls
* feat(_types.py): add litellm well known config as public route
allows ui to query it
* fix(/_types.py): add .well-known config to as public route
* fix(page.tsx): create pattern for loading in ui config before making network requests
ensures requests are formatted correctly
* fix(page.tsx): call credential endpoint once ui config is loaded
* fix(page.tsx): route correctly to litellm dashboard from new user login
* fix(page.tsx): remove hardcoded `/litellm` for /sso/key/generate request
* fix(proxy_server.py): re-add moderations endpoint
* fix(proxy_server.py): mount __next__ at / and /litellm
allows it to work when proxy is mounted on root
* docs(contributing.md): remove /ui on ui doc - it will now run on root
* docs(custom_root_ui.md): add docs on custom root path