Commit Graph

3827 Commits

Author SHA1 Message Date
Ishaan Jaff 6657012f07 docs release note 2025-06-12 11:11:00 -07:00
Cole McIntosh bee41c1961 Update Mistral documentation and enhance reasoning prompt handling
- Revised the reasoning support indicators in the Mistral model documentation for clarity.
- Improved the `_add_reasoning_system_prompt_if_needed` method to handle both string and list content types for system messages, ensuring the reasoning prompt is correctly prepended.
- Added a new test case to verify the functionality of adding the reasoning system prompt when the existing content is a list.
2025-06-12 11:17:48 -06:00
Cole McIntosh c5f91b9d77 Merge branch 'BerriAI:main' into mistral-reasoning 2025-06-12 11:04:57 -06:00
Krrish Dholakia e4c89135f5 docs(index.md): clarify pip install will be live by eod 2025-06-11 19:04:17 -07:00
Krrish Dholakia bdb1222a57 docs(index.md): remove pip install - not live yet 2025-06-11 19:03:31 -07:00
Krrish Dholakia 1bd2b03b4d docs(index.md): update docs to indicate v1.72.2-stable is now live 2025-06-11 19:03:11 -07:00
Cole McIntosh a30ed8ce0b [Feat] Add Mistral AI reasoning capabilities docs 2025-06-11 18:20:17 -06:00
Krish Dholakia 39de3610be fix(internal_user_endpoints.py): support user with + in email on us… (#11601)
* fix(internal_user_endpoints.py): support user with `+` in email on user info

ensures user is correctly parsed from input

* fix(factory.py): support vertex function call args as None

handles empty string in args for vertex gemini calls

* docs(langfuse_integration.md): pin langfuse sdk version on docs

* fix(vertex_ai/): return empty dict, instead of none when empty string given

* refactor: reduce function size

* fix: fix linting errors

* fix: revert check

* fix(internal_user_endpoints.py): fix check

* test: update tests

* test: update tests
2025-06-10 22:13:10 -07:00
Ishaan Jaff 2d0ea74cf4 [Bug Fix] No module named 'diskcache' (#11600)
* (build) show clear error when disk cache does not exist

* docs disk cache

* add caching to pyproject
2025-06-10 14:54:11 -07:00
Ishaan Jaff 55cd5f096c [Feat] LiteLLM Allow setting Uvicorn Keep Alive Timeout (#11594)
* Add keepalive timeout option for uvicorn server configuration

* docs Keepalive Timeout

---------

Co-authored-by: Cursor Agent <cursoragent@cursor.com>
2025-06-10 13:30:19 -07:00
Konstantin Lapine 62e6cb315b Pangea/kl/udpate readme (#11570)
* chore(pangea-guardrail): Fix typo in debug message.

* docs(pangea-guardrail): Fix YAML example in pangea.md (README)."

* docs(pangea-guardrail): Update pangea.md (README).

* chore(pangea-guardrail): Format with Black.
2025-06-10 08:29:12 -07:00
Ishaan Jaff c6d0878160 [Feat] Add Lasso Guardrail to LiteLLM (#11565)
* Feature/lasso guardrail (#9002)

* first version of lasso guardrail in litellm

* update to the new Lasso API

* change  prod api_base and kill the request when lasso detect issue.

* change test for now api, local test pass

* add async tests

* all tests pass

* add docs for the new lasso guardrail

* Remove support for modes other than pre_call in Lasso guardrail

* code structure and naming

* only pre_call docs

* fix lint errors

* move test to the new location  follows the same directory structure as litellm/.

* add lasso guard

* docs lasso docs

* add lasso guardrail

* fix lasso guardrail

---------

Co-authored-by: oroxenberg <oro@lasso.security>
2025-06-09 18:47:26 -07:00
Krrish Dholakia 230dd70604 docs(data_security.md): data_security.md
update to indicate litellm does have soc 2 type2
2025-06-09 17:53:11 -07:00
Marc Abramowitz 3bd36238dc Simplify management_cli.md CLI docs (#10799)
Offer just 1 easy option for installing with `uv tool` so people can get
started quickly
2025-06-09 17:38:06 -07:00
Ishaan Jaff cd8ec4556f [Feat] Add reasoning_effort support for perplexity models (#11562)
* fix: add reasoning_effort for pplx

* docs pplx reasoning

* [tests] add mock tests for pplx reasoning (#11564)

* Add tests for Perplexity reasoning models and effort parameter

* tests perplexity reasoning effort

---------

Co-authored-by: Cursor Agent <cursoragent@cursor.com>

* test pplx reasoning effort

---------

Co-authored-by: Cursor Agent <cursoragent@cursor.com>
2025-06-09 17:07:31 -07:00
Cole McIntosh 322aefb97d Update documentation for configuring web search options in config.yaml (#11537)
* Update web_search.md to include new supported providers and models, enhance web search options, and improve documentation for using web search with various AI models.

* Update LiteLLM version in web_search.md to reflect the latest stable release.

* Fix formatting in web_search.md for model declaration consistency.

* docs(web_search.md): add configuration options for web search in config.yaml

This update introduces sections for setting default and custom web search options in the proxy config file, including examples for different models and context sizes. A note clarifies that users can override these settings in API requests.
2025-06-09 15:19:57 -07:00
fengbohello 400acd4297 docs: fix database_url config description (#11547) 2025-06-09 15:18:37 -07:00
Krrish Dholakia 35bc2d52a3 docs(index.md): document rate limiting improvements 2025-06-07 19:51:26 -07:00
Krrish Dholakia ff5a52fe3f docs(index.md): reference anthropic mcp docs on release notes 2025-06-07 19:27:54 -07:00
Krrish Dholakia 483f835c65 docs(anthropic.md): document anthropic mcp tool calling support 2025-06-07 19:21:14 -07:00
Ishaan Jaff bb8a5e752e docs 1.72.2 notes 2025-06-07 18:32:51 -07:00
Ishaan Jaff 3f41b84408 docs add audit logs to release note 2025-06-07 18:27:27 -07:00
Ishaan Jaff 6f6f9bf58e docs cleanup 2025-06-07 17:22:40 -07:00
Ishaan Jaff 4fc92244b5 [Docs] v1.72.2.rc (#11519)
* v1-72-2.rc

* docs v1.72.2.rc

* docs 1.72.2.rc

* docs update

* docs Bug Fixes

* add TLDR section

* add table

* docs release notes

* docs v1-72-2-stable

* docs /v1/messages

* docs hf rerank
2025-06-07 17:20:15 -07:00
Ishaan Jaff 3351f0513e Update Anthropic unified docs with multi-provider examples and proxy usage (#11523)
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
2025-06-07 15:24:14 -07:00
Krish Dholakia c42740a4b9 Simplify experimental multi-instance rate limiter - more accurate (#11424)
* refactor: comment out circuit breaker

causes incorrect rate limiting in high traffic

* fix(base_routing_strategy.py): don't reset value if redis val is lower than current in-memory value

Fixes issue where redis might be trailing in-memory value

* fix(parallel_request_limiter_v2.py): if in-memory higher than redis, don't reset value; add previous slot keys to redis increment to correctly 'get' them

* fix(parallel_request_limiter_v3.py): v3 implementation of parallel request limiter

does not use background redis syncing - increments redis in call

 simplify rate limiting logic, to improve accuracy

* fix: fix ruff errors

* fix(parallel_request_limiter_v3.py): don't decrement limit on post call success - causes double decrements

* fix(parallel_request_limiter_v3.py): working accurate multi-instance logic

ensured just 100 requests allowed on 100 users, 10 ramp up, 100 rpm limit key, 2 instances

* fix(parallel_request_limiter_v3.py): working accurate rate limiting with time window resets

allows rate limiting to work across multiple windows

* test: add unit tests for v3 rate limiter

* fix(parallel_request_limiter_v3.py): return window value into in-memory cache

allows in-memory cache checks to be used correctly

* refactor(parallel_request_limiter_v3.py): refactor rate limiting to work for multiple window/counter key pairs

enables using for user/team/model rate limiting

* feat(parallel_request_limiter_v3.py): working rate limiting, across key/user/team/end-user

* fix(parallel_request_limiter_v3.py): add model specific rate limiting

* fix(parallel_request_limiter_v3.py): ignore if no rate limits set

skip unecessary rate limit checks - if no limits set

* fix(parallel_request_limiter_v3.py): initial commit bringing token rate limits back

* fix(parallel_request_limiter_v3.py): increment by value in list + update assertions to handle tokens + max parallel requests

* test(parallel_request_limiter_v3.py): more testing

* fix(parallel_request_limiter.py): working in-memory cache limiter

* fix(redis_cache.py): ignore linting error - use safe hasattr

* fix(parallel_request_limiter_v3.py): fix linting error

* refactor: remove redundant parallel_Request_limiter_v2.py

old / inaccurate implementation

* test: update tests

* style: cleanup

* test: update test

* docs(config_settings.md): document new env var

* test(test_base_routing_strategy.py): update test
2025-06-07 11:10:55 -07:00
Cole McIntosh 1d86fc84fe Update web search documentation for new provider support (xAI, VertexAI, Google AI Studio) (#11515)
* Update web_search.md to include new supported providers and models, enhance web search options, and improve documentation for using web search with various AI models.

* Update LiteLLM version in web_search.md to reflect the latest stable release.

* Fix formatting in web_search.md for model declaration consistency.
2025-06-07 09:13:03 -07:00
Tu Vu 3b7746a13b Update the correct test directory in contributing_code.md (#11511) 2025-06-07 07:35:01 -07:00
Ishaan Jaff 5299c4bb6e docs - stable release v1.72.0 2025-06-06 20:58:02 -07:00
Tu Vu bb45844ad8 Update model version in deploy.md (#11506) 2025-06-06 20:35:14 -07:00
Tu Vu bc0e93e8a1 Remove retired version gpt-3.5 from configs.md (#11508) 2025-06-06 20:34:45 -07:00
Ishaan Jaff 4749008fbe docs: add redis version requirement (#11499) 2025-06-06 14:22:47 -07:00
Fadil Rahman fb5f2c5441 Add batch polling to python code in batches docs (#11286) 2025-06-06 10:49:23 -07:00
AyrennC 900c6e5d7b [Docs] Add audio / tts section for gemini and vertex (#11306)
* added audio and tts doc for gemini

* updated gemini and vertex audio gen doc to be more concise
2025-06-06 10:48:50 -07:00
Krrish Dholakia b70574017c docs: document new env vars 2025-06-06 09:59:13 -07:00
Krrish Dholakia 1ca85161a5 docs(users.md): clarify how budgets are applied 2025-06-05 23:12:24 -07:00
RMeans 742405f6cf Add pangea to guardrails sidebar (#11464) 2025-06-05 18:11:52 -07:00
Ishaan Jaff 18ea65218b [Feat] Make batch size for maximum retention in spend logs a controllable parameter (#11459)
* feat: add SPEND_LOG_CLEANUP_BATCH_SIZE

* docs update

* test: test_cleanup_batch_size_env_var
2025-06-05 17:11:51 -07:00
Krish Dholakia d05eda0311 Custom Root Path Improvements: don't require reserving /litellm route (#11460)
* fix(proxy_server.py): initial commit with asset prefix rewriting for custom base path

Closes https://github.com/BerriAI/litellm/issues/11451

* docs(litellm_proxy.md): clarify version requirement

* fix(proxy_server.py): replace litellm well known route with custom server root path

Ensures UI calls correct endpoint

* build(ui/): update ui build
2025-06-05 16:36:47 -07:00
Krrish Dholakia 30f1c5e852 docs: clarify pre-release 2025-06-05 15:00:04 -07:00
Low Jian Sheng a3e5bc4856 Support no reasoning option for gemini models (#11393)
* support no reasoning for gemini models

* change none to disable

* remove print statements

* update docs
2025-06-05 00:11:45 -07:00
Sha a301ef873e added gemini url context support (#11351)
* added gemini url context support

* lint issue fix
2025-06-04 23:56:21 -07:00
Krrish Dholakia 26891c23c5 docs: update docs 2025-06-04 11:42:53 -07:00
Krish Dholakia e0fa33f099 UI / SSO - Update proxy admin id role in DB + Handle SSO redirects with custom root path (#11384)
* fix(ui_sso.py): update user as proxy admin in db table, when checking for proxy_admin_id

Fixes issue where existing internal user, unable to make calls when set as proxy admin id

* fix(utils.py): fix custom base path
2025-06-03 21:16:55 -07:00
Ishaan Jaff 99c91fe41f [Feat]: Performance add DD profiler to monitor python profile of LiteLLM CPU% (#11375)
* feat: add DD profile

* fix: test_should_use_dd_profiler

* docs dd profiler

* docs DD profiler
2025-06-03 12:03:08 -07:00
Krrish Dholakia d5842edf09 docs(vllm.md): add vllm - model list loadbalancing tutorial to docs 2025-06-03 09:38:33 -07:00
AnilAren 2486743904 Doc : Nvidia embedding models (#11352)
* fix: bedrock ai21 jamba models will work now

* Update supported_embedding.md

* Update supported_embedding.md
2025-06-03 09:17:07 -07:00
Ishaan Jaff a366f9247a docs update s3 logger 2025-06-02 21:53:47 -07:00
Ishaan Jaff 3db272b6d2 [Perf] - Add Async + Batched S3 Logging (#11340)
* fix: add s3 v2 async

* fix: add s3 v2 async

* fix: add s3 v2 async

* test: s3 v2 logging

* fixes: s3 logging

* fixes: s3 logging use max upload batch size

* fixes: s3 logging tests

* fixes: s3 logging tests

* fixes: s3 logging tests
2025-06-02 21:52:34 -07:00
Krish Dholakia 00be76abf4 UI - Custom Server Root Path (Multiple Fixes) (#11337)
* fix(proxy_server.py): working swagger on custom base

removes the swagger monkey patch - this seems to render the swagger on custom base paths

* fix(ui/): working custom auth uptil login success event

* fix(ui/): working custom server root path for login

* fix(proxy_server.py): create typed dict for ui returned token

allows better documentation of expected params

* refactor(proxy_server.py): refactor all ui login endpoints to use same returned ui token object

* feat(ui_sso.py): add server root path to ui token

* feat(ui_sso.py): allows ui to call correct endpoint

* fix(networking.tsx): update proxy base url with custom root path

* fix(networking.tsx): handle updating proxy base url for non-local instances

* refactor: remove uneccessary references to proxybaseurl in ui code - reduce potential for errors

* fix: fix linting error

* fix(onboarding_link.tsx): fix onboarding link when custom server path is set

* feat(ui_discovery_endpoints.py): add new public .well-known/ route for litellm ui config

returns the server root path and proxy base url for constructing api calls

* feat(_types.py): add litellm well known config as public route

allows ui to query it

* fix(/_types.py): add .well-known config to as public route

* fix(page.tsx): create pattern for loading in ui config before making network requests

ensures requests are formatted correctly

* fix(page.tsx): call credential endpoint once ui config is loaded

* fix(page.tsx): route correctly to litellm dashboard from new user login

* fix(page.tsx): remove hardcoded `/litellm` for /sso/key/generate request

* fix(proxy_server.py): re-add moderations endpoint

* fix(proxy_server.py): mount __next__ at / and /litellm

allows it to work when proxy is mounted on root

* docs(contributing.md): remove /ui on ui doc - it will now run on root

* docs(custom_root_ui.md): add docs on custom root path
2025-06-02 17:48:03 -07:00