* fix(caching_routes.py): mask redis password on `/cache/ping` route
* fix(caching_routes.py): fix linting erro
* fix(caching_routes.py): fix linting error on caching routes
* fix: fix test - ignore mask_dict - has a breakpoint
* fix(azure.py): add timeout param + elapsed time in azure timeout error
* fix(http_handler.py): add elapsed time to http timeout request
makes it easier to debug how long request took before failing
* fix(azure.py): ensure max_retries=0 is respected
Fixes https://github.com/BerriAI/litellm/issues/6129
* fix(test_openai.py): add unit test to ensure openai sdk calls always respect max_retries = 0
* test(test_azure_openai.py): add unit testing for azure_text/ route
* fix(azure.py): fix passing max retries on streaming
* fix(azure.py): fix azure max retries on async completion + streaming
* fix(completion/handler.py): fix azure text async completion + streaming
* test(test_azure_openai.py): ensure azure openai max retries always respected
* test(test_azure_o_series.py): add testing to ensure max retries always respected
* Added gemini providers for 2.0-flash and 2.0-flash lite (#8321)
* Update model_prices_and_context_window.json
added gemini providers for 2.0-flash and 2.0-flash light
* Update model_prices_and_context_window.json
fixed URL
---------
Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>
* Convert tool use arguments to string before counting tokens (#6989)
In at least some cases the `messages["tool_calls"]["function"]["arguments"]` is a dict, not a string. In order to tokenize it properly it needs to be a string. In the case that it is already a string this is a noop, which is also fine.
* build(model_prices_and_context_window.json): add gemini 2.0 flash lite pricing
* build(model_prices_and_context_window.json): add gemini commercial rate limits
* fix(utils.py): fix linting error
* refactor(utils.py): refactor to maintain function size
---------
Co-authored-by: Bardia Khosravi <bardiakhosravi95@gmail.com>
Co-authored-by: Josh Morrow <josh@jcmorrow.com>
* feat(handle_jwt.py): initial commit to allow scope based model access
* feat(handle_jwt.py): allow model access based on token scopes
allow admin to control model access from IDP
* test(test_jwt.py): add unit testing for scope based model access
* docs(token_auth.md): add scope based model access to docs
* docs(token_auth.md): update docs
* docs(token_auth.md): update docs
* build: add gemini commercial rate limits
* fix: fix linting error
* fix(.globals.css): revert .md hard set
caused regression in invitation link display (and possibly other places)
* Fix keys not showing on refresh for internal users (#8312)
* [Bug] UI: Newly created key does not display on the View Key Page (#8039)
- Fixed issue where all keys appeared blank for admin users.
- Implemented filtering of data via team settings to ensure all keys are displayed correctly.
* Fix:
- Updated the validator to allow model editing when `keyTeam.team_alias === "Default Team"`.
- Ensured other teams still follow the original validation rules.
* - added some classes in global.css
- added text wrap in output of request,response and metadata in index.tsx
- fixed styles of table in table.tsx
* - added full payload when we open single log entry
- added Combined Info Card in index.tsx
* fix: keys not showing on refresh for internal user
* fixed user id passed as null when keyuser is you (#8271)
* fix(user_dashboard.tsx): ensure non admin can't view other keys
---------
Co-authored-by: Taha Ali <123803932+tahaali-dev@users.noreply.github.com>
Co-authored-by: Jaswanth Karani <karani.jaswanth@gmail.com>
* add initial test for assembly ai
* start using PassthroughEndpointRouter
* migrate to lllm passthrough endpoints
* add assembly ai as a known provider
* fix PassthroughEndpointRouter
* fix set_pass_through_credentials
* working EU request to assembly ai pass through endpoint
* add e2e test assembly
* test_assemblyai_routes_with_bad_api_key
* clean up pass through endpoint router
* e2e testing for assembly ai pass through
* test assembly ai e2e testing
* delete assembly ai models
* fix code quality
* ui working assembly ai api base flow
* fix install assembly ai
* update model call details with kwargs for pass through logging
* fix tracking assembly ai model in response
* _handle_assemblyai_passthrough_logging
* fix test_initialize_deployment_for_pass_through_unsupported_provider
* TestPassthroughEndpointRouter
* _get_assembly_transcript
* fix assembly ai pt logging tests
* fix assemblyai_proxy_route
* fix _get_assembly_region_from_url
* fix(utils.py): handle key error in msg validation
* Support running Aim Guard during LLM call (#7918)
* support running Aim Guard during LLM call
* Rename header
* adjust docs and fix type annotations
* fix(timeout.md): doc fix for openai example on dynamic timeouts
---------
Co-authored-by: Tomer Bin <117278227+hxtomer@users.noreply.github.com>
* refactor _get_langfuse_input_output_content
* test_langfuse_logging_completion_with_malformed_llm_response
* fix _get_langfuse_input_output_content
* fixes for langfuse linting
* unit testing for get chat/text content for langfuse
* fix _should_raise_content_policy_error
* add gemini-2.0-flash-001 (#8289)
* build(model_prices_and_context_window.json): add gemini-2.0-flash-001 to model cost map
Adds new gemini model with token based pricing to model cost map
---------
Co-authored-by: kushagro <kush@orby.ai>
* fix(columns.tsx): fix request logs team column to indicate the value is the alias not the id
* fix(team_info.tsx): add edit team logic to team info page
* fix(team_info.tsx): re-enable updating team settings on UI
Fixes https://github.com/BerriAI/litellm/issues/8281
* fix(team_info.tsx): fix save changes on team update
* fix(teams.tsx): allow edit button to still act as a quick action button -> drop user into settings page for team
* test(config.yml): run dev ui during testing
make sure no ui regressions are pushed on main
* build: update ci/cd
* ci(config.yml): fix test
* ci: fix ci
* ci: update
* ci: fix
* ci: another attempt to get nvm working in ci/cd
* ci: fix ci
* ci: test update
* ci: test update 2
* ci: test 3
* fix(team_info.tsx): fix linting error
* fix(convert_dict_to_response.py): only convert if response is the response_format tool call passed in
Fixes https://github.com/BerriAI/litellm/issues/8241
* fix(gpt_transformation.py): makes sure response format / tools conversion doesn't remove previous tool calls
* refactor(gpt_transformation.py): refactor out json schema converstion to base config
keeps logic consistent across providers
* fix(o_series_transformation.py): support o3 mini native streaming
Fixes https://github.com/BerriAI/litellm/issues/8274
* fix(gpt_transformation.py): remove unused variables
* test: update test
* initial transform for invoke
* invoke transform_response
* working - able to make request
* working get_complete_url
* working - invoke now runs on llm_http_handler
* fix unused imports
* track litellm overhead ms
* working stream request
* sign_request transform
* sign_request update
* use has_async_custom_stream_wrapper property
* use get_async_custom_stream_wrapper in base llm http handler
* fix make_call in invoke handler
* fix invoke with streaming get_async_custom_stream_wrapper
* working bedrock async streaming with invoke
* fix make call handler for bedrock
* test_all_model_configs
* fix test_bedrock_custom_prompt_template
* sync streaming for bedrock invoke
* fix _add_stream_param_to_request_body
* test_async_text_completion_bedrock
* fix transform_request
* fix get_supported_openai_params
* fix test supports tool choice
* fix test_supports_tool_choice
* add unit test coverage for bedrock invoke transform
* fix location of transformation files
* update import loc
* fix bedrock invoke unit tests
* fix import for max completion tokens
* remove code block upserting master key hash to db
* run test to check if key upserted into db
* run ci/cd again
* litellm_proxy_security_tests
* litellm_proxy_security_tests
* run prisma entrypoint
* ci/cd run again
* fix test master key not in db
* Added a guide for users who want to use LiteLLM with AI/ML.
* Minor changes
* Minor changes
* Fix sidebars.js
---------
Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>
* refactor(deepseek/): move deepseek to base llm http handler
Fixes https://github.com/BerriAI/litellm/issues/8128#issuecomment-2635430457
* fix(gpt_transformation.py): support stream parsing for gpt-like calls
* test(test_deepseek_completion.py): add async streaming test
* fix(gpt_transformation.py): fix import
* fix(gpt_transformation.py): return full api base and content type
* feat(proxy/_types.py): add new jwt field params
allows users + services to auth into proxy
* feat(handle_jwt.py): allow team role proxy access
allows proxy admin to set allowed team roles
* fix(proxy/_types.py): add 'routes' to role based permissions
allow proxy admin to restrict what routes a team can access easily
* feat(handle_jwt.py): support more flexible role based route access
v2 on role based 'allowed_routes'
* test(test_jwt.py): add unit test for rbac for proxy routes
* feat(handle_jwt.py): ensure cost tracking always works for any jwt request with `enforce_rbac=True`
* docs(token_auth.md): add documentation on controlling model access via OIDC Roles
* test: increase time delay before retrying
* test: handle model overloaded for test