litellm

mirror of https://github.com/tiennm99/litellm.git synced 2026-06-29 09:08:27 +00:00

Files

T

Krish Dholakia c42740a4b9 Simplify experimental multi-instance rate limiter - more accurate (#11424 )

* refactor: comment out circuit breaker

causes incorrect rate limiting in high traffic

* fix(base_routing_strategy.py): don't reset value if redis val is lower than current in-memory value

Fixes issue where redis might be trailing in-memory value

* fix(parallel_request_limiter_v2.py): if in-memory higher than redis, don't reset value; add previous slot keys to redis increment to correctly 'get' them

* fix(parallel_request_limiter_v3.py): v3 implementation of parallel request limiter

does not use background redis syncing - increments redis in call

 simplify rate limiting logic, to improve accuracy

* fix: fix ruff errors

* fix(parallel_request_limiter_v3.py): don't decrement limit on post call success - causes double decrements

* fix(parallel_request_limiter_v3.py): working accurate multi-instance logic

ensured just 100 requests allowed on 100 users, 10 ramp up, 100 rpm limit key, 2 instances

* fix(parallel_request_limiter_v3.py): working accurate rate limiting with time window resets

allows rate limiting to work across multiple windows

* test: add unit tests for v3 rate limiter

* fix(parallel_request_limiter_v3.py): return window value into in-memory cache

allows in-memory cache checks to be used correctly

* refactor(parallel_request_limiter_v3.py): refactor rate limiting to work for multiple window/counter key pairs

enables using for user/team/model rate limiting

* feat(parallel_request_limiter_v3.py): working rate limiting, across key/user/team/end-user

* fix(parallel_request_limiter_v3.py): add model specific rate limiting

* fix(parallel_request_limiter_v3.py): ignore if no rate limits set

skip unecessary rate limit checks - if no limits set

* fix(parallel_request_limiter_v3.py): initial commit bringing token rate limits back

* fix(parallel_request_limiter_v3.py): increment by value in list + update assertions to handle tokens + max parallel requests

* test(parallel_request_limiter_v3.py): more testing

* fix(parallel_request_limiter.py): working in-memory cache limiter

* fix(redis_cache.py): ignore linting error - use safe hasattr

* fix(parallel_request_limiter_v3.py): fix linting error

* refactor: remove redundant parallel_Request_limiter_v2.py

old / inaccurate implementation

* test: update tests

* style: cleanup

* test: update test

* docs(config_settings.md): document new env var

* test(test_base_routing_strategy.py): update test

2025-06-07 11:10:55 -07:00

.litellm_cache

…

example_config_yaml

test: update tests to new deployment model (#10142 )

2025-04-18 14:22:12 -07:00

test_configs

test: update tests to new deployment model (#10142 )

2025-04-18 14:22:12 -07:00

test_model_response_typing

LiteLLM Minor Fixes & Improvements (11/05/2024) (#6590 )

2024-11-07 04:17:05 +05:30

adroit-crow-413218-bc47f303efc9.json

vertex testing use pathrise-convert-1606954137718

2025-01-05 14:00:17 -08:00

azure_fine_tune.jsonl

…

batch_job_results_furniture.jsonl

…

cache_unit_tests.py

(code refactor) - Add BaseRerankConfig. Use BaseRerankConfig for cohere/rerank and azure_ai/rerank (#7319 )

2024-12-19 17:03:34 -08:00

conftest.py

ci(conftest.py): reset conftest.py for local_testing/ (#6657 )

2024-11-08 19:14:16 +05:30

create_mock_standard_logging_payload.py

[Bug Fix]: Errors in LiteLLM When Using Embeddings Model with Usage-Based Routing (#7390 )

2024-12-23 17:42:24 -08:00

data_map.txt

…

eagle.wav

…

example.jsonl

VertexAI non-jsonl file storage support (#9781 )

2025-04-09 14:01:48 -07:00

gettysburg.wav

…

large_text.py

…

model_cost.json

…

openai_batch_completions_router.jsonl

…

openai_batch_completions.jsonl

…

speech_vertex.mp3

…

stream_chunk_testdata.py

…

test_acompletion_fallbacks.py

(core sdk fix) - fix fallbacks stuck in infinite loop (#7751 )

2025-01-13 19:34:34 -08:00

test_acompletion.py

Complete o3 model support (#8183 )

2025-02-02 22:36:37 -08:00

test_acooldowns_router.py

test: update tests to new deployment model (#10142 )

2025-04-18 14:22:12 -07:00

test_add_function_to_prompt.py

LiteLLM Minor Fixes & Improvements (11/05/2024) (#6590 )

2024-11-07 04:17:05 +05:30

test_add_update_models.py

Allow team admins to add/update/delete models on UI + show api base and model id on request logs (#9572 )

2025-03-27 12:06:31 -07:00

test_aim_guardrails.py

Feat/support anonymize in aim guardrail (#10757 )

2025-05-15 22:18:58 -07:00

test_alangfuse.py

test: update tests to new deployment model (#10142 )

2025-04-18 14:22:12 -07:00

test_amazing_vertex_completion.py

test: update to handle gemini-flash empty responses

2025-06-06 13:37:29 -07:00

test_anthropic_prompt_caching.py

LiteLLM Minor Fixes & Improvements (01/16/2025) - p2 (#7828 )

2025-02-02 23:17:50 -08:00

test_arize_ai.py

Merge branch 'main' into litellm_arize_dynamic_logging

2025-03-18 22:13:35 -07:00

test_arize_phoenix.py

fix arize config tests

2025-05-13 20:21:14 -07:00

test_assistants.py

test assistants fixes

2025-04-19 08:09:45 -07:00

test_async_fn.py

refactor(sagemaker/): separate chat + completion routes + make them b… (#7151 )

2024-12-10 19:40:05 -08:00

test_audio_speech.py

Litellm dev 2024 12 20 p1 (#7335 )

2024-12-20 21:22:31 -08:00

test_auth_utils.py

Fix: Respect user_header_name property for budget selection and user identification (#11419 )

2025-06-06 14:21:02 -07:00

test_azure_content_safety.py

(refactor) caching use LLMCachingHandler for async_get_cache and set_cache (#6208 )

2024-10-14 16:34:01 +05:30

test_azure_openai.py

test: fixes

2025-05-31 12:42:56 -07:00

test_azure_perf.py

test: update tests to new deployment model (#10142 )

2025-04-18 14:22:12 -07:00

test_bad_params.py

test_completion_invalid_param_cohere

2025-04-02 06:49:11 -07:00

test_basic_python_version.py

Litellm dev 01 10 2025 p2 (#7679 )

2025-01-10 21:50:53 -08:00

test_batch_completion_return_exceptions.py

…

test_batch_completions.py

Litellm dev contributor prs 01 31 2025 (#8168 )

2025-02-01 09:05:20 -08:00

test_blocked_user_list.py

(docs) add docstrings for all /key, /user, /team, /customer endpoints (#6804 )

2024-11-18 19:44:06 -08:00

test_braintrust.py

Litellm dev 01 07 2025 p3 (#7635 )

2025-01-08 11:46:24 -08:00

test_budget_manager.py

Litellm ruff linting enforcement (#5992 )

2024-10-01 19:44:20 -04:00

test_caching_handler.py

test(test_caching_handler.py): move to in-memory cache - prevent redis flakiness from impacting ci/cd

2025-03-28 13:32:04 -07:00

test_caching_ssl.py

test: update tests

2025-05-20 13:08:47 -07:00

test_caching.py

Simplify experimental multi-instance rate limiter - more accurate (#11424 )

2025-06-07 11:10:55 -07:00

test_class.py

test: update tests to new deployment model (#10142 )

2025-04-18 14:22:12 -07:00

test_completion_cost.py

test: more testing fixes

2025-05-01 15:36:13 -07:00

test_completion_with_retries.py

fix(main.py): fix retries being multiplied when using openai sdk (#7221 )

2024-12-14 11:56:55 -08:00

test_completion.py

test_lm_studio_completion

2025-06-06 20:41:00 -07:00

test_config.py

test: update tests to new deployment model (#10142 )

2025-04-18 14:22:12 -07:00

test_cost_calc.py

test(test_cost_calc.py): fix test to handle llm api errors

2024-12-24 16:49:02 -08:00

test_custom_api_logger.py

…

test_custom_callback_input.py

Update fireworks ai pricing (#10425 )

2025-04-29 20:58:05 -07:00

test_custom_callback_router.py

Litellm dev 04 30 2025 p1 (#10462 )

2025-04-30 22:11:12 -07:00

test_custom_llm.py

feat: add embeddings to CustomLLM (#10980 )

2025-05-22 22:55:46 -07:00

test_custom_logger.py

test: update tests to new deployment model (#10142 )

2025-04-18 14:22:12 -07:00

test_disk_cache_unit_tests.py

LiteLLM Minor Fixes & Improvements (11/12/2024) (#6705 )

2024-11-12 22:50:51 +05:30

test_dual_cache.py

(code refactor) - Add BaseRerankConfig. Use BaseRerankConfig for cohere/rerank and azure_ai/rerank (#7319 )

2024-12-19 17:03:34 -08:00

test_dynamic_rate_limit_handler.py

LiteLLM Minor Fixes & Improvements (10/15/2024) (#6242 )

2024-10-16 07:32:06 -07:00

test_dynamodb_logs.py

…

test_embedding.py

Integration with Nebius AI Studio added (#11143 )

2025-05-27 11:05:22 -07:00

test_exceptions.py

Add ExceptionCheckers class for improved error string detection

2025-06-05 17:15:53 -06:00

test_file_types.py

…

test_function_call_parsing.py

…

test_function_calling.py

test: handle internal server errors

2025-05-01 16:47:30 -07:00

test_function_setup.py

…

test_gcs_bucket.py

test: update tests to new deployment model (#10142 )

2025-04-18 14:22:12 -07:00

test_get_llm_provider.py

[Feat] Option to force/always use the litellm proxy (#10559 ) (#10633 ) (#10773 )

2025-05-12 20:22:54 -07:00

test_get_model_file.py

LiteLLM Minor Fixes & Improvements (10/05/2024) (#6083 )

2024-10-05 18:59:11 -04:00

test_get_model_info.py

[Feature] Add supports_computer_use to the model list (#10881 )

2025-05-20 17:07:43 -07:00

test_get_optional_params_embeddings.py

…

test_get_optional_params_functions_not_supported.py

…

test_google_ai_studio_gemini.py

…

test_guardrails_ai.py

LiteLLM Minor Fixes & Improvements (10/15/2024) (#6242 )

2024-10-16 07:32:06 -07:00

test_health_check.py

test: update tests to new deployment model (#10142 )

2025-04-18 14:22:12 -07:00

test_helicone_integration.py

test: update tests to new deployment model (#10142 )

2025-04-18 14:22:12 -07:00

test_http_parsing_utils.py

(Bug fix) - reading /parsing request body when on hypercorn (#8734 )

2025-02-25 15:18:04 -08:00

test_img_resize.py

fix: Support WebP image format and avoid token calculation error (#7182 )

2024-12-12 14:32:39 -08:00

test_lakera_ai_prompt_injection.py

Merge pull request #9222 from BerriAI/litellm_snowflake_pr_mar_13

2025-03-13 21:35:39 -07:00

test_langchain_ChatLiteLLM.py

…

test_langsmith.py

Litellm dev 11 30 2024 (#6974 )

2024-12-02 21:03:33 -08:00

test_least_busy_routing.py

test: update tests to new deployment model (#10142 )

2025-04-18 14:22:12 -07:00

test_litellm_max_budget.py

…

test_literalai.py

Litellm Minor Fixes & Improvements (10/03/2024) (#6049 )

2024-10-03 18:02:28 -04:00

test_llm_guard.py

[Refactor] Move LLM Guard, Secret Detection to Enterprise Pip packagea (#10782 )

2025-05-13 09:42:22 -07:00

test_load_test_router_s3.py

test: update tests to new deployment model (#10142 )

2025-04-18 14:22:12 -07:00

test_loadtest_router.py

test: update tests to new deployment model (#10142 )

2025-04-18 14:22:12 -07:00

test_logfire.py

…

test_logging.py

LiteLLM Minor Fixes & Improvements (11/05/2024) (#6590 )

2024-11-07 04:17:05 +05:30

test_longer_context_fallback.py

…

test_lowest_cost_routing.py

test: update tests to new deployment model (#10142 )

2025-04-18 14:22:12 -07:00

test_lowest_latency_routing.py

test: update tests to new deployment model (#10142 )

2025-04-18 14:22:12 -07:00

test_lunary.py

…

test_max_tpm_rpm_limiter.py

(refactor) caching use LLMCachingHandler for async_get_cache and set_cache (#6208 )

2024-10-14 16:34:01 +05:30

test_mem_leak.py

LiteLLM Minor Fixes & Improvements (10/30/2024) (#6519 )

2024-11-02 00:44:32 +05:30

test_mem_usage.py

test: update tests to new deployment model (#10142 )

2025-04-18 14:22:12 -07:00

test_mock_request.py

test: update tests to new deployment model (#10142 )

2025-04-18 14:22:12 -07:00

test_model_alias_map.py

test: fix test

2025-04-16 07:57:10 -07:00

test_model_max_token_adjust.py

…

test_multiple_deployments.py

…

test_ollama_local_chat.py

…

test_ollama_local.py

…

test_ollama.py

[Fixes] Aiohttp transport fixes - add handling for aiohttp.ClientPayloadError and ssl_verification settings (#11162 )

2025-05-26 21:14:35 -07:00

test_openai_moderations_hook.py

(refactor) caching use LLMCachingHandler for async_get_cache and set_cache (#6208 )

2024-10-14 16:34:01 +05:30

test_opik.py

[Feat] Observability integration - Opik by Comet (#6062 )

2024-10-10 18:27:50 +05:30

test_parallel_request_limiter.py

Add all /key/generate api params to UI + add metadata fields on team AND org add/update (#8667 )

2025-02-19 21:13:06 -08:00

test_pass_through_endpoints.py

oops

2025-03-11 08:27:36 -04:00

test_profiling_router.py

…

test_prometheus_service.py

Embedding caching fixes - handle str -> list cache, set usage tokens for cache hits, combine usage tokens on partial cache hits (#10424 )

2025-04-29 21:21:28 -07:00

test_prometheus.py

done

2025-01-22 20:19:31 +09:00

test_prompt_caching.py

LiteLLM Minor Fixes & Improvements (12/05/2024) (#7037 )

2024-12-05 00:02:31 -08:00

test_prompt_injection_detection.py

test: update tests to new deployment model (#10142 )

2025-04-18 14:22:12 -07:00

test_promptlayer_integration.py

LiteLLM Minor Fixes & Improvements (11/05/2024) (#6590 )

2024-11-07 04:17:05 +05:30

test_provider_specific_config.py

test: update tests to new deployment model (#10142 )

2025-04-18 14:22:12 -07:00

test_pydantic_namespaces.py

…

test_pydantic.py

…

test_register_model.py

…

test_router_batch_completion.py

(code quality) run ruff rule to ban unused imports (#7313 )

2024-12-19 12:33:42 -08:00

test_router_budget_limiter.py

test: update tests

2025-05-20 13:08:47 -07:00

test_router_caching.py

test: update tests to new deployment model (#10142 )

2025-04-18 14:22:12 -07:00

test_router_client_init.py

test: update tests to new deployment model (#10142 )

2025-04-18 14:22:12 -07:00

test_router_cooldowns.py

test: update tests to new deployment model (#10142 )

2025-04-18 14:22:12 -07:00

test_router_custom_routing.py

…

test_router_debug_logs.py

test: update tests to new deployment model (#10142 )

2025-04-18 14:22:12 -07:00

test_router_fallback_handlers.py

(Feat) - return x-litellm-attempted-fallbacks in responses from litellm proxy (#8558 )

2025-02-15 14:54:23 -08:00

test_router_fallbacks.py

test: update tests to new deployment model (#10142 )

2025-04-18 14:22:12 -07:00

test_router_get_deployments.py

test: update tests to new deployment model (#10142 )

2025-04-18 14:22:12 -07:00

test_router_init.py

test: update tests to new deployment model (#10142 )

2025-04-18 14:22:12 -07:00

test_router_max_parallel_requests.py

fix(lowest_tpm_rpm_routing.py): fix parallel rate limit check (#6577 )

2024-11-05 22:03:44 +05:30

test_router_pattern_matching.py

(code quality) run ruff rule to ban unused imports (#7313 )

2024-12-19 12:33:42 -08:00

test_router_policy_violation.py

test: update tests to new deployment model (#10142 )

2025-04-18 14:22:12 -07:00

test_router_retries.py

test: update tests to new deployment model (#10142 )

2025-04-18 14:22:12 -07:00

test_router_tag_routing.py

build: merge litellm_dev_03_01_2025_p2

2025-03-03 23:05:41 -08:00

test_router_timeout.py

test: update tests to new deployment model (#10142 )

2025-04-18 14:22:12 -07:00

test_router_utils.py

test: update tests to new deployment model (#10142 )

2025-04-18 14:22:12 -07:00

test_router_with_fallbacks.py

…

test_router.py

Litellm staging 05 10 2025 - openai pdf url support + sagemaker chat content length error fix (#10724 )

2025-05-10 17:41:57 -07:00

test_rules.py

Litellm ruff linting enforcement (#5992 )

2024-10-01 19:44:20 -04:00

test_sagemaker.py

test: mock sagemaker tests

2025-03-21 16:21:18 -07:00

test_scheduler.py

…

test_secret_detect_hook.py

[Refactor] Move LLM Guard, Secret Detection to Enterprise Pip packagea (#10782 )

2025-05-13 09:42:22 -07:00

test_simple_shuffle.py

…

test_spend_calculate_endpoint.py

…

test_stream_chunk_builder.py

test: update groq test - change on their end

2025-05-22 15:02:01 -07:00

test_streaming.py

Merge in - Gemini streaming - thinking content parsing - return in reasoning_content (#11298 )

2025-06-02 23:14:38 -07:00

test_supabase_integration.py

Litellm ruff linting enforcement (#5992 )

2024-10-01 19:44:20 -04:00

test_team_config.py

…

test_text_completion.py

Add key-level multi-instance tpm/rpm/max parallel request limiting (#10458 )

2025-04-30 21:32:31 -07:00

test_timeout.py

Update fireworks ai pricing (#10425 )

2025-04-29 20:58:05 -07:00

test_together_ai.py

…

test_tpm_rpm_routing_v2.py

test: update tests to new deployment model (#10142 )

2025-04-18 14:22:12 -07:00

test_traceloop.py

test: skip redundant test

2025-02-10 22:13:58 -08:00

test_ui_sso_helper_utils.py

LiteLLM Minor Fixes & Improvements (10/17/2024) (#6293 )

2024-10-17 22:09:11 -07:00

test_unit_test_caching.py

(Bug fix) - don't log messages in model_parameters in StandardLoggingPayload (#8932 )

2025-03-01 13:39:45 -08:00

test_update_spend.py

test_batch_update_spend

2025-04-01 07:12:29 -07:00

test_validate_environment.py

…

test_wandb.py

LiteLLM Minor Fixes & Improvements (11/05/2024) (#6590 )

2024-11-07 04:17:05 +05:30

test_whisper.py

refactor: update model handling in Azure and OpenAI audio transcription classes (#11333 )

2025-06-02 16:25:51 -07:00

user_cost.json

…

vertex_ai.jsonl

…

vertex_batch_completions.jsonl

(feat) add Vertex Batches API support in OpenAI format (#7032 )

2024-12-04 19:40:28 -08:00

vertex_key.json

ci/cd update vertex acct

2025-01-05 13:43:32 -08:00

whitelisted_bedrock_models.txt

Add supports_pdf_input: true to Claude 3.7 bedrock models (#9917 )

2025-05-01 14:56:54 -07:00