mirror of
https://github.com/tiennm99/litellm.git
synced 2026-07-04 21:08:09 +00:00
docs aiohttp benchmarks
This commit is contained in:
@@ -0,0 +1,34 @@
|
||||
# LiteLLM v1.71.1 Benchmarks
|
||||
|
||||
## Overview
|
||||
|
||||
This document presents performance benchmarks comparing LiteLLM's v1.71.1 to prior litellm versions.
|
||||
|
||||
**Related PR:** [#11097](https://github.com/BerriAI/litellm/pull/11097)
|
||||
|
||||
## Testing Methodology
|
||||
|
||||
The load testing was conducted using the following parameters:
|
||||
- **Request Rate:** 200 RPS (Requests Per Second)
|
||||
- **User Ramp Up:** 200 concurrent users
|
||||
- **Transport Comparison:** httpx (existing) vs aiohttp (new implementation)
|
||||
- **Number of pods/instance of litellm:** 1
|
||||
- **Machine Specs:** 2 vCPUs, 4GB RAM
|
||||
|
||||
## Benchmark Results
|
||||
|
||||
| Metric | httpx (Existing) | aiohttp (LiteLLM v1.71.1) | Improvement | Calculation |
|
||||
|--------|------------------|-------------------|-------------|-------------|
|
||||
| **RPS** | 50.2 | 224 | **+346%** ✅ | (224 - 50.2) / 50.2 × 100 = 346% |
|
||||
| **Median Latency** | 2,500ms | 74ms | **-97%** ✅ | (74 - 2500) / 2500 × 100 = -97% |
|
||||
| **95th Percentile** | 5,600ms | 250ms | **-96%** ✅ | (250 - 5600) / 5600 × 100 = -96% |
|
||||
| **99th Percentile** | 6,200ms | 330ms | **-95%** ✅ | (330 - 6200) / 6200 × 100 = -95% |
|
||||
|
||||
## Key Improvements
|
||||
|
||||
- **4.5x increase** in requests per second (from 50.2 to 224 RPS)
|
||||
- **97% reduction** in median response time (from 2.5 seconds to 74ms)
|
||||
- **96% reduction** in 95th percentile latency (from 5.6 seconds to 250ms)
|
||||
- **95% reduction** in 99th percentile latency (from 6.2 seconds to 330ms)
|
||||
|
||||
|
||||
Reference in New Issue
Block a user