* fix: lazy load utils.py imports
Lazy-load most functions and response types from utils.py to avoid loading
tiktoken and other heavy dependencies at import time. This significantly
reduces memory usage when importing completion from litellm.
* fix: prevent memory leak in aiohttp connection pooling
Add connection limits to aiohttp TCPConnector to prevent unbounded
connection growth that causes memory leaks. Without these limits,
aiohttp's _wrap_create_connection can accumulate connections
indefinitely in long-running processes.
Changes:
- Set default limit of 300 total connections and 50 per host
- Apply limits to shared proxy session initialization
- Apply limits to HTTP handler transport creation
- Configurable via AIOHTTP_CONNECTOR_LIMIT and
AIOHTTP_CONNECTOR_LIMIT_PER_HOST environment variables
- Set to 0 for unlimited (not recommended for production)
This fix covers:
- All standard LLM provider API calls (OpenAI, Anthropic, etc.)
- Proxy server shared session
- Most guardrail HTTP calls
Impact: Prevents memory exhaustion in high-traffic deployments and
long-running proxy servers that make thousands of API calls.
Testing: Verified connection limits are applied correctly and
existing functionality remains unchanged.
Add Agent Lightning, Microsoft's open-source framework for training
AI agents with RL, APO, and SFT. Uses LiteLLM Proxy for LLM routing
and trace collection.
* fix(initial-commit): adding a way to get the right response type based on the api route
* feat(unified_guardrail.py): support streaming guardrails
* test: update tests
* fix: fix linting errors
* test: update tests
* test: add failing tests for organization budget enforcement bug
Add comprehensive tests exposing that organization-level budgets are
retrieved but never enforced during request authentication. Tests verify:
1. Basic org budget exceeded scenario (team under budget, org over)
2. Multiple teams collectively exceeding org budget
3. Organization budget fields exist but are never checked
4. Inconsistency between team budget enforcement (works) and org (doesn't)
Tests intentionally fail to document the bug. Will be fixed in next commit.
Related to organization_max_budget not being enforced in auth_checks.py
* fix: enforce organization budget in auth checks
Add organization budget enforcement to common_checks() in auth_checks.py.
Previously, organization_max_budget was retrieved from DB but never checked,
allowing teams to collectively exceed their organization's budget limit.
Changes:
- Add _organization_max_budget_check() function following team budget pattern
- Call org budget check after team budget check in common_checks()
- Add "organization_budget" to budget_alerts type literals
- Update tests to verify org budget is enforced
Budget hierarchy is now properly enforced:
Organization Budget (hard ceiling)
└─ Team Budget (sub-allocation)
└─ Team Member Budget (per-user within team)
└─ Key Budget (per-key)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fix: add organization_id to budget alerts, fix enum comparison and linting of newly added code
- Add organization_id field to CallInfo class for better alert context
- Include organization_id in budget alerts (token, soft, team, org)
- Fix event_group enum comparison (was comparing enum to string)
- Add OrganizationBudgetAlert class for organization budget alerting
- Add organization_budget to test parameterizations
- Apply Black formatting to slack_alerting.py
---------
Co-authored-by: Claude <noreply@anthropic.com>
Both frameworks integrate with LiteLLM:
- Google ADK uses LiteLLM for model-agnostic agent building
- Harbor uses LiteLLM for agent evaluation across providers
* docs: update getting started page
- Add Core Functions table with link to full list
- Add Responses API section
- Add Async section with acompletion() example
- Add "Switch Providers with One Line" example
- Clarify Basic Usage supports multiple endpoints
- Update models to current versions (openai/gpt-4o, anthropic/claude-sonnet-4)
- Use provider/model format throughout
- Fix deprecated import: from openai.error -> from openai
- Keep original structure: community key, More details links, observability env vars
* Cleanup: Remove orphan docs pages and Docusaurus template files
- Remove orphan getting_started.md (not linked in sidebar)
- Remove Docusaurus template intro.md
- Remove tutorial-basics/ directory (Docusaurus template)
- Remove tutorial-extras/ directory (Docusaurus template)
Replace Pydantic v1 `.dict()` method with v2 `.model_dump()` to fix
PydanticDeprecatedSince20 warnings. The `.dict()` method is deprecated
in Pydantic v2 and will be removed in v3.
Fixes#5987
* fix: conditionally pass enable_cleanup_closed to aiohttp TCPConnector
Fixes deprecation warning on Python 3.12.7+ and 3.13.1+ where
enable_cleanup_closed is no longer needed since the underlying
CPython SSL connection leak bug was fixed.
See: https://github.com/python/cpython/pull/118960
* chore: add aiohttp source reference to AIOHTTP_NEEDS_CLEANUP_CLOSED