Files
litellm/AGENTS.md
T
Ishaan Jaff cfd0e2cf99 [Feat] UI Polish - MCP Servers page - show transport type (#23051)
* Update AGENTS.md with additional Cursor Cloud setup notes

- Add note about openapi-core dependency needed for OpenAPI compliance tests
- Add note about poetry lock fallback when lock file is out of sync

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* Sync lock files with current dependency specs

- poetry.lock: regenerated to match pyproject.toml (litellm-proxy-extras 0.4.50 -> 0.4.51)
- package-lock.json: updated from npm install

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* Polish MCP Servers UI for enterprise-grade look and feel

10 improvements to the MCP Servers table and related components:

1. Remove debug console.logs from mcp_servers.tsx
2. Fix health status icons: distinct ✓/✗/? per state instead of identical dots
3. Health status badges: proper pill styling with rounded-full and borders
4. Health loading state: subtle pulsing dot instead of raw SVG spinner
5. Transport column: color-coded badges (HTTP=blue, SSE=purple, STDIO=amber, OPENAPI=teal)
6. Auth type column: color-coded badges (oauth2=indigo, bearer_token=sky, api_key=emerald)
7. Server ID chip: rounded corners, border, and transition effect
8. Filter bar: lighter border, cleaner labels, vertical divider between filters
9. Network Access: pill badges with colored dots (Public/Internal)
10. Date columns: shorter headers, dash for missing values, tooltip with full datetime

Also:
- Improved delete modal: cleaner layout, neutral background instead of red
- Access Groups column: shows first group with +N count instead of truncated text
- Empty state message includes CTA guidance
- Updated test to match renamed filter label

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* Polish MCP server detail views and table refinements (round 2)

10 more enterprise polish improvements:

1. Overview cards: use color-coded badges for Transport and Auth Type values
2. Overview cards: fix 'Host Url' typo -> 'Host URL', uppercase card labels
3. Settings tab: show em-dash placeholder for empty/missing values
4. Settings tab: use consistent Transport/Auth/Network badge styling matching table
5. Settings tab: definition-list layout with label/value grid columns
6. Server detail header: show server name prominently with alias as badge
7. Server detail header: show description below name, smaller server ID
8. Actions column: improved hover states with background color transitions
9. Credential column: pill badge for Connected state, shadow on Connect button
10. Table header: server count badge next to title, CTA button moved right

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* Revert colorful transport/auth badges to neutral gray

Color should only carry semantic meaning. Transport type (HTTP/SSE) and
auth type (oauth2/bearer_token) are informational labels, not status
indicators, so they use a uniform gray badge.

Color remains on:
- Health status: green (healthy), red (unhealthy)
- Network access: green (public), orange (internal)
- Credential: green (connected)

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

---------

Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
2026-03-07 13:05:46 -08:00

274 lines
13 KiB
Markdown

# INSTRUCTIONS FOR LITELLM
This document provides comprehensive instructions for AI agents working in the LiteLLM repository.
## OVERVIEW
LiteLLM is a unified interface for 100+ LLMs that:
- Translates inputs to provider-specific completion, embedding, and image generation endpoints
- Provides consistent OpenAI-format output across all providers
- Includes retry/fallback logic across multiple deployments (Router)
- Offers a proxy server (LLM Gateway) with budgets, rate limits, and authentication
- Supports advanced features like function calling, streaming, caching, and observability
## REPOSITORY STRUCTURE
### Core Components
- `litellm/` - Main library code
- `llms/` - Provider-specific implementations (OpenAI, Anthropic, Azure, etc.)
- `proxy/` - Proxy server implementation (LLM Gateway)
- `router_utils/` - Load balancing and fallback logic
- `types/` - Type definitions and schemas
- `integrations/` - Third-party integrations (observability, caching, etc.)
### Key Directories
- `tests/` - Comprehensive test suites
- `docs/my-website/` - Documentation website
- `ui/litellm-dashboard/` - Admin dashboard UI
- `enterprise/` - Enterprise-specific features
## DEVELOPMENT GUIDELINES
### MAKING CODE CHANGES
1. **Provider Implementations**: When adding/modifying LLM providers:
- Follow existing patterns in `litellm/llms/{provider}/`
- Implement proper transformation classes that inherit from `BaseConfig`
- Support both sync and async operations
- Handle streaming responses appropriately
- Include proper error handling with provider-specific exceptions
2. **Type Safety**:
- Use proper type hints throughout
- Update type definitions in `litellm/types/`
- Ensure compatibility with both Pydantic v1 and v2
3. **Testing**:
- Add tests in appropriate `tests/` subdirectories
- Include both unit tests and integration tests
- Test provider-specific functionality thoroughly
- Consider adding load tests for performance-critical changes
### MAKING CODE CHANGES FOR THE UI (IGNORE FOR BACKEND)
1. **Tremor is DEPRECATED, do not use Tremor components in new features/changes**
- The only exception is the Tremor Table component and its required Tremor Table sub components.
2. **Use Common Components as much as possible**:
- These are usually defined in the `common_components` directory
- Use these components as much as possible and avoid building new components unless needed
3. **Testing**:
- The codebase uses **Vitest** and **React Testing Library**
- **Query Priority Order**: Use query methods in this order: `getByRole`, `getByLabelText`, `getByPlaceholderText`, `getByText`, `getByTestId`
- **Always use `screen`** instead of destructuring from `render()` (e.g., use `screen.getByText()` not `getByText`)
- **Wrap user interactions in `act()`**: Always wrap `fireEvent` calls with `act()` to ensure React state updates are properly handled
- **Use `query` methods for absence checks**: Use `queryBy*` methods (not `getBy*`) when expecting an element to NOT be present
- **Test names must start with "should"**: All test names should follow the pattern `it("should ...")`
- **Mock external dependencies**: Check `setupTests.ts` for global mocks and mock child components/networking calls as needed
- **Structure tests properly**:
- First test should verify the component renders successfully
- Subsequent tests should focus on functionality and user interactions
- Use `waitFor` for async operations that aren't already awaited
- **Avoid using `querySelector`**: Prefer React Testing Library queries over direct DOM manipulation
### IMPORTANT PATTERNS
1. **Function/Tool Calling**:
- LiteLLM standardizes tool calling across providers
- OpenAI format is the standard, with transformations for other providers
- See `litellm/llms/anthropic/chat/transformation.py` for complex tool handling
2. **Streaming**:
- All providers should support streaming where possible
- Use consistent chunk formatting across providers
- Handle both sync and async streaming
3. **Error Handling**:
- Use provider-specific exception classes
- Maintain consistent error formats across providers
- Include proper retry logic and fallback mechanisms
4. **Configuration**:
- Support both environment variables and programmatic configuration
- Use `BaseConfig` classes for provider configurations
- Allow dynamic parameter passing
## PROXY SERVER (LLM GATEWAY)
The proxy server is a critical component that provides:
- Authentication and authorization
- Rate limiting and budget management
- Load balancing across multiple models/deployments
- Observability and logging
- Admin dashboard UI
- Enterprise features
Key files:
- `litellm/proxy/proxy_server.py` - Main server implementation
- `litellm/proxy/auth/` - Authentication logic
- `litellm/proxy/management_endpoints/` - Admin API endpoints
**Database (proxy)**: Use Prisma model methods (`prisma_client.db.<model>.upsert`, `.find_many`, `.find_unique`, etc.), not raw SQL (`execute_raw`/`query_raw`). See COMMON PITFALLS for details.
## MCP (MODEL CONTEXT PROTOCOL) SUPPORT
LiteLLM supports MCP for agent workflows:
- MCP server integration for tool calling
- Transformation between OpenAI and MCP tool formats
- Support for external MCP servers (Zapier, Jira, Linear, etc.)
- See `litellm/experimental_mcp_client/` and `litellm/proxy/_experimental/mcp_server/`
## RUNNING SCRIPTS
Use `poetry run python script.py` to run Python scripts in the project environment (for non-test files).
## GITHUB TEMPLATES
When opening issues or pull requests, follow these templates:
### Bug Reports (`.github/ISSUE_TEMPLATE/bug_report.yml`)
- Describe what happened vs. expected behavior
- Include relevant log output
- Specify LiteLLM version
- Indicate if you're part of an ML Ops team (helps with prioritization)
### Feature Requests (`.github/ISSUE_TEMPLATE/feature_request.yml`)
- Clearly describe the feature
- Explain motivation and use case with concrete examples
### Pull Requests (`.github/pull_request_template.md`)
- Add at least 1 test in `tests/litellm/`
- Ensure `make test-unit` passes
## TESTING CONSIDERATIONS
1. **Provider Tests**: Test against real provider APIs when possible
2. **Proxy Tests**: Include authentication, rate limiting, and routing tests
3. **Performance Tests**: Load testing for high-throughput scenarios
4. **Integration Tests**: End-to-end workflows including tool calling
## DOCUMENTATION
- Keep documentation in sync with code changes
- Update provider documentation when adding new providers
- Include code examples for new features
- Update changelog and release notes
## SECURITY CONSIDERATIONS
- Handle API keys securely
- Validate all inputs, especially for proxy endpoints
- Consider rate limiting and abuse prevention
- Follow security best practices for authentication
## ENTERPRISE FEATURES
- Some features are enterprise-only
- Check `enterprise/` directory for enterprise-specific code
- Maintain compatibility between open-source and enterprise versions
## COMMON PITFALLS TO AVOID
1. **Breaking Changes**: LiteLLM has many users - avoid breaking existing APIs
2. **Provider Specifics**: Each provider has unique quirks - handle them properly
3. **Rate Limits**: Respect provider rate limits in tests
4. **Memory Usage**: Be mindful of memory usage in streaming scenarios
5. **Dependencies**: Keep dependencies minimal and well-justified
6. **UI/Backend Contract Mismatch**: When adding a new entity type to the UI, always check whether the backend endpoint accepts a single value or an array. Match the UI control accordingly (single-select vs. multi-select) to avoid silently dropping user selections
7. **Missing Tests for New Entity Types**: When adding a new entity type (e.g., in `EntityUsage`, `UsageViewSelect`), always add corresponding tests in the existing test files and update any icon/component mocks
8. **Raw SQL in proxy DB code**: Do not use `execute_raw` or `query_raw` for proxy database access. Use Prisma model methods (e.g. `prisma_client.db.litellm_tooltable.upsert()`, `.find_many()`, `.find_unique()`) so behavior stays consistent with the schema, the client stays mockable in tests, and you avoid the pitfalls of hand-written SQL (parameter ordering, type casting, schema drift)
8. **Do not hardcode model-specific flags**: Put model-specific capability flags in `model_prices_and_context_window.json` and read them via `get_model_info` (or existing helpers like `supports_reasoning`). This prevents users from needing to upgrade LiteLLM each time a new model supports a feature.
**Example of BAD** (hardcoded model checks):
```python
@staticmethod
def _is_effort_supported_model(model: str) -> bool:
"""Check if the model supports the output_config.effort parameter..."""
model_lower = model.lower()
if AnthropicConfig._is_claude_4_6_model(model):
return True
return any(
v in model_lower for v in ("opus-4-5", "opus_4_5", "opus-4.5", "opus_4.5")
)
```
**Example of GOOD** (config-driven or helper that reads from config):
```python
if (
"claude-3-7-sonnet" in model
or AnthropicConfig._is_claude_4_6_model(model)
or supports_reasoning(
model=model,
custom_llm_provider=self.custom_llm_provider,
)
):
...
```
Using helpers like `supports_reasoning` (which read from `model_prices_and_context_window.json` / `get_model_info`) allows future model updates to "just work" without code changes.
9. **Never close HTTP/SDK clients on cache eviction**: Do not add `close()`, `aclose()`, or `create_task(close_fn())` inside `LLMClientCache._remove_key()` or any cache eviction path. Evicted clients may still be held by in-flight requests; closing them causes `RuntimeError: Cannot send a request, as the client has been closed.` in production after the cache TTL (1 hour) expires. Connection cleanup is handled at shutdown by `close_litellm_async_clients()`. See PR #22247 for the full incident history.
## HELPFUL RESOURCES
- Main documentation: https://docs.litellm.ai/
- Provider-specific docs in `docs/my-website/docs/providers/`
- Admin UI for testing proxy features
## WHEN IN DOUBT
- Follow existing patterns in the codebase
- Check similar provider implementations
- Ensure comprehensive test coverage
- Update documentation appropriately
- Consider backward compatibility impact
## Cursor Cloud specific instructions
### Environment
- Poetry is installed in `~/.local/bin`; the update script ensures it is on `PATH`.
- Python 3.12, Node 22 are pre-installed.
- The virtual environment lives under `~/.cache/pypoetry/virtualenvs/`.
### Running the proxy server
Start the proxy with a config file:
```bash
poetry run litellm --config dev_config.yaml --port 4000
```
The proxy takes ~15-20 seconds to fully start (it runs Prisma migrations on boot). Wait for `/health` to return before sending requests. Without a PostgreSQL `DATABASE_URL`, the proxy connects to a default Neon dev database embedded in the `litellm-proxy-extras` package.
### Running tests
See `CLAUDE.md` and the `Makefile` for standard commands. Key notes:
- `psycopg-binary` must be installed (`poetry run pip install psycopg-binary`) because the pytest-postgresql plugin requires it and the lock file only includes `psycopg` (no binary).
- `openapi-core` must be installed (`poetry run pip install openapi-core`) for the OpenAPI compliance tests in `tests/test_litellm/interactions/`.
- The `--timeout` pytest flag is NOT available; don't pass it.
- Unit tests: `poetry run pytest tests/test_litellm/ -x -vv -n 4`
- Black `--check` may report pre-existing formatting issues; this does not block test runs.
- If `poetry install` fails with "pyproject.toml changed significantly since poetry.lock was last generated", run `poetry lock` first to regenerate the lock file.
### Lint
```bash
cd litellm && poetry run ruff check .
```
Ruff is the primary fast linter. For the full lint suite (including mypy, black, circular imports), run `make lint` per `CLAUDE.md`.
### UI Dashboard development
- The UI is at `ui/litellm-dashboard/`. Run `npm run dev` from that directory for the Next.js dev server on port 3000.
- The proxy at port 4000 serves a **pre-built** static UI from `litellm/proxy/_experimental/out/`. After making UI code changes, you must run `npm run build` in the dashboard directory and copy the output: `cp -r ui/litellm-dashboard/out/* litellm/proxy/_experimental/out/` for the proxy to serve the updated UI.
- SVGs used as provider logos (loaded via `<img>` tags) must NOT use `fill="currentColor"` — replace with an explicit color like `#000000` or use the `-color` variant from lobehub icons, since CSS color inheritance does not work inside `<img>` elements.
- Provider logos live in `ui/litellm-dashboard/public/assets/logos/` (source) and `litellm/proxy/_experimental/out/assets/logos/` (pre-built). Both locations must have the file for it to work in dev and proxy-served modes.
- UI Vitest tests: `cd ui/litellm-dashboard && npx vitest run`