Files
litellm/tests/test_litellm/proxy/management_endpoints/usage_endpoints
Ishaan Jaff 360643e213 [Feat] UI - Allow using AI to understand Usage patterns (#22042)
* Add Ask AI chat component to Usage page

- Create UsageAIChatModal component with streaming chat interface
- Integrate with existing model hub for model selection
- Pass usage data context (spend, models, providers, keys) to AI
- Add Ask AI button next to Export Data button in global view
- Add tests for the new component and integration

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* Convert Ask AI from modal to right-side sliding panel

- Replace UsageAIChatModal with UsageAIChatPanel
- Panel slides in from right side, usage page stays visible
- Full-height panel with header, model selector, chat area, and input
- Smooth CSS transition for open/close animation
- Update tests for new panel component (34 tests passing)

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* Remove build output directory from tracking

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* Add backend AI usage chat endpoint with tool calling

Backend:
- New /usage/ai/chat SSE streaming endpoint
- AI agent has get_usage_data tool that queries /user/daily/activity/aggregated
- Follows same architecture as policy AI suggest (litellm.acompletion + tools)
- Non-admin users are restricted to their own data
- 12 backend unit tests

Frontend:
- Panel now calls /usage/ai/chat backend endpoint via SSE
- Removed direct OpenAI client calls from frontend
- Added usageAiChatStream networking function following enrichPolicyTemplateStream pattern

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* Make model selection optional, default to gpt-4o-mini on backend

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* Add team/tag tools, status indicators, and improved AI agent

- AI agent now has 3 tools: get_usage_data, get_team_usage_data, get_tag_usage_data
- Stream status events (Thinking... Fetching... Analyzing...) to UI
- Frontend shows spinner + status text during tool execution
- Better system prompt guiding tool selection
- Entity summariser for team/tag data with ranked breakdowns
- 13 backend tests, 34 frontend tests passing

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* Fix: inject today's date into system prompt so AI resolves relative dates correctly

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* Show tool calls as distinct steps + render markdown in responses

- Backend emits tool_call events with tool_name, label, args, and status
- Frontend shows each tool call as a step with ✓/spinner/✗ indicator
- Tool call steps show icon, label, date range, and filters
- AI responses rendered with ReactMarkdown (bold, lists, tables, code)
- Cursor-like UX: Thinking → tool calls → Analyzing → streamed answer

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* Refactor backend for code quality: proper types, constants, all functions ≤50 LOC

- TypedDict for SSE events (SSEStatusEvent, SSEToolCallEvent, etc.) and ToolHandler
- Constants for table names, entity fields, temperature, page sizes, top-N limits
- Shared _query_activity() eliminates duplicated fetch logic
- _accumulate_breakdown() + _ranked_lines() replace inline aggregation loops
- Extracted _process_tool_call() and _stream_final_response() from main stream fn
- Black + Ruff clean, all 15 functions verified ≤50 LOC
- Replaced Tremor Button with Antd Button in panel (Tremor deprecated per AGENTS.md)

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* Address greptile review: security fixes and input validation

- Restrict team/tag tools to admin-only users (non-admins only get get_usage_data)
- Constrain ChatMessage.role to Literal['user', 'assistant'] to prevent system prompt injection
- Add test for base tools restriction (non-admin gets 1 tool, admin gets 3)
- Issues 3 (unused imports) and 4 (inline datetime) were already fixed in prior commit

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* Address greptile round 2: sanitize errors, defense-in-depth allowlist, revert tsconfig

- Sanitize error messages: generic 'An internal error occurred' sent to client,
  full exception logged server-side via verbose_proxy_logger
- Defense-in-depth: _process_tool_call validates fn_name against role-based
  allowlist before dispatch (even though LLM only receives allowed tools)
- Revert tsconfig.json jsx back to 'preserve' (Next.js recommended default)

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* Role-scoped system prompt + additional test coverage

- System prompt is now role-aware: admin sees all 3 tool descriptions,
  non-admin only sees get_usage_data (consistent with tool filtering)
- Added tests: non-admin prompt excludes team/tag tools, date injection
- 15 backend tests, 34 frontend tests passing

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

* Fix LLM arg validation + cap conversation size at 20 messages

- _resolve_fetch_kwargs uses .get() with ValueError for missing dates
  (handles malformed LLM tool arguments gracefully)
- MAX_CHAT_MESSAGES = 20 constant; backend truncates to last 20
- Frontend also sends only last 20 messages per request
- Prevents excessive token usage and context-length errors

Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>

---------

Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>
2026-02-24 16:40:04 -08:00
..