mirror of
https://github.com/tiennm99/litellm.git
synced 2026-06-24 15:38:19 +00:00
360643e213
* Add Ask AI chat component to Usage page - Create UsageAIChatModal component with streaming chat interface - Integrate with existing model hub for model selection - Pass usage data context (spend, models, providers, keys) to AI - Add Ask AI button next to Export Data button in global view - Add tests for the new component and integration Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * Convert Ask AI from modal to right-side sliding panel - Replace UsageAIChatModal with UsageAIChatPanel - Panel slides in from right side, usage page stays visible - Full-height panel with header, model selector, chat area, and input - Smooth CSS transition for open/close animation - Update tests for new panel component (34 tests passing) Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * Remove build output directory from tracking Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * Add backend AI usage chat endpoint with tool calling Backend: - New /usage/ai/chat SSE streaming endpoint - AI agent has get_usage_data tool that queries /user/daily/activity/aggregated - Follows same architecture as policy AI suggest (litellm.acompletion + tools) - Non-admin users are restricted to their own data - 12 backend unit tests Frontend: - Panel now calls /usage/ai/chat backend endpoint via SSE - Removed direct OpenAI client calls from frontend - Added usageAiChatStream networking function following enrichPolicyTemplateStream pattern Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * Make model selection optional, default to gpt-4o-mini on backend Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * Add team/tag tools, status indicators, and improved AI agent - AI agent now has 3 tools: get_usage_data, get_team_usage_data, get_tag_usage_data - Stream status events (Thinking... Fetching... Analyzing...) to UI - Frontend shows spinner + status text during tool execution - Better system prompt guiding tool selection - Entity summariser for team/tag data with ranked breakdowns - 13 backend tests, 34 frontend tests passing Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * Fix: inject today's date into system prompt so AI resolves relative dates correctly Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * Show tool calls as distinct steps + render markdown in responses - Backend emits tool_call events with tool_name, label, args, and status - Frontend shows each tool call as a step with ✓/spinner/✗ indicator - Tool call steps show icon, label, date range, and filters - AI responses rendered with ReactMarkdown (bold, lists, tables, code) - Cursor-like UX: Thinking → tool calls → Analyzing → streamed answer Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * Refactor backend for code quality: proper types, constants, all functions ≤50 LOC - TypedDict for SSE events (SSEStatusEvent, SSEToolCallEvent, etc.) and ToolHandler - Constants for table names, entity fields, temperature, page sizes, top-N limits - Shared _query_activity() eliminates duplicated fetch logic - _accumulate_breakdown() + _ranked_lines() replace inline aggregation loops - Extracted _process_tool_call() and _stream_final_response() from main stream fn - Black + Ruff clean, all 15 functions verified ≤50 LOC - Replaced Tremor Button with Antd Button in panel (Tremor deprecated per AGENTS.md) Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * Address greptile review: security fixes and input validation - Restrict team/tag tools to admin-only users (non-admins only get get_usage_data) - Constrain ChatMessage.role to Literal['user', 'assistant'] to prevent system prompt injection - Add test for base tools restriction (non-admin gets 1 tool, admin gets 3) - Issues 3 (unused imports) and 4 (inline datetime) were already fixed in prior commit Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * Address greptile round 2: sanitize errors, defense-in-depth allowlist, revert tsconfig - Sanitize error messages: generic 'An internal error occurred' sent to client, full exception logged server-side via verbose_proxy_logger - Defense-in-depth: _process_tool_call validates fn_name against role-based allowlist before dispatch (even though LLM only receives allowed tools) - Revert tsconfig.json jsx back to 'preserve' (Next.js recommended default) Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * Role-scoped system prompt + additional test coverage - System prompt is now role-aware: admin sees all 3 tool descriptions, non-admin only sees get_usage_data (consistent with tool filtering) - Added tests: non-admin prompt excludes team/tag tools, date injection - 15 backend tests, 34 frontend tests passing Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> * Fix LLM arg validation + cap conversation size at 20 messages - _resolve_fetch_kwargs uses .get() with ValueError for missing dates (handles malformed LLM tool arguments gracefully) - MAX_CHAT_MESSAGES = 20 constant; backend truncates to last 20 - Frontend also sends only last 20 messages per request - Prevents excessive token usage and context-length errors Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com> --------- Co-authored-by: Cursor Agent <cursoragent@cursor.com> Co-authored-by: Ishaan Jaff <ishaan-jaff@users.noreply.github.com>