goclaw

mirror of https://github.com/tiennm99/goclaw.git synced 2026-06-11 10:10:53 +00:00

Author	SHA1	Message	Date
viettranx	120fc2d09c	fix(media): chain provider format, post-write verification, group media history (#206 ) Cherry-picked valuable changes from PR #206: - hasReadImageProvider supports chain format {"providers":[...]} config - create_image/video/audio verify file persistence after write with diagnostic logging - HistoryEntry gains Media field + CollectMedia() for group media context on @mention - Zalo extractContentAndMedia refactored: all media types via DetectMIMEType/BuildMediaTags, 20MB limit - Discord/Zalo pass media paths to Record() and collect historical media on @mention - Zalo send_helpers logs directory contents when checkFileSize stat fails	2026-03-18 08:12:10 +07:00
Luan Vu	405a753239	fix: resolve media provider type from DB instead of guessing from name (#154 ) Media tools (create_image, create_video, create_audio, read_audio, read_video, read_document) routed API calls based on provider name pattern matching (e.g. strings.HasPrefix(name, "gemini")). This breaks when users give custom names to DB providers — a Gemini provider named "chatgpt-sap-het" would be misrouted to the OpenAI-compat endpoint, causing 404 errors. Fix: carry the DB provider_type through OpenAIProvider, resolve it via typedProvider interface in ExecuteWithChain, and inject as _provider_type param for callProvider routing. Name-based heuristic kept as fallback for config-file providers that don't have a DB type. Co-authored-by: Luvu182 <208665161+Luvu182@users.noreply.github.com>	2026-03-11 18:32:51 +07:00
Luan Vu	0592be359d	fix: remove legacy per-agent imageGen/vision override from tools_config (#153 ) The per-agent `imageGen` and `vision` fields in `ToolPolicySpec` (stored in agents.tools_config JSONB) were added in `d5cc5a7` (Feb 26) as the original way to configure image/vision providers. When the media provider chain system was introduced in `5815437` (Mar 8), these fields were kept "for backward compat" but became dead code with no UI to manage them. This causes a hard-to-debug issue: if an agent's tools_config contains stale imageGen/vision data (set via API or leftover from DB), it silently overrides the global provider chain configured in the builtin tools UI. Users see the correct chain in the UI but the tool calls a completely different provider/model, with no indication of why. Changes: - Remove Vision and ImageGen fields + struct definitions from ToolPolicySpec - Remove associated context helpers (WithVisionConfig, WithImageGenConfig, etc.) - Remove per-agent override injection in agent loop - Simplify create_image and read_image to use chain as sole source of truth - UI: whitelist known tools_config fields on save to clean stale DB data Co-authored-by: Luvu182 <208665161+Luvu182@users.noreply.github.com>	2026-03-11 17:37:55 +07:00
Luan Vu	fa5f51e72e	fix: allow OAuth providers in media tool chain (read_audio, read_image, etc.) (#150 ) ExecuteWithChain previously required all providers to implement credentialProvider (APIKey/APIBase). OAuth-based providers like CodexProvider (ChatGPT OAuth) don't expose static credentials, causing all media tools to fail with "does not expose API credentials". Make credentialProvider optional (nil when unsupported). Each callProvider gracefully falls back to the provider's Chat() API when credentials are unavailable. Generation tools (create_image, create_video, create_audio) return a clear error since they require direct API access with no Chat fallback. Co-authored-by: Luvu182 <208665161+Luvu182@users.noreply.github.com>	2026-03-11 16:40:35 +07:00
viettranx	bdb60de7ae	chore: upgrade Go 1.25 → 1.26 and apply go fix modernizations - Update go.mod and Dockerfile to Go 1.26 - Apply `go fix ./...` stdlib modernizations across 170+ files - Add `go fix` to post-implementation checklist in CLAUDE.md - Fix go fix misapplied rewrite in loop_history.go	2026-03-10 00:09:15 +07:00
viettranx	5815437f78	feat(tools): add media provider chain with ordered fallback and retry Refactor create_image and create_video to use a shared provider chain system. Each tool now supports an ordered list of providers with per-entry timeout, max retries, and provider-specific params. Includes MiniMax and DashScope image/video generation implementations. - New media_provider_chain.go: shared chain resolution, retry execution, limitedReadAll - create_image: refactored to ExecuteWithChain, added MiniMax + DashScope providers - create_video: refactored to ExecuteWithChain, added MiniMax async video generation - Backward compatible with legacy {provider, model} settings format	2026-03-08 20:09:43 +07:00
viettranx	0f2737ce53	feat(media): persistent media storage, read_document tool, and pipeline refactor - Add persistent media storage (internal/media/) replacing temp file deletion - Add MediaRef type for lightweight media references in session messages - Refactor media pipeline to use bus.MediaFile{Path, MimeType} across all channels - Add read_document builtin tool for PDF/DOCX/XLSX analysis via Gemini native API - Move image sanitization from Telegram to shared agent/media layer - Add media reload for multi-turn conversations (images from last 5 messages) - Add reply-to-message media resolution for Telegram (re-download on reply) - Add media inventory to compaction summary to preserve awareness after truncation - Fix coreToolSummaries for read_image, read_document, create_image tools - Add real-time trace update events via WebSocket broadcast - Improve trace detail UI with media refs and tool result display	2026-03-08 14:00:34 +07:00
viettranx	96845d1e44	fix(media): set result.Media on create_image and add MEDIA: fallback in subagent exec Root cause: create_image tool only set ForLLM:"MEDIA:path" but never populated result.Media. The main agent loop parses MEDIA: prefix via parseMediaResult(), but the subagent exec loop only checked result.Media — so media paths were silently lost for all subagent/spawn workflows. This caused the entire downstream pipeline (task.Media → AnnounceQueue → PublishInbound → ContentSuffix → session) to receive empty media, making images invisible in WS chat despite Telegram working fine. Fixes: - create_image.go: set result.Media = []string{imagePath} - subagent_exec.go: add MEDIA: prefix fallback parsing as safety net	2026-03-08 00:06:13 +07:00
viettranx	b62d46e50e	refactor(lint): apply Go best practices across codebase - Use errors.Is() instead of direct sentinel comparison (13 instances) - Convert if/else-if chains to switch/case for same-variable comparisons - Remove redundant bitwise OR with zero - Add post-implementation checklist to CLAUDE.md	2026-03-07 20:51:39 +07:00
viettranx	6ed62b8506	feat: channel-isolated workspace, resolvePath fix, create_image workspace, summoner Expertise section, bus Topic constants - Fix resolvePath for nested non-existent dirs (use resolveThroughExistingAncestors) - Channel-isolated workspace: user_agent_profiles.workspace stores channel prefix, used as source of truth with backward compat for existing users - Loop caches workspace per-user with CacheKindUserWorkspace invalidation via pubsub - ContractHome/ExpandHome for portable ~-based paths in DB - create_image saves to workspace/generated/YYYY-MM-DD/ instead of OS temp dir - SOUL.md template: add ## Expertise section for domain knowledge - Summoner buildEditPrompt: section guide, complete file output, frontmatter update - Bus: Topic* constants for Subscribe/Broadcast keys, CacheKind* for payload kinds - Teams, delegates, sessions, agent links: various enhancements Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-02 10:52:32 +07:00
viettranx	45ea0ee9a4	feat: Add native Gemini image generation support and refine media path stripping in agent output.	2026-02-28 18:44:51 +07:00
Michael	370c290642	fix(tools): use /images/generations endpoint for Gemini and OpenAI image gen (#9 ) create_image exclusively used /chat/completions with modalities:["image","text"] which only works on OpenRouter. Gemini returns HTTP 400: "Image generation is not yet supported on the chat.completions endpoint" OpenAI's DALL-E models also require /images/generations, not /chat/completions. Fix: route OpenRouter through /chat/completions (supports modalities), route all other providers (Gemini, OpenAI, etc.) through the standard /images/generations endpoint with response_format:"b64_json". Also update default Gemini model from deprecated gemini-2.0-flash-exp to gemini-2.5-flash-image.	2026-02-28 13:08:32 +07:00
viettranx	86d58e1021	feat: Introduce a new upgrade command and enhance built-in tool settings with provider and model configuration.	2026-02-27 11:38:04 +07:00
viettranx	d65d792646	feat: Implement built-in tool management with persistence, API, and UI.	2026-02-27 10:19:19 +07:00
viettranx	d5cc5a745d	feat: Implement vision capabilities and image generation tools, adding media handling, dedicated configurations, and trace optimization for image data.	2026-02-26 22:28:27 +07:00

15 Commits