goclaw

mirror of https://github.com/tiennm99/goclaw.git synced 2026-06-18 03:30:53 +00:00

Author	SHA1	Message	Date
viettranx	120fc2d09c	fix(media): chain provider format, post-write verification, group media history (#206 ) Cherry-picked valuable changes from PR #206: - hasReadImageProvider supports chain format {"providers":[...]} config - create_image/video/audio verify file persistence after write with diagnostic logging - HistoryEntry gains Media field + CollectMedia() for group media context on @mention - Zalo extractContentAndMedia refactored: all media types via DetectMIMEType/BuildMediaTags, 20MB limit - Discord/Zalo pass media paths to Record() and collect historical media on @mention - Zalo send_helpers logs directory contents when checkFileSize stat fails	2026-03-18 08:12:10 +07:00
Luan Vu	405a753239	fix: resolve media provider type from DB instead of guessing from name (#154 ) Media tools (create_image, create_video, create_audio, read_audio, read_video, read_document) routed API calls based on provider name pattern matching (e.g. strings.HasPrefix(name, "gemini")). This breaks when users give custom names to DB providers — a Gemini provider named "chatgpt-sap-het" would be misrouted to the OpenAI-compat endpoint, causing 404 errors. Fix: carry the DB provider_type through OpenAIProvider, resolve it via typedProvider interface in ExecuteWithChain, and inject as _provider_type param for callProvider routing. Name-based heuristic kept as fallback for config-file providers that don't have a DB type. Co-authored-by: Luvu182 <208665161+Luvu182@users.noreply.github.com>	2026-03-11 18:32:51 +07:00
Luan Vu	fa5f51e72e	fix: allow OAuth providers in media tool chain (read_audio, read_image, etc.) (#150 ) ExecuteWithChain previously required all providers to implement credentialProvider (APIKey/APIBase). OAuth-based providers like CodexProvider (ChatGPT OAuth) don't expose static credentials, causing all media tools to fail with "does not expose API credentials". Make credentialProvider optional (nil when unsupported). Each callProvider gracefully falls back to the provider's Chat() API when credentials are unavailable. Generation tools (create_image, create_video, create_audio) return a clear error since they require direct API access with no Chat fallback. Co-authored-by: Luvu182 <208665161+Luvu182@users.noreply.github.com>	2026-03-11 16:40:35 +07:00
viettranx	bdb60de7ae	chore: upgrade Go 1.25 → 1.26 and apply go fix modernizations - Update go.mod and Dockerfile to Go 1.26 - Apply `go fix ./...` stdlib modernizations across 170+ files - Add `go fix` to post-implementation checklist in CLAUDE.md - Fix go fix misapplied rewrite in loop_history.go	2026-03-10 00:09:15 +07:00
viettranx	9d0af657e5	fix(tools): correct media provider params and UI fixes - MiniMax audio: fix invalid params (sample_rate/bitrate as int, auto instrumental when no lyrics, pass duration_seconds) - MiniMax/DashScope image: map aspect_ratio to provider size format - Gemini video: read person_generation from chain params instead of hardcode - GetParamInt: add string-to-int coercion for UI select values - UI: fix combobox portal selection bug, dropdown overflow clipping, provider chain form spacing, skill dialog min-height - Update bitrate options to numeric bps values in params schema	2026-03-08 22:32:08 +07:00
viettranx	5815437f78	feat(tools): add media provider chain with ordered fallback and retry Refactor create_image and create_video to use a shared provider chain system. Each tool now supports an ordered list of providers with per-entry timeout, max retries, and provider-specific params. Includes MiniMax and DashScope image/video generation implementations. - New media_provider_chain.go: shared chain resolution, retry execution, limitedReadAll - create_image: refactored to ExecuteWithChain, added MiniMax + DashScope providers - create_video: refactored to ExecuteWithChain, added MiniMax async video generation - Backward compatible with legacy {provider, model} settings format	2026-03-08 20:09:43 +07:00
viettranx	e1a6801a7a	fix(tools): correct Veo API, media ref ordering, video tag, and model verify - Fix create_video: use predictLongRunning API instead of generateContent (async polling flow: POST → poll every 10s → download video from URI) - Fix durationSeconds as int (not string) per actual Gemini API requirement - Fix MediaRef collection order: historical first, current last, so refs[len-1] always returns the most recent file (fixes read_audio picking up old file instead of current voice message) - Remove misleading "video not yet supported" text from Telegram handler that prevented LLM from calling read_video tool - Add isNonChatModel() to skip chat-based verify for generation models (veo-, dall-e-, imagen-, gemini--image)	2026-03-08 15:21:08 +07:00
viettranx	691ddce8fb	feat(tools): add read_audio, read_video, create_video tools and fix system prompt tool filtering - Add read_audio tool with Gemini File API, OpenAI input_audio, and fallback support - Add read_video tool with Gemini File API and base64 fallback for video analysis - Add create_video tool with Gemini Veo and OpenRouter chat completions support - Add shared gemini_file_api.go for upload → poll → generateContent pipeline - Add shared openai_compat_call.go for custom JSON chat completions - Fix system prompt showing denied tools: use filteredToolNames() instead of tools.List() - Wire audio/video MediaRef context propagation in agent loop - Register new tools in seed data, policy groups, and web UI settings - Enforce duration (max 30s) and aspect_ratio limits on create_video	2026-03-08 14:43:18 +07:00

8 Commits