Commit Graph

5 Commits

Author SHA1 Message Date
Luan Vu 405a753239 fix: resolve media provider type from DB instead of guessing from name (#154)
Media tools (create_image, create_video, create_audio, read_audio,
read_video, read_document) routed API calls based on provider name
pattern matching (e.g. strings.HasPrefix(name, "gemini")). This breaks
when users give custom names to DB providers — a Gemini provider named
"chatgpt-sap-het" would be misrouted to the OpenAI-compat endpoint,
causing 404 errors.

Fix: carry the DB provider_type through OpenAIProvider, resolve it via
typedProvider interface in ExecuteWithChain, and inject as _provider_type
param for callProvider routing. Name-based heuristic kept as fallback
for config-file providers that don't have a DB type.

Co-authored-by: Luvu182 <208665161+Luvu182@users.noreply.github.com>
2026-03-11 18:32:51 +07:00
Luan Vu fa5f51e72e fix: allow OAuth providers in media tool chain (read_audio, read_image, etc.) (#150)
ExecuteWithChain previously required all providers to implement
credentialProvider (APIKey/APIBase). OAuth-based providers like
CodexProvider (ChatGPT OAuth) don't expose static credentials,
causing all media tools to fail with "does not expose API credentials".

Make credentialProvider optional (nil when unsupported). Each
callProvider gracefully falls back to the provider's Chat() API
when credentials are unavailable. Generation tools (create_image,
create_video, create_audio) return a clear error since they require
direct API access with no Chat fallback.

Co-authored-by: Luvu182 <208665161+Luvu182@users.noreply.github.com>
2026-03-11 16:40:35 +07:00
viettranx bdb60de7ae chore: upgrade Go 1.25 → 1.26 and apply go fix modernizations
- Update go.mod and Dockerfile to Go 1.26
- Apply `go fix ./...` stdlib modernizations across 170+ files
- Add `go fix` to post-implementation checklist in CLAUDE.md
- Fix go fix misapplied rewrite in loop_history.go
2026-03-10 00:09:15 +07:00
viettranx 01d75ac7fe refactor(tools): migrate read_* tools to provider chain and add media models
Refactor read_image, read_document, read_video, read_audio to use
ResolveMediaProviderChain + ExecuteWithChain for consistent fallback behavior.
Add hardcoded model lists for MiniMax, DashScope, and Suno providers.
2026-03-08 20:10:10 +07:00
viettranx 0f2737ce53 feat(media): persistent media storage, read_document tool, and pipeline refactor
- Add persistent media storage (internal/media/) replacing temp file deletion
- Add MediaRef type for lightweight media references in session messages
- Refactor media pipeline to use bus.MediaFile{Path, MimeType} across all channels
- Add read_document builtin tool for PDF/DOCX/XLSX analysis via Gemini native API
- Move image sanitization from Telegram to shared agent/media layer
- Add media reload for multi-turn conversations (images from last 5 messages)
- Add reply-to-message media resolution for Telegram (re-download on reply)
- Add media inventory to compaction summary to preserve awareness after truncation
- Fix coreToolSummaries for read_image, read_document, create_image tools
- Add real-time trace update events via WebSocket broadcast
- Improve trace detail UI with media refs and tool result display
2026-03-08 14:00:34 +07:00