* fix(providers): auto-clamp max_tokens on model rejection + fix verify for reasoning models
When OpenAI-compat models reject max_tokens as too large (e.g. gpt-3.5-turbo
supports 4096 but we send 8192), parse the model's stated limit from the 400
error, clamp the value, and retry once. This fixes agent creation for models
with lower output token limits without hardcoding model names.
Also increase the provider verify endpoint's max_tokens from 1 to 50 so
reasoning models (gpt-5, o-series) have enough headroom for internal
reasoning during the check call.
Closes#248, closes#245
* refactor(providers): extract chat retry closure + fix clamp log key
- Extract duplicate retry closure into chatRequestFn() to follow DRY
- Fix slog logging wrong key: body["max_tokens"] was nil for reasoning
models that use max_completion_tokens — now uses clampedLimit() helper
- Remove unnecessary _ = resp in provider verify endpoint
---------
Co-authored-by: viettranx <viettranx@gmail.com>
- Lead agents: auto-resolve team workspace as default (relative paths)
- Dispatched members: team workspace as default via req.TeamWorkspace
- Direct-chat members: own workspace default, team workspace accessible
- Add dataDir field to Loop/LoopConfig for global workspace root
- System prompt shows team workspace absolute path for model guidance
- Remove orphan task detector (superseded by post-turn dispatch)
- Log warning on OpenAI tool call argument parse failures
- Move cache_control from request root (ignored by API) to per-block
placement on last system block and last tool definition
- Change system prompt time format to date-only for better cache stability
- Add builtin datetime tool for precise timestamps (cron, memory, etc.)
- Add atMs past-time validation in cron handleUpdate (was only in handleAdd)
- Update cron description to guide model to use datetime tool first
Remove ModelThinkingCapable interface and ChatRequest.ModelSupportsThinking
hint field — DashScope handles per-model checks internally via its own
whitelist. Fix double applyThinkingGuard on ChatStream→Chat tool fallback
by calling OpenAIProvider.Chat directly.
* feat(providers/dashscope): guard enable_thinking injection by per-model capability check
Introduces ModelThinkingCapable interface and ModelSupportsThinking field
on ChatRequest so DashScope can skip thinking-param injection for models
that do not support it (e.g. qwen3-plus, qwen3-turbo), preventing
\"model not supported\" API errors.
- types.go: add ModelThinkingCapable interface + ModelSupportsThinking *bool on ChatRequest
- dashscope.go: add dashscopeThinkingModels whitelist + ModelSupportsThinking(); honour pre-computed hint
- loop.go: detect ModelThinkingCapable and set hint on ChatRequest before LLM call
- provider_models.go: add qwen3.5-plus / qwen3.5-turbo to DashScope model list
- dashscope_test.go: full test suite for whitelist, injection, hint override, budget mapping
* Fix review code.
---------
Co-authored-by: Nguyen Gia Hoang Vinh <vinhngh@runsystem.net>
Replace all hardcoded ~/.goclaw path constructions with configurable
sources (cfg.ResolvedDataDir() for service dirs, cfg.Agents.Defaults.Workspace
for agent workspaces). This fixes data persistence issues in Docker
deployments where paths differ from local dev.
- Add DataDir field to Config with ResolvedDataDir() resolver
- Add ResolvedDataDirFromEnv() package-level helper for packages without Config
- Populate StoreConfig.SkillsStorageDir (was never set, caused hardcoded fallback)
- Agent workspaces now use subdirectory format (workspace/{key}) for volume compatibility
- Remove dead GOCLAW_SESSIONS_STORAGE env/config (sessions moved to PostgreSQL)
- Fix deploy-stg.sh trailing space after backslash + remove deprecated GOCLAW_MODE
- Add GOCLAW_SKILLS_DIR override in docker-compose for volume persistence
GPT-5 and o-series models reject the legacy max_tokens parameter and
require max_completion_tokens instead. Additionally, gpt-5-mini,
gpt-5-nano, and o-series models only support the default temperature (1).
This patch adds model-aware branching in buildRequestBody() to:
- Send max_completion_tokens instead of max_tokens for gpt-5*/o1/o3/o4
- Skip temperature parameter for gpt-5-mini/gpt-5-nano/o-series
Co-authored-by: Marcelo Emmerich <marcelo@agenticsystems.de>
* feat(providers): add ACP provider for orchestrating external coding agents (#189)
Implement native Go ACP (Agent Client Protocol) client as a new Provider.
Enables GoClaw to orchestrate any ACP-compatible agent (Claude Code, Codex
CLI, Gemini CLI) as a subprocess via JSON-RPC 2.0 over stdio.
- Add bidirectional JSON-RPC 2.0 transport over stdio pipes
- Add subprocess process pool with idle TTL reaping and crash recovery
- Add ACP session lifecycle (initialize, session/new, session/prompt)
- Add tool bridge for agent-initiated fs/terminal/permission requests
- Add workspace sandboxing, shell deny patterns, and env var filtering
- Wire config-based and DB-based provider registration paths
- Export DefaultDenyPatterns from tools package for reuse
* feat(providers): add changelog entry for ACP provider integration
* fix(tools): prevent workspace traversal bypass via /tmp/ fallback in resolveMediaPath
Reject paths containing ".." in the isInTempDir fallback to prevent
workspace escape where traversal path still resolves inside /tmp/.
* fix(tools): block workspace-sibling paths in resolveMediaPath /tmp/ fallback
When workspace is inside /tmp/, traversal paths like workspace/../X
resolve to /tmp/ siblings that pass isInTempDir. Reject paths inside
the workspace parent directory to prevent this escape.
* feat(providers): add ACP provider web UI and live reload via pubsub
Web UI for creating/editing ACP providers with dedicated form fields
(binary, args, idle TTL, permission mode, work directory). ACP providers
now update immediately without gateway restart via cache invalidation
pubsub pattern.
Frontend:
- New ACPSection form component with i18n (en/vi/zh)
- Provider form dialog integration with ACP state management
- ACP type badge on providers list page
- Settings field added to provider TypeScript types
Backend:
- ACP models handler (claude/codex/gemini) without API key requirement
- Binary path validation + LookPath verification in verify handler
- Provider CRUD emits cache.invalidate events via msgBus
- Subscriber in gateway_managed.go re-registers ACP providers from DB
- ACP core improvements from code review (helpers, jsonrpc, process,
terminal, tool_bridge)
---------
Co-authored-by: viettranx <viettranx@gmail.com>
* feat(workspace): add team shared workspace for file collaboration
- Add workspace_write and workspace_read tools for agents to share files across team members
- Create team_workspaces DB table with migration 000017 (file metadata, pinning, tags)
- Implement PostgreSQL store layer for workspace CRUD operations
- Add RPC handlers for workspace list/read/delete from web UI
- Build React workspace tab with file listing, content preview, and delete
- Propagate workspace channel/chatID scope through delegation chain
- Auto-allow workspace tools in agent tool policy when agent belongs to a team
- Inject team workspace guidance into system prompt for team agents
- Add /reset command handler for clearing session history
- Harden MCP bridge context middleware to reject headers when no gateway token
- Add i18n strings for workspace UI in en/vi/zh locales
* feat(teams): add comprehensive task management with followup reminders and recovery
- Add task followup/reminder system with auto-set on lead agent reply and auto-clear when user responds on channel
- Add task recovery ticker to re-dispatch stale/pending tasks periodically
- Add task scopes, filtering by status/channel/chatID, and task events
- Add WS RPC handlers for task CRUD, assignments, comments, events, and bulk operations (teams_tasks.go)
- Add task detail dialog, settings UI for followup config, and scope filtering in web dashboard
- Add migrations 000018 (team_tasks_v2) and 000019 (task_followup)
- Extend team_tasks_tool with await_reply, clear_followup actions
- Auto-complete/fail team tasks when delegate agent finishes
- Add workspace file listing and team tool manager enhancements
* docs(teams): add team system architecture and playbook ideas documentation
- Add TEAM_SYSTEM.md with full architecture design covering task management, shared workspace, and delegation engine subsystems
- Add TEAM_PLAYBOOK_IDEAS.md outlining future team coordination layers (playbook, member capabilities, auto-learned patterns)
- Document data models, status flows, tool actions, followup reminder system, task ticker, execution locking, and workspace scope model
* fix(teams): resolve 6 critical bugs in team task system
- Fix unblock SQL: check array_length after array_remove (not before)
- Enforce single-team leadership in team creation
- Add requireLead() for approve/reject tool actions
- Validate cross-team dependency references in blocked_by
- Add team_id to handoff route for multi-team isolation
- Set blocked_by DEFAULT '{}' to prevent NULL array issues
* refactor(workspace): use stable userID as scope key instead of connection UUID
Workspace scope changed from (team_id, channel, chat_id) to (team_id, userID).
Fixes workspace fragmentation across WS tab refreshes and reconnections.
* feat(teams): add V1/V2 versioning with feature gating and optimized prompts
- IsTeamV2() helper gates advanced features (locking, followup, review, audit)
- V2 tool actions rejected for V1 teams with clear error message
- Ticker, gateway consumer, delegation hooks respect version flag
- TEAM.md renders v1/v2 sections conditionally
- Tool descriptions and params optimized (~38% token reduction)
- UI: version toggle in settings, V2 Beta badge, conditional rendering
- i18n: version modal keys for en/vi/zh
* fix(migration): use VARCHAR(255) for user ID columns and add metadata JSONB
- assignee_user_id, user_id, actor_id: TEXT → VARCHAR(255)
- Add metadata JSONB to team_task_comments and team_task_attachments
---------
Co-authored-by: Nam Nguyen Ngoc <namnn.0911@gmail.com>
Media tools (create_image, create_video, create_audio, read_audio,
read_video, read_document) routed API calls based on provider name
pattern matching (e.g. strings.HasPrefix(name, "gemini")). This breaks
when users give custom names to DB providers — a Gemini provider named
"chatgpt-sap-het" would be misrouted to the OpenAI-compat endpoint,
causing 404 errors.
Fix: carry the DB provider_type through OpenAIProvider, resolve it via
typedProvider interface in ExecuteWithChain, and inject as _provider_type
param for callProvider routing. Name-based heuristic kept as fallback
for config-file providers that don't have a DB type.
Co-authored-by: Luvu182 <208665161+Luvu182@users.noreply.github.com>
- Update go.mod and Dockerfile to Go 1.26
- Apply `go fix ./...` stdlib modernizations across 170+ files
- Add `go fix` to post-implementation checklist in CLAUDE.md
- Fix go fix misapplied rewrite in loop_history.go
Fixes#94. The stream-json scanner never checked scanner.Err() after the
scan loop, so if a line exceeded the 1MB buffer limit the scanner would
stop with bufio.ErrTooLong and the response would be silently truncated.
- Check scanner.Err() after the loop and return an explicit error
- Increase max buffer from 1MB to 10MB to handle large tool outputs
Co-authored-by: Luvu182 <208665161+Luvu182@users.noreply.github.com>
* fix(mcp-bridge): add per-session agent context and HMAC verification
- Add per-session MCP config with X-Agent-ID/X-User-ID headers instead
of shared global config file
- Sign bridge context headers with HMAC-SHA256 to prevent forgery
- Add bridgeContextMiddleware to verify signatures on MCP bridge requests
- Store MCP configs in ~/.goclaw/mcp-configs/ outside agent workDir
- Use atomic writes (tmp + rename) for MCP config files
- Fix provider rename leaving ghost registry entries
- Remove provider_type from mutable fields on update
- Tighten temp dir permissions from 0755 to 0700
* feat(mcp-bridge): propagate channel routing context through MCP bridge
- Pass channel, chat_id, and peer_kind from agent loop to CLI provider options
- Inject X-Channel, X-Chat-ID, X-Peer-Kind headers in bridge context middleware
- Add BridgeContext struct to bundle per-call context for MCP config generation
- Include channel routing headers in per-session MCP config files
- Expose "message" tool via MCP bridge for cross-channel messaging
- Add extract helpers for new option keys in claude_cli_session.go
* feat(mcp-bridge): forward media attachments to outbound message bus
- Wire MessageBus into gateway server and MCP bridge handler
- Publish tool result media files to outbound bus for channel delivery
- Extract channel/chatID/peerKind from tool context for proper routing
- Add mimeFromExt helper for content-type detection on attachments
* feat(mcp-bridge): inject per-agent DB-backed MCP servers into Claude CLI config
- Add MCPServerLookup type to resolve agent-specific MCP servers from DB
- Wire MCPServerStore through provider registration and HTTP handler
- Extract mcpServerEntryToConfig helper to deduplicate transport config logic
- Add JSON-to-Go helpers (jsonToStringSlice, jsonToStringMap) for DB fields
- Merge per-agent MCP servers at config write time without overriding static entries
* fix(mcp-bridge): use Media struct fields and prefer explicit MimeType
- Map Media.Path to attachment URL instead of treating Media as string
- Use Media.MimeType when available, fall back to extension-based detection
* refactor(providers): deduplicate option extractors and extract bridge media forwarding
- Replace per-field extractors (extractSessionKey, extractAgentID, etc.) with generic extractStringOpt/extractBoolOpt
- Add bridgeContextFromOpts helper to build BridgeContext in one call
- Extract forwardMediaToOutbound from inline block in makeToolHandler
- Change NewBridgeServer msgBus param from variadic to explicit pointer
* fix(providers): validate provider_type on update instead of silently dropping it
- Add explicit validation against ValidProviderTypes with 400 response
- Remove silent delete(updates, "provider_type") that hid invalid values
- Caller now receives clear error when submitting unsupported provider_type
* fix(providers): add header injection validation to MCP bridge headers
- Extend CRLF/null-byte checks to agentID, channel, chatID, and peerKind
- Previously only userID had header injection prevention
- Prevents HTTP header injection via crafted values in MCP config
* fix(mcp-bridge): sign all context fields in HMAC and remove legacy code
- Sign all 5 bridge context fields (agentID|userID|channel|chatID|peerKind)
in HMAC instead of only agentID|userID to prevent channel routing forgery
- Propagate context.Context into MCPServerLookup to respect request
cancellation instead of using context.Background()
- Remove legacy BuildCLIMCPConfig, WithClaudeCLIMCPConfig, mcpConfigPath,
and mcpCleanup (dead code since system is PG-only)
- Use mime.TypeByExtension before custom fallback in mimeFromExt
- Add debug log when media forwarding is skipped due to missing context
- Add thread-safety comment to SetMCPServerLookup
---------
Co-authored-by: Nam Nguyen Ngoc <namnn.0911@gmail.com>
Co-authored-by: viettranx <viettranx@gmail.com>
- Add persistent media storage (internal/media/) replacing temp file deletion
- Add MediaRef type for lightweight media references in session messages
- Refactor media pipeline to use bus.MediaFile{Path, MimeType} across all channels
- Add read_document builtin tool for PDF/DOCX/XLSX analysis via Gemini native API
- Move image sanitization from Telegram to shared agent/media layer
- Add media reload for multi-turn conversations (images from last 5 messages)
- Add reply-to-message media resolution for Telegram (re-download on reply)
- Add media inventory to compaction summary to preserve awareness after truncation
- Fix coreToolSummaries for read_image, read_document, create_image tools
- Add real-time trace update events via WebSocket broadcast
- Improve trace detail UI with media refs and tool result display
Add Claude CLI as an LLM provider (subscription-based, no API key needed).
The CLI manages session history, tool execution, and context while GoClaw
forwards messages and streams responses.
Key features:
- Claude CLI provider with session persistence (--resume)
- MCP bridge server exposing GoClaw tools to CLI via streamable-http
- Security hooks (shell deny patterns, workspace path restrictions)
- Per-session mutex preventing concurrent CLI calls
- Onboard wizard for Claude CLI setup and auth verification
- Web UI for adding/managing Claude CLI provider with auth status
- Provider registry Close() for proper shutdown cleanup
Security:
- CLI path validation (only "claude" or absolute paths from DB)
- Token auth middleware for MCP bridge endpoint
- Shell injection prevention in hook scripts (single-quoted paths)
- Relative path resolution before workspace boundary checks
- Resource leak prevention on provider replace/unregister
Co-authored-by: nhokboo <nhokboo@users.noreply.github.com>
Split agent summoning into two sequential LLM calls:
- Step 1: Generate SOUL.md (personality)
- Step 2: Generate IDENTITY.md + USER_PREDEFINED.md using SOUL.md as context
This enables real-time progress updates in the UI as each file completes.
On retry, previously generated files are detected by comparing against
the default template and skipped, so only missing files are regenerated.
Also increase HTTP client timeout from 120s to 300s for both OpenAI and
Anthropic providers. Some models (e.g. Kimi) with deep reasoning can
take 160s+ for non-streaming requests, which exceeded the 120s limit
and caused cascading retry failures until the context deadline expired.
Default bufio.Scanner limit (64KB) can truncate large SSE lines from
tool call arguments or thinking content, causing silent parse failures.
Set buffer to 1MB to match the scale of extended thinking responses.
Also remove a duplicate scanner.Err() check that was accidentally left in.
When stream_options.include_usage is enabled, the OpenAI API sends the
final usage chunk with an empty choices array. The streaming parser
skipped chunks with no choices before checking for usage data, causing
token counts to always be 0 in traces.
Fix: extract usage data before the empty-choices skip.
Also fixes two related SSE parsing issues:
- SSE data prefix now handles both "data: " and "data:" (some providers
like Kimi omit the space after the colon, per SSE spec)
- Added scanner.Err() check to surface stream read errors (timeout,
connection reset) instead of silently returning partial results
Co-authored-by: Luvu182 <208665161+Luvu182@users.noreply.github.com>
- SSE: handle "data:" without space (Kimi and other providers)
- Add scanner.Err() check to detect stream read failures
- Echo reasoning_content for thinking models (Kimi, DeepSeek)
- Add Thinking field to Message struct for reasoning passback
- Add GOCLAW_OPENAI_BASE_URL env var override (parity with Anthropic)
Implement full thinking mode system: per-agent thinking_level config (off/low/medium/high)
stored in other_config JSONB, per-provider param injection (Anthropic budget_tokens, OpenAI
reasoning_effort, DashScope enable_thinking+thinking_budget), Anthropic extended thinking with
streaming parse and tool use block preservation via RawAssistantContent, thinking token tracking
in trace spans, and Web UI with agent config selector, chat thinking block rendering, and trace
thinking token display. No DB migration needed.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add DashScope (Qwen) native provider with tools+streaming fallback
- Add Bailian Coding provider with hardcoded model list (no /v1/models API)
- Parse reasoning_content in OpenAI-compat streaming/non-streaming responses
- Emit ChatEventThinking events in agent loop for thinking models
- Add vision support for DashScope (qwen3-vl)
- Fix provider form dialog not updating API base URL when switching types
- Update README provider count from 11+ to 13+
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Allow overriding the Anthropic API base URL via GOCLAW_ANTHROPIC_BASE_URL
env var, config JSON, or DB provider record. Enables use of Anthropic-
compatible proxies and custom endpoints.
Also adds Makefile shortcuts for docker compose (up/down/logs).
- Add tool loop detection (toolloop.go): tracks repeated no-progress tool calls
using SHA256 hashing of args+results. Warning at 5 identical calls, force stop
at 10. Prevents Gemini models from burning tokens in infinite loops.
- Inject AVAILABILITY.md negative context when agent has no team/delegation
targets, so models don't waste iterations probing unavailable capabilities.
- Fix Dockerfile: create /app/.goclaw directory so Docker volume initializes
with correct goclaw:goclaw ownership instead of root:root.
- Update collapseToolCallsWithoutSig comments for clarity.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Gemini 2.5 flash doesn't return thought_signature, causing
collapseToolCallsWithoutSig to convert tool_calls to "(Used tool ...)"
text that Gemini imitates instead of making real tool calls.
Now strips tool_calls and folds results into plain user messages
so the model retains context without an imitable pattern.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Gemini 2.5 flash doesn't return thought_signature in OpenAI-compat responses,
causing collapseToolCallsWithoutSig to convert ALL tool_calls to "(Used tool ...)"
text. Gemini then imitates this pattern instead of making actual tool calls.
Now drops tool_call cycles entirely, preserving only the assistant's text content.
Also fixes word-wrap in trace detail dialog (break-words instead of break-all)
and uses <pre> with whitespace-pre-wrap for trace-level input/output previews.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Implement DM pairing flow for Discord and WhatsApp (checkDMPolicy, sendPairingReply with debounce)
- Add "pairing" case to BaseChannel.CheckPolicy() to reject instead of falling through to "open"
- Add group pending history tracking to Discord (record when not mentioned, prepend context when mentioned)
- Resolve Discord display names with priority: server nickname > global name > username
- Fix Gemini collapse format to prevent model from imitating tool call patterns
- Fix formatTokens crash on null/undefined input
Multi-agent AI gateway with WebSocket RPC, HTTP API, and messaging channel integrations.
Go port of OpenClaw with multi-tenant PostgreSQL, per-user isolation, security hardening,
and production observability.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>