* fix(subagent): inherit parent agent's provider instead of alphabetical fallback
Subagents previously used a fixed provider (alphabetically first from the
registry, often "anthropic") regardless of which provider the parent agent
used. This caused invalid combos like anthropic/glm-5 when a zai-coding
agent spawned subagents.
- Pass provider registry to SubagentManager for runtime resolution
- Inject parent provider name into context (WithParentProvider)
- Resolve activeProvider from parent context before LLM call
- Fix trace spans to show actual resolved provider, not default
* fix(providers): api_base fallback from config/env for DB providers
DB providers with empty api_base now inherit from config/env vars
(e.g., GOCLAW_ANTHROPIC_BASE_URL). Prevents proxy API keys from being
sent to the real provider API endpoint.
- Add APIBaseForType() method on ProvidersConfig
- registerProvidersFromDB falls back to config when api_base is empty
- ProvidersHandler uses resolveAPIBase() for model listing
- Add api_base, display_name, settings to provider validation whitelist
* fix(tracing): pass resolved provider name to subagent span emitters
- emitSubagentSpanStart now accepts providerName param instead of
reading sm.provider.Name() — ensures root subagent span reflects
the inherited parent provider, not the fallback default
- registerInMemory now uses resolveAPIBase() so DB providers with
empty api_base inherit the config/env fallback (same as startup path)
---------
Co-authored-by: viettranx <viettranx@gmail.com>
Replace all hardcoded ~/.goclaw path constructions with configurable
sources (cfg.ResolvedDataDir() for service dirs, cfg.Agents.Defaults.Workspace
for agent workspaces). This fixes data persistence issues in Docker
deployments where paths differ from local dev.
- Add DataDir field to Config with ResolvedDataDir() resolver
- Add ResolvedDataDirFromEnv() package-level helper for packages without Config
- Populate StoreConfig.SkillsStorageDir (was never set, caused hardcoded fallback)
- Agent workspaces now use subdirectory format (workspace/{key}) for volume compatibility
- Remove dead GOCLAW_SESSIONS_STORAGE env/config (sessions moved to PostgreSQL)
- Fix deploy-stg.sh trailing space after backslash + remove deprecated GOCLAW_MODE
- Add GOCLAW_SKILLS_DIR override in docker-compose for volume persistence
* feat(providers): add ACP provider for orchestrating external coding agents (#189)
Implement native Go ACP (Agent Client Protocol) client as a new Provider.
Enables GoClaw to orchestrate any ACP-compatible agent (Claude Code, Codex
CLI, Gemini CLI) as a subprocess via JSON-RPC 2.0 over stdio.
- Add bidirectional JSON-RPC 2.0 transport over stdio pipes
- Add subprocess process pool with idle TTL reaping and crash recovery
- Add ACP session lifecycle (initialize, session/new, session/prompt)
- Add tool bridge for agent-initiated fs/terminal/permission requests
- Add workspace sandboxing, shell deny patterns, and env var filtering
- Wire config-based and DB-based provider registration paths
- Export DefaultDenyPatterns from tools package for reuse
* feat(providers): add changelog entry for ACP provider integration
* fix(tools): prevent workspace traversal bypass via /tmp/ fallback in resolveMediaPath
Reject paths containing ".." in the isInTempDir fallback to prevent
workspace escape where traversal path still resolves inside /tmp/.
* fix(tools): block workspace-sibling paths in resolveMediaPath /tmp/ fallback
When workspace is inside /tmp/, traversal paths like workspace/../X
resolve to /tmp/ siblings that pass isInTempDir. Reject paths inside
the workspace parent directory to prevent this escape.
* feat(providers): add ACP provider web UI and live reload via pubsub
Web UI for creating/editing ACP providers with dedicated form fields
(binary, args, idle TTL, permission mode, work directory). ACP providers
now update immediately without gateway restart via cache invalidation
pubsub pattern.
Frontend:
- New ACPSection form component with i18n (en/vi/zh)
- Provider form dialog integration with ACP state management
- ACP type badge on providers list page
- Settings field added to provider TypeScript types
Backend:
- ACP models handler (claude/codex/gemini) without API key requirement
- Binary path validation + LookPath verification in verify handler
- Provider CRUD emits cache.invalidate events via msgBus
- Subscriber in gateway_managed.go re-registers ACP providers from DB
- ACP core improvements from code review (helpers, jsonrpc, process,
terminal, tool_bridge)
---------
Co-authored-by: viettranx <viettranx@gmail.com>
Add complete i18n support for channel config tab — field labels, help text,
and select option translations for en/vi/zh. Enable DM streaming by default
for Telegram and Slack channels.
* feat(workspace): add team shared workspace for file collaboration
- Add workspace_write and workspace_read tools for agents to share files across team members
- Create team_workspaces DB table with migration 000017 (file metadata, pinning, tags)
- Implement PostgreSQL store layer for workspace CRUD operations
- Add RPC handlers for workspace list/read/delete from web UI
- Build React workspace tab with file listing, content preview, and delete
- Propagate workspace channel/chatID scope through delegation chain
- Auto-allow workspace tools in agent tool policy when agent belongs to a team
- Inject team workspace guidance into system prompt for team agents
- Add /reset command handler for clearing session history
- Harden MCP bridge context middleware to reject headers when no gateway token
- Add i18n strings for workspace UI in en/vi/zh locales
* feat(teams): add comprehensive task management with followup reminders and recovery
- Add task followup/reminder system with auto-set on lead agent reply and auto-clear when user responds on channel
- Add task recovery ticker to re-dispatch stale/pending tasks periodically
- Add task scopes, filtering by status/channel/chatID, and task events
- Add WS RPC handlers for task CRUD, assignments, comments, events, and bulk operations (teams_tasks.go)
- Add task detail dialog, settings UI for followup config, and scope filtering in web dashboard
- Add migrations 000018 (team_tasks_v2) and 000019 (task_followup)
- Extend team_tasks_tool with await_reply, clear_followup actions
- Auto-complete/fail team tasks when delegate agent finishes
- Add workspace file listing and team tool manager enhancements
* docs(teams): add team system architecture and playbook ideas documentation
- Add TEAM_SYSTEM.md with full architecture design covering task management, shared workspace, and delegation engine subsystems
- Add TEAM_PLAYBOOK_IDEAS.md outlining future team coordination layers (playbook, member capabilities, auto-learned patterns)
- Document data models, status flows, tool actions, followup reminder system, task ticker, execution locking, and workspace scope model
* fix(teams): resolve 6 critical bugs in team task system
- Fix unblock SQL: check array_length after array_remove (not before)
- Enforce single-team leadership in team creation
- Add requireLead() for approve/reject tool actions
- Validate cross-team dependency references in blocked_by
- Add team_id to handoff route for multi-team isolation
- Set blocked_by DEFAULT '{}' to prevent NULL array issues
* refactor(workspace): use stable userID as scope key instead of connection UUID
Workspace scope changed from (team_id, channel, chat_id) to (team_id, userID).
Fixes workspace fragmentation across WS tab refreshes and reconnections.
* feat(teams): add V1/V2 versioning with feature gating and optimized prompts
- IsTeamV2() helper gates advanced features (locking, followup, review, audit)
- V2 tool actions rejected for V1 teams with clear error message
- Ticker, gateway consumer, delegation hooks respect version flag
- TEAM.md renders v1/v2 sections conditionally
- Tool descriptions and params optimized (~38% token reduction)
- UI: version toggle in settings, V2 Beta badge, conditional rendering
- i18n: version modal keys for en/vi/zh
* fix(migration): use VARCHAR(255) for user ID columns and add metadata JSONB
- assignee_user_id, user_id, actor_id: TEXT → VARCHAR(255)
- Add metadata JSONB to team_task_comments and team_task_attachments
---------
Co-authored-by: Nam Nguyen Ngoc <namnn.0911@gmail.com>
Inject user follow-up messages into the running agent loop at turn
boundaries instead of queueing them for a new run. This preserves
context so the LLM sees both tool results and user follow-ups together.
- Add InjectedMessage type and drainInjectChannel helper
- Add InjectCh to ActiveRun with buffered channel (cap=5)
- Drain injection channel at two points in agent loop (after tool
results and before no-tool-calls exit)
- Route steer/new_task intents to InjectMessage with scheduler fallback
- WebSocket: inject into running loop when session is busy
- Remove IntentClassify config toggle (always on)
- Web UI: show send + stop buttons side by side during agent run
- i18n: add injection acknowledgment messages (en/vi/zh)
* feat(telegram): support custom Bot API server URL
Add api_server credential field for Telegram channels to allow
using a self-hosted Telegram Bot API server (e.g. for 2GB upload
limits). Uses telego.WithAPIServer() option.
* feat(telegram): local Bot API media download + config improvements
- Support local Bot API --local mode: detect absolute file paths and
copy directly from filesystem (requires shared volume mount)
- Fall back to HTTP download via custom API server URL for non-local mode
- Move api_server/proxy from credentials to config (non-secret fields);
credentials still accepted for backward compat
- Add media_max_mb config field (friendlier than media_max_bytes);
deprecated media_max_bytes still works
- Default to 200 MB max when local Bot API is configured (vs 20 MB cloud)
- UI: move api_server/proxy to config section, use MB for media size
* fix: use 127.0.0.1 instead of localhost in API server placeholder
Docker containers cannot resolve localhost to the host machine.
127.0.0.1 works reliably with Docker's default bridge network.
---------
Co-authored-by: Luvu182 <208665161+Luvu182@users.noreply.github.com>
The per-agent `imageGen` and `vision` fields in `ToolPolicySpec` (stored
in agents.tools_config JSONB) were added in d5cc5a7 (Feb 26) as the
original way to configure image/vision providers. When the media provider
chain system was introduced in 5815437 (Mar 8), these fields were kept
"for backward compat" but became dead code with no UI to manage them.
This causes a hard-to-debug issue: if an agent's tools_config contains
stale imageGen/vision data (set via API or leftover from DB), it silently
overrides the global provider chain configured in the builtin tools UI.
Users see the correct chain in the UI but the tool calls a completely
different provider/model, with no indication of why.
Changes:
- Remove Vision and ImageGen fields + struct definitions from ToolPolicySpec
- Remove associated context helpers (WithVisionConfig, WithImageGenConfig, etc.)
- Remove per-agent override injection in agent loop
- Simplify create_image and read_image to use chain as sole source of truth
- UI: whitelist known tools_config fields on save to clean stale DB data
Co-authored-by: Luvu182 <208665161+Luvu182@users.noreply.github.com>
- A1+C2: Include token usage in run.completed event payload for WS clients
- A2: Cost tracking with model pricing config, cost calculation, and cost summary API
- A3: Budget enforcement per agent with monthly budget limits (migration 000015)
- C1: External wake/trigger API (POST /v1/agents/{id}/wake) for orchestrators
- C3: Activity audit trail with structured logging and queryable API
- UI: Activity page, cost stat card on overview, budget section in agent detail
- i18n: Complete en/vi/zh translations for all new features
* feat(cron): configurable default timezone for cron expressions
Cron expressions (e.g. "0 8 * * *") are evaluated relative to a timezone.
Without an explicit per-job timezone, they default to the server's system
timezone, which may not match the user's local time — especially in Docker
containers (default UTC) or multi-region deployments.
This adds a `default_timezone` setting to `CronConfig` (IANA format, e.g.
"Asia/Ho_Chi_Minh") that is applied as fallback when a cron job has no
explicit `schedule.tz`. The setting is configurable via the UI config page
(Integrations → Cron Scheduler) and hot-reloads on config changes.
Backend:
- Add `DefaultTimezone` field to `CronConfig`
- Add `SetDefaultTimezone()` to `CronStore` interface + PG implementation
- Apply default TZ in `AddJob()` when `schedule.TZ` is empty
- Wire at startup + subscribe to config change events for hot reload
- Update cron tool description so LLM knows about gateway default
Frontend:
- Add timezone dropdown (20 common IANA timezones) to Cron config section
- Add i18n keys for en, vi, zh
* fix(cron): apply default timezone to existing jobs via computeNextRun
Pass defaultTZ as fallback to computeNextRun so existing cron jobs
(with timezone = NULL in DB) also use the gateway's configured default
timezone when computing next_run_at. This ensures old jobs benefit
from the timezone setting without needing a DB migration or backfill.
---------
Co-authored-by: Luvu182 <208665161+Luvu182@users.noreply.github.com>
* feat(providers): add Ollama local and Ollama Cloud provider support
Adds two new provider variants:
ollama — local/self-hosted Ollama instance
- Gated on providers.ollama.host in config (or GOCLAW_OLLAMA_HOST env)
- No API key required; Ollama's OpenAI-compat endpoint accepts any Bearer value
- Defaults to http://localhost:11434/v1 (configurable for LAN/remote hosts)
- Default model: llama3.3
ollama-cloud — Ollama Cloud (managed remote inference)
- Gated on providers.ollama_cloud.api_key (or GOCLAW_OLLAMA_CLOUD_API_KEY env)
- Bearer token from ollama.com/settings/keys
- Default base URL: https://ollama.com/v1 (overridable via api_base)
- Default model: llama3.3
Both variants use the existing NewOpenAIProvider (OpenAI-compat) — no new
provider struct needed. Both are registered from config file and DB (via
llm_providers table with ProviderOllama / ProviderOllamaCloud types).
OllamaCloud.APIKey follows all existing secret handling patterns:
MaskedCopy, StripSecrets, StripMaskedSecrets.
* feat(providers): wire Ollama into web UI and fix DB registration
- Add ollama + ollama_cloud to PROVIDER_TYPES constants (dropdowns)
- Fix setup wizard: skip API key requirement for Ollama local (isOllama)
- Fix bootstrap status: recognize Ollama local as no-API-key provider
- Add ollama_cloud to config-page KNOWN_PROVIDERS list
- Fix gateway_providers.go: move ProviderOllama before APIKey=='' guard
so DB-registered local Ollama providers actually register at startup
(same pattern as ClaudeCLI, which also needs no API key)
The onboard wizard sets GOCLAW_PROVIDER and GOCLAW_MODEL in .env for
initial bootstrap. Previously these env vars always overrode the config
file value via envStr(), making it impossible to change the default
provider/model through the Dashboard — every save was silently reverted
by ApplyEnvOverrides().
Change envStr to envFallback for these two fields: the env var is only
applied when the config file has no value (empty string). Once the user
saves a provider/model via the Dashboard, the config-file value wins.
Also:
- Stabilize ProviderModelSelect auto-select effect (useRef + useMemo)
- Add toast feedback on config save success/failure
Co-authored-by: Luvu182 <208665161+Luvu182@users.noreply.github.com>
Add Provider, Model, MaxTokens to PendingCompactionConfig so users can
override the LLM used for pending message summarization via the config
UI. Falls back to agent's provider/model when not set. Increase default
max_tokens from 512 to 4096. Add allowEmpty prop to ProviderModelSelect
to prevent auto-selecting first provider when empty means "use default".
- Fix compact endpoint using random provider instead of agent's configured provider+model
- Wire auto-compaction for all 5 channel types (telegram, discord, slack, feishu, zalo_personal)
via PendingCompactable interface and InstanceLoader
- Add global PendingCompactionConfig (threshold, keep_recent) to ChannelsConfig
- Wire global config through InstanceLoader and PendingMessagesHandler
- Increase compaction timeout from 45s to 180s for slow providers
- Add pending compaction config card to Behavior tab in config page
- Add HowItWorksCard (expanded by default) and toast notifications to pending messages page
- Add i18n support for all new strings (en/vi/zh)
- Update go.mod and Dockerfile to Go 1.26
- Apply `go fix ./...` stdlib modernizations across 170+ files
- Add `go fix` to post-implementation checklist in CLAUDE.md
- Fix go fix misapplied rewrite in loop_history.go
- Add tool status display on channels during tool execution (streaming preview + reactions)
- Emit agent.activity events at phase transitions (thinking, tool_exec, compacting)
- Enrich delegation progress with per-member activity and tool info
- Add LLM-based intent classifier for DM status queries when agent is busy
- Keyword fast-path for cancel/status patterns (no LLM cost)
- Falls back to LLM classification with 5s timeout
- Supports status_query (immediate reply) and cancel (abort run) intents
- Register/unregister runs in makeSchedulerRunFunc for channel inbound tracking
- Add sessionRuns secondary index in Router for O(1) IsSessionBusy lookups
- Add intent_classify config toggle (global default + per-agent override)
- Add tool_status config toggle for channel tool status display
- Add i18n keys and translations (en/vi/zh) for status messages
- Add web UI config toggles for intent_classify and tool_status
* feat: add Z.ai provider support (general API + coding plan)
Add Z.ai (GLM) as a new LLM provider with two variants:
- `zai`: general API (api.z.ai/api/paas/v4)
- `zai_coding`: coding plan (api.z.ai/api/coding/paas/v4)
Reuses OpenAIProvider — Z.ai API is OpenAI-compatible with Bearer
token auth, SSE streaming, and reasoning_content support.
Includes: store constants, config struct fields, env var loading
(GOCLAW_ZAI_API_KEY, GOCLAW_ZAI_CODING_API_KEY), secret masking,
config + DB registration, onboard wizard, and UI provider types.
Default model: glm-5
Closes#100
* docs: add Z.ai provider entries to providers documentation
* fix(agent): use ChannelType in system prompt for proper channel context
The system prompt was using the channel instance name (e.g. "zep-lao") instead
of the platform type (e.g. "zalo_personal"), causing the LLM to not understand
which messaging platform it's running on. This led to context confusion where
the bot would ask users which channel to send to instead of using the current one.
Changes:
- Add ChannelType field to RunRequest and SystemPromptConfig
- Thread channel type from consumer/cron → agent loop → system prompt
- Add WithToolChannelType/ToolChannelTypeFromCtx for tool context
- Register channel types for both config-based and DB-loaded instances
- Fix Zalo group thread type detection with approvedGroups cache
- Update cron handler to resolve channel type for cron-triggered runs
* refactor(channels): add Type() to Channel interface, remove channelTypes map
Move channel type from a separate map in Manager to the Channel interface
itself. BaseChannel.Type() falls back to Name() for config-based channels
where name == type. Extracts resolveChannelType helper to DRY up 6
repeated resolution blocks across consumer and cron handlers.
* feat(zalo): add pending group history for conversation context
Zalo personal groups now record non-@mentioned messages in a ring buffer
(default 50, configurable via history_limit). When the bot IS mentioned,
pending history is flushed as context — matching Telegram/Discord/Feishu.
Separated mention gating from policy gating in checkGroupPolicy for
cleaner control flow.
Extract shared media utilities (MediaInfo, BuildMediaTags, TranscribeAudio,
DetectMIMEType) into internal/channels/media/ and refactor Telegram to use
them. Add full inbound/outbound media support to Discord and Feishu channels
(STT transcription, document extraction, media tags, voice agent routing).
Add WebSocket media upload/serve endpoints and MIME-aware media tags in
chat.send. Split large channel files for maintainability.
Add optional Redis cache support via `go build -tags redis`, following
the same paired-stub pattern as OTel and Tailscale. The Cache[V] interface
is unchanged; Redis and in-memory implementations are injected at startup
without altering usage logic.
- Add RedisCache[V] implementation with JSON serialization, fail-open on errors
- Add gateway_redis.go / gateway_redis_noop.go paired wiring files
- Refactor GroupWriterCache and ContextFileInterceptor to accept injected caches
- Add GOCLAW_REDIS_DSN env var, docker-compose.redis.yml overlay
- Update Dockerfile and GitHub Actions with ENABLE_REDIS build arg
- Add Redis variant to CI matrix (5 variants: latest, otel, tsnet, redis, full)
Add Claude CLI as an LLM provider (subscription-based, no API key needed).
The CLI manages session history, tool execution, and context while GoClaw
forwards messages and streams responses.
Key features:
- Claude CLI provider with session persistence (--resume)
- MCP bridge server exposing GoClaw tools to CLI via streamable-http
- Security hooks (shell deny patterns, workspace path restrictions)
- Per-session mutex preventing concurrent CLI calls
- Onboard wizard for Claude CLI setup and auth verification
- Web UI for adding/managing Claude CLI provider with auth status
- Provider registry Close() for proper shutdown cleanup
Security:
- CLI path validation (only "claude" or absolute paths from DB)
- Token auth middleware for MCP bridge endpoint
- Shell injection prevention in hook scripts (single-quoted paths)
- Relative path resolution before workspace boundary checks
- Resource leak prevention on provider replace/unregister
Co-authored-by: nhokboo <nhokboo@users.noreply.github.com>
Add block.reply event that delivers intermediate assistant text to non-streaming
channels during multi-tool iterations. Includes 2-tier config toggle:
gateway-level default (disabled) + per-channel override (inherit/on/off).
Backend:
- Emit block.reply events from agent loop between tool iterations
- Add BlockReply *bool to GatewayConfig and all 6 channel config structs
- Add BlockReplyChannel interface with ResolveBlockReply() resolution
- Guard delivery in HandleAgentEvent by RunContext.BlockReplyEnabled
- Resolve config at RegisterRun time, pass to consumer goroutine
- Conditional dedup: skip final message if identical to last block reply
UI:
- Gateway settings: Switch toggle for global default
- Per-channel: tri-state select (Inherit from gateway / Enabled / Disabled)
- Protocol: BLOCK_REPLY constant in AgentEventTypes
- Form: coerceBoolSelects for proper JSON boolean serialization
* feat(browser): add remote Chrome sidecar support for Docker deployments
When running in Docker, Chrome is not installed in the runtime image.
This adds support for connecting to a remote Chrome via CDP (Chrome
DevTools Protocol) using a Docker Compose sidecar overlay, following
the existing pattern used by sandbox, OTel, and Tailscale overlays.
Changes:
- Add RemoteURL field to BrowserToolConfig
- Add GOCLAW_BROWSER_REMOTE_URL env var (auto-enables browser tool)
- Browser Manager: remote CDP connection with hostname-to-IP resolution
(required by Chrome M113+ DNS rebinding protection), auto-reconnect
on dead connections, disconnect-only on Stop (sidecar stays alive)
- Auto-start browser on first tool action (no explicit "start" needed)
- Add docker-compose.browser.yml overlay (zenika/alpine-chrome:124)
- Add unit tests for CDP resolution and Manager lifecycle
Usage:
docker compose -f docker-compose.yml -f docker-compose.managed.yml \
-f docker-compose.browser.yml up -d --build
Closes#56
* feat(browser): fix onboard summary and config serialization for remote mode
- onboard.go: show "remote: ws://..." instead of "headless" when RemoteURL is set
- onboard_auto.go: serialize remote_url field in generated config
---------
Co-authored-by: Luvu182 <208665161+Luvu182@users.noreply.github.com>
Security: fix cross-agent MCP tool leak by cloning tool registry before MCP registration.
MCP: enforce mcp_ prefix on all tool names, add cache invalidation on server/grant changes,
add grant management endpoints, add group:mcp policy support for per-agent allowlisting.
Skills: persist full YAML frontmatter, auto-promote/demote visibility on grant/revoke,
simplify versioning, handle ZIP wrapper directories, expand tilde in skillsDir path.
Fixes: wrap DeleteSkill cascade in transaction, use atomic NOT EXISTS for revoke-demote,
create cancel context before storing server in map.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add blocked_domains to web_fetch policy (always enforced regardless of allow_all/allowlist mode)
- Refactor domain matching into shared matchDomainList() for allowlist and blocklist
- Enhance server IP scrubbing: register decimal IP, dashed pattern, and reverse DNS hostnames
- Add SOUL.md scope enforcement to delegation ExtraPrompt so agents refuse out-of-scope tasks
- Add blocked domains UI textarea in dashboard tools config
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Part A — Channel quota limiter (managed mode):
- DB-backed per-user/group request quotas with in-memory 60s TTL cache
- Config merge priority: Groups > Channels > Providers > Default
- Per-group quota override via channels.telegram.groups[chatID].quota
- Migration 000009: index on channel_requests for quota queries
- Hot-reload quota config via pub/sub (TopicConfigChanged)
Part B — Per-run tool call budget:
- Soft stop at configurable limit (default 25, per-agent override)
- MaxToolCalls field on AgentDefaults + AgentSpec + LoopConfig
- LLM gets one final call to summarize when budget exceeded
Part C — Web UI + config page refactor:
- QuotaSection with provider/channel dropdowns (useProviders, useChannelInstances)
- Config page refactored to vertical sidebar tabs layout
- Categories: General, Quota, Agents, Tools, Connections, Advanced, Raw Editor
- Fixed config.patch RPC to serialize raw JSON + baseHash correctly
- Config change pub/sub broadcast from handleApply/handlePatch
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- SSE: handle "data:" without space (Kimi and other providers)
- Add scanner.Err() check to detect stream read failures
- Echo reasoning_content for thinking models (Kimi, DeepSeek)
- Add Thinking field to Message struct for reasoning passback
- Add GOCLAW_OPENAI_BASE_URL env var override (parity with Anthropic)
- Add sendMessageDraft transport (disabled pending Telegram client fix for
"reply to deleted message" artifact — tdesktop#10315, bugs.telegram.org/c/561)
- Split stream_mode into dm_stream/group_stream boolean flags (both default false)
- DM messages no longer set reply_to_message_id (cleaner UX, matching TS)
- Progressive placeholder editing for DMs: "Thinking..." → stream chunks → final
- Update web UI with separate DM/Group streaming toggles
fix(agent): prevent false MEDIA: detection in tool output
parseMediaResult() used strings.Index to find "MEDIA:" anywhere in tool output,
causing false positives when external content (e.g. GitHub releases page)
contained commit messages like "return MEDIA: path from screenshot".
Changed to strings.HasPrefix to only match at start of output.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* feat(channels): implement Zalo Personal Chat (ZCA) protocol layer
Implement complete Zalo Personal Chat integration including:
- Message protocol layer (request/response/event types)
- Connection management with auth flow
- Message sending/receiving with text and media support
- User/group management and sync
- Telegram-style contact and conversation handling
- Comprehensive unit tests with 85%+ coverage
Architecture follows existing channel patterns (Telegram, Feishu) with
raw API calls for session management and message delivery. Includes
error handling, rate limiting awareness, and logging.
* feat(channels): add Zalo Personal channel integration layer
Wire protocol package to GoClaw's channel system:
- channel.go: Channel struct, Start/Stop/Send, listenLoop, message handlers
- auth.go: credential resolution (preloaded > file > QR), persistence
- policy.go: DM/group policy, @mention gating, pairing with debounce
- factory.go: managed mode factory (requires credentials, no QR)
- cmd/gateway.go: register standalone + managed factory
* feat(ui): add Zalo Personal channel type to web dashboard
Add zalo_personal to channel type dropdown, credential fields
(IMEI, cookie, userAgent), and config schema (DM/group policy,
require_mention, allow_from).
* feat(channels): add WebSocket QR login for Zalo Personal channel
Add real-time QR code login flow for zalo_personal channel instances
in managed mode. Users create an instance without credentials, then
trigger QR login from the web dashboard.
Backend:
- New RPC method zalo.personal.qr.start with per-instance mutex
- QR PNG pushed via client-scoped WS events (not broadcast)
- Credentials encrypted and saved to DB on successful scan
- Cache invalidation triggers automatic channel reload/start
- Factory returns nil,nil for missing credentials (skip, not error)
- Instance loader handles nil-channel gracefully
Frontend:
- ZaloPersonalQRDialog with auto-start, retry, and auto-close
- QR button in channel instances table for zalo_personal type
- Credential fields no longer required (auto-populated via QR)
* fix(channels): skip redundant LoginWithCredentials after QR login
QR flow already validates session via qrCheckSession + qrGetUserInfo.
Calling LoginWithCredentials again conflicts with the active QR session
state, causing "empty response" errors. Credentials are validated when
the channel starts instead. Also rename log prefix from "zca" to
"Zalo Personal".
* fix(channels): fix Zalo Personal cookie domain for login API
BuildCookieJar only set cookies for chat.zalo.me but the login API
uses wpa.chat.zalo.me. Cookies weren't sent to the subdomain, causing
"empty response" on channel startup. Now sets cookies for both hosts.
* fix(channels): move UTF-8 check after gzip decompression in Zalo listener
The UTF-8 validity check in decryptAESGCMPayload ran on raw decrypted
bytes before gzip decompression, causing all encType=2 (AES-GCM+gzip)
messages to fail with "decrypted payload is not valid UTF-8".
Move the check to decryptEventData so it runs after all processing
(decryption + decompression) is complete.
* feat(channels): add QR-only onboarding and contacts picker for Zalo Personal
- Remove credential text fields for zalo_personal, show QR auth info banner
- Add has_credentials boolean to HTTP and WS mask functions
- Implement FetchFriends/FetchGroups protocol (encrypted Zalo API)
- Add zalo.personal.contacts WS RPC method with parallel fetch
- Create ZaloContactsPicker component with search, selection, manual entry
- Integrate picker in channel instance edit dialog for allow_from config
* refactor(channels): rename zca error prefix to zalo_personal across protocol package
* fix(channels): unwrap inner response envelope in Zalo contacts decryption
The Zalo API returns double-wrapped responses: outer envelope contains
encrypted base64 data, which when decrypted yields another Response
envelope with error_code and data fields. The decryptDataField helper
was returning the raw decrypted bytes without unwrapping the inner
envelope, causing json unmarshal failures when parsing friends/groups.
* fix(channels): pass version 0 for group details to get full data
The Zalo group info endpoint uses a version-based caching mechanism.
Passing the actual version from step 1 causes the server to return
the group in "unchangedsGroup" with empty "gridInfoMap". By passing
version 0 for all groups, we force the server to return full group
info including name, avatar, and member count.
* fix(ui): auto-load contacts on modal reopen to resolve display names
When the edit modal is reopened with already-selected contact IDs,
contacts are now auto-fetched so badges show display names instead
of raw numeric IDs.
* fix(channels): handle gzip-compressed response in Zalo SendMessage
SendMessage used io.ReadAll + json.Unmarshal directly but the response
is gzip-compressed (Accept-Encoding: gzip header). Use readJSON() which
handles gzip decompression, fixing "invalid character '\x1f'" errors.
* fix(channels): decrypt encrypted send response in Zalo SendMessage
The Zalo send message API response is encrypted like all other endpoints.
Parse outer envelope, decrypt the data field, then extract msgId from
the decrypted inner response.
* feat(channels): improve Zalo listener reliability and UI channel wizard
- Migrate WebSocket client from gorilla to coder/websocket, eliminating
unsafe/reflect hacks for RSV1 decompression and buffer inspection
- Add channel-level restart with exponential backoff (2s→60s cap, max 10)
so channels auto-recover instead of stopping permanently
- Reset listener retry counters after 60s stable connection to prevent
long-lived connections from exhausting retry budget
- Add code 3000 (duplicate session) recovery with 60s initial delay
- Detect silent disconnects via read deadline (2.5x ping interval)
- Fix Stop() to always cancel context, preventing reconnect timer leaks
- Refactor UI channel form into wizard-based flow with registry pattern
- Auto-refresh channel status after create/update dialog closes
* refactor(channels): move Zalo RPC methods to zalomethods package
Move Zalo personal channel RPC handlers from internal/gateway/methods to
internal/channels/zalo/personal/zalomethods, improving code organization
and removing prefix redundancy. Rename types: ZaloPersonalQRMethods →
QRMethods, ZaloPersonalContactsMethods → ContactsMethods.
- Move zalo_personal_qr.go → zalomethods/qr.go
- Move zalo_personal_contacts.go → zalomethods/contacts.go
- Update imports in cmd/gateway.go (2 call sites)
- Update internal/channels/zalo/personal imports
* feat(channels): add typing indicator to Zalo Personal channel
Show "typing..." in Zalo while the LLM processes messages, matching
the Telegram/Discord pattern. Uses the shared typing.Controller with
4s keepalive (Zalo typing expires ~5s) and 60s TTL safety net.
* feat(channels): handle image attachments in Zalo Personal channel
- Add Raw field to Content struct to preserve non-string JSON payloads
- Add Attachment struct with IsImage() detection (ext + Zalo CDN paths)
- Add AttachmentText() for human-readable placeholders (image/file/other)
- Download image attachments to temp files for agent vision pipeline
- Non-image files get text placeholder only (no download)
- Fix URL query param stripping in file extension detection
* fix(channels): switch Zalo WS client to gorilla/websocket with cookie jar fix
coder/websocket did not propagate session cookies for wss:// URLs,
causing Zalo backend to reject connections with "zpw_sek not found".
Switch to gorilla/websocket which handles wss→https scheme conversion
natively. Add wsJar safety wrapper and fix Close() mutex consistency.
Also update Makefile `up` target to use --no-cache builds.
* fix(channels): inject cookies manually for Zalo WS connection
Replace wsJar wrapper with direct cookie injection from chat.zalo.me
base domain. Fixes host-only cookies (zpw_sek) not matching WS
subdomains (ws*-msg.chat.zalo.me) due to Go cookiejar limitations.
* fix(channels): harden Zalo Personal channel security and concurrency
- Add SSRF protection to downloadFile using CheckSSRF (URL validation,
private IP blocking, DNS pinning) with context and 30s timeout
- Protect c.sess/c.listener with sync.RWMutex to eliminate data races
during restart; add thread-safe session()/getListener() accessors
- Add stopped flag + reconnTimer to Listener to prevent zombie reconnects
after Stop(); timer cancelled on Stop(), checked before Start()
- Fix QR flow using context.Background() detached from WS client; now
derives from parent ctx so flow cancels on client disconnect
- Set initial 30s read deadline for cipher key handshake to prevent
indefinite blocking before ping loop starts
- Use defer in WSClient.Close() to prevent connection leak on panic
- Document ReadMessage ctx limitation and two-layer reconnect design
* chore: remove unused gobwas/ws dependency from go.mod
gobwas/ws was a leftover from the previous coder/websocket usage,
no longer imported by any Go source files.
* fix(channels): align Zalo Personal policy defaults across UI and backend
Policy defaults were inconsistent across three layers causing group/DM
allowlist enforcement to silently fail. New() applied "allowlist" default
to local vars but never wrote back to config; checkGroupPolicy() then
read empty string and defaulted to "open", bypassing the allowlist.
UI Select components displayed schema defaults visually without
persisting them to configValues, so DB config never stored the policy.
When LLMs call team_tasks create + spawn in parallel, the spawn
tool receives a hallucinated task_id that fails uuid.Parse, causing
a misleading error and bypassing orphan detection.
- Include pending task IDs in spawn error message so LLM can retry
with the correct UUID
- Move spawn counting to post-execution so failed spawns don't
increment teamTaskSpawns, allowing orphan detection to fire
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Port 4 missing Telegram forum/topic features from TypeScript OpenClaw:
1. Thread-not-found fallback: retry sends without message_thread_id when
a topic is deleted (sendHTML, sendPhoto, sendVideo, sendAudio,
sendDocument, stream flush).
2. Per-topic config: hierarchical config resolution (global → wildcard
group "*" → specific group → specific topic) for groupPolicy,
requireMention, allowFrom, enabled, skills, systemPrompt.
New TelegramGroupConfig/TelegramTopicConfig structs, resolveTopicConfig()
with 10 unit tests.
3. DM topic support: preserve message_thread_id in private chats for
session isolation. New BuildDMThreadSessionKey, parseRawChatID handles
🧵 suffix.
4. createForumTopic agent tool: ForumTopicCreator interface decoupled
from telego, lazy bot resolution via channel manager.
5. Web UI: structured group/topic config form with tri-state booleans
(Inherit/Yes/No), nested collapsible group and topic entries.
Also fix: forum group pairing reply and approval notification now
correctly set MessageThreadID so messages land in the right topic.
Send() extracts threadID from localKey suffix as fallback for cases
where metadata is absent (e.g. pairing approval via SendToChannel).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix resolvePath for nested non-existent dirs (use resolveThroughExistingAncestors)
- Channel-isolated workspace: user_agent_profiles.workspace stores channel prefix,
used as source of truth with backward compat for existing users
- Loop caches workspace per-user with CacheKindUserWorkspace invalidation via pubsub
- ContractHome/ExpandHome for portable ~-based paths in DB
- create_image saves to workspace/generated/YYYY-MM-DD/ instead of OS temp dir
- SOUL.md template: add ## Expertise section for domain knowledge
- Summoner buildEditPrompt: section guide, complete file output, frontmatter update
- Bus: Topic* constants for Subscribe/Broadcast keys, CacheKind* for payload kinds
- Teams, delegates, sessions, agent links: various enhancements
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add DashScope (Qwen) native provider with tools+streaming fallback
- Add Bailian Coding provider with hardcoded model list (no /v1/models API)
- Parse reasoning_content in OpenAI-compat streaming/non-streaming responses
- Emit ChatEventThinking events in agent loop for thinking models
- Add vision support for DashScope (qwen3-vl)
- Fix provider form dialog not updating API base URL when switching types
- Update README provider count from 11+ to 13+
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* feat(telegram): add speech-to-text transcription for voice/audio messages with audio-aware routing
Implement an STT pipeline for inbound voice and audio messages in the Telegram channel:
- Add new STT configuration options (stt_proxy_url, stt_api_key, stt_tenant_id, stt_timeout_seconds)
- Add VoiceAgentID config to route voice messages to a dedicated speaking agent
- New stt.go module handles transcription via configurable STT proxy service
- MediaInfo struct now includes Transcript field for storing transcription results
- buildMediaTags embeds transcript in media tags when available for downstream processing
This enables the pipeline to process voice messages with full transcription rather than falling back to text-only handling.
* test(telegram): add tests for buildMediaTags and transcribeAudio functions
Allow overriding the Anthropic API base URL via GOCLAW_ANTHROPIC_BASE_URL
env var, config JSON, or DB provider record. Enables use of Anthropic-
compatible proxies and custom endpoints.
Also adds Makefile shortcuts for docker compose (up/down/logs).
- Fix seedManagedData() hardcoding openai_compat for all non-Anthropic providers;
now uses resolveProviderType() mapping to correct store constants (gemini_native,
minimax_native, etc.) — fixes model listing in web UI for Gemini
- Rename GOCLAW_FEISHU_* env vars to GOCLAW_LARK_* (Lark is the global brand)
- Add WhatsApp env override (GOCLAW_WHATSAPP_BRIDGE_URL) and auto-enable
- Add missing env vars to docker-compose.yml (Cohere, Perplexity, Lark, Zalo, WhatsApp)
- Update .env.example with all providers/channels, remove unnecessary GOCLAW_PROVIDER
- Add Cohere and Perplexity to prepare-env.sh provider detection
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add env var overlay and auto-enable for Discord channel, matching the
existing Telegram pattern. Update .env.example and docker-compose.yml.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
- Implement DM pairing flow for Discord and WhatsApp (checkDMPolicy, sendPairingReply with debounce)
- Add "pairing" case to BaseChannel.CheckPolicy() to reject instead of falling through to "open"
- Add group pending history tracking to Discord (record when not mentioned, prepend context when mentioned)
- Resolve Discord display names with priority: server nickname > global name > username
- Fix Gemini collapse format to prevent model from imitating tool call patterns
- Fix formatTokens crash on null/undefined input