Commit Graph

166 Commits

Author SHA1 Message Date
viettranx 4e9f155a4c feat(agent): adaptive tool timing with slow tool notification
Track per-tool execution time statistics in session metadata. When a tool
call exceeds its adaptive threshold (2x historical max, min 120s default),
send a direct outbound notification to the user.

- ToolTimingMap: parse/serialize/record/threshold from session metadata
- StartSlowTimer: fires once per tool call, auto-cancels on completion
- Team config: slow_tool toggle (default on, always direct, never leader)
- UI: toggle in team settings with i18n (en/vi/zh)
- Store: add GetSessionMetadata to session store interface
2026-03-19 13:35:57 +07:00
Duc Nguyen 2cc9d68cdc fix(tts): config save, Edge provider, media dispatch + dark mode chat (#265)
* fix(tts): config save + Edge provider registration + dark mode chat bubbles

- Wrap TTS config payload in `raw` field for config.patch RPC (#229)
- Always register Edge TTS provider (free, no API key) instead of gating on `enabled` flag
- Fix low-contrast user message bubbles in dark mode chat

* fix(tts): skip duplicate media dispatch when temp file already delivered

When both the agent loop and the message tool dispatch the same TTS
temp file, the first dispatch succeeds and cleanup deletes it. Filter
out missing temp media files before sending to prevent "file not found"
errors and spurious error notifications on Telegram/Slack/Discord.

* feat(tts): include edge-tts in Docker image when Python enabled

Edge TTS is free (no API key) and serves as a universal TTS fallback.
Install it alongside Python in both ENABLE_PYTHON and ENABLE_FULL_SKILLS builds.

* chore(docker): expose build args from .env for compose builds

Pass ENABLE_OTEL, ENABLE_PYTHON, ENABLE_FULL_SKILLS as env-driven
build args so .env can control Docker build features without editing
docker-compose.yml directly.

* fix(tts): hot-reload TTS config on settings change via pub/sub

TTS providers were only registered at startup, so changing provider/API
key via the Web UI had no effect until container restart. Add a
tts-config-reload bus subscriber that rebuilds the TTS manager on
config changes, matching the pattern used by quota, cron, and web_fetch.
Always create a TtsTool at startup (even without providers) so the
reload subscriber can populate it when settings are first configured.

* fix(tts): protect TtsTool.UpdateManager with RWMutex to prevent data race

UpdateManager() can be called from the config reload goroutine while
Execute() reads t.manager concurrently from agent goroutines. Add
sync.RWMutex following the same pattern as WebFetchTool.UpdatePolicy().

Also update setupTTS doc comment which incorrectly stated it could
return nil — Edge TTS is now always registered.

---------

Co-authored-by: viettranx <viettranx@gmail.com>
2026-03-19 08:21:06 +07:00
viettranx 5b349db7eb feat(heartbeat): provider/model override + fix cache invalidation
- Add ProviderModelSelect to heartbeat config dialog (allowEmpty, verify button)
- Backend: accept providerName in HEARTBEAT.SET, resolve to UUID via GetProviderByName
- Add ModelOverride to RunRequest, used by Loop when set (cheaper model for heartbeat)
- Ticker passes heartbeat model override to agent RunRequest
- Fix: InvalidateCache after UpdateState so ListDue picks up new next_run_at immediately
- i18n: add sectionModel/modelHint keys (en/vi/zh)
2026-03-18 23:02:48 +07:00
Duc Nguyen dc51018563 fix: subagent provider routing + api_base fallback (#262)
* fix(subagent): inherit parent agent's provider instead of alphabetical fallback

Subagents previously used a fixed provider (alphabetically first from the
registry, often "anthropic") regardless of which provider the parent agent
used. This caused invalid combos like anthropic/glm-5 when a zai-coding
agent spawned subagents.

- Pass provider registry to SubagentManager for runtime resolution
- Inject parent provider name into context (WithParentProvider)
- Resolve activeProvider from parent context before LLM call
- Fix trace spans to show actual resolved provider, not default

* fix(providers): api_base fallback from config/env for DB providers

DB providers with empty api_base now inherit from config/env vars
(e.g., GOCLAW_ANTHROPIC_BASE_URL). Prevents proxy API keys from being
sent to the real provider API endpoint.

- Add APIBaseForType() method on ProvidersConfig
- registerProvidersFromDB falls back to config when api_base is empty
- ProvidersHandler uses resolveAPIBase() for model listing
- Add api_base, display_name, settings to provider validation whitelist

* fix(tracing): pass resolved provider name to subagent span emitters

- emitSubagentSpanStart now accepts providerName param instead of
  reading sm.provider.Name() — ensures root subagent span reflects
  the inherited parent provider, not the fallback default
- registerInMemory now uses resolveAPIBase() so DB providers with
  empty api_base inherit the config/env fallback (same as startup path)

---------

Co-authored-by: viettranx <viettranx@gmail.com>
2026-03-18 22:40:49 +07:00
viettranx 96cfd1bf08 feat(heartbeat): improve prompting, suppression, delivery targets and session cleanup
- Rewrite heartbeat prompt to instruct agent to EXECUTE checklist tasks, not echo them
- Simplify suppression: HEARTBEAT_OK present = always suppress, absent = always deliver
- Add delivery targets RPC (heartbeat.targets) for channel/chatId picker
- Sanitize backend errors — never expose raw SQL to client
- Add session cleanup for isolated heartbeat sessions after run
- Cap StaggerOffset at 10% of interval to avoid user-visible delay
- Fix Upsert to persist next_run_at correctly
2026-03-18 16:37:36 +07:00
viettranx 29816db0ab feat(heartbeat): cron wakeMode, queue-aware scheduling, lightContext
- CronPayload.WakeHeartbeat triggers heartbeat immediately after cron job completes
- Cron tool supports wake_heartbeat param on add/update actions
- Scheduler.HasActiveSessionsForAgent() detects busy agents for heartbeat skip
- RunRequest.LightContext skips loading context files during heartbeat runs
2026-03-18 13:11:58 +07:00
viettranx 08a2d95c0c feat: agent heartbeat system — periodic proactive check-ins (#245)
Phase 1 (Core):
- Migration 000022: agent_heartbeats, heartbeat_run_logs, agent_config_permissions tables
- HeartbeatStore + ConfigPermissionStore interfaces with PG implementations
- HeartbeatTicker: background poll → active hours filter → queue-aware skip → run → smart suppression → deliver/log
- Heartbeat tool: status/get/set/toggle/set_checklist/get_checklist/test/logs actions
- Permission check with wildcard scope matching + TTL cache (60s)
- RPC methods: heartbeat.get/set/toggle/test/logs/checklist.get/checklist.set
- HEARTBEAT.md routed via context file interceptor (read/write for both open + predefined agents)
- Session keys: agent:{id}:heartbeat or agent:{id}💓{ts} (isolated)
- PromptMinimal for heartbeat sessions (like cron/subagent)
- Event broadcasting + cache invalidation via bus (heartbeat + config_perms)
- Gateway wiring: ticker init, event wiring, graceful shutdown

Phase 2 (Integration):
- wakeMode: CronPayload.WakeHeartbeat triggers heartbeat after cron job completes
- Queue-aware: Scheduler.HasActiveSessionsForAgent() skips busy agents
- Stagger: deterministic FNV offset spreads heartbeats across interval
- lightContext: RunRequest.LightContext skips context files, only injects checklist
- System prompt distinguishes cron (user-scheduled tasks) vs heartbeat (autonomous monitoring)
2026-03-18 13:11:44 +07:00
viettranx 5a4c72018a fix(teams): include agent display name in task create, list, and announce
- Add display_name to task create response (assignee name)
- Add owner_display_name and created_by_display_name to list/get items
- Pass to_agent_display via dispatch metadata (zero extra DB queries)
- Use display name in announce messages to leader for correct attribution
2026-03-18 11:04:56 +07:00
viettranx 49441f7305 refactor: remove dead delegate code, rename lane/channel to team/teammate
- Remove handleDelegateAnnounce() dead code (no sender emits delegate:* messages)
- Remove delegate tool reference from intent_classify.go
- Rename LaneDelegate → LaneTeam with backward-compat env var fallback
- Rename ChannelDelegate → ChannelTeammate across all team tool files
- Comment out lifecycle guards in team_tasks_lifecycle.go (TODO: reviewer workflow)
- Update string literals in cron.go, task_ticker.go
- Gate tool_status placeholder_update to non-streaming runs only
- Skip FinalizeStream on tool.call to prevent mid-run content loss
2026-03-18 11:04:45 +07:00
viettranx 843b550651 feat: runtime packages UI, pkg-helper, configurable shell deny groups (#244)
Runtime package management with security hardening:

- pkg-helper: root-privileged daemon for apk install/uninstall via Unix socket
- HTTP API: /v1/packages (list/install/uninstall/runtimes), admin role required for writes
- Shell deny groups: 15 configurable groups (per-agent overrides via context)
- Packages UI: Web page for managing system/pip/npm packages with confirmation dialogs
- Docker: privilege separation (root entrypoint → su-exec drop), init for zombie reaping
- Security: umask socket creation, persist file validation, deny pattern hardening
  (Node.js fetch/http, Python from/import, curl localhost, sensitive env vars)
- Auth: empty gateway token → admin role (dev/single-user mode)
2026-03-17 19:50:26 +07:00
viettranx b231878a85 feat(teams): add limit param to ListTasks + lightweight get-light endpoint
- Add limit parameter to ListTasks interface (dashboard=200, agent=30)
- Add teams.tasks.get-light WS method returning task only (no comments/events)
- Truncate dashboard response to exact limit (fix off-by-one from limit+1)
- Update all 7 ListTasks callers with explicit limit values
2026-03-17 18:03:10 +07:00
viettranx b735c16d93 feat(teams): split dispatched/assigned events + add completed notifications
- Change agent-side broadcasts from EventTeamTaskAssigned to
  EventTeamTaskDispatched (post-turn, fallback, unblock, retry)
- Add completed notification with leader-completion skip logic
- Add Completed field to TeamNotifyConfig with *bool backwards compat
- Differentiate dispatched messages: unblocked vs regular dispatch
- Add EventTeamTaskDispatched to audit event mapper
2026-03-17 18:02:54 +07:00
viettranx aeadb20ba7 fix(teams): deduplicate task notifications and batch with debounce
Remove premature EventTeamTaskAssigned broadcast in executeCreate() that
caused duplicate "assigned to" Telegram notifications. Assignment
notification now only fires at actual dispatch (post-turn, fallback, or
unblocked).

Add TeamNotifyQueue (2s debounce, cap 20) to batch rapid-fire task
notifications per chat — reduces N messages to 1 when leader dispatches
multiple tasks at once. In leader mode this also reduces agent turns
from N to 1.

Also fix: ResetTaskStatus now clears progress_percent/progress_step on
retry, and retry broadcast includes TaskNumber/Subject for correct
notification formatting.
2026-03-17 14:29:52 +07:00
viettranx 97cacfe68b feat(teams): member task progress reminder + fix broken progress notifications
- Fix progress event payload missing TaskNumber, Subject, OwnerAgentKey,
  ProgressPercent, ProgressStep — notifications were rendering empty
- Fix progress notification format to include task name (consistent with
  dispatched/failed) and guard empty ProgressStep
- Change percent tool schema from number to integer for clarity
- Add pre-run member task reminder injecting task context before LLM loop
- Add mid-loop progress nudge every 10 iterations with suggested percent
  based on iteration ratio (handles maxIter=0 unlimited case)
- Enhance leader cross-session reminder to show progress % when available
- Strengthen TEAM.md member guidance: focus, result quality, progress rules
- Add progress bar to task list table view (matches kanban card pattern)
2026-03-17 12:43:09 +07:00
viettranx d205691a13 fix(skills): hide skill_manage from LLM when skill_evolve is off
- skill_manage builtin tool default Enabled: true (available in registry)
- When skill_evolve=false: filter skill_manage from both tool definitions
  (API params) and system prompt tooling section — agent has zero awareness
- When skill_evolve=true: tool visible + system prompt guidance + nudges
- Update UI hints to reflect tool is available by default
2026-03-17 12:05:48 +07:00
viettranx b2a74ba487 feat(skills): skill_manage tool + skill_evolve learning loop (#218)
Adds skill_manage — a first-class agent tool for creating, updating, and
deleting skills from within a conversation — paired with per-agent
skill_evolve that nudges predefined agents to capture reusable workflows.

Tool (skill_manage):
- create: write skill from SKILL.md content string (auto-grant, dep scan)
- patch: find/replace producing new immutable version (advisory-locked)
- delete: soft-delete (archive in DB, move to .trash/)
- Security guard: 25 regex patterns block shell injection, credential exfil,
  path traversal, SQL injection, privilege escalation
- Ownership enforced: only skill owner can patch/delete (admin bypass)
- Content size limit: 100KB; companion file copy: 20MB, symlink-safe (WalkDir)
- Enabled: false by default — admin opt-in per agent

Learning loop (skill_evolve, predefined agents only):
- System prompt: SHOULD/SHOULD NOT guidance for skill creation
- Budget nudges: [System] prefix at 70%/90% iteration budget (ephemeral, i18n)
- Postscript: once-per-run suggestion with explicit user consent
- Config: other_config.skill_evolve + skill_nudge_interval (default 15)

Security hardening (pre-existing + new):
- CreateSkillManaged: RETURNING id + pg_advisory_xact_lock (atomic upsert)
- GetNextVersionLocked: advisory lock for race-safe patch versioning
- Ownership checks on HTTP update/delete, gateway update, 4 grant/revoke handlers
- copyOtherFiles: filepath.WalkDir for real symlink detection

UI: Skill Learning toggle + nudge interval in Agent General Tab
i18n: backend (en/vi/zh catalogs) + frontend (en/vi/zh locale files)
2026-03-17 11:38:35 +07:00
viettranx baa4bb6d45 refactor(teams): batch teammate announce via queue with dedup
Extract announce logic from handleTeammateMessage into a dedicated
announce queue that batches concurrent task completions. Handles
failure reporting and deduplicates rapid-fire announces.
2026-03-17 09:26:00 +07:00
badgerbees 365f41f81c fix: pass custom name to DashScopeProvider for correct registry lookup (#228) 2026-03-16 22:54:10 +07:00
viettranx eee79d111e feat(teams): granular progress notifications with direct/leader mode
- Replace progress_notifications toggle with granular config:
  dispatched (on), progress (on), failed (on) + delivery mode
- Direct mode: outbound to channel, no AI processing
- Leader mode: inject into leader session with NO-ACTION instructions
- Add consumer.team-notify subscriber for event forwarding
- Enrich TeamTaskEventPayload with TaskNumber, ProgressPercent/Step
- Add auto-status system prompt section
- UI: card-select for delivery mode (Zap/Bot icons), 3 toggles
2026-03-16 22:46:51 +07:00
viettranx 3f2b6e258e chore(teams): remove deprecated delegation tools
Remove delegate_search, evaluate_loop, handoff from:
- Seed data, system prompt, i18n keys/catalogs, channel events
- Consumer handler (handleHandoffAnnounce), handoff route lookup
- HandoffRouteData struct + PG implementation
- Protocol events, MCP bridge comment
- Web UI locale files (en/vi/zh)
2026-03-16 22:46:18 +07:00
viettranx b0bd4d6198 fix(pairing): fix browser approval stuck + security hardening
Squash-merge PR #225 with security fixes:

- Fix browser pairing stuck on "Waiting for approval" (stale closure:
  useState → useRef for senderID in pairing-form)
- Fix auto-kick after pairing (RequireAuth now accepts senderID,
  onAuthFailure skips logout for paired browser sessions)
- Allow browser-paired users to access HTTP APIs via X-GoClaw-Sender-Id
  header with fail-closed IsPaired check
- Remove ad-hoc IsInternalOrBrowser(), use channels.IsInternalChannel()
- Log failed HTTP pairing auth attempts for security monitoring
- Pass senderID to HttpClient for authenticated HTTP requests
2026-03-16 20:09:44 +07:00
viettranx 96898a3daa fix(teams): status filter default=all, reduce page size to 30, update WorkspaceDir callers
- status="" now returns all tasks (was active-only); add explicit "active" filter
- Reduce list/search page size from 50 to 30
- Update WorkspaceDir callers after signature change (remove unused channel param)
- Update team_tasks schema descriptions for status and page params
2026-03-16 20:06:15 +07:00
viettranx 27fb900510 refactor(tools): remove workspace_read/workspace_write, use file tools for team workspace
Remove dedicated workspace tools in favor of making existing file tools
(read_file, write_file, list_files, edit) team-workspace-aware.

- Delete workspace_tool_read.go and workspace_tool_write.go
- Clean up workspace_dir.go: export WorkspaceDir, remove dead code
  (workspaceRelPath, sanitizeFilePath, inferMimeType, templates, etc.)
- Remove workspace tool registration from gateway_managed.go
- Remove workspace tool references from policy, subagent, MCP bridge
- Add PathAllowable/PathDenyable to types.go for interface abstraction
2026-03-16 20:05:26 +07:00
viettranx 0b5124a8f1 fix(security): harden pairing auth — fail-closed, rate-limit, TTL
- Change WS pairing check from fail-open to fail-closed on DB error
  (router.go: previously granted RoleOperator on any IsPaired() error)
- Add "browser" to InternalChannels so it's properly excluded from
  outbound dispatch without ad-hoc helpers
- Rate-limit browser.pairing.status endpoint to prevent sender_id
  enumeration (reuses server RateLimiter via PairingMethods injection)
- Add expires_at column to paired_devices with 30-day TTL for
  defense-in-depth; IsPaired() now checks expiry, ListPaired() prunes
- Add confidence_score column to team_tasks, team_messages,
  team_task_comments
- Bump RequiredSchemaVersion to 21
2026-03-16 19:55:08 +07:00
viettranx 8d6729e959 feat(teams): improve task dispatch, concurrency, and tool ergonomics
- Move task dispatch from mid-turn to post-turn to prevent dependent
  tasks from completing before the current agent's run finishes
- Add team create lock to serialize list→create flows across concurrent
  group chat sessions, preventing duplicate task creation
- Require list-before-create gate: agents must call team_tasks(list)
  before creating tasks
- Make assignee required on task creation
- Add pagination (50 per page) to task list with offset support
- Slim task list/get/search responses with dedicated structs to reduce
  context token usage
- Add task board snapshot in announce messages to leader
- Workspace: allow subdirectory paths in read/delete, show directories
  in list output
- UI: reduce kanban card title font size for better visual balance
2026-03-16 15:26:25 +07:00
viettranx 0dc3124607 fix(teams): propagate peer_kind and local_key through task dispatch chain
Team task announce was writing to wrong session (direct instead of group)
because origin_peer_kind was hardcoded as "direct" in dispatch metadata.
This caused leaders to miss completed task results in group conversations.

- Store peer_kind and local_key in task metadata at creation time
- Resolve peer_kind from context → metadata → "direct" fallback in all
  dispatch paths (tool, gateway, unblocked)
- Use actual origPeerKind in announce handler session key + request
- Add origin_local_key to gateway dispatch for forum topic routing
- Clarify ask_user guidance: bot must present question directly
- Guide members to use team_tasks progress instead of team_message
- Improve error message when non-owner calls progress action
2026-03-16 09:01:13 +07:00
viettranx 0857321a6b fix(providers): correct Anthropic prompt caching + add datetime tool
- Move cache_control from request root (ignored by API) to per-block
  placement on last system block and last tool definition
- Change system prompt time format to date-only for better cache stability
- Add builtin datetime tool for precise timestamps (cron, memory, etc.)
- Add atMs past-time validation in cron handleUpdate (was only in handleAdd)
- Update cron description to guide model to use datetime tool first
2026-03-16 08:14:03 +07:00
viettranx e138ac7676 fix(teams): validate blocked_by terminal state + improve leader orchestration prompt
- Add terminal-state check in executeCreate(): reject blocked_by
  referencing completed/cancelled/failed tasks with actionable error
- Add full validation in executeUpdate(): batch query via GetTasksByIDs,
  check existence + team membership + terminal state
- Add GetTasksByIDs batch query to TeamStore interface + pg implementation
- Refactor: modularize gateway, skills store, and team tools into
  focused files
- Update TEAM.md leader prompt: prefer delegation, plan full task graph
  upfront, create tasks in order with blocked_by UUIDs
2026-03-15 23:16:16 +07:00
viettranx 84b1b07634 refactor(config): centralize hardcoded ~/.goclaw paths via config resolution
Replace all hardcoded ~/.goclaw path constructions with configurable
sources (cfg.ResolvedDataDir() for service dirs, cfg.Agents.Defaults.Workspace
for agent workspaces). This fixes data persistence issues in Docker
deployments where paths differ from local dev.

- Add DataDir field to Config with ResolvedDataDir() resolver
- Add ResolvedDataDirFromEnv() package-level helper for packages without Config
- Populate StoreConfig.SkillsStorageDir (was never set, caused hardcoded fallback)
- Agent workspaces now use subdirectory format (workspace/{key}) for volume compatibility
- Remove dead GOCLAW_SESSIONS_STORAGE env/config (sessions moved to PostgreSQL)
- Fix deploy-stg.sh trailing space after backslash + remove deprecated GOCLAW_MODE
- Add GOCLAW_SKILLS_DIR override in docker-compose for volume persistence
2026-03-15 21:20:46 +07:00
Goon 75c570e951 feat(security): credentialed exec + HTTP RBAC + API key cache (#197)
- Secure CLI credential injection via AES-256-GCM encrypted env vars
- API key management with fine-grained RBAC scopes
- resolveAuth/requireAuth middleware across all 25+ HTTP handlers
- In-memory API key cache with TTL, negative caching, pubsub invalidation
- Sandbox-first execution (fails if unavailable, no silent fallback)
- Credential scrubbing, constant-time token comparison, Admin-only CLI creds
- SQL migration 000020: secure_cli_binaries + api_keys tables
- 14 unit tests for cache and RBAC with race detector

Closes #197
2026-03-15 20:13:18 +07:00
viettranx f236d721a9 refactor(teams): redesign team detail page with kanban board layout
- Add board/ components: kanban board, columns, cards, toolbar, dialogs
- Add Zustand board store for task state management
- Simplify task-detail-dialog and task-list components
- Refactor team-detail-page to board-based layout
- Update team-version-modal with improved UI
- Clean up team-settings-tab
- Add i18n strings for board UI (en/vi/zh)
- Update workspace path resolution in gateway and teams_workspace
2026-03-15 17:28:54 +07:00
Viet Tran 9a9744077e refactor(teams): v2 system cleanup — remove legacy tools, fix followup, add events API (#210)
Major refactoring of the team system with multiple improvements:

## Removed legacy delegation tools
- Delete `delegate.go`, `delegate_async.go`, `delegate_sync.go`, `delegate_events.go`,
  `delegate_policy.go`, `delegate_prep.go`, `delegate_state.go`, `delegate_search_tool.go`
- Delete `evaluate_loop_tool.go`, `handoff_tool.go`
- Remove all references and registrations from tool manager and policy
- Clean up TEAM_PLAYBOOK_IDEAS.md and TEAM_SYSTEM.md (moved to docs)

## Rename await_reply → ask_user
- Rename action `await_reply` → `ask_user`, `clear_followup` → `clear_ask_user`
- Rename functions `executeAwaitReply` → `executeAskUser`, `executeClearFollowup` → `executeClearAskUser`
- Update system prompt with stronger wording to prevent model misuse
- Model was confusing "await_reply" with general waiting; "ask_user" is unambiguous

## Fix auto-followup false positives
- Add `HasActiveMemberTasks(ctx, teamID, excludeAgentID)` store method
- Guard `autoSetFollowup()` in consumer: skip when lead has active member tasks
- Prevents auto-followup when lead is orchestrating teammates (not waiting for user)

## Task identifier zero-padding
- Change format from `T-1-xxxx` → `T-001-xxxx` (3-digit minimum)

## Refactor workspace WS handlers to filesystem-only
- Rewrite `teams.workspace.list/read/delete` to use pure filesystem (os.ReadDir/ReadFile/Remove)
- Remove DB dependency from workspace WS handlers
- Consistent with storage handler and workspace tools
- Simplify TeamWorkspaceFile type and frontend hook

## Add team events listing API
- New WS method `teams.events.list` with team_id, limit, offset params
- New HTTP endpoint `GET /v1/teams/{id}/events` with bearer auth
- New `ListTeamEvents(ctx, teamID, limit, offset)` store method
- JOIN with team_tasks for team-wide event filtering

## Extract team access policy
- New `team_access_policy.go` — centralized team tool access control

## Migration 000019: team_id columns
- Add team_id foreign key columns to relevant tables

## Other improvements
- Add team_id propagation through agent loop, tracing, sessions
- Update i18n locale files (en/vi/zh) for new tool labels
- Update frontend builtin-tools page and require-setup component
- Bump RequiredSchemaVersion for migration 000019
2026-03-15 14:53:19 +07:00
Goon 5e2fa395c7 feat(providers): add ACP provider for external coding agents (#190)
* feat(providers): add ACP provider for orchestrating external coding agents (#189)

Implement native Go ACP (Agent Client Protocol) client as a new Provider.
Enables GoClaw to orchestrate any ACP-compatible agent (Claude Code, Codex
CLI, Gemini CLI) as a subprocess via JSON-RPC 2.0 over stdio.

- Add bidirectional JSON-RPC 2.0 transport over stdio pipes
- Add subprocess process pool with idle TTL reaping and crash recovery
- Add ACP session lifecycle (initialize, session/new, session/prompt)
- Add tool bridge for agent-initiated fs/terminal/permission requests
- Add workspace sandboxing, shell deny patterns, and env var filtering
- Wire config-based and DB-based provider registration paths
- Export DefaultDenyPatterns from tools package for reuse

* feat(providers): add changelog entry for ACP provider integration

* fix(tools): prevent workspace traversal bypass via /tmp/ fallback in resolveMediaPath

Reject paths containing ".." in the isInTempDir fallback to prevent
workspace escape where traversal path still resolves inside /tmp/.

* fix(tools): block workspace-sibling paths in resolveMediaPath /tmp/ fallback

When workspace is inside /tmp/, traversal paths like workspace/../X
resolve to /tmp/ siblings that pass isInTempDir. Reject paths inside
the workspace parent directory to prevent this escape.

* feat(providers): add ACP provider web UI and live reload via pubsub

Web UI for creating/editing ACP providers with dedicated form fields
(binary, args, idle TTL, permission mode, work directory). ACP providers
now update immediately without gateway restart via cache invalidation
pubsub pattern.

Frontend:
- New ACPSection form component with i18n (en/vi/zh)
- Provider form dialog integration with ACP state management
- ACP type badge on providers list page
- Settings field added to provider TypeScript types

Backend:
- ACP models handler (claude/codex/gemini) without API key requirement
- Binary path validation + LookPath verification in verify handler
- Provider CRUD emits cache.invalidate events via msgBus
- Subscriber in gateway_managed.go re-registers ACP providers from DB
- ACP core improvements from code review (helpers, jsonrpc, process,
  terminal, tool_bridge)

---------

Co-authored-by: viettranx <viettranx@gmail.com>
2026-03-14 16:16:08 +07:00
viettranx 8ad425f5f8 fix: use resolved workspace dir for StorageHandler in Docker deployments
StorageHandler was hardcoded to browse ~/.goclaw/, which is empty in
Docker where volumes mount to separate paths (GOCLAW_WORKSPACE).
Use the already-resolved workspace variable so the Storage page
correctly shows workspace files regardless of deployment layout.
2026-03-14 14:21:59 +07:00
Viet Tran 1a42dc93a6 feat(teams): team system v2 with bug fixes, workspace scope, versioning, and prompt optimization (#183)
* feat(workspace): add team shared workspace for file collaboration

- Add workspace_write and workspace_read tools for agents to share files across team members
- Create team_workspaces DB table with migration 000017 (file metadata, pinning, tags)
- Implement PostgreSQL store layer for workspace CRUD operations
- Add RPC handlers for workspace list/read/delete from web UI
- Build React workspace tab with file listing, content preview, and delete
- Propagate workspace channel/chatID scope through delegation chain
- Auto-allow workspace tools in agent tool policy when agent belongs to a team
- Inject team workspace guidance into system prompt for team agents
- Add /reset command handler for clearing session history
- Harden MCP bridge context middleware to reject headers when no gateway token
- Add i18n strings for workspace UI in en/vi/zh locales

* feat(teams): add comprehensive task management with followup reminders and recovery

- Add task followup/reminder system with auto-set on lead agent reply and auto-clear when user responds on channel
- Add task recovery ticker to re-dispatch stale/pending tasks periodically
- Add task scopes, filtering by status/channel/chatID, and task events
- Add WS RPC handlers for task CRUD, assignments, comments, events, and bulk operations (teams_tasks.go)
- Add task detail dialog, settings UI for followup config, and scope filtering in web dashboard
- Add migrations 000018 (team_tasks_v2) and 000019 (task_followup)
- Extend team_tasks_tool with await_reply, clear_followup actions
- Auto-complete/fail team tasks when delegate agent finishes
- Add workspace file listing and team tool manager enhancements

* docs(teams): add team system architecture and playbook ideas documentation

- Add TEAM_SYSTEM.md with full architecture design covering task management, shared workspace, and delegation engine subsystems
- Add TEAM_PLAYBOOK_IDEAS.md outlining future team coordination layers (playbook, member capabilities, auto-learned patterns)
- Document data models, status flows, tool actions, followup reminder system, task ticker, execution locking, and workspace scope model

* fix(teams): resolve 6 critical bugs in team task system

- Fix unblock SQL: check array_length after array_remove (not before)
- Enforce single-team leadership in team creation
- Add requireLead() for approve/reject tool actions
- Validate cross-team dependency references in blocked_by
- Add team_id to handoff route for multi-team isolation
- Set blocked_by DEFAULT '{}' to prevent NULL array issues

* refactor(workspace): use stable userID as scope key instead of connection UUID

Workspace scope changed from (team_id, channel, chat_id) to (team_id, userID).
Fixes workspace fragmentation across WS tab refreshes and reconnections.

* feat(teams): add V1/V2 versioning with feature gating and optimized prompts

- IsTeamV2() helper gates advanced features (locking, followup, review, audit)
- V2 tool actions rejected for V1 teams with clear error message
- Ticker, gateway consumer, delegation hooks respect version flag
- TEAM.md renders v1/v2 sections conditionally
- Tool descriptions and params optimized (~38% token reduction)
- UI: version toggle in settings, V2 Beta badge, conditional rendering
- i18n: version modal keys for en/vi/zh

* fix(migration): use VARCHAR(255) for user ID columns and add metadata JSONB

- assignee_user_id, user_id, actor_id: TEXT → VARCHAR(255)
- Add metadata JSONB to team_task_comments and team_task_attachments

---------

Co-authored-by: Nam Nguyen Ngoc <namnn.0911@gmail.com>
2026-03-13 22:41:32 +07:00
viettranx ddd4565380 fix(tools): add negative guidance for message tool and disable handoff by default
Message tool prompting now explicitly tells the LLM not to use it for
replying to the user — prevents false activations on phrases like
"gởi lại cho tôi" (send it back to me).

Handoff tool disabled by default since it's rarely used and causes
confusion with spawn. Admins can re-enable via DB/UI if needed.
2026-03-13 16:15:33 +07:00
viettranx 7f4f4a238e feat(memory): inject KG hint into memory_search results and improve KG tool prompting
- Add hasKG flag to MemorySearchTool, inject hint in results when KG is enabled
- Wire SetHasKG(true) in gateway when KG store is available
- Improve knowledge_graph_search tool description with concrete use cases
- Update system prompt KG guidance to be more actionable
2026-03-13 13:33:18 +07:00
viettranx 4c7db6e09b feat(agent): add mid-run message injection for DM and WebSocket
Inject user follow-up messages into the running agent loop at turn
boundaries instead of queueing them for a new run. This preserves
context so the LLM sees both tool results and user follow-ups together.

- Add InjectedMessage type and drainInjectChannel helper
- Add InjectCh to ActiveRun with buffered channel (cap=5)
- Drain injection channel at two points in agent loop (after tool
  results and before no-tool-calls exit)
- Route steer/new_task intents to InjectMessage with scheduler fallback
- WebSocket: inject into running loop when session is busy
- Remove IntentClassify config toggle (always on)
- Web UI: show send + stop buttons side by side during agent run
- i18n: add injection acknowledgment messages (en/vi/zh)
2026-03-13 11:55:55 +07:00
Luan Vu b73f66d99b fix(tools): make MessageTool media path resolution workspace-aware (#169)
MessageTool.parseMediaPath() was hardcoded to only allow files in /tmp/,
while all other filesystem tools (read_file, write_file, edit, exec) use
workspace-aware resolvePath() with restrict_to_workspace enforcement.

This meant agents could create files in their workspace via write_file
but couldn't send them as attachments — only /tmp/ files from
create_image/create_audio worked.

Replace parseMediaPath() with resolveMediaPath() that:
- Reuses resolvePath() for consistent security (symlink, hardlink, traversal)
- Honors per-agent workspace + restrict_to_workspace from context
- Still allows /tmp/ as fallback (for create_image, create_audio, etc.)
- Supports relative paths resolved against workspace
- Updates tool description so LLM knows about MEDIA: prefix

Co-authored-by: Luvu182 <208665161+Luvu182@users.noreply.github.com>
2026-03-12 20:43:52 +07:00
viettranx 9115169c03 feat: expand audit logging via pub/sub event pattern
Replace direct ActivityStore injection with event-driven audit system.
Handlers emit audit events via msgBus.Broadcast(), a single subscriber
with buffered channel persists to activity_logs table.

Coverage expanded from 3 agent CRUD actions to ~65 audit points across
all HTTP handlers and WebSocket RPC methods including agents, providers,
skills, MCP servers, cron, sessions, teams, pairing, and more.
2026-03-12 18:34:56 +07:00
Goon 7a4a20b2e8 fix(discord): per-user memory scope in guild channels (#166)
* docs: add brainstorm report for discord guild-user memory

* docs: update brainstorm report with corrected root cause analysis

* feat(discord): per-user memory scope in guild channels

Fixes shared USER.md between guild members by scoping userID to
"guild:{guildID}:user:{senderID}" for Discord group messages.
Updates all group-context prefix checks (write permissions, writer
cache, cron peer kind, history filter) to include the new guild: prefix.

Closes #165
2026-03-12 16:45:30 +07:00
Viet Tran ace07509b7 feat(skills): system skills integration — toggle, dep checking, per-item install (#161)
* feat(infra): add runtime package support for skills

Install nodejs, npm, pandoc, github-cli + pre-install Python packages
(openpyxl, pandas, python-pptx, markitdown) and Node packages
(docx, pptxgenjs). Configure runtime dirs for agent pip/npm installs
with PIP_TARGET, NPM_CONFIG_PREFIX, NODE_PATH to enable dynamic
package installation in read-only container environment.

* feat(infra): add bundled skills with runtime package support

- Add 5 bundled skills: docx, pdf, pptx, xlsx, skill-creator from container skills-store
- Wire GOCLAW_BUILTIN_SKILLS_DIR env var in gateway and CLI
- Support optional runtime packages alongside dynamic skill loading
- Update Dockerfile to COPY bundled-skills at /app/bundled-skills/
- Add PIP_CACHE_DIR in docker-entrypoint.sh for clean pip installs
- Document bundled skills in 14-skills-runtime.md section 6

* feat(infra): remove ai-multimodal skill directory from bundled skills

Remove the ai-multimodal skill package as part of consolidating runtime
package support for bundled skills. This directory is no longer needed
in the bundled skills structure.

* feat(ci): add semantic release and Docker Hub publishing

Add go-semantic-release workflow to auto-create semver tags on merge to
main. Extend docker-publish to push all variants to both GHCR and
Docker Hub (digitop/goclaw).

* feat(skills): add system skills infrastructure with is_system column, dep scanning, and seeder

- Migration 000017: add is_system boolean column with partial index
- Store layer: UpsertSystemSkill, delete protection, IsSystemSkill
- ListAccessible auto-includes system skills (no grants needed)
- ListWithGrantStatus returns is_system field
- Dependency scanner: auto-detect deps from scripts/ or skill-manifest.json
- Dependency checker: verify system binaries, Python/Node packages
- Seeder: seed bundled skills into DB on startup (idempotent via hash)
- Gateway wiring: GOCLAW_BUNDLED_SKILLS_DIR env for bundled skills
- HTTP: delete guard (403), slug conflict check (409), rescan-deps endpoint
- UI: System badge, hide delete for system skills, rescan deps button
- Agent skills tab: "Always available" for system skills
- i18n: en/vi/zh keys for system skills, deps scanning

* feat(skills): conditional system prompt, skill manifests, and Zip Slip fix

- System prompt: only show package list when python3/node are available
- Add skill-manifest.json for pdf, docx, xlsx, pptx bundled skills
- Fix Zip Slip vulnerability in office/unpack.py (all 3 copies)

* refactor(skills): extract shared office code to _shared/ and deduplicate

Move office scripts (pack, unpack, validate, schemas, validators) from
duplicated copies in docx/xlsx/pptx to skills/_shared/office/ with
symlinks. Remove soffice.py (non-functional in containers) and update
SKILL.md references to use soffice binary directly. Update seeder
copyDir to follow symlinks.

Removes ~45K lines of duplicate code across 3 skills.

* fix(skills): address code review findings for system skills integration

- H1: Remove dead symlink branch in copyDir (filepath.Walk follows symlinks)
- H3: Fix rescan-deps to query ALL skills (including archived) and re-activate
  when deps become available; add ListAllSkills() + Status field to SkillInfo
- H4: Add Status field to SkillCreateParams, stop overloading Visibility
- M1: Batch Python/Node dep checks into single subprocess per runtime
- M4: Add rows.Err() check in ListSkills to prevent caching partial results

* feat(skills): async dep checking with realtime WS events

Split Seed() into sync DB upsert + async CheckDepsAsync() goroutine.
Gateway startup no longer blocks on Python/Node subprocess dep checks.

- Seed() returns seeded skills list, all initially status="active"
- CheckDepsAsync() runs in background, emits skill.deps.checked per-skill
- skill.deps.complete event emitted when all checks finish
- Each failed dep check: archives skill + BumpVersion() for immediate
  cache invalidation so next agent turn picks up the change
- UI: use-query-invalidation listens to skill.deps.* events → auto-refresh
  skills list in realtime

* feat(skills): system skills integration with toggle, dep checking, and per-item install

- Add is_system, deps, enabled columns to skills table (migration 017)
- Seed bundled core skills (pdf, docx, pptx, xlsx, skill-creator) on startup
- PYTHONPATH-based dep detection — eliminates false positives from local modules
- Per-item dep install UI with individual status (installing/success/error)
- Enable/disable toggle for core and custom skills (independent of dep status)
- Re-run dep check when skill is toggled back on
- Inline skill thresholds: 40 skills / 5000 tokens before switching to search mode
- Fix UpsertSystemSkill: backfill null file_hash without bumping DB version
- Remove redundant skill-manifest.json files (replaced by deps JSONB column)
- Show author from frontmatter in custom skills tab
- Runtime checker for python3/pip3/node/npm availability
- WS events for dep checking/installing progress
- docs: add 15-core-skills-system.md, 16-skill-publishing.md

---------

Co-authored-by: Goon <duy@wearetopgroup.com>
2026-03-12 09:20:41 +07:00
viettranx 72779646c1 fix: complete media provider type routing for Suno and fix dbTypeToMediaType map
- Set providerType on Suno case in registerProvidersFromDB (was missing,
  causing misrouting for custom-named Suno providers)
- Fix dbTypeToMediaType: replace dead "openai" key with "openai_compat"
  to match actual store constant, add "bailian" → "dashscope" mapping
- Clone entry.Params before injecting _provider_type to avoid mutating
  the original chain config
2026-03-11 18:34:38 +07:00
Luan Vu 405a753239 fix: resolve media provider type from DB instead of guessing from name (#154)
Media tools (create_image, create_video, create_audio, read_audio,
read_video, read_document) routed API calls based on provider name
pattern matching (e.g. strings.HasPrefix(name, "gemini")). This breaks
when users give custom names to DB providers — a Gemini provider named
"chatgpt-sap-het" would be misrouted to the OpenAI-compat endpoint,
causing 404 errors.

Fix: carry the DB provider_type through OpenAIProvider, resolve it via
typedProvider interface in ExecuteWithChain, and inject as _provider_type
param for callProvider routing. Name-based heuristic kept as fallback
for config-file providers that don't have a DB type.

Co-authored-by: Luvu182 <208665161+Luvu182@users.noreply.github.com>
2026-03-11 18:32:51 +07:00
viettranx ec34f488df feat(tools): add tool alias registry for Claude Code skill compatibility
Add alias support to the tool Registry so Claude Code skills can reference
Anthropic tool names (Read, Write, Edit, Bash, etc.) and have them resolve
to GoClaw canonical tools (read_file, write_file, edit, exec, etc.).

- Registry: aliases map, RegisterAlias(), resolve(), Aliases()
- Policy engine: auto-includes alias defs when canonical tool passes filter
- System prompt: alias entries in coreToolSummaries + missing use_skill
- Legacy toolAliases migrated to Registry.RegisterAlias() at startup
2026-03-11 17:33:28 +07:00
Luan Vu 1b99406012 fix: resolve embedding provider from DB registry + per-agent config (#134)
The embedding provider resolution only matched 3 hardcoded names
(openai, openrouter, gemini), silently failing for DB-stored providers
like "openai-embedding". This caused memory chunks to be stored
without vectors even when a valid embedding provider was configured.

Changes:
- resolveEmbeddingProvider: fallback to provider registry for DB-stored
  provider names when hardcoded match fails
- gateway startup: read per-agent memory config from DB (priority over
  config file defaults) for embedding provider resolution
- memory IndexDocument: log embedding errors instead of swallowing them
- memory admin ListChunks: return full chunk text instead of truncating
  to 200 chars, avoiding confusing partial content in the UI

Co-authored-by: Luvu182 <208665161+Luvu182@users.noreply.github.com>
2026-03-11 14:31:00 +07:00
Viet Tran 73389d2715 fix(ui): align usage data contracts, add timezone setting, and fix empty usage page (#146)
- Fix 6 data contract mismatches between Go backend JSON tags and React
  frontend TypeScript interfaces (field renames, response envelope changes)
- Add timezone selector to topbar with 12 common timezone options
- Replace date-fns formatting with native Intl.DateTimeFormat for
  timezone-aware chart labels (reduces bundle ~20KB)
- Add missing SnapshotTimeSeries fields (memory_docs, memory_chunks,
  kg_entities, kg_relations) that caused empty usage page
- Add error banner to usage page for API error visibility
- Sanitize backend error messages in usage HTTP handlers
- Add batch chunking (max 3000 rows) for snapshot upserts
- Remove userId display from topbar
- Add usage analytics i18n strings for en/vi/zh
2026-03-11 14:22:03 +07:00
Viet Tran 0926d053b0 feat: add token usage tracking, cost analytics, budget enforcement, wake API, and activity audit trail (#142)
- A1+C2: Include token usage in run.completed event payload for WS clients
- A2: Cost tracking with model pricing config, cost calculation, and cost summary API
- A3: Budget enforcement per agent with monthly budget limits (migration 000015)
- C1: External wake/trigger API (POST /v1/agents/{id}/wake) for orchestrators
- C3: Activity audit trail with structured logging and queryable API
- UI: Activity page, cost stat card on overview, budget section in agent detail
- i18n: Complete en/vi/zh translations for all new features
2026-03-11 12:52:12 +07:00
Viet Tran cc00a6f193 fix: route delegate session keys to correct agent loop (#127)
Delegate session keys (delegate:{uuid8}:{agentKey}:{delegationId}) were
not parsed by makeSchedulerRunFunc, causing fallback to default agent ID
which doesn't exist in managed-mode DBs. Add switch/case to handle both
agent: and delegate: prefixes.
2026-03-11 07:57:51 +07:00
Thieu Nguyen 8ad580521d refactor: deprecate standalone mode, managed mode is now default (#126)
* refactor: remove managed/standalone mode distinction from codebase

Standalone mode is deprecated; managed mode is now the only mode.
Remove redundant "managed mode" qualifiers from comments, docs,
and error messages. Error strings now reference "database stores"
instead of "managed mode" for clarity.

* improve(onboard): streamline onboard process and env setup

Simplify onboard wizard, extract helpers to dedicated file,
update env example and entrypoint for default managed mode,
clean up prepare-env script, update i18n catalogs.
2026-03-11 07:27:38 +07:00