Commit Graph

89 Commits

Author SHA1 Message Date
viettranx 4e9f155a4c feat(agent): adaptive tool timing with slow tool notification
Track per-tool execution time statistics in session metadata. When a tool
call exceeds its adaptive threshold (2x historical max, min 120s default),
send a direct outbound notification to the user.

- ToolTimingMap: parse/serialize/record/threshold from session metadata
- StartSlowTimer: fires once per tool call, auto-cancels on completion
- Team config: slow_tool toggle (default on, always direct, never leader)
- UI: toggle in team settings with i18n (en/vi/zh)
- Store: add GetSessionMetadata to session store interface
2026-03-19 13:35:57 +07:00
viettranx 5b349db7eb feat(heartbeat): provider/model override + fix cache invalidation
- Add ProviderModelSelect to heartbeat config dialog (allowEmpty, verify button)
- Backend: accept providerName in HEARTBEAT.SET, resolve to UUID via GetProviderByName
- Add ModelOverride to RunRequest, used by Loop when set (cheaper model for heartbeat)
- Ticker passes heartbeat model override to agent RunRequest
- Fix: InvalidateCache after UpdateState so ListDue picks up new next_run_at immediately
- i18n: add sectionModel/modelHint keys (en/vi/zh)
2026-03-18 23:02:48 +07:00
Duc Nguyen dc51018563 fix: subagent provider routing + api_base fallback (#262)
* fix(subagent): inherit parent agent's provider instead of alphabetical fallback

Subagents previously used a fixed provider (alphabetically first from the
registry, often "anthropic") regardless of which provider the parent agent
used. This caused invalid combos like anthropic/glm-5 when a zai-coding
agent spawned subagents.

- Pass provider registry to SubagentManager for runtime resolution
- Inject parent provider name into context (WithParentProvider)
- Resolve activeProvider from parent context before LLM call
- Fix trace spans to show actual resolved provider, not default

* fix(providers): api_base fallback from config/env for DB providers

DB providers with empty api_base now inherit from config/env vars
(e.g., GOCLAW_ANTHROPIC_BASE_URL). Prevents proxy API keys from being
sent to the real provider API endpoint.

- Add APIBaseForType() method on ProvidersConfig
- registerProvidersFromDB falls back to config when api_base is empty
- ProvidersHandler uses resolveAPIBase() for model listing
- Add api_base, display_name, settings to provider validation whitelist

* fix(tracing): pass resolved provider name to subagent span emitters

- emitSubagentSpanStart now accepts providerName param instead of
  reading sm.provider.Name() — ensures root subagent span reflects
  the inherited parent provider, not the fallback default
- registerInMemory now uses resolveAPIBase() so DB providers with
  empty api_base inherit the config/env fallback (same as startup path)

---------

Co-authored-by: viettranx <viettranx@gmail.com>
2026-03-18 22:40:49 +07:00
viettranx 2504095dfe fix(agents): complete shell deny groups propagation chain
ShellDenyGroups was defined in SystemPromptConfig but lacked full propagation
through parser, Loop fields, context injection, and system prompt population.
Per-agent overrides from other_config JSONB had zero runtime effect.

Changes:
- agent_store.go: Add ParseShellDenyGroups() to extract overrides from JSONB
- loop_types.go: Add shellDenyGroups field to Loop and LoopConfig, wire in NewLoop
- resolver.go: Wire agent-parsed shell deny groups into LoopConfig
- loop.go: Inject shellDenyGroups into context via store.WithShellDenyGroups
- loop_history.go: Populate ShellDenyGroups in system prompt config
- message_test.go: Fix macOS symlink path normalization in test expectations

Fixes test failures on macOS where /var/folders symlinks to /private/var/folders.
2026-03-18 17:04:26 +07:00
viettranx c7d0bc19f8 fix(teams): auto-copy media files to team workspace on task creation, scope task_number per chat
- Add RunMediaPaths context key to track media files from current run
- Collect persisted media paths in agent loop after enrichment
- Auto-copy media files to {workspace}/attachments/ when leader creates task
- Append attached files hint in dispatch content so members know what to read
- Scope task_number per (team_id, chat_id) instead of global per team
- Fix NULL chat_id comparison with COALESCE
- Use hard link first, copy fallback to save disk space
- Validate filenames and use restrictive file permissions (0640)
2026-03-18 12:58:09 +07:00
viettranx b231878a85 feat(teams): add limit param to ListTasks + lightweight get-light endpoint
- Add limit parameter to ListTasks interface (dashboard=200, agent=30)
- Add teams.tasks.get-light WS method returning task only (no comments/events)
- Truncate dashboard response to exact limit (fix off-by-one from limit+1)
- Update all 7 ListTasks callers with explicit limit values
2026-03-17 18:03:10 +07:00
teexiii 99dd363b13 feat(mcp): lazy-activate deferred tools on direct call in search mode (#235)
* feat: Implement MCP manager for server connections, tool registration, and deferred tool loading for agents.

* feat: Add tests for deferred tool activation logic within the tool registry and agent loop.

* fix(mcp): prevent deny list bypass via lazy activation + fix idempotency race

- Add PolicyEngine.IsDenied() to check deny patterns (incl. group: expansion)
  before allowing lazily-activated deferred tools to execute
- Check IsDenied() in both single-tool and parallel execution paths in loop.go
- Make ActivateToolIfDeferred idempotent by checking activatedTools before
  returning false, preventing concurrent goroutines from being blocked
- Add tests for deny-on-first-call, group deny patterns, and idempotent
  concurrent activation

---------

Co-authored-by: viettranx <viettranx@gmail.com>
2026-03-17 13:18:27 +07:00
viettranx 97cacfe68b feat(teams): member task progress reminder + fix broken progress notifications
- Fix progress event payload missing TaskNumber, Subject, OwnerAgentKey,
  ProgressPercent, ProgressStep — notifications were rendering empty
- Fix progress notification format to include task name (consistent with
  dispatched/failed) and guard empty ProgressStep
- Change percent tool schema from number to integer for clarity
- Add pre-run member task reminder injecting task context before LLM loop
- Add mid-loop progress nudge every 10 iterations with suggested percent
  based on iteration ratio (handles maxIter=0 unlimited case)
- Enhance leader cross-session reminder to show progress % when available
- Strengthen TEAM.md member guidance: focus, result quality, progress rules
- Add progress bar to task list table view (matches kanban card pattern)
2026-03-17 12:43:09 +07:00
viettranx d205691a13 fix(skills): hide skill_manage from LLM when skill_evolve is off
- skill_manage builtin tool default Enabled: true (available in registry)
- When skill_evolve=false: filter skill_manage from both tool definitions
  (API params) and system prompt tooling section — agent has zero awareness
- When skill_evolve=true: tool visible + system prompt guidance + nudges
- Update UI hints to reflect tool is available by default
2026-03-17 12:05:48 +07:00
viettranx b2a74ba487 feat(skills): skill_manage tool + skill_evolve learning loop (#218)
Adds skill_manage — a first-class agent tool for creating, updating, and
deleting skills from within a conversation — paired with per-agent
skill_evolve that nudges predefined agents to capture reusable workflows.

Tool (skill_manage):
- create: write skill from SKILL.md content string (auto-grant, dep scan)
- patch: find/replace producing new immutable version (advisory-locked)
- delete: soft-delete (archive in DB, move to .trash/)
- Security guard: 25 regex patterns block shell injection, credential exfil,
  path traversal, SQL injection, privilege escalation
- Ownership enforced: only skill owner can patch/delete (admin bypass)
- Content size limit: 100KB; companion file copy: 20MB, symlink-safe (WalkDir)
- Enabled: false by default — admin opt-in per agent

Learning loop (skill_evolve, predefined agents only):
- System prompt: SHOULD/SHOULD NOT guidance for skill creation
- Budget nudges: [System] prefix at 70%/90% iteration budget (ephemeral, i18n)
- Postscript: once-per-run suggestion with explicit user consent
- Config: other_config.skill_evolve + skill_nudge_interval (default 15)

Security hardening (pre-existing + new):
- CreateSkillManaged: RETURNING id + pg_advisory_xact_lock (atomic upsert)
- GetNextVersionLocked: advisory lock for race-safe patch versioning
- Ownership checks on HTTP update/delete, gateway update, 4 grant/revoke handlers
- copyOtherFiles: filepath.WalkDir for real symlink detection

UI: Skill Learning toggle + nudge interval in Agent General Tab
i18n: backend (en/vi/zh catalogs) + frontend (en/vi/zh locale files)
2026-03-17 11:38:35 +07:00
viettranx ca44b7279f feat(bootstrap): predefined agents keep full system prompt during onboarding
Predefined agents now retain all tools and system prompt sections when
BOOTSTRAP.md is present, instead of entering slim mode with only write_file.
Open agents keep the existing slim bootstrap mode.

- Gate tool filtering and IsBootstrap on agentType != "predefined"
- Add FIRST RUN reminder for predefined agents (without tool restriction)
- Skip bootstrap/user seeding for team-dispatched sessions (IsTeamSession)
- Group chats skip BOOTSTRAP.md entirely
- Track bootstrapWriteDetected + inject nudge after 2 turns without write_file
- Update templates: never reveal process, no capability listing, no "locked"
- Cache LoadContextFiles via existing agentCache/userCache (TTL 5min)
2026-03-17 09:25:23 +07:00
viettranx 514c5e0bfc refactor(teams): batch TaskTicker queries + leader notifications
- Replace per-team loop with batch SQL (v2 filter in JOIN)
- RecoverAllStaleTasks/ForceRecoverAllTasks/MarkAllStaleTasks return
  RecoveredTaskInfo for notification routing
- Notify leaders per (teamID, channel, chatID) scope with actionable hints
- Fix notifyLeaderCycleError routing (was silently DROPPED)
- Stale threshold: 24h → 2h default
- Remove per-session RecoverStaleTasks from loop.go (ticker handles it)
- Add rows.Err() check to scanRecoveredTaskInfoRows
2026-03-16 22:46:00 +07:00
viettranx 50a42ad110 feat(agent): team workspace resolution for lead/member agents
- Lead agents: auto-resolve team workspace as default (relative paths)
- Dispatched members: team workspace as default via req.TeamWorkspace
- Direct-chat members: own workspace default, team workspace accessible
- Add dataDir field to Loop/LoopConfig for global workspace root
- System prompt shows team workspace absolute path for model guidance
- Remove orphan task detector (superseded by post-turn dispatch)
- Log warning on OpenAI tool call argument parse failures
2026-03-16 20:06:01 +07:00
viettranx 8d6729e959 feat(teams): improve task dispatch, concurrency, and tool ergonomics
- Move task dispatch from mid-turn to post-turn to prevent dependent
  tasks from completing before the current agent's run finishes
- Add team create lock to serialize list→create flows across concurrent
  group chat sessions, preventing duplicate task creation
- Require list-before-create gate: agents must call team_tasks(list)
  before creating tasks
- Make assignee required on task creation
- Add pagination (50 per page) to task list with offset support
- Slim task list/get/search responses with dedicated structs to reduce
  context token usage
- Add task board snapshot in announce messages to leader
- Workspace: allow subdirectory paths in read/delete, show directories
  in list output
- UI: reduce kanban card title font size for better visual balance
2026-03-16 15:26:25 +07:00
viettranx 9468aae422 refactor(providers): simplify DashScope per-model thinking guard
Remove ModelThinkingCapable interface and ChatRequest.ModelSupportsThinking
hint field — DashScope handles per-model checks internally via its own
whitelist. Fix double applyThinkingGuard on ChatStream→Chat tool fallback
by calling OpenAIProvider.Chat directly.
2026-03-16 07:55:25 +07:00
hoangvinh14 a44dbf2ba4 feat(providers/dashscope): add Qwen 3.5 series support with per-model thinking capability (#215)
* feat(providers/dashscope): guard enable_thinking injection by per-model capability check
Introduces ModelThinkingCapable interface and ModelSupportsThinking field
on ChatRequest so DashScope can skip thinking-param injection for models
that do not support it (e.g. qwen3-plus, qwen3-turbo), preventing
\"model not supported\" API errors.
- types.go: add ModelThinkingCapable interface + ModelSupportsThinking *bool on ChatRequest
- dashscope.go: add dashscopeThinkingModels whitelist + ModelSupportsThinking(); honour pre-computed hint
- loop.go: detect ModelThinkingCapable and set hint on ChatRequest before LLM call
- provider_models.go: add qwen3.5-plus / qwen3.5-turbo to DashScope model list
- dashscope_test.go: full test suite for whitelist, injection, hint override, budget mapping

* Fix review code.

---------

Co-authored-by: Nguyen Gia Hoang Vinh <vinhngh@runsystem.net>
2026-03-16 07:43:08 +07:00
Viet Tran 9a9744077e refactor(teams): v2 system cleanup — remove legacy tools, fix followup, add events API (#210)
Major refactoring of the team system with multiple improvements:

## Removed legacy delegation tools
- Delete `delegate.go`, `delegate_async.go`, `delegate_sync.go`, `delegate_events.go`,
  `delegate_policy.go`, `delegate_prep.go`, `delegate_state.go`, `delegate_search_tool.go`
- Delete `evaluate_loop_tool.go`, `handoff_tool.go`
- Remove all references and registrations from tool manager and policy
- Clean up TEAM_PLAYBOOK_IDEAS.md and TEAM_SYSTEM.md (moved to docs)

## Rename await_reply → ask_user
- Rename action `await_reply` → `ask_user`, `clear_followup` → `clear_ask_user`
- Rename functions `executeAwaitReply` → `executeAskUser`, `executeClearFollowup` → `executeClearAskUser`
- Update system prompt with stronger wording to prevent model misuse
- Model was confusing "await_reply" with general waiting; "ask_user" is unambiguous

## Fix auto-followup false positives
- Add `HasActiveMemberTasks(ctx, teamID, excludeAgentID)` store method
- Guard `autoSetFollowup()` in consumer: skip when lead has active member tasks
- Prevents auto-followup when lead is orchestrating teammates (not waiting for user)

## Task identifier zero-padding
- Change format from `T-1-xxxx` → `T-001-xxxx` (3-digit minimum)

## Refactor workspace WS handlers to filesystem-only
- Rewrite `teams.workspace.list/read/delete` to use pure filesystem (os.ReadDir/ReadFile/Remove)
- Remove DB dependency from workspace WS handlers
- Consistent with storage handler and workspace tools
- Simplify TeamWorkspaceFile type and frontend hook

## Add team events listing API
- New WS method `teams.events.list` with team_id, limit, offset params
- New HTTP endpoint `GET /v1/teams/{id}/events` with bearer auth
- New `ListTeamEvents(ctx, teamID, limit, offset)` store method
- JOIN with team_tasks for team-wide event filtering

## Extract team access policy
- New `team_access_policy.go` — centralized team tool access control

## Migration 000019: team_id columns
- Add team_id foreign key columns to relevant tables

## Other improvements
- Add team_id propagation through agent loop, tracing, sessions
- Update i18n locale files (en/vi/zh) for new tool labels
- Update frontend builtin-tools page and require-setup component
- Bump RequiredSchemaVersion for migration 000019
2026-03-15 14:53:19 +07:00
Goon 19786166c1 fix(agent): enrich <media:image> tags with persisted media IDs for Discord image attachments (#179)
* fix(agent): enrich <media:image> tags with persisted media IDs for Discord image attachments

Discord image attachments were downloaded and persisted correctly, but
<media:image> tags in the message content remained bare (no ID attribute),
unlike <media:audio> and <media:video> tags which get enriched with media
IDs. This made it harder for the LLM to confirm image receipt and
reference specific images.

Add enrichImageIDs() that embeds persisted media IDs into <media:image>
tags, matching the existing enrichAudioIDs()/enrichVideoIDs() pattern.
Iterates refs in reverse order to correctly map multiple image refs to
their positional tags when users attach several images at once.

Closes #178

https://claude.ai/code/session_01KkE9UxcNB8eXpqRiJHRqeB

* style(agent): add missing continue in enrichImageIDs for consistency with enrichVideoIDs

https://claude.ai/code/session_01KkE9UxcNB8eXpqRiJHRqeB

---------

Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Viet Tran <viettranx@gmail.com>
2026-03-13 23:01:59 +07:00
Viet Tran 1a42dc93a6 feat(teams): team system v2 with bug fixes, workspace scope, versioning, and prompt optimization (#183)
* feat(workspace): add team shared workspace for file collaboration

- Add workspace_write and workspace_read tools for agents to share files across team members
- Create team_workspaces DB table with migration 000017 (file metadata, pinning, tags)
- Implement PostgreSQL store layer for workspace CRUD operations
- Add RPC handlers for workspace list/read/delete from web UI
- Build React workspace tab with file listing, content preview, and delete
- Propagate workspace channel/chatID scope through delegation chain
- Auto-allow workspace tools in agent tool policy when agent belongs to a team
- Inject team workspace guidance into system prompt for team agents
- Add /reset command handler for clearing session history
- Harden MCP bridge context middleware to reject headers when no gateway token
- Add i18n strings for workspace UI in en/vi/zh locales

* feat(teams): add comprehensive task management with followup reminders and recovery

- Add task followup/reminder system with auto-set on lead agent reply and auto-clear when user responds on channel
- Add task recovery ticker to re-dispatch stale/pending tasks periodically
- Add task scopes, filtering by status/channel/chatID, and task events
- Add WS RPC handlers for task CRUD, assignments, comments, events, and bulk operations (teams_tasks.go)
- Add task detail dialog, settings UI for followup config, and scope filtering in web dashboard
- Add migrations 000018 (team_tasks_v2) and 000019 (task_followup)
- Extend team_tasks_tool with await_reply, clear_followup actions
- Auto-complete/fail team tasks when delegate agent finishes
- Add workspace file listing and team tool manager enhancements

* docs(teams): add team system architecture and playbook ideas documentation

- Add TEAM_SYSTEM.md with full architecture design covering task management, shared workspace, and delegation engine subsystems
- Add TEAM_PLAYBOOK_IDEAS.md outlining future team coordination layers (playbook, member capabilities, auto-learned patterns)
- Document data models, status flows, tool actions, followup reminder system, task ticker, execution locking, and workspace scope model

* fix(teams): resolve 6 critical bugs in team task system

- Fix unblock SQL: check array_length after array_remove (not before)
- Enforce single-team leadership in team creation
- Add requireLead() for approve/reject tool actions
- Validate cross-team dependency references in blocked_by
- Add team_id to handoff route for multi-team isolation
- Set blocked_by DEFAULT '{}' to prevent NULL array issues

* refactor(workspace): use stable userID as scope key instead of connection UUID

Workspace scope changed from (team_id, channel, chat_id) to (team_id, userID).
Fixes workspace fragmentation across WS tab refreshes and reconnections.

* feat(teams): add V1/V2 versioning with feature gating and optimized prompts

- IsTeamV2() helper gates advanced features (locking, followup, review, audit)
- V2 tool actions rejected for V1 teams with clear error message
- Ticker, gateway consumer, delegation hooks respect version flag
- TEAM.md renders v1/v2 sections conditionally
- Tool descriptions and params optimized (~38% token reduction)
- UI: version toggle in settings, V2 Beta badge, conditional rendering
- i18n: version modal keys for en/vi/zh

* fix(migration): use VARCHAR(255) for user ID columns and add metadata JSONB

- assignee_user_id, user_id, actor_id: TEXT → VARCHAR(255)
- Add metadata JSONB to team_task_comments and team_task_attachments

---------

Co-authored-by: Nam Nguyen Ngoc <namnn.0911@gmail.com>
2026-03-13 22:41:32 +07:00
viettranx 4c7db6e09b feat(agent): add mid-run message injection for DM and WebSocket
Inject user follow-up messages into the running agent loop at turn
boundaries instead of queueing them for a new run. This preserves
context so the LLM sees both tool results and user follow-ups together.

- Add InjectedMessage type and drainInjectChannel helper
- Add InjectCh to ActiveRun with buffered channel (cap=5)
- Drain injection channel at two points in agent loop (after tool
  results and before no-tool-calls exit)
- Route steer/new_task intents to InjectMessage with scheduler fallback
- WebSocket: inject into running loop when session is busy
- Remove IntentClassify config toggle (always on)
- Web UI: show send + stop buttons side by side during agent run
- i18n: add injection acknowledgment messages (en/vi/zh)
2026-03-13 11:55:55 +07:00
viettranx 6eb33f9cea feat: decouple memory/KG sharing from workspace folder sharing
Add independent `share_memory` config flag to control memory and
knowledge graph sharing separately from workspace folder isolation.

- Add ShareMemory field to WorkspaceSharingConfig
- Decouple WithSharedMemory(ctx) from shouldShareWorkspace() in loop.go
- Add shouldShareMemory() helper independent of workspace sharing
- Fix KG Traverse CTE to scope user_id in recursive step (pre-existing bug)
- Add memory toggle UI with violet styling in workspace sharing section
- Add i18n translations (en/vi/zh) for new memory sharing controls
- Add unit tests for shouldShareMemory() independence
2026-03-12 18:26:40 +07:00
viettranx bece4525ba feat: share memory and KG across users when workspace sharing is enabled
Memory (MEMORY.md, memory/*) and knowledge graph are now shared when
workspace sharing is active, matching the filesystem sharing behavior.
Previously memory was always per-user isolated even with shared workspace,
causing inconsistencies when collaborating on the same files.

Adds MemoryUserID(ctx) helper that returns empty userID (global scope)
when shared memory flag is set, used by memory interceptor, memory tools,
and KG search. UI warning updated to note data is not migrated on toggle.
2026-03-12 12:09:38 +07:00
viettranx 25b24ebd50 feat: configurable workspace sharing with per-agent DM/group/user controls
Add workspace_sharing config in other_config JSONB to control per-user
workspace isolation. When enabled, users share the base workspace directory
instead of isolated subfolders — configurable separately for DMs and groups,
with a per-user allowlist override.

Backend: WorkspaceSharingConfig struct, ParseWorkspaceSharing(), conditional
isolation in loop.go/loop_history.go, 7 unit tests.
Frontend: prominent always-visible config section with contact search
combobox, sticky save bar layout fix, i18n (en/vi/zh).
2026-03-12 10:54:17 +07:00
Luan Vu b488ef44d6 fix: media tag enrichment, Gemini file polling, credential merge (#158)
1. Media tag enrichment (audio/video/document):
   - Add enrichVideoIDs() — video media_id was never injected into
     <media:video> tags, causing LLM to hallucinate UUIDs
   - Fix all enrich functions to replace the LAST bare tag instead of
     the first. When group history prepends older media tags, the first
     occurrence belongs to history — injecting the current turn's ID
     there causes the LLM to reference the wrong file

2. Gemini File API polling:
   - Upload response returns fileURI immediately but file may still be
     in PROCESSING state. Check state field; only skip polling when
     file is already ACTIVE. Fixes "not in an ACTIVE state" errors

3. Channel instance credential merge:
   - Partial credential updates (e.g. updating just token) now merge
     with existing credentials instead of wiping other fields
   - Loads, decrypts, merges, re-encrypts in a single Update() call

Co-authored-by: Luvu182 <208665161+Luvu182@users.noreply.github.com>
2026-03-12 09:35:04 +07:00
Luan Vu 7386fc8ad7 fix: read_audio fails when LLM passes literal <media:audio> as media_id (#156)
The read_audio tool errors with "audio not found" because:
1. BuildMediaTags() creates bare <media:audio> tags without IDs
2. persistMedia() generates UUIDs stored in context, not in message content
3. LLM sees <media:audio> and passes it literally as media_id parameter
4. resolveAudioFile() tries to match "<media:audio>" against UUIDs — fails

Two-part fix:
- enrichAudioIDs(): embed persisted UUIDs into <media:audio> tags (like
  enrichDocumentPaths does for documents), so LLM sees actual IDs
- resolveAudioFile(): sanitize tag-like media_id values and fallback to
  most recent audio instead of hard error on unmatched IDs

Co-authored-by: Luvu182 <208665161+Luvu182@users.noreply.github.com>
2026-03-11 19:57:15 +07:00
Luan Vu 0592be359d fix: remove legacy per-agent imageGen/vision override from tools_config (#153)
The per-agent `imageGen` and `vision` fields in `ToolPolicySpec` (stored
in agents.tools_config JSONB) were added in d5cc5a7 (Feb 26) as the
original way to configure image/vision providers. When the media provider
chain system was introduced in 5815437 (Mar 8), these fields were kept
"for backward compat" but became dead code with no UI to manage them.

This causes a hard-to-debug issue: if an agent's tools_config contains
stale imageGen/vision data (set via API or leftover from DB), it silently
overrides the global provider chain configured in the builtin tools UI.
Users see the correct chain in the UI but the tool calls a completely
different provider/model, with no indication of why.

Changes:
- Remove Vision and ImageGen fields + struct definitions from ToolPolicySpec
- Remove associated context helpers (WithVisionConfig, WithImageGenConfig, etc.)
- Remove per-agent override injection in agent loop
- Simplify create_image and read_image to use chain as sole source of truth
- UI: whitelist known tools_config fields on save to clean stale DB data

Co-authored-by: Luvu182 <208665161+Luvu182@users.noreply.github.com>
2026-03-11 17:37:55 +07:00
Goon c25e770d43 feat(ui): multi-skill upload with client-side validation (#149)
* feat(ui): multi-skill upload with client-side validation

Allow uploading multiple skill ZIP files at once with pre-upload
validation. JSZip parses each ZIP client-side to verify SKILL.md
presence, frontmatter format, and slug validity before upload.

- Add JSZip dependency (lazy-loaded, code-split ~30KB gzip)
- Create validate-skill-zip.ts mirroring server-side checks
- Rewrite skill-upload-dialog for multi-file with status badges
- Add concurrent validation, sequential upload with per-file progress
- Add empty SKILL.md check to backend upload handler
- Add i18n keys for all new UI strings (en/vi/zh)

* fix(ui): duplicate entries and validation hang in multi-skill upload

- Move pending list construction to assignment inside updater return
  to prevent StrictMode double-invoke from pushing duplicates
- Wrap per-file validateSkillZip in try/catch so one failure doesn't
  block Promise.all and leave entries stuck in "validating" state

* fix(ui): use static import for JSZip instead of dynamic import

Dynamic import("jszip") fails in browser - bare module specifiers
don't resolve at runtime. Use static import which Vite handles
via its module graph and code-splits automatically.

* feat(ui): add inline visibility toggle on skills table

Click the visibility badge on managed skills to cycle through
private → internal → public. File-based skills stay read-only.

* fix(ui): move dedup logic outside state updater in upload dialog

Avoids reading stale entries inside functional updater. Builds
pending list from current entries state before calling setEntries.

* fix(ui): auto-select first active agent when current agent unavailable

When agents load from API, if the current selected agent is not in the active agents list, automatically select the first available active agent instead of remaining unset. Prevents chat page from being unable to send messages when default agent selection is invalid.

* feat(ui): make agent display name editable in setup wizard

Allow users to customize the agent display name during onboarding instead of keeping it hardcoded to "GoClaw". Removed read-only state from the display name input and added a placeholder for guidance.

* feat: add document path enrichment and media filename support

Backend changes:
- enrichDocumentPaths() in agent/media.go: injects persisted file paths into <media:document> tags
- Document paths allow skills (e.g. pdf skill via exec) to access files directly
- chat.go: support new media format {path, filename} alongside legacy string paths
- Updated read_document tool description to guide agent on using path attribute
- Docker: add pypdf to Python dependencies for PDF processing
- Softened MUST language in read_* tool descriptions (changed to Call this)

Frontend changes:
- chat-input.tsx: attach filename with each uploaded file in media payload
- use-chat-send.ts: send media as {path, filename} objects instead of just paths
- i18n: add "uploaded_files" text in en, vi, zh locales
- chat-page.tsx: minor adjustment for media handling

Enables skills to process uploaded documents directly without intermediate copying.
2026-03-11 16:59:03 +07:00
Luan Vu 2fdb791802 fix: honor per-agent DB settings for restrict, subagents, memory, sandbox (#145)
Four per-agent settings stored in the database (and configurable via UI)
were silently ignored at runtime because the tool/system layer always
used the global config defaults instead.

**restrict_to_workspace**: Tools used the global config default baked at
startup. Fix: pass per-agent value through context; tools check context
override before falling back to constructor default.

**subagents_config**: ParseSubagentsConfig() existed but was never called.
All agents shared one SubagentManager with global limits. Fix: resolve
per-agent config in the agent resolver, store it on each spawned task,
and use it for limit checks, deny lists, and system prompt generation.

**memory_config**: Only the enabled toggle was read per-agent; search
weights (vector_weight, text_weight, max_results, min_score) were
hardcoded from PGMemoryStore defaults. Fix: extend MemorySearchOptions
with weight overrides, read per-agent config from context in the
memory_search tool.

**sandbox_config**: Only workspace_access was extracted per-agent; mode,
image, memory, CPU, timeout, network settings were discarded. Fix: pass
full sandbox.Config through context; Manager.Get() accepts an optional
config override for new containers.

Co-authored-by: Luvu182 <208665161+Luvu182@users.noreply.github.com>
2026-03-11 16:05:56 +07:00
Viet Tran 0926d053b0 feat: add token usage tracking, cost analytics, budget enforcement, wake API, and activity audit trail (#142)
- A1+C2: Include token usage in run.completed event payload for WS clients
- A2: Cost tracking with model pricing config, cost calculation, and cost summary API
- A3: Budget enforcement per agent with monthly budget limits (migration 000015)
- C1: External wake/trigger API (POST /v1/agents/{id}/wake) for orchestrators
- C3: Activity audit trail with structured logging and queryable API
- UI: Activity page, cost stat card on overview, budget section in agent detail
- i18n: Complete en/vi/zh translations for all new features
2026-03-11 12:52:12 +07:00
viettranx ee31387aa1 fix(security): disable config leak detection to prevent false positives
StripConfigLeak was blocking legitimate responses when predefined agents
mentioned SOUL.md/IDENTITY.md/AGENTS.md in architecture explanations.
Improved detection logic to exclude code blocks but disabled the gate
entirely for now until a more robust approach is designed.
2026-03-10 23:11:02 +07:00
viettranx ef720ee13a feat(tracing): two-phase spans — show running state before completion
Spans are now emitted BEFORE execution starts with status "running" and
input visible, then updated when the step completes. This lets users see
what the system is doing during long-running LLM calls and tool executions.

- Add EmitSpanUpdate() to collector with separate update buffer
- Flush ordering: batch INSERT new spans first, then process updates
- Split LLM/tool/agent spans into start/end pairs (agent loop + subagent)
- Emit tool span end inside goroutines for parallel calls (no orphans)
- EmitSpanUpdate is a channel send — works after ctx cancellation
2026-03-10 13:52:24 +07:00
viettranx bdb60de7ae chore: upgrade Go 1.25 → 1.26 and apply go fix modernizations
- Update go.mod and Dockerfile to Go 1.26
- Apply `go fix ./...` stdlib modernizations across 170+ files
- Add `go fix` to post-implementation checklist in CLAUDE.md
- Fix go fix misapplied rewrite in loop_history.go
2026-03-10 00:09:15 +07:00
viettranx e593b9cf22 feat(channels): real-time agent activity status & intent classification
- Add tool status display on channels during tool execution (streaming preview + reactions)
- Emit agent.activity events at phase transitions (thinking, tool_exec, compacting)
- Enrich delegation progress with per-member activity and tool info
- Add LLM-based intent classifier for DM status queries when agent is busy
  - Keyword fast-path for cancel/status patterns (no LLM cost)
  - Falls back to LLM classification with 5s timeout
  - Supports status_query (immediate reply) and cancel (abort run) intents
- Register/unregister runs in makeSchedulerRunFunc for channel inbound tracking
- Add sessionRuns secondary index in Router for O(1) IsSessionBusy lookups
- Add intent_classify config toggle (global default + per-agent override)
- Add tool_status config toggle for channel tool status display
- Add i18n keys and translations (en/vi/zh) for status messages
- Add web UI config toggles for intent_classify and tool_status
2026-03-09 23:58:56 +07:00
Nam Nguyen Ngoc 11bed0cc01 fix(mcp-bridge): per-session security context + media forwarding (#91)
* fix(mcp-bridge): add per-session agent context and HMAC verification

- Add per-session MCP config with X-Agent-ID/X-User-ID headers instead
  of shared global config file
- Sign bridge context headers with HMAC-SHA256 to prevent forgery
- Add bridgeContextMiddleware to verify signatures on MCP bridge requests
- Store MCP configs in ~/.goclaw/mcp-configs/ outside agent workDir
- Use atomic writes (tmp + rename) for MCP config files
- Fix provider rename leaving ghost registry entries
- Remove provider_type from mutable fields on update
- Tighten temp dir permissions from 0755 to 0700

* feat(mcp-bridge): propagate channel routing context through MCP bridge

- Pass channel, chat_id, and peer_kind from agent loop to CLI provider options
- Inject X-Channel, X-Chat-ID, X-Peer-Kind headers in bridge context middleware
- Add BridgeContext struct to bundle per-call context for MCP config generation
- Include channel routing headers in per-session MCP config files
- Expose "message" tool via MCP bridge for cross-channel messaging
- Add extract helpers for new option keys in claude_cli_session.go

* feat(mcp-bridge): forward media attachments to outbound message bus

- Wire MessageBus into gateway server and MCP bridge handler
- Publish tool result media files to outbound bus for channel delivery
- Extract channel/chatID/peerKind from tool context for proper routing
- Add mimeFromExt helper for content-type detection on attachments

* feat(mcp-bridge): inject per-agent DB-backed MCP servers into Claude CLI config

- Add MCPServerLookup type to resolve agent-specific MCP servers from DB
- Wire MCPServerStore through provider registration and HTTP handler
- Extract mcpServerEntryToConfig helper to deduplicate transport config logic
- Add JSON-to-Go helpers (jsonToStringSlice, jsonToStringMap) for DB fields
- Merge per-agent MCP servers at config write time without overriding static entries

* fix(mcp-bridge): use Media struct fields and prefer explicit MimeType

- Map Media.Path to attachment URL instead of treating Media as string
- Use Media.MimeType when available, fall back to extension-based detection

* refactor(providers): deduplicate option extractors and extract bridge media forwarding

- Replace per-field extractors (extractSessionKey, extractAgentID, etc.) with generic extractStringOpt/extractBoolOpt
- Add bridgeContextFromOpts helper to build BridgeContext in one call
- Extract forwardMediaToOutbound from inline block in makeToolHandler
- Change NewBridgeServer msgBus param from variadic to explicit pointer

* fix(providers): validate provider_type on update instead of silently dropping it

- Add explicit validation against ValidProviderTypes with 400 response
- Remove silent delete(updates, "provider_type") that hid invalid values
- Caller now receives clear error when submitting unsupported provider_type

* fix(providers): add header injection validation to MCP bridge headers

- Extend CRLF/null-byte checks to agentID, channel, chatID, and peerKind
- Previously only userID had header injection prevention
- Prevents HTTP header injection via crafted values in MCP config

* fix(mcp-bridge): sign all context fields in HMAC and remove legacy code

- Sign all 5 bridge context fields (agentID|userID|channel|chatID|peerKind)
  in HMAC instead of only agentID|userID to prevent channel routing forgery
- Propagate context.Context into MCPServerLookup to respect request
  cancellation instead of using context.Background()
- Remove legacy BuildCLIMCPConfig, WithClaudeCLIMCPConfig, mcpConfigPath,
  and mcpCleanup (dead code since system is PG-only)
- Use mime.TypeByExtension before custom fallback in mimeFromExt
- Add debug log when media forwarding is skipped due to missing context
- Add thread-safety comment to SetMCPServerLookup

---------

Co-authored-by: Nam Nguyen Ngoc <namnn.0911@gmail.com>
Co-authored-by: viettranx <viettranx@gmail.com>
2026-03-09 15:23:56 +07:00
Luan Vu a321af8b04 feat(ui): add tool call details with arguments, results, and thinking display (#92)
- Enhance tool call cards with expandable arguments/result sections and inline summary
- Display thinking/reasoning blocks in chat messages (uses existing ThinkingBlock component)
- Reconstruct tool details from session history for persistent display
- Emit tool result content and truncated arguments in agent events
- Fix sanitize to always return cleaned content instead of empty string

Co-authored-by: Luvu182 <208665161+Luvu182@users.noreply.github.com>
2026-03-09 12:57:55 +07:00
viettranx 967f7ae46f refactor: split gateway, consumer, onboard, and agent loop into smaller files
Extract helper functions and move existing functions to dedicated files:
- cmd/gateway.go → gateway_channels_setup.go (channel registration, RPC wiring, event subscribers)
- cmd/gateway_consumer.go → gateway_consumer_helpers.go, gateway_consumer_process.go
- cmd/gateway_managed.go → gateway_http_handlers.go (wireHTTP)
- cmd/onboard.go → onboard_resolve.go (API key resolution helpers)
- internal/agent/loop.go → loop_run.go (Run entry point)

No logic changes — only code movement between files within the same package.
2026-03-09 10:49:58 +07:00
Duc Nguyen e05a4018c9 fix: use platform type instead of instance name in system prompt + Zalo group routing (#90)
* fix(agent): use ChannelType in system prompt for proper channel context

The system prompt was using the channel instance name (e.g. "zep-lao") instead
of the platform type (e.g. "zalo_personal"), causing the LLM to not understand
which messaging platform it's running on. This led to context confusion where
the bot would ask users which channel to send to instead of using the current one.

Changes:
- Add ChannelType field to RunRequest and SystemPromptConfig
- Thread channel type from consumer/cron → agent loop → system prompt
- Add WithToolChannelType/ToolChannelTypeFromCtx for tool context
- Register channel types for both config-based and DB-loaded instances
- Fix Zalo group thread type detection with approvedGroups cache
- Update cron handler to resolve channel type for cron-triggered runs

* refactor(channels): add Type() to Channel interface, remove channelTypes map

Move channel type from a separate map in Manager to the Channel interface
itself. BaseChannel.Type() falls back to Name() for config-based channels
where name == type. Extracts resolveChannelType helper to DRY up 6
repeated resolution blocks across consumer and cron handlers.

* feat(zalo): add pending group history for conversation context

Zalo personal groups now record non-@mentioned messages in a ring buffer
(default 50, configurable via history_limit). When the bot IS mentioned,
pending history is flushed as context — matching Telegram/Discord/Feishu.

Separated mention gating from policy gating in checkGroupPolicy for
cleaner control flow.
2026-03-09 08:30:45 +07:00
viettranx d839e034af feat(tools): add exec path exemptions and tool arguments in events
- Add AllowPathExemptions to ExecTool for fine-grained deny bypass (skills-store)
- Include tool call arguments in tool.result event payloads
2026-03-08 22:40:09 +07:00
viettranx e1a6801a7a fix(tools): correct Veo API, media ref ordering, video tag, and model verify
- Fix create_video: use predictLongRunning API instead of generateContent
  (async polling flow: POST → poll every 10s → download video from URI)
- Fix durationSeconds as int (not string) per actual Gemini API requirement
- Fix MediaRef collection order: historical first, current last, so
  refs[len-1] always returns the most recent file (fixes read_audio
  picking up old file instead of current voice message)
- Remove misleading "video not yet supported" text from Telegram handler
  that prevented LLM from calling read_video tool
- Add isNonChatModel() to skip chat-based verify for generation models
  (veo-*, dall-e-*, imagen-*, gemini-*-image)
2026-03-08 15:21:08 +07:00
viettranx 691ddce8fb feat(tools): add read_audio, read_video, create_video tools and fix system prompt tool filtering
- Add read_audio tool with Gemini File API, OpenAI input_audio, and fallback support
- Add read_video tool with Gemini File API and base64 fallback for video analysis
- Add create_video tool with Gemini Veo and OpenRouter chat completions support
- Add shared gemini_file_api.go for upload → poll → generateContent pipeline
- Add shared openai_compat_call.go for custom JSON chat completions
- Fix system prompt showing denied tools: use filteredToolNames() instead of tools.List()
- Wire audio/video MediaRef context propagation in agent loop
- Register new tools in seed data, policy groups, and web UI settings
- Enforce duration (max 30s) and aspect_ratio limits on create_video
2026-03-08 14:43:18 +07:00
viettranx ea185b3f6c feat(agents): add self-evolution config and instances management for predefined agents
Self-Evolution: predefined agents can now optionally evolve their SOUL.md
(communication style/tone only) when self_evolve is enabled in other_config.
Identity, name, and operating instructions remain locked. Context propagation
flows through LoopConfig → Loop → context.WithValue → interceptor carve-out.
System prompt guides the agent on what it can/cannot evolve.

Instances Tab: new HTTP endpoints and UI tab for viewing/editing per-user
USER.md files on predefined agents. Includes owner-only access checks,
fileName validation (USER.md only), and cache invalidation.

UI: self-evolve toggle in General tab, create dialog, and setup wizard.
Agent type and evolve/static badges with tooltip explanations on cards
and detail header. TooltipProvider added to agents list and detail pages.
2026-03-08 14:27:40 +07:00
viettranx 0f2737ce53 feat(media): persistent media storage, read_document tool, and pipeline refactor
- Add persistent media storage (internal/media/) replacing temp file deletion
- Add MediaRef type for lightweight media references in session messages
- Refactor media pipeline to use bus.MediaFile{Path, MimeType} across all channels
- Add read_document builtin tool for PDF/DOCX/XLSX analysis via Gemini native API
- Move image sanitization from Telegram to shared agent/media layer
- Add media reload for multi-turn conversations (images from last 5 messages)
- Add reply-to-message media resolution for Telegram (re-download on reply)
- Add media inventory to compaction summary to preserve awareness after truncation
- Fix coreToolSummaries for read_image, read_document, create_image tools
- Add real-time trace update events via WebSocket broadcast
- Improve trace detail UI with media refs and tool result display
2026-03-08 14:00:34 +07:00
viettranx a53f3e092f fix(media): use absolute paths and relative URLs for WS media delivery
- FilesHandler now serves files by absolute path (auth-token protected)
  instead of requiring a workspace root, supporting multiple agent workspaces
- mediaToMarkdown generates relative URLs (/v1/files/...) instead of
  absolute http://host:port URLs so images work from any client origin
- Deduplicate media collection in agent loop: prefer result.Media over
  MEDIA: prefix parsing to prevent duplicate images
2026-03-08 00:41:38 +07:00
viettranx 9897dd77ed fix(ws-media): use ContentSuffix to inject images into session before run.completed
Previous approach failed because:
1. PublishOutbound with channel "ws" is silently dropped (no handler)
2. LLM strips image URLs despite instruction text
3. Post-run AddMessage races with frontend loadHistory()

New approach: ContentSuffix field on RunRequest is appended to the
assistant response inside the agent loop BEFORE saving to session
and BEFORE emitting run.completed. This guarantees image markdown
is in the session when frontend calls loadHistory().

Only affects WS channel — other channels still use ForwardMedia
and their respective outbound handlers.
2026-03-07 23:08:13 +07:00
viettranx 892ee8ea70 feat(ws-chat): add HTTP file serving, WS media delivery, and streaming UX improvements
- Add GET /v1/files/{path} endpoint with Bearer token + query param auth
- Pre-convert media to markdown HTTP URLs in announce messages for WS channel
- Add HideInput flag to skip persisting system messages in session history
- Add fallback to append image URLs if LLM strips them from announce response
- Fix stream-to-history DOM flash by promoting streamed text locally
- Fix tool call card word-wrap and expandable arguments panel
- Fix snake_case mismatch in Message types (toolCalls → tool_calls)
- Add clickable image rendering in markdown renderer
2026-03-07 22:46:04 +07:00
viettranx df684e2582 refactor(security): fix false positives and deduplicate web security hardening
- Fix opacity/font-size zero detection false positives (0.5, 0.8 matched as zero)
- Extract scanWebToolResult helper to eliminate DRY violation in agent loop
- Move hidden element detection to dedicated web_fetch_hidden.go file
- Replace map false-entries with comments for hiddenClasses documentation
2026-03-07 19:35:19 +07:00
Nam Nguyen Ngoc b901a82551 fix(security): harden web fetch/search against prompt injection and cache poisoning (#80)
- Scan web_fetch/web_search tool results for prompt injection patterns via inputGuard
- Strip hidden HTML elements (display:none, aria-hidden, sr-only classes) during conversion
- Scope web tool caches per channel to prevent cross-channel cache poisoning
- Enforce domain blocklist and allowlist checks on HTTP redirect targets
- Add untrusted content reminder to external content wrapper
- Log redirect source URL in fetch results for transparency

Co-authored-by: Nam Nguyen Ngoc <namnn.0911@gmail.com>
2026-03-07 19:31:56 +07:00
viettranx b2c4d543aa feat(providers): add Claude CLI provider with MCP bridge (#61)
Add Claude CLI as an LLM provider (subscription-based, no API key needed).
The CLI manages session history, tool execution, and context while GoClaw
forwards messages and streams responses.

Key features:
- Claude CLI provider with session persistence (--resume)
- MCP bridge server exposing GoClaw tools to CLI via streamable-http
- Security hooks (shell deny patterns, workspace path restrictions)
- Per-session mutex preventing concurrent CLI calls
- Onboard wizard for Claude CLI setup and auth verification
- Web UI for adding/managing Claude CLI provider with auth status
- Provider registry Close() for proper shutdown cleanup

Security:
- CLI path validation (only "claude" or absolute paths from DB)
- Token auth middleware for MCP bridge endpoint
- Shell injection prevention in hook scripts (single-quoted paths)
- Relative path resolution before workspace boundary checks
- Resource leak prevention on provider replace/unregister

Co-authored-by: nhokboo <nhokboo@users.noreply.github.com>
2026-03-07 02:06:39 +07:00
viettranx faa47abfb6 feat(block-reply): deliver intermediate text during tool iterations with 2-tier config (#55)
Add block.reply event that delivers intermediate assistant text to non-streaming
channels during multi-tool iterations. Includes 2-tier config toggle:
gateway-level default (disabled) + per-channel override (inherit/on/off).

Backend:
- Emit block.reply events from agent loop between tool iterations
- Add BlockReply *bool to GatewayConfig and all 6 channel config structs
- Add BlockReplyChannel interface with ResolveBlockReply() resolution
- Guard delivery in HandleAgentEvent by RunContext.BlockReplyEnabled
- Resolve config at RegisterRun time, pass to consumer goroutine
- Conditional dedup: skip final message if identical to last block reply

UI:
- Gateway settings: Switch toggle for global default
- Per-channel: tri-state select (Inherit from gateway / Enabled / Disabled)
- Protocol: BLOCK_REPLY constant in AgentEventTypes
- Form: coerceBoolSelects for proper JSON boolean serialization
2026-03-07 01:06:10 +07:00
Luan Vu 7d744eb4f2 refactor(oauth): DB-backed token storage, split codex.go, remove file-based artifacts (#65)
Replace file-based OAuth token storage with DB-backed storage using
llm_providers (access token) + config_secrets (refresh token).

- Store: Add Settings JSONB field, chatgpt_oauth provider type
- OAuth: DBTokenSource backed by provider + secrets stores
- HTTP: oauth.go uses DB stores + registers provider in-memory
- Providers: chatgpt_oauth support in registerInMemory/registerProvidersFromDB
- Config: Remove HasOAuthToken, revert envFallback→envStr
- CLI: auth commands call HTTP API on running gateway
- Split codex.go (478→189 LOC) into codex.go + codex_build.go + codex_types.go
- Frontend: Remove fake OAUTH_PROVIDER_ID, use real DB-backed providers
- Tests: Rewrite with mock stores, fix SSE mock servers
2026-03-07 00:15:30 +07:00