* feat(ui): improve kanban UX, fix dialog scroll, remove delegation page - Kanban: reorder columns (blocked after pending), show blocked-by info on cards, clickable blocker links in task detail, framer-motion card animation between columns - Dialogs: standardize scroll pattern across all modals — header fixed, scrollbar flush with outer edge via negative margin trick - Remove delegation page, types, events, i18n, routes, and all references - Fix activity_logs NULL jsonb scan error (COALESCE) - Board header: show text labels on action buttons (desktop) * docs: comprehensive audit and update of all documentation - Update Go 1.25 → 1.26, PostgreSQL 15+ → 18 across all docs - Add 10 missing internal modules to CLAUDE.md project structure - Expand provider docs from 2 to 6 packages (Anthropic, OpenAI, DashScope, Claude CLI, ACP, Codex) - Add 8 missing store interfaces to data model docs (22 total) - Update bootstrap files from 7 to 13 templates - Expand tool inventory from ~35 to 60+ tools with media/KG/credential categories - Fix Team Task Board: add blocked status, 3 missing actions, V2 versioning, delegate restrictions - Remove all references to removed features: handoff, delegate_search, evaluate_loop, agent_links - Fix lane defaults (2/4/1 → 30/50/100/30), ghost file references, models.list → providers.models - Add SecureCLI, snapshot worker, cost calculation, pairing security docs - Comprehensive changelog catch-up - Trim docs/03-tools-system.md to 800-line limit
29 KiB
01 - Agent Loop
Overview
The Agent Loop implements a Think --> Act --> Observe cycle. Each agent owns a Loop instance configured with a provider, model, tools, workspace, and agent type. A user message enters as a RunRequest, passes through runLoop, and exits as a RunResult. The loop iterates up to 20 times: the LLM thinks, optionally calls tools, observes results, and repeats until it produces a final text response.
1. RunRequest Flow
The full lifecycle of a single agent run is broken into seven phases.
flowchart TD
START([RunRequest]) --> PH1
subgraph PH1["Phase 1: Setup"]
P1A[Increment activeRuns atomic counter] --> P1B[Emit run.started event]
P1B --> P1C[Create trace record]
P1C --> P1D[Inject agentType / userID / agentID into context]
P1D --> P1E0[Compute per-user workspace + WithToolWorkspace]
P1E0 --> P1E[Ensure per-user files via sync.Map cache]
P1E --> P1F[Persist agent + user IDs on session]
end
PH1 --> PH2
subgraph PH2["Phase 2: Input Validation"]
P2A["InputGuard.Scan - 6 injection patterns"] --> P2B["Message truncation at max_message_chars (default 32K)"]
end
PH2 --> PH3
subgraph PH3["Phase 3: Build Messages"]
P3A[Build system prompt - 15+ sections] --> P3B[Inject conversation summary if present]
P3B --> P3C["History pipeline: limitHistoryTurns --> pruneContextMessages --> sanitizeHistory"]
P3C --> P3D[Append current user message]
P3D --> P3E[Buffer user message locally - deferred write]
end
PH3 --> PH4
subgraph PH4["Phase 4: LLM Iteration Loop (max 20)"]
P4A[Filter tools via PolicyEngine] --> P4B["Call LLM (ChatStream or Chat)"]
P4B --> P4C[Accumulate tokens + record LLM span]
P4C --> P4D{Tool calls in response?}
P4D -->|No| EXIT[Exit loop with final content]
P4D -->|Yes| PH5
end
subgraph PH5["Phase 5: Tool Execution"]
P5A[Append assistant message with tool calls] --> P5B{Single or multiple tools?}
P5B -->|Single| P5C[Execute sequentially]
P5B -->|Multiple| P5D["Execute in parallel via goroutines, sort results by index"]
P5C & P5D --> P5E["Emit tool.call / tool.result events, record tool spans, save tool messages"]
end
PH5 --> PH4
EXIT --> PH6
subgraph PH6["Phase 6: Response Finalization"]
P6A["SanitizeAssistantContent (7-step pipeline)"] --> P6B["Detect NO_REPLY - suppress delivery if silent"]
P6B --> P6C[Flush all buffered messages atomically to session]
P6C --> P6D[Update metadata: model, provider, token counts]
end
PH6 --> PH7
subgraph PH7["Phase 7: Auto-Summarization"]
P7A{"> 50 messages OR > 75% context window?"}
P7A -->|No| P7D[Skip]
P7A -->|Yes| P7B["Memory flush (synchronous, max 5 iterations, 90s timeout)"]
P7B --> P7C["Summarize in background goroutine (120s timeout)"]
end
PH7 --> POST
subgraph POST["Post-processing"]
PP1[Emit root agent span] --> PP2["Emit run.completed or run.failed"]
PP2 --> PP3[Finish trace]
end
POST --> RESULT([RunResult])
Phase 1: Setup
- Increment the
activeRunsatomic counter (no mutex -- true concurrency, especially in group chats withmaxConcurrent = 3). - Emit a
run.startedevent to notify connected clients. - Create a trace record with a generated trace UUID.
- Propagate context values:
WithAgentID(),WithUserID(),WithAgentType(). Downstream tools and interceptors rely on these. - Compute per-user workspace:
base + "/" + sanitize(userID). Inject viaWithToolWorkspace(ctx)so all filesystem and shell tools use the correct directory. - Ensure per-user files exist. A
sync.Mapcache guarantees the seeding function runs at most once per user. - Persist the agent ID and user ID on the session for later reference.
Phase 2: Input Validation
- InputGuard: scans the user message against 6 regex patterns that detect prompt injection attempts. See Section 4 for details.
- Message truncation: if the message exceeds
max_message_chars(default 32,768), the content is truncated and the LLM receives a notification that the input was shortened. The message is never rejected outright.
Phase 3: Build Messages
- Build the system prompt (15+ sections). Context files are resolved dynamically based on agent type.
- Inject the conversation summary (if one exists from a previous compaction) as the first two messages.
- Run the history pipeline (3 stages, see Section 5).
- Append the current user message. Messages are buffered locally (deferred write) to avoid race conditions with concurrent runs on the same session.
Phase 4: LLM Iteration Loop
- Filter the available tools through the PolicyEngine (RBAC).
- Call the LLM. Streaming calls emit
chunkevents in real time; non-streaming calls return a single response. - Record an LLM span for tracing with token counts and timing.
- Mid-loop compaction: if prompt tokens exceed 75% of context window (or
MaxHistoryShareif configured), summarize ~70% of in-memory messages, keeping the last ~30%. This happens during active iterations to prevent context overflow in long-running tasks. - If the response contains no tool calls, exit the loop.
- If tool calls are present, proceed to Phase 5 and then loop back.
- Maximum iterations before loop forcibly exits (default 20, set via
maxIterationsin agent config orreq.MaxIterationsper-request).
Phase 5: Tool Execution
- Append the assistant message (with tool calls) to the message list.
- Single tool call: execute sequentially (no goroutine overhead).
- Multiple tool calls: launch parallel goroutines, collect all results, sort by original index, then process sequentially.
- Emit
tool.callbefore execution andtool.resultafter. - Record a tool span for each call. Track async tools (spawn, cron) separately.
- Save tool messages to the session.
Phase 6: Response Finalization
- Run
SanitizeAssistantContent-- a 7-step cleanup pipeline (see Section 3). - Detect
NO_REPLYin the final content. If present, suppress message delivery (silent reply). - Flush all buffered messages atomically to the session (user message, tool messages, assistant message). This prevents concurrent runs from interleaving partial history.
- Update session metadata: model name, provider name, cumulative token counts.
Phase 7: Auto-Summarization
- Trigger condition: the history has more than 50 messages OR the estimated token count exceeds 75% of the context window.
- Per-session TryLock: before summarizing, acquire a non-blocking per-session lock. If another concurrent run is already summarizing, skip. This prevents concurrent summarization from corrupting session history.
- Memory flush first: run synchronously so the agent can persist durable memories before history is truncated. Max 5 LLM iterations, 90-second timeout.
- Summarize: launch a background goroutine with a 120-second timeout. The LLM produces a summary of all messages except the last 4. The summary is saved and the history is truncated to those 4 messages. The compaction counter is incremented.
Cancel Handling
When the context is cancelled (via /stop or /stopall), the loop exits immediately:
- Trace finalization uses
context.Background()fallback whenctx.Err() != nilto ensure the final DB write succeeds. - Trace status is set to
"cancelled"instead of"error". - An empty outbound message triggers cleanup (stop typing indicator, clear reactions).
2. System Prompt
The system prompt is assembled dynamically from 19 sections. Two modes control the amount of content included:
- PromptFull: used for main agent runs. Includes all sections.
- PromptMinimal: used for sub-agents and cron jobs. Reduced sections (only AGENTS.md and TOOLS.md from bootstrap files).
Sections (In Build Order)
- Identity -- channel-aware context with platform type (Telegram, Zalo, etc.) and chat type (direct/group).
- First-run bootstrap --
[MANDATORY]notice injected if BOOTSTRAP.md is present, forcing immediate execution. - Persona -- SOUL.md and IDENTITY.md injected early in the "primacy zone" to prevent drift in long conversations.
- Tooling -- core tool descriptions, filtered by policy and sandbox status.
- Credentialed CLI -- optional secure CLI context for credentialed exec tool access.
- Safety -- defensive preamble for handling external content, identity anchoring for predefined agents.
- Self-Evolution -- rules for predefined agents to update SOUL.md (style/tone) from user feedback.
- Skills (inline) -- skill content injected directly when the skill set is small (≤15 skills).
- Skills (search mode) -- use
skill_searchtool when the skill set is large. - MCP Tools (inline) -- external integration tools with real descriptions.
- MCP Tools (search mode) -- use
mcp_tool_searchwhen many MCP tools are available. - Workspace -- working directory path, file structure, sandbox container workdir.
- Team Workspace -- absolute path to shared team workspace (for team agents).
- Sandbox -- Docker container instructions, available commands, policy notes.
- User Identity -- owner IDs for permission checks (full mode only).
- Time -- current UTC date/time for temporal awareness.
- Channel Formatting -- platform-specific output hints (e.g., Zalo → plain text).
- Extra Context -- additional context wrapped in
<extra_context>tags (subagent context, etc.). - Project Context -- bootstrap context files (remaining after persona extraction), wrapped in defensive preamble.
- Sub-Agent Spawning -- rules for launching child agents (skipped for team agents with TEAM.md).
- Runtime -- agent ID, session key, provider info, model pricing.
- Persona Reminder -- recency reinforcement to combat "lost in the middle" in long conversations.
- Memory Reminders -- prompts to run memory_search and knowledge_graph_search before answering.
3. Sanitize Output
A 7-step pipeline cleans raw LLM output before delivering it to the user.
flowchart TD
IN[Raw LLM Output] --> S1
S1["1. stripGarbledToolXML<br/>Remove broken XML tool artifacts<br/>from DeepSeek, GLM, Minimax"] --> S2
S2["2. stripDowngradedToolCallText<br/>Remove text-format tool calls:<br/>[Tool Call: ...], [Tool Result ...]"] --> S3
S3["3. stripThinkingTags<br/>Remove reasoning tags:<br/>think, thinking, thought, antThinking"] --> S4
S4["4. stripFinalTags<br/>Remove final tag wrappers,<br/>preserve inner content"] --> S5
S5["5. stripEchoedSystemMessages<br/>Remove hallucinated<br/>[System Message] blocks"] --> S6
S6["6. collapseConsecutiveDuplicateBlocks<br/>Deduplicate repeated paragraphs<br/>caused by model stuttering"] --> S7
S7["7. stripLeadingBlankLines<br/>Remove leading whitespace lines"] --> TRIM
TRIM["TrimSpace()"] --> OUT[Clean Output]
Step Details
-
stripGarbledToolXML -- Some models (DeepSeek, GLM, Minimax) emit tool-call XML as plain text instead of proper structured tool calls. This step removes tags like
<tool_call>,<function_call>,<tool_use>,<minimax:tool_call>, and<parameter name=...>. If the entire response consists of garbled XML, an empty string is returned. -
stripDowngradedToolCallText -- Removes text-format tool calls such as
[Tool Call: ...],[Tool Result ...], and[Historical context: ...]along with any accompanying JSON arguments and output. Uses line-by-line scanning because Go regex does not support lookahead. -
stripThinkingTags -- Removes internal reasoning tags:
<think>,<thinking>,<thought>,<antThinking>. Case-insensitive, non-greedy matching. -
stripFinalTags -- Removes
<final>and</final>wrapper tags but preserves the content inside them. -
stripEchoedSystemMessages -- Removes
[System Message]blocks that the LLM hallucinates or echoes in its response. Scans line by line, skipping content until an empty line is reached. -
collapseConsecutiveDuplicateBlocks -- Removes paragraphs that repeat consecutively (a symptom of model stuttering). Splits by
\n\nand compares each trimmed block against its predecessor. -
stripLeadingBlankLines -- Removes whitespace-only lines at the beginning of the output while preserving indentation in the remaining content.
4. Input Guard
The Input Guard detects prompt injection attempts in user messages. It is a detection system -- by default it logs warnings but does not block requests.
6 Detection Patterns
| Pattern | Description | Example |
|---|---|---|
ignore_instructions |
Attempts to override prior instructions | "Ignore all previous instructions" |
role_override |
Attempts to redefine the agent's role | "You are now a different assistant" |
system_tags |
Injection of fake system-level tags | <|im_start|>system, [SYSTEM] |
instruction_injection |
Insertion of new directives | "New instructions:", "override:" |
null_bytes |
Null byte injection | \x00 characters in the message |
delimiter_escape |
Attempts to escape context boundaries | "end of system", </instructions> |
4 Action Modes
| Action | Behavior |
|---|---|
"off" |
Scanning disabled entirely |
"log" |
Log at info level (security.injection_detected), continue processing |
"warn" (default) |
Log at warn level (security.injection_detected), continue processing |
"block" |
Log at warn level and return an error, halting the request |
All security events use the slog.Warn("security.injection_detected") convention.
5. History Pipeline
The history pipeline prepares conversation history before sending it to the LLM. It runs in three sequential stages.
flowchart TD
RAW[Raw Session History] --> S1
S1["Stage 1: limitHistoryTurns<br/>Keep the last N user turns<br/>plus their associated assistant/tool messages"] --> S2
S2["Stage 2: pruneContextMessages<br/>2-pass tool result trimming<br/>(see Section 6)"] --> S3
S3["Stage 3: sanitizeHistory<br/>Repair broken tool_use / tool_result pairing<br/>after truncation"] --> OUT[Cleaned History]
Stage 1: limitHistoryTurns
Takes the raw session history and a historyLimit parameter. Keeps only the last N user turns along with all associated assistant and tool messages that belong to those turns. Earlier messages are discarded.
Stage 2: pruneContextMessages
Applies the 2-pass context pruning algorithm described in Section 6.
Stage 3: sanitizeHistory
Repairs tool message pairing that may have been broken by truncation or compaction:
- Skip orphaned tool messages at the beginning of history (no preceding assistant message).
- For each assistant message that contains tool calls, collect the expected tool_call IDs.
- Validate that the following tool messages match those expected IDs. Drop mismatched tool messages.
- Synthesize missing tool results with placeholder text:
"[Tool result missing -- session was compacted]".
6. Context Pruning
Context pruning reduces oversized tool results using a 2-pass algorithm. It only activates when the estimated token-to-context-window ratio crosses a threshold.
flowchart TD
START[Estimate token ratio vs context window] --> CHECK{Ratio >= softTrimRatio 0.3?}
CHECK -->|No| DONE[No pruning needed]
CHECK -->|Yes| PASS1
PASS1["Pass 1: Soft Trim<br/>For each eligible tool result > 4000 chars:<br/>Keep first 1500 chars + last 1500 chars<br/>Replace middle with '...'"]
PASS1 --> CHECK2{"Ratio >= hardClearRatio 0.5?"}
CHECK2 -->|No| DONE
CHECK2 -->|Yes| PASS2
PASS2["Pass 2: Hard Clear<br/>Replace entire tool result content<br/>with '[Old tool result content cleared]'<br/>Stop when ratio drops below threshold"]
PASS2 --> DONE
Defaults
| Parameter | Default | Description |
|---|---|---|
keepLastAssistants |
3 | Number of recent assistant messages protected from pruning |
softTrimRatio |
0.3 | Token ratio threshold to trigger Pass 1 |
hardClearRatio |
0.5 | Token ratio threshold to trigger Pass 2 |
minPrunableToolChars |
50,000 | Minimum tool result length eligible for hard clear |
Protected Zone
The following messages are never pruned:
- System messages
- The last N assistant messages (default: 3)
- The first user message in the conversation
7. Auto-Summarize and Compaction
The system uses a two-stage compaction strategy: mid-loop (during active iterations) and post-run (after completion).
Mid-Loop Compaction (During Iteration)
When in-memory messages exceed 75% of context window during LLM iterations, the agent immediately summarizes the first ~70% of messages in place, keeping the last ~30%. This prevents context overflow in long-running tasks without waiting for post-run summarization.
Threshold: prompt_tokens >= contextWindow * 0.75 (configurable via MaxHistoryShare)
Trigger: Once per run, inside the iteration loop (between LLM calls)
Output: In-memory messages replaced with [summary] + [recent 4 messages]
Post-Run Compaction (After Completion)
When the session history exceeds thresholds after a run completes, the session is compacted in the background.
flowchart TD
CHECK{"> 50 messages OR<br/>> 75% context window?"}
CHECK -->|No| SKIP[Skip compaction]
CHECK -->|Yes| LOCK["Per-session non-blocking lock<br/>(skip if another run already compacting)"]
LOCK -->|Lock acquired| FLUSH
LOCK -->|Already locked| SKIP
FLUSH["Step 1: Memory Flush (synchronous)<br/>Embedded agent turn with write_file tool<br/>Agent stores durable memories before truncation<br/>Uses PromptMinimal mode<br/>Max 5 iterations, 90s timeout"]
FLUSH --> SUMMARIZE
SUMMARIZE["Step 2: Summarize (background goroutine)<br/>Keep last 4 messages<br/>LLM summarizes older messages<br/>temp=0.3, max_tokens=1024, timeout 120s"]
SUMMARIZE --> SAVE
SAVE["Step 3: Save<br/>SetSummary() + TruncateHistory(4)<br/>IncrementCompaction()"]
Summary Reuse
On the next request, the saved summary is injected at the beginning of the message list as two messages:
{role: "user", content: "[Summary of earlier conversation]\n{summary}"}{role: "assistant", content: "I understand the context..."}
This gives the LLM continuity without replaying the full history. Protected zone: the last 3 assistant messages are never pruned.
8. Memory Flush
Memory flush runs synchronously before post-run compaction to give the agent an opportunity to persist important information before session history is truncated.
Trigger Conditions
- Primary: compaction is about to run (message count or token ratio exceeded).
- Token threshold: only runs when session tokens are significant enough to warrant capture.
- Deduplication: runs at most once per compaction cycle, tracked by comparing compaction counter.
Mechanism
An embedded agent turn with special configuration:
- System prompt mode:
PromptMinimal(stripped-down context). - Message window: latest 10 messages only (not the full history).
- Available tools:
write_fileandread_filefor memory file operations. - Default prompt: "Pre-compaction memory flush. Store durable memories now (use memory/YYYY-MM-DD.md; create memory/ if needed). If nothing to store, reply with NO_REPLY."
- Output handling: recognizes
NO_REPLYconvention (silent completion).
Timing
- Synchronous blocking: blocks the entire post-run path until flush LLM call completes.
- Timeout: 90 seconds for the entire flush turn (5 max iterations).
- Configurable: can be disabled or customized via
compaction.memory_flushconfig section.
Results
The agent can write findings to memory/YYYY-MM-DD.md files. These persist across session compaction and are available to future sessions via memory_search and memory_get tools.
9. Agent Router
The Agent Router manages Loop instances with a cache layer. It supports lazy resolution, TTL-based expiration, and run abort.
flowchart TD
GET["Router.Get(agentID)"] --> CACHE{"Cache hit<br/>and TTL valid?"}
CACHE -->|Yes| RETURN[Return cached Loop]
CACHE -->|No or Expired| RESOLVE{"Resolver configured?"}
RESOLVE -->|No| ERR["Error: agent not found"]
RESOLVE -->|Yes| DB["Resolver.Resolve(agentID)<br/>Load from DB, create Loop"]
DB --> STORE[Store in cache with TTL]
STORE --> RETURN
Cache Invalidation
InvalidateAgent(agentID) removes a specific agent from the cache, forcing the next Get() call to re-resolve from the database.
Active Run Tracking
| Method | Behavior |
|---|---|
RegisterRun(runID, sessionKey, agentID, cancel) |
Register a new active run with its cancel function |
AbortRun(runID, sessionKey) |
Cancel a run (verifies sessionKey match before aborting) |
AbortRunsForSession(sessionKey) |
Cancel all active runs belonging to a session |
10. Resolver
The ManagedResolver lazy-creates Loop instances from PostgreSQL data when the Router encounters a cache miss.
flowchart TD
MISS["Router cache miss"] --> LOAD["Step 1: Load agent from DB<br/>AgentStore.GetByKey(agentKey)"]
LOAD --> PROV["Step 2: Resolve provider<br/>ProviderRegistry.Get(provider)<br/>Fallback: first provider in registry"]
PROV --> BOOT["Step 3: Load bootstrap files<br/>bootstrap.LoadFromStore(agentID)"]
BOOT --> DEFAULTS["Step 4: Apply defaults<br/>contextWindow <= 0 then 200K<br/>maxIterations <= 0 then 20"]
DEFAULTS --> CREATE["Step 5: Create Loop<br/>NewLoop(LoopConfig)"]
CREATE --> WIRE["Step 6: Wire hooks<br/>EnsureUserFilesFunc, ContextFileLoaderFunc"]
WIRE --> DONE["Return Loop to Router for caching"]
Resolved Properties
- Provider: looked up by name from the provider registry. Falls back to the first registered provider if not found.
- Bootstrap files: loaded from the workspace directory via
bootstrap.LoadWorkspaceFiles(). Standard files: AGENTS.md, SOUL.md, TOOLS.md, IDENTITY.md, USER.md, BOOTSTRAP.md. Additional files (MEMORY.md, USER_PREDEFINED.md, DELEGATION.md, TEAM.md, AVAILABILITY.md) loaded separately as needed. Per-user files (USER.md) created on first chat viaEnsureUserFilesFunc. - Agent type:
open(per-user context, seeded from template files) orpredefined(agent-level context plus per-user USER.md overlay). - Per-user seeding:
EnsureUserFilesFuncseeds template files on first chat, idempotent (skips files that already exist). Uses PostgreSQL'sxmaxtrick inGetOrCreateUserProfileto distinguish INSERT from ON CONFLICT UPDATE, triggering seeding only for genuinely new users. - Dynamic context loading:
ContextFileLoaderFuncresolves context files based on agent type and request context. Returns a[]bootstrap.ContextFilelist with truncated content for system prompt injection. For open agents: loads per-user files from workspace. For predefined agents: loads agent-level files plus per-user USER.md. - Custom tools:
DynamicLoader.LoadForAgent()clones the global tool registry and adds per-agent custom tools, ensuring each agent gets its own isolated set of dynamic tools. - Team context: auto-resolved for agents that belong to a team. Lead agents get the team workspace as default workspace; non-lead members keep their own workspace with team workspace accessible via absolute path tool context.
11. Team Workspace Handling
Agents that belong to a team have access to shared team workspaces for collaboration.
Workspace Resolution
For dispatched tasks (via req.TeamWorkspace):
- The team workspace becomes the default workspace for relative path operations
- All file tools (read_file, write_file, list_files) use team workspace by default
- Agent workspace is still accessible via
WithToolTeamWorkspace()context for absolute-path access
For direct chat (auto-resolved via team membership):
- Lead agents get team workspace as their default workspace (primary job is team coordination)
- Non-lead member agents keep their own workspace as default
- Team workspace is accessible via
WithToolTeamWorkspace()context
Path Scoping
- Shared workspace mode (team.settings.shared_workspace): all agents in team share single workspace
- Isolated workspace mode (default): each agent gets a workspace scoped by
(teamID, chatID)or(teamID, userID)
Context Variables
During runs with team context:
WithToolTeamWorkspace(ctx, wsDir)— absolute path to shared team workspaceWithToolWorkspace(ctx, effectiveWorkspace)— effective default workspace for file operationsWithToolTeamID(ctx, teamID)— team UUID string for team-scoped tool operationsWithToolTaskID(ctx, taskID)— team task ID when executing dispatched team tasks
12. Event System
The Loop publishes events via an onEvent callback. The WebSocket gateway forwards these as EventFrame messages to connected clients for real-time progress tracking.
Event Types
| Event | When | Payload |
|---|---|---|
run.started |
Run begins | {"message": "..."} |
activity |
Phase transitions | `{"phase": "thinking" |
chunk |
Streaming: each text fragment from the LLM | {"content": "..."} |
thinking |
Streaming: thinking tokens (extended thinking models) | {"content": "..."} |
tool.call |
Tool execution begins | {"name": "...", "id": "...", "arguments": {...}} |
tool.result |
Tool execution completes | {"name": "...", "id": "...", "is_error": bool, "result": "..."} |
block.reply |
Intermediate assistant content during tool iterations | {"content": "..."} |
run.retrying |
LLM provider retry after failure | {"attempt": N, "maxAttempts": M, "error": "..."} |
run.completed |
Run finishes successfully | {"content": "...", "usage": {...}} |
run.failed |
Run finishes with an error | {"error": "..."} |
Event Flow
sequenceDiagram
participant L as Agent Loop
participant GW as Gateway
participant C as WebSocket Client
L->>GW: emit(run.started)
GW->>C: EventFrame
loop LLM Iterations
L->>GW: emit(chunk) x N
GW->>C: EventFrame x N
L->>GW: emit(tool.call)
GW->>C: EventFrame
L->>GW: emit(tool.result)
GW->>C: EventFrame
end
L->>GW: emit(run.completed)
GW->>C: EventFrame
13. Tracing
Every agent run produces a trace with a hierarchy of spans for debugging, analysis, and cost tracking.
Span Hierarchy
flowchart TD
T["Trace (one per Run)"] --> A["Root Agent Span<br/>Covers the entire run duration"]
A --> L1["LLM Span #1<br/>provider, model, iteration number"]
A --> T1["Tool Span #1a<br/>tool name, duration"]
A --> T2["Tool Span #1b<br/>tool name, duration"]
A --> L2["LLM Span #2<br/>provider, model, iteration number"]
A --> T3["Tool Span #2a<br/>tool name, duration"]
3 Span Types
| Span Type | Description |
|---|---|
| Root Agent Span | Parent span covering the full run. Contains agent ID, session key, and final status. |
| LLM Call Span | One per LLM invocation. Records provider, model, token counts (input/output), and duration. |
| Tool Call Span | One per tool execution. Records tool name, whether it errored, and duration. |
Verbose Mode
Enabled via the GOCLAW_TRACE_VERBOSE=1 environment variable.
| Field | Normal Mode | Verbose Mode |
|---|---|---|
OutputPreview |
First 500 characters | First 500 characters |
InputPreview |
Not recorded | Full LLM input messages as JSON, truncated at 50,000 characters |
14. File Reference
| File | Responsibility |
|---|---|
internal/agent/loop_run.go |
Run() entry point: trace creation, span management, event emission wrapper |
internal/agent/loop.go |
runLoop() core loop: LLM iteration, tool execution, message buffering, event emission |
internal/agent/loop_history.go |
History pipeline: limitHistoryTurns, pruneContextMessages, sanitizeHistory, summary injection |
internal/agent/pruning.go |
Context pruning: 2-pass soft trim and hard clear algorithm |
internal/agent/loop_compact.go |
Mid-loop compaction: in-memory message summarization during iterations |
internal/agent/systemprompt.go |
System prompt assembly (19+ sections), PromptFull and PromptMinimal modes |
internal/agent/systemprompt_sections.go |
Individual section builders (tooling, workspace, sandbox, skills, MCP, etc.) |
internal/agent/resolver.go |
ManagedResolver: lazy Loop creation from PostgreSQL, provider resolution, bootstrap loading |
internal/agent/loop_tracing.go |
Trace and span creation, verbose mode input capture, span finalization |
internal/agent/input_guard.go |
Input Guard: 6 regex patterns, 4 action modes, security logging |
internal/agent/sanitize.go |
7-step output sanitization pipeline |
internal/agent/memoryflush.go |
Pre-compaction memory flush: embedded agent turn with write_file tool |
internal/agent/toolloop.go |
Tool execution and loop detection (no-progress warnings) |
internal/bootstrap/files.go |
Bootstrap file loading and context file preparation |