Files
goclaw/docs/01-agent-loop.md
T
Viet Tran 037d18f711 docs: comprehensive audit and update of all documentation (#231)
* feat(ui): improve kanban UX, fix dialog scroll, remove delegation page

- Kanban: reorder columns (blocked after pending), show blocked-by info
  on cards, clickable blocker links in task detail, framer-motion card
  animation between columns
- Dialogs: standardize scroll pattern across all modals — header fixed,
  scrollbar flush with outer edge via negative margin trick
- Remove delegation page, types, events, i18n, routes, and all references
- Fix activity_logs NULL jsonb scan error (COALESCE)
- Board header: show text labels on action buttons (desktop)

* docs: comprehensive audit and update of all documentation

- Update Go 1.25 → 1.26, PostgreSQL 15+ → 18 across all docs
- Add 10 missing internal modules to CLAUDE.md project structure
- Expand provider docs from 2 to 6 packages (Anthropic, OpenAI, DashScope, Claude CLI, ACP, Codex)
- Add 8 missing store interfaces to data model docs (22 total)
- Update bootstrap files from 7 to 13 templates
- Expand tool inventory from ~35 to 60+ tools with media/KG/credential categories
- Fix Team Task Board: add blocked status, 3 missing actions, V2 versioning, delegate restrictions
- Remove all references to removed features: handoff, delegate_search, evaluate_loop, agent_links
- Fix lane defaults (2/4/1 → 30/50/100/30), ghost file references, models.list → providers.models
- Add SecureCLI, snapshot worker, cost calculation, pairing security docs
- Comprehensive changelog catch-up
- Trim docs/03-tools-system.md to 800-line limit
2026-03-16 22:51:57 +07:00

581 lines
29 KiB
Markdown

# 01 - Agent Loop
## Overview
The Agent Loop implements a **Think --> Act --> Observe** cycle. Each agent owns a `Loop` instance configured with a provider, model, tools, workspace, and agent type. A user message enters as a `RunRequest`, passes through `runLoop`, and exits as a `RunResult`. The loop iterates up to 20 times: the LLM thinks, optionally calls tools, observes results, and repeats until it produces a final text response.
---
## 1. RunRequest Flow
The full lifecycle of a single agent run is broken into seven phases.
```mermaid
flowchart TD
START([RunRequest]) --> PH1
subgraph PH1["Phase 1: Setup"]
P1A[Increment activeRuns atomic counter] --> P1B[Emit run.started event]
P1B --> P1C[Create trace record]
P1C --> P1D[Inject agentType / userID / agentID into context]
P1D --> P1E0[Compute per-user workspace + WithToolWorkspace]
P1E0 --> P1E[Ensure per-user files via sync.Map cache]
P1E --> P1F[Persist agent + user IDs on session]
end
PH1 --> PH2
subgraph PH2["Phase 2: Input Validation"]
P2A["InputGuard.Scan - 6 injection patterns"] --> P2B["Message truncation at max_message_chars (default 32K)"]
end
PH2 --> PH3
subgraph PH3["Phase 3: Build Messages"]
P3A[Build system prompt - 15+ sections] --> P3B[Inject conversation summary if present]
P3B --> P3C["History pipeline: limitHistoryTurns --> pruneContextMessages --> sanitizeHistory"]
P3C --> P3D[Append current user message]
P3D --> P3E[Buffer user message locally - deferred write]
end
PH3 --> PH4
subgraph PH4["Phase 4: LLM Iteration Loop (max 20)"]
P4A[Filter tools via PolicyEngine] --> P4B["Call LLM (ChatStream or Chat)"]
P4B --> P4C[Accumulate tokens + record LLM span]
P4C --> P4D{Tool calls in response?}
P4D -->|No| EXIT[Exit loop with final content]
P4D -->|Yes| PH5
end
subgraph PH5["Phase 5: Tool Execution"]
P5A[Append assistant message with tool calls] --> P5B{Single or multiple tools?}
P5B -->|Single| P5C[Execute sequentially]
P5B -->|Multiple| P5D["Execute in parallel via goroutines, sort results by index"]
P5C & P5D --> P5E["Emit tool.call / tool.result events, record tool spans, save tool messages"]
end
PH5 --> PH4
EXIT --> PH6
subgraph PH6["Phase 6: Response Finalization"]
P6A["SanitizeAssistantContent (7-step pipeline)"] --> P6B["Detect NO_REPLY - suppress delivery if silent"]
P6B --> P6C[Flush all buffered messages atomically to session]
P6C --> P6D[Update metadata: model, provider, token counts]
end
PH6 --> PH7
subgraph PH7["Phase 7: Auto-Summarization"]
P7A{"> 50 messages OR > 75% context window?"}
P7A -->|No| P7D[Skip]
P7A -->|Yes| P7B["Memory flush (synchronous, max 5 iterations, 90s timeout)"]
P7B --> P7C["Summarize in background goroutine (120s timeout)"]
end
PH7 --> POST
subgraph POST["Post-processing"]
PP1[Emit root agent span] --> PP2["Emit run.completed or run.failed"]
PP2 --> PP3[Finish trace]
end
POST --> RESULT([RunResult])
```
### Phase 1: Setup
- Increment the `activeRuns` atomic counter (no mutex -- true concurrency, especially in group chats with `maxConcurrent = 3`).
- Emit a `run.started` event to notify connected clients.
- Create a trace record with a generated trace UUID.
- Propagate context values: `WithAgentID()`, `WithUserID()`, `WithAgentType()`. Downstream tools and interceptors rely on these.
- Compute per-user workspace: `base + "/" + sanitize(userID)`. Inject via `WithToolWorkspace(ctx)` so all filesystem and shell tools use the correct directory.
- Ensure per-user files exist. A `sync.Map` cache guarantees the seeding function runs at most once per user.
- Persist the agent ID and user ID on the session for later reference.
### Phase 2: Input Validation
- **InputGuard**: scans the user message against 6 regex patterns that detect prompt injection attempts. See Section 4 for details.
- **Message truncation**: if the message exceeds `max_message_chars` (default 32,768), the content is truncated and the LLM receives a notification that the input was shortened. The message is never rejected outright.
### Phase 3: Build Messages
- Build the system prompt (15+ sections). Context files are resolved dynamically based on agent type.
- Inject the conversation summary (if one exists from a previous compaction) as the first two messages.
- Run the history pipeline (3 stages, see Section 5).
- Append the current user message. Messages are buffered locally (deferred write) to avoid race conditions with concurrent runs on the same session.
### Phase 4: LLM Iteration Loop
- Filter the available tools through the PolicyEngine (RBAC).
- Call the LLM. Streaming calls emit `chunk` events in real time; non-streaming calls return a single response.
- Record an LLM span for tracing with token counts and timing.
- **Mid-loop compaction**: if prompt tokens exceed 75% of context window (or `MaxHistoryShare` if configured), summarize ~70% of in-memory messages, keeping the last ~30%. This happens during active iterations to prevent context overflow in long-running tasks.
- If the response contains no tool calls, exit the loop.
- If tool calls are present, proceed to Phase 5 and then loop back.
- Maximum iterations before loop forcibly exits (default 20, set via `maxIterations` in agent config or `req.MaxIterations` per-request).
### Phase 5: Tool Execution
- Append the assistant message (with tool calls) to the message list.
- **Single tool call**: execute sequentially (no goroutine overhead).
- **Multiple tool calls**: launch parallel goroutines, collect all results, sort by original index, then process sequentially.
- Emit `tool.call` before execution and `tool.result` after.
- Record a tool span for each call. Track async tools (spawn, cron) separately.
- Save tool messages to the session.
### Phase 6: Response Finalization
- Run `SanitizeAssistantContent` -- a 7-step cleanup pipeline (see Section 3).
- Detect `NO_REPLY` in the final content. If present, suppress message delivery (silent reply).
- Flush all buffered messages atomically to the session (user message, tool messages, assistant message). This prevents concurrent runs from interleaving partial history.
- Update session metadata: model name, provider name, cumulative token counts.
### Phase 7: Auto-Summarization
- **Trigger condition**: the history has more than 50 messages OR the estimated token count exceeds 75% of the context window.
- **Per-session TryLock**: before summarizing, acquire a non-blocking per-session lock. If another concurrent run is already summarizing, skip. This prevents concurrent summarization from corrupting session history.
- **Memory flush first**: run synchronously so the agent can persist durable memories before history is truncated. Max 5 LLM iterations, 90-second timeout.
- **Summarize**: launch a background goroutine with a 120-second timeout. The LLM produces a summary of all messages except the last 4. The summary is saved and the history is truncated to those 4 messages. The compaction counter is incremented.
### Cancel Handling
When the context is cancelled (via `/stop` or `/stopall`), the loop exits immediately:
- Trace finalization uses `context.Background()` fallback when `ctx.Err() != nil` to ensure the final DB write succeeds.
- Trace status is set to `"cancelled"` instead of `"error"`.
- An empty outbound message triggers cleanup (stop typing indicator, clear reactions).
---
## 2. System Prompt
The system prompt is assembled dynamically from 19 sections. Two modes control the amount of content included:
- **PromptFull**: used for main agent runs. Includes all sections.
- **PromptMinimal**: used for sub-agents and cron jobs. Reduced sections (only AGENTS.md and TOOLS.md from bootstrap files).
### Sections (In Build Order)
1. **Identity** -- channel-aware context with platform type (Telegram, Zalo, etc.) and chat type (direct/group).
2. **First-run bootstrap** -- `[MANDATORY]` notice injected if BOOTSTRAP.md is present, forcing immediate execution.
3. **Persona** -- SOUL.md and IDENTITY.md injected early in the "primacy zone" to prevent drift in long conversations.
4. **Tooling** -- core tool descriptions, filtered by policy and sandbox status.
5. **Credentialed CLI** -- optional secure CLI context for credentialed exec tool access.
6. **Safety** -- defensive preamble for handling external content, identity anchoring for predefined agents.
7. **Self-Evolution** -- rules for predefined agents to update SOUL.md (style/tone) from user feedback.
8. **Skills (inline)** -- skill content injected directly when the skill set is small (≤15 skills).
9. **Skills (search mode)** -- use `skill_search` tool when the skill set is large.
10. **MCP Tools (inline)** -- external integration tools with real descriptions.
11. **MCP Tools (search mode)** -- use `mcp_tool_search` when many MCP tools are available.
12. **Workspace** -- working directory path, file structure, sandbox container workdir.
13. **Team Workspace** -- absolute path to shared team workspace (for team agents).
14. **Sandbox** -- Docker container instructions, available commands, policy notes.
15. **User Identity** -- owner IDs for permission checks (full mode only).
16. **Time** -- current UTC date/time for temporal awareness.
17. **Channel Formatting** -- platform-specific output hints (e.g., Zalo → plain text).
18. **Extra Context** -- additional context wrapped in `<extra_context>` tags (subagent context, etc.).
19. **Project Context** -- bootstrap context files (remaining after persona extraction), wrapped in defensive preamble.
20. **Sub-Agent Spawning** -- rules for launching child agents (skipped for team agents with TEAM.md).
21. **Runtime** -- agent ID, session key, provider info, model pricing.
22. **Persona Reminder** -- recency reinforcement to combat "lost in the middle" in long conversations.
23. **Memory Reminders** -- prompts to run memory_search and knowledge_graph_search before answering.
---
## 3. Sanitize Output
A 7-step pipeline cleans raw LLM output before delivering it to the user.
```mermaid
flowchart TD
IN[Raw LLM Output] --> S1
S1["1. stripGarbledToolXML<br/>Remove broken XML tool artifacts<br/>from DeepSeek, GLM, Minimax"] --> S2
S2["2. stripDowngradedToolCallText<br/>Remove text-format tool calls:<br/>[Tool Call: ...], [Tool Result ...]"] --> S3
S3["3. stripThinkingTags<br/>Remove reasoning tags:<br/>think, thinking, thought, antThinking"] --> S4
S4["4. stripFinalTags<br/>Remove final tag wrappers,<br/>preserve inner content"] --> S5
S5["5. stripEchoedSystemMessages<br/>Remove hallucinated<br/>[System Message] blocks"] --> S6
S6["6. collapseConsecutiveDuplicateBlocks<br/>Deduplicate repeated paragraphs<br/>caused by model stuttering"] --> S7
S7["7. stripLeadingBlankLines<br/>Remove leading whitespace lines"] --> TRIM
TRIM["TrimSpace()"] --> OUT[Clean Output]
```
### Step Details
1. **stripGarbledToolXML** -- Some models (DeepSeek, GLM, Minimax) emit tool-call XML as plain text instead of proper structured tool calls. This step removes tags like `<tool_call>`, `<function_call>`, `<tool_use>`, `<minimax:tool_call>`, and `<parameter name=...>`. If the entire response consists of garbled XML, an empty string is returned.
2. **stripDowngradedToolCallText** -- Removes text-format tool calls such as `[Tool Call: ...]`, `[Tool Result ...]`, and `[Historical context: ...]` along with any accompanying JSON arguments and output. Uses line-by-line scanning because Go regex does not support lookahead.
3. **stripThinkingTags** -- Removes internal reasoning tags: `<think>`, `<thinking>`, `<thought>`, `<antThinking>`. Case-insensitive, non-greedy matching.
4. **stripFinalTags** -- Removes `<final>` and `</final>` wrapper tags but preserves the content inside them.
5. **stripEchoedSystemMessages** -- Removes `[System Message]` blocks that the LLM hallucinates or echoes in its response. Scans line by line, skipping content until an empty line is reached.
6. **collapseConsecutiveDuplicateBlocks** -- Removes paragraphs that repeat consecutively (a symptom of model stuttering). Splits by `\n\n` and compares each trimmed block against its predecessor.
7. **stripLeadingBlankLines** -- Removes whitespace-only lines at the beginning of the output while preserving indentation in the remaining content.
---
## 4. Input Guard
The Input Guard detects prompt injection attempts in user messages. It is a detection system -- by default it logs warnings but does not block requests.
### 6 Detection Patterns
| Pattern | Description | Example |
|---------|-------------|---------|
| `ignore_instructions` | Attempts to override prior instructions | "Ignore all previous instructions" |
| `role_override` | Attempts to redefine the agent's role | "You are now a different assistant" |
| `system_tags` | Injection of fake system-level tags | `<\|im_start\|>system`, `[SYSTEM]` |
| `instruction_injection` | Insertion of new directives | "New instructions:", "override:" |
| `null_bytes` | Null byte injection | `\x00` characters in the message |
| `delimiter_escape` | Attempts to escape context boundaries | "end of system", `</instructions>` |
### 4 Action Modes
| Action | Behavior |
|--------|----------|
| `"off"` | Scanning disabled entirely |
| `"log"` | Log at info level (`security.injection_detected`), continue processing |
| `"warn"` (default) | Log at warn level (`security.injection_detected`), continue processing |
| `"block"` | Log at warn level and return an error, halting the request |
All security events use the `slog.Warn("security.injection_detected")` convention.
---
## 5. History Pipeline
The history pipeline prepares conversation history before sending it to the LLM. It runs in three sequential stages.
```mermaid
flowchart TD
RAW[Raw Session History] --> S1
S1["Stage 1: limitHistoryTurns<br/>Keep the last N user turns<br/>plus their associated assistant/tool messages"] --> S2
S2["Stage 2: pruneContextMessages<br/>2-pass tool result trimming<br/>(see Section 6)"] --> S3
S3["Stage 3: sanitizeHistory<br/>Repair broken tool_use / tool_result pairing<br/>after truncation"] --> OUT[Cleaned History]
```
### Stage 1: limitHistoryTurns
Takes the raw session history and a `historyLimit` parameter. Keeps only the last N user turns along with all associated assistant and tool messages that belong to those turns. Earlier messages are discarded.
### Stage 2: pruneContextMessages
Applies the 2-pass context pruning algorithm described in Section 6.
### Stage 3: sanitizeHistory
Repairs tool message pairing that may have been broken by truncation or compaction:
1. Skip orphaned tool messages at the beginning of history (no preceding assistant message).
2. For each assistant message that contains tool calls, collect the expected tool_call IDs.
3. Validate that the following tool messages match those expected IDs. Drop mismatched tool messages.
4. Synthesize missing tool results with placeholder text: `"[Tool result missing -- session was compacted]"`.
---
## 6. Context Pruning
Context pruning reduces oversized tool results using a 2-pass algorithm. It only activates when the estimated token-to-context-window ratio crosses a threshold.
```mermaid
flowchart TD
START[Estimate token ratio vs context window] --> CHECK{Ratio >= softTrimRatio 0.3?}
CHECK -->|No| DONE[No pruning needed]
CHECK -->|Yes| PASS1
PASS1["Pass 1: Soft Trim<br/>For each eligible tool result > 4000 chars:<br/>Keep first 1500 chars + last 1500 chars<br/>Replace middle with '...'"]
PASS1 --> CHECK2{"Ratio >= hardClearRatio 0.5?"}
CHECK2 -->|No| DONE
CHECK2 -->|Yes| PASS2
PASS2["Pass 2: Hard Clear<br/>Replace entire tool result content<br/>with '[Old tool result content cleared]'<br/>Stop when ratio drops below threshold"]
PASS2 --> DONE
```
### Defaults
| Parameter | Default | Description |
|-----------|---------|-------------|
| `keepLastAssistants` | 3 | Number of recent assistant messages protected from pruning |
| `softTrimRatio` | 0.3 | Token ratio threshold to trigger Pass 1 |
| `hardClearRatio` | 0.5 | Token ratio threshold to trigger Pass 2 |
| `minPrunableToolChars` | 50,000 | Minimum tool result length eligible for hard clear |
### Protected Zone
The following messages are never pruned:
- System messages
- The last N assistant messages (default: 3)
- The first user message in the conversation
---
## 7. Auto-Summarize and Compaction
The system uses a two-stage compaction strategy: **mid-loop** (during active iterations) and **post-run** (after completion).
### Mid-Loop Compaction (During Iteration)
When in-memory messages exceed 75% of context window during LLM iterations, the agent immediately summarizes the first ~70% of messages in place, keeping the last ~30%. This prevents context overflow in long-running tasks without waiting for post-run summarization.
```
Threshold: prompt_tokens >= contextWindow * 0.75 (configurable via MaxHistoryShare)
Trigger: Once per run, inside the iteration loop (between LLM calls)
Output: In-memory messages replaced with [summary] + [recent 4 messages]
```
### Post-Run Compaction (After Completion)
When the session history exceeds thresholds **after** a run completes, the session is compacted in the background.
```mermaid
flowchart TD
CHECK{"> 50 messages OR<br/>> 75% context window?"}
CHECK -->|No| SKIP[Skip compaction]
CHECK -->|Yes| LOCK["Per-session non-blocking lock<br/>(skip if another run already compacting)"]
LOCK -->|Lock acquired| FLUSH
LOCK -->|Already locked| SKIP
FLUSH["Step 1: Memory Flush (synchronous)<br/>Embedded agent turn with write_file tool<br/>Agent stores durable memories before truncation<br/>Uses PromptMinimal mode<br/>Max 5 iterations, 90s timeout"]
FLUSH --> SUMMARIZE
SUMMARIZE["Step 2: Summarize (background goroutine)<br/>Keep last 4 messages<br/>LLM summarizes older messages<br/>temp=0.3, max_tokens=1024, timeout 120s"]
SUMMARIZE --> SAVE
SAVE["Step 3: Save<br/>SetSummary() + TruncateHistory(4)<br/>IncrementCompaction()"]
```
### Summary Reuse
On the next request, the saved summary is injected at the beginning of the message list as two messages:
1. `{role: "user", content: "[Summary of earlier conversation]\n{summary}"}`
2. `{role: "assistant", content: "I understand the context..."}`
This gives the LLM continuity without replaying the full history. Protected zone: the last 3 assistant messages are never pruned.
---
## 8. Memory Flush
Memory flush runs **synchronously before post-run compaction** to give the agent an opportunity to persist important information before session history is truncated.
### Trigger Conditions
- **Primary**: compaction is about to run (message count or token ratio exceeded).
- **Token threshold**: only runs when session tokens are significant enough to warrant capture.
- **Deduplication**: runs at most once per compaction cycle, tracked by comparing compaction counter.
### Mechanism
An embedded agent turn with special configuration:
- **System prompt mode**: `PromptMinimal` (stripped-down context).
- **Message window**: latest 10 messages only (not the full history).
- **Available tools**: `write_file` and `read_file` for memory file operations.
- **Default prompt**: "Pre-compaction memory flush. Store durable memories now (use memory/YYYY-MM-DD.md; create memory/ if needed). If nothing to store, reply with NO_REPLY."
- **Output handling**: recognizes `NO_REPLY` convention (silent completion).
### Timing
- **Synchronous blocking**: blocks the entire post-run path until flush LLM call completes.
- **Timeout**: 90 seconds for the entire flush turn (5 max iterations).
- **Configurable**: can be disabled or customized via `compaction.memory_flush` config section.
### Results
The agent can write findings to `memory/YYYY-MM-DD.md` files. These persist across session compaction and are available to future sessions via `memory_search` and `memory_get` tools.
---
## 9. Agent Router
The Agent Router manages Loop instances with a cache layer. It supports lazy resolution, TTL-based expiration, and run abort.
```mermaid
flowchart TD
GET["Router.Get(agentID)"] --> CACHE{"Cache hit<br/>and TTL valid?"}
CACHE -->|Yes| RETURN[Return cached Loop]
CACHE -->|No or Expired| RESOLVE{"Resolver configured?"}
RESOLVE -->|No| ERR["Error: agent not found"]
RESOLVE -->|Yes| DB["Resolver.Resolve(agentID)<br/>Load from DB, create Loop"]
DB --> STORE[Store in cache with TTL]
STORE --> RETURN
```
### Cache Invalidation
`InvalidateAgent(agentID)` removes a specific agent from the cache, forcing the next `Get()` call to re-resolve from the database.
### Active Run Tracking
| Method | Behavior |
|--------|----------|
| `RegisterRun(runID, sessionKey, agentID, cancel)` | Register a new active run with its cancel function |
| `AbortRun(runID, sessionKey)` | Cancel a run (verifies sessionKey match before aborting) |
| `AbortRunsForSession(sessionKey)` | Cancel all active runs belonging to a session |
---
## 10. Resolver
The `ManagedResolver` lazy-creates Loop instances from PostgreSQL data when the Router encounters a cache miss.
```mermaid
flowchart TD
MISS["Router cache miss"] --> LOAD["Step 1: Load agent from DB<br/>AgentStore.GetByKey(agentKey)"]
LOAD --> PROV["Step 2: Resolve provider<br/>ProviderRegistry.Get(provider)<br/>Fallback: first provider in registry"]
PROV --> BOOT["Step 3: Load bootstrap files<br/>bootstrap.LoadFromStore(agentID)"]
BOOT --> DEFAULTS["Step 4: Apply defaults<br/>contextWindow <= 0 then 200K<br/>maxIterations <= 0 then 20"]
DEFAULTS --> CREATE["Step 5: Create Loop<br/>NewLoop(LoopConfig)"]
CREATE --> WIRE["Step 6: Wire hooks<br/>EnsureUserFilesFunc, ContextFileLoaderFunc"]
WIRE --> DONE["Return Loop to Router for caching"]
```
### Resolved Properties
- **Provider**: looked up by name from the provider registry. Falls back to the first registered provider if not found.
- **Bootstrap files**: loaded from the workspace directory via `bootstrap.LoadWorkspaceFiles()`. Standard files: AGENTS.md, SOUL.md, TOOLS.md, IDENTITY.md, USER.md, BOOTSTRAP.md. Additional files (MEMORY.md, USER_PREDEFINED.md, DELEGATION.md, TEAM.md, AVAILABILITY.md) loaded separately as needed. Per-user files (USER.md) created on first chat via `EnsureUserFilesFunc`.
- **Agent type**: `open` (per-user context, seeded from template files) or `predefined` (agent-level context plus per-user USER.md overlay).
- **Per-user seeding**: `EnsureUserFilesFunc` seeds template files on first chat, idempotent (skips files that already exist). Uses PostgreSQL's `xmax` trick in `GetOrCreateUserProfile` to distinguish INSERT from ON CONFLICT UPDATE, triggering seeding only for genuinely new users.
- **Dynamic context loading**: `ContextFileLoaderFunc` resolves context files based on agent type and request context. Returns a `[]bootstrap.ContextFile` list with truncated content for system prompt injection. For open agents: loads per-user files from workspace. For predefined agents: loads agent-level files plus per-user USER.md.
- **Custom tools**: `DynamicLoader.LoadForAgent()` clones the global tool registry and adds per-agent custom tools, ensuring each agent gets its own isolated set of dynamic tools.
- **Team context**: auto-resolved for agents that belong to a team. Lead agents get the team workspace as default workspace; non-lead members keep their own workspace with team workspace accessible via absolute path tool context.
---
## 11. Team Workspace Handling
Agents that belong to a team have access to shared team workspaces for collaboration.
### Workspace Resolution
**For dispatched tasks** (via `req.TeamWorkspace`):
- The team workspace becomes the **default workspace** for relative path operations
- All file tools (read_file, write_file, list_files) use team workspace by default
- Agent workspace is still accessible via `WithToolTeamWorkspace()` context for absolute-path access
**For direct chat** (auto-resolved via team membership):
- Lead agents get team workspace as their default workspace (primary job is team coordination)
- Non-lead member agents keep their own workspace as default
- Team workspace is accessible via `WithToolTeamWorkspace()` context
### Path Scoping
- **Shared workspace mode** (team.settings.shared_workspace): all agents in team share single workspace
- **Isolated workspace mode** (default): each agent gets a workspace scoped by `(teamID, chatID)` or `(teamID, userID)`
### Context Variables
During runs with team context:
- `WithToolTeamWorkspace(ctx, wsDir)` — absolute path to shared team workspace
- `WithToolWorkspace(ctx, effectiveWorkspace)` — effective default workspace for file operations
- `WithToolTeamID(ctx, teamID)` — team UUID string for team-scoped tool operations
- `WithToolTaskID(ctx, taskID)` — team task ID when executing dispatched team tasks
---
## 12. Event System
The Loop publishes events via an `onEvent` callback. The WebSocket gateway forwards these as `EventFrame` messages to connected clients for real-time progress tracking.
### Event Types
| Event | When | Payload |
|-------|------|---------|
| `run.started` | Run begins | `{"message": "..."}` |
| `activity` | Phase transitions | `{"phase": "thinking"|"tool_exec"|"compacting", "iteration": N}` |
| `chunk` | Streaming: each text fragment from the LLM | `{"content": "..."}` |
| `thinking` | Streaming: thinking tokens (extended thinking models) | `{"content": "..."}` |
| `tool.call` | Tool execution begins | `{"name": "...", "id": "...", "arguments": {...}}` |
| `tool.result` | Tool execution completes | `{"name": "...", "id": "...", "is_error": bool, "result": "..."}` |
| `block.reply` | Intermediate assistant content during tool iterations | `{"content": "..."}` |
| `run.retrying` | LLM provider retry after failure | `{"attempt": N, "maxAttempts": M, "error": "..."}` |
| `run.completed` | Run finishes successfully | `{"content": "...", "usage": {...}}` |
| `run.failed` | Run finishes with an error | `{"error": "..."}` |
### Event Flow
```mermaid
sequenceDiagram
participant L as Agent Loop
participant GW as Gateway
participant C as WebSocket Client
L->>GW: emit(run.started)
GW->>C: EventFrame
loop LLM Iterations
L->>GW: emit(chunk) x N
GW->>C: EventFrame x N
L->>GW: emit(tool.call)
GW->>C: EventFrame
L->>GW: emit(tool.result)
GW->>C: EventFrame
end
L->>GW: emit(run.completed)
GW->>C: EventFrame
```
---
## 13. Tracing
Every agent run produces a trace with a hierarchy of spans for debugging, analysis, and cost tracking.
### Span Hierarchy
```mermaid
flowchart TD
T["Trace (one per Run)"] --> A["Root Agent Span<br/>Covers the entire run duration"]
A --> L1["LLM Span #1<br/>provider, model, iteration number"]
A --> T1["Tool Span #1a<br/>tool name, duration"]
A --> T2["Tool Span #1b<br/>tool name, duration"]
A --> L2["LLM Span #2<br/>provider, model, iteration number"]
A --> T3["Tool Span #2a<br/>tool name, duration"]
```
### 3 Span Types
| Span Type | Description |
|-----------|-------------|
| **Root Agent Span** | Parent span covering the full run. Contains agent ID, session key, and final status. |
| **LLM Call Span** | One per LLM invocation. Records provider, model, token counts (input/output), and duration. |
| **Tool Call Span** | One per tool execution. Records tool name, whether it errored, and duration. |
### Verbose Mode
Enabled via the `GOCLAW_TRACE_VERBOSE=1` environment variable.
| Field | Normal Mode | Verbose Mode |
|-------|-------------|--------------|
| `OutputPreview` | First 500 characters | First 500 characters |
| `InputPreview` | Not recorded | Full LLM input messages as JSON, truncated at 50,000 characters |
---
## 14. File Reference
| File | Responsibility |
|------|---------------|
| `internal/agent/loop_run.go` | Run() entry point: trace creation, span management, event emission wrapper |
| `internal/agent/loop.go` | runLoop() core loop: LLM iteration, tool execution, message buffering, event emission |
| `internal/agent/loop_history.go` | History pipeline: limitHistoryTurns, pruneContextMessages, sanitizeHistory, summary injection |
| `internal/agent/pruning.go` | Context pruning: 2-pass soft trim and hard clear algorithm |
| `internal/agent/loop_compact.go` | Mid-loop compaction: in-memory message summarization during iterations |
| `internal/agent/systemprompt.go` | System prompt assembly (19+ sections), PromptFull and PromptMinimal modes |
| `internal/agent/systemprompt_sections.go` | Individual section builders (tooling, workspace, sandbox, skills, MCP, etc.) |
| `internal/agent/resolver.go` | ManagedResolver: lazy Loop creation from PostgreSQL, provider resolution, bootstrap loading |
| `internal/agent/loop_tracing.go` | Trace and span creation, verbose mode input capture, span finalization |
| `internal/agent/input_guard.go` | Input Guard: 6 regex patterns, 4 action modes, security logging |
| `internal/agent/sanitize.go` | 7-step output sanitization pipeline |
| `internal/agent/memoryflush.go` | Pre-compaction memory flush: embedded agent turn with write_file tool |
| `internal/agent/toolloop.go` | Tool execution and loop detection (no-progress warnings) |
| `internal/bootstrap/files.go` | Bootstrap file loading and context file preparation |