Files
goclaw/docs/00-architecture-overview.md
T
viettranx 7b3ac84152 docs: update references for delegate→team/teammate rename
- Update lane name, env var, channel references across all docs
- Mark delegate tool as removed in tools system docs
- Update routing tables and architecture diagrams
2026-03-18 11:05:23 +07:00

18 KiB

00 - Architecture Overview

1. Overview

GoClaw is an AI agent gateway written in Go. It exposes a WebSocket RPC (v3) interface and an OpenAI-compatible HTTP API for orchestrating LLM-powered agents. The system uses PostgreSQL as its storage backend with full multi-tenant isolation, per-user context files, encrypted credentials, agent delegation, teams, and LLM call tracing.

2. Component Diagram

flowchart TD
    subgraph Clients
        WS[WebSocket Clients]
        HTTP[HTTP Clients]
        TG[Telegram]
        DC[Discord]
        FS[Feishu / Lark]
        ZL[Zalo OA]
        ZLP[Zalo Personal]
        WA[WhatsApp]
        SL[Slack]
    end

    subgraph Gateway["Gateway Server"]
        WSS[WebSocket Server]
        HTTPS[HTTP API Server]
        MR[Method Router]
        RL[Rate Limiter]
        RBAC[Permission Engine]
    end

    subgraph Channels["Channel Manager"]
        CM[Channel Manager]
        PA[Pairing Service]
    end

    subgraph Core["Core Engine"]
        BUS[Message Bus]
        SCHED[Scheduler -- 4 Lanes]
        AR[Agent Router]
        LOOP[Agent Loop -- Think / Act / Observe]
    end

    subgraph Providers["LLM Providers"]
        ANTH[Anthropic -- Native HTTP + SSE]
        OAI["OpenAI-Compatible -- HTTP + SSE<br/>(OpenAI, Gemini, DeepSeek, DashScope, +8)"]
        ACP["ACP -- JSON-RPC 2.0 stdio<br/>(Claude Code, Codex, Gemini CLI)"]
    end

    subgraph Tools["Tool Registry"]
        FS_T[Filesystem]
        EXEC[Exec / Shell]
        WEB[Web Search / Fetch]
        MEM[Memory]
        SUB[Subagent]
        DEL[Delegation]
        TEAM_T[Teams]
        TTS_T[TTS]
        BROW[Browser]
        SK[Skills]
        MCP_T[MCP Bridge]
        CT[Custom Tools]
    end

    subgraph Store["Store Layer"]
        SESS[SessionStore]
        AGENT_S[AgentStore]
        PROV_S[ProviderStore]
        CRON_S[CronStore]
        MEM_S[MemoryStore]
        SKILL_S[SkillStore]
        TRACE_S[TracingStore]
        MCP_S[MCPServerStore]
        CT_S[CustomToolStore]
        TM_S[TeamStore]
    end

    WS --> WSS
    HTTP --> HTTPS
    TG & DC & FS & ZL & ZLP & WA & SL --> CM

    WSS --> MR
    HTTPS --> MR
    MR --> RL --> RBAC --> AR

    CM --> BUS
    BUS --> SCHED
    SCHED --> AR
    AR --> LOOP

    LOOP --> Providers
    LOOP --> Tools
    Tools --> Store
    LOOP --> Store

3. Module Map

Module Description
internal/gateway/ WebSocket + HTTP server, client handling, method router
internal/gateway/methods/ RPC method handlers: chat, agents, teams, delegations, sessions, config, skills, cron, pairing, exec approval, usage, send
internal/agent/ Agent loop (think, act, observe), router, resolver, system prompt builder, sanitization, pruning, tracing, memory flush, DELEGATION.md + TEAM.md injection
internal/providers/ LLM providers: Anthropic (native HTTP + SSE), OpenAI-compatible (HTTP + SSE, 12+ providers), DashScope (Qwen), ACP (JSON-RPC 2.0 subprocess), Claude CLI, Codex, extended thinking support, retry logic
internal/providers/acp/ ACP protocol implementation: ProcessPool (subprocess lifecycle), ToolBridge (fs/terminal), session management
internal/tools/ Tool registry, filesystem ops, exec/shell, policy engine, subagent, delegation manager, team tools, context file + memory interceptors, credential scrubbing, rate limiting, PathDenyable
internal/tools/dynamic_loader.go Custom tool loader: LoadGlobal (startup), LoadForAgent (per-agent clone), ReloadGlobal (cache invalidation)
internal/tools/dynamic_tool.go Custom tool executor: command template rendering, shell escaping, encrypted env vars
internal/store/ Store interfaces: SessionStore, AgentStore, ProviderStore, SkillStore, MemoryStore, CronStore, PairingStore, TracingStore, MCPServerStore, TeamStore, ChannelInstanceStore, ConfigSecretsStore
internal/store/pg/ PostgreSQL implementations (database/sql + pgx/v5)
internal/bootstrap/ System prompt files (AGENTS.md, SOUL.md, TOOLS.md, IDENTITY.md, USER.md, BOOTSTRAP.md) + seeding + truncation
internal/config/ Config loading (JSON5) + env var overlay
internal/skills/ SKILL.md loader (5-tier hierarchy) + BM25 search + hot-reload via fsnotify
internal/channels/ Channel manager + adapters: Telegram (forum topics, STT, bot commands), Feishu/Lark (streaming cards, media), Zalo OA, Zalo Personal, Discord, WhatsApp, Slack
internal/mcp/ MCP server bridge (stdio, SSE, streamable-HTTP transports)
internal/scheduler/ Lane-based concurrency control (main, subagent, cron, team lanes) with per-session serialization
internal/memory/ Memory system (pgvector hybrid search)
internal/permissions/ RBAC policy engine (admin, operator, viewer roles)
internal/store/pg/pairing.go DM/device pairing service (8-character codes, database-backed)
internal/sandbox/ Docker-based code execution sandbox
internal/tts/ Text-to-Speech providers: OpenAI, ElevenLabs, Edge, MiniMax
internal/http/ HTTP API handlers: /v1/chat/completions, /v1/agents, /v1/skills, /v1/traces, /v1/mcp, /v1/delegations, summoner
internal/crypto/ AES-256-GCM encryption for API keys
internal/tracing/ LLM call tracing (traces + spans), in-memory buffer with periodic store flush
internal/tracing/otelexport/ Optional OpenTelemetry OTLP exporter (opt-in via build tags; adds gRPC + protobuf)
internal/cache/ Caching layer for agent state and provider responses
internal/bus/ Event pub/sub message bus for inter-component communication
internal/knowledgegraph/ Knowledge graph storage and traversal
internal/mcp/ Model Context Protocol bridge/server (stdio, SSE, streamable-HTTP)
internal/media/ Media handling utilities
internal/oauth/ OAuth authentication integration
internal/sessions/ Session management and lifecycle
internal/tasks/ Task management system
internal/upgrade/ Database schema version tracking and migrations

4. Multi-Tenant Identity Model

GoClaw uses the Identity Propagation pattern (also known as Trusted Subsystem). It does not implement authentication or authorization — instead, it trusts the upstream service that authenticates with the gateway token to provide accurate user identity.

flowchart LR
    subgraph "Upstream Service (trusted)"
        AUTH["Authenticate end-user"]
        HDR["Set X-GoClaw-User-Id header<br/>or user_id in WS connect"]
    end

    subgraph "GoClaw Gateway"
        EXTRACT["Extract user_id<br/>(opaque, VARCHAR 255)"]
        CTX["store.WithUserID(ctx)"]
        SCOPE["Per-user scoping:<br/>sessions, context files,<br/>memory, traces, agent shares"]
    end

    AUTH --> HDR
    HDR --> EXTRACT
    EXTRACT --> CTX
    CTX --> SCOPE

Identity Flow

Entry Point How user_id is provided Enforcement
HTTP API X-GoClaw-User-Id header Required
WebSocket user_id field in connect handshake Required
Channels Derived from platform sender ID (e.g., Telegram user ID) Automatic

Compound User ID Convention

The user_id field is opaque to GoClaw — it does not interpret or validate the format. For multi-tenant deployments, the recommended convention is:

tenant.{tenantId}.user.{userId}

This hierarchical format ensures natural isolation between tenants. Since user_id is used as a scoping key across all per-user tables (user_context_files, user_agent_profiles, user_agent_overrides, agent_shares, sessions, traces), the compound format guarantees that users from different tenants cannot access each other's data.

Where user_id is used

Component Usage
Session keys agent:{agentId}:{channel}:direct:{peerId} — peerId derived from user_id
Context files user_context_files table scoped by (agent_id, user_id)
User profiles user_agent_profiles table — first/last seen, workspace
User overrides user_agent_overrides — per-user provider/model preferences
Agent shares agent_shares table — user-level access control
Memory Per-user memory entries via context propagation
Traces traces table includes user_id for filtering
MCP grants mcp_user_grants — per-user MCP server access
Skills grants skill_user_grants — per-user skill access

6. Gateway Startup Sequence

sequenceDiagram
    participant CLI as CLI (cmd/root.go)
    participant GW as runGateway()
    participant PG as PostgreSQL
    participant Engine as Core Engine

    CLI->>GW: 1. Parse CLI flags + load config
    GW->>GW: 2. Resolve workspace + data dirs
    GW->>GW: 3. Create Message Bus

    GW->>PG: 4. Connect to Postgres (pg.NewPGStores)
    PG-->>GW: PG stores created
    GW->>GW: 5. Start tracing collector
    GW->>PG: 6. Register providers from DB
    GW->>PG: 7. Wire embedding provider to PGMemoryStore
    GW->>PG: 8. Backfill memory embeddings (background)

    GW->>GW: 9. Register config-based providers
    GW->>GW: 10. Create tool registry (filesystem, exec, web, memory, browser, TTS, subagent, MCP)
    GW->>GW: 11. Load bootstrap files (DB)
    GW->>GW: 12. Create skills loader + register skill_search tool
    GW->>GW: 13. Wire skill embeddings

    GW->>GW: 14. Create agents lazily (set ManagedResolver)
    GW->>GW: 15. wireManagedExtras (interceptors, cache subscribers)
    GW->>GW: 16. Wire managed HTTP handlers (agents, skills, traces, MCP)

    GW->>Engine: 17. Create gateway server (WS + HTTP)
    GW->>Engine: 18. Register RPC methods
    GW->>Engine: 19. Register + start channels (Telegram, Discord, Feishu, Zalo, WhatsApp)
    GW->>Engine: 20. Start cron, scheduler (4 lanes)
    GW->>Engine: 21. Start skills watcher + inbound consumer
    GW->>Engine: 22. Listen on host:port

7. Database Wiring

The wireManagedExtras() function in cmd/gateway_managed.go wires multi-tenant components:

flowchart TD
    W1["1. ContextFileInterceptor<br/>Routes read_file / write_file to DB"] --> W2
    W2["2. User Seeding Callback<br/>Seeds per-user context files on first chat"] --> W3
    W3["3. Context File Loader<br/>Loads per-user vs agent-level files by agent_type"] --> W4
    W4["4. ManagedResolver<br/>Lazy-creates agent Loops from DB on cache miss"] --> W5
    W5["5. Virtual FS Interceptors<br/>Wire interceptors on read_file + write_file + memory tools"] --> W6
    W6["6. Memory Store Wiring<br/>Wire PGMemoryStore on memory_search + memory_get tools"] --> W7
    W7["7. Cache Invalidation Subscribers<br/>Subscribe to MessageBus events"] --> W8
    W8["8. Delegation Tools<br/>DelegateManager + agent links"] --> W9
    W9["9. Team Tools<br/>team_tasks + team_message + team auto-linking"] --> W10
    W10["10. Team Mailbox<br/>team_message tool for peer communication"]

Cache Invalidation Events

Event Subscriber Action
cache:bootstrap ContextFileInterceptor InvalidateAgent() or InvalidateAll()
cache:agent AgentRouter InvalidateAgent() -- forces re-resolve from DB
cache:skills SkillStore BumpVersion()
cache:cron CronStore InvalidateCache()
cache:custom_tools DynamicToolLoader ReloadGlobal() + AgentRouter.InvalidateAll()

8. Scheduler Lanes

The scheduler uses a lane-based concurrency model. Each lane is a named worker pool with a bounded semaphore. Per-session queues control concurrency within each session.

flowchart TD
    subgraph Main["Lane: main (concurrency 2)"]
        M1[Channel messages]
        M2[WebSocket requests]
    end

    subgraph Sub["Lane: subagent (concurrency 4)"]
        S1[Subagent executions]
    end

    subgraph Del["Lane: team (concurrency 100)"]
        D1[Delegation executions]
    end

    subgraph Cron["Lane: cron (concurrency 1)"]
        C1[Cron job executions]
    end

    Main --> SEM1[Semaphore]
    Sub --> SEM2[Semaphore]
    Del --> SEM3[Semaphore]
    Cron --> SEM4[Semaphore]

    SEM1 --> Q[Per-Session Queue]
    SEM2 --> Q
    SEM3 --> Q
    SEM4 --> Q

    Q --> AGENT[Agent Loop]

Lane Defaults

Lane Concurrency Env Override Purpose
main 30 GOCLAW_LANE_MAIN Primary user chat sessions
subagent 50 GOCLAW_LANE_SUBAGENT Spawned subagents
team 100 GOCLAW_LANE_TEAM Agent team/delegation executions
cron 30 GOCLAW_LANE_CRON Scheduled cron jobs

Session Queue Concurrency

Per-session queues now support configurable maxConcurrent:

  • DMs: maxConcurrent = 1 (single-threaded per user)
  • Groups: maxConcurrent = 3 (multiple concurrent responses)
  • Adaptive throttle: When session history exceeds 60% of context window, concurrency drops to 1

Queue Modes

Mode Behavior
queue FIFO -- new messages wait until the current run completes
followup Merges incoming message into the pending queue as a follow-up
interrupt Cancels the active run and replaces it with the new message

Default queue config: capacity 10, drop policy old (drops oldest on overflow), debounce 800ms.

/stop and /stopall

  • /stop -- Cancel the oldest running task (others keep going)
  • /stopall -- Cancel all running tasks + drain the queue

Both are intercepted before the debouncer to avoid being merged with normal messages.


9. Graceful Shutdown

When the process receives SIGINT or SIGTERM:

  1. Broadcast shutdown event to all connected WebSocket clients.
  2. channelMgr.StopAll() -- stop all channel adapters.
  3. cronStore.Stop() -- stop cron scheduler.
  4. sandboxMgr.Stop() + ReleaseAll() -- release Docker containers.
  5. cancel() -- cancel root context, propagating to consumer + scheduler.
  6. Deferred cleanup: flush tracing collector, close memory store, close browser manager, stop scheduler lanes.
  7. HTTP server shutdown with a 5-second timeout (context.WithTimeout).

10. Config System

Configuration is loaded from a JSON5 file with environment variable overlay. Secrets are never persisted to the config file.

flowchart TD
    A{Config path?} -->|--config flag| B[CLI flag path]
    A -->|GOCLAW_CONFIG env| C[Env var path]
    A -->|default| D["config.json"]

    B & C & D --> LOAD["config.Load()"]
    LOAD --> S1["1. Set defaults"]
    S1 --> S2["2. Parse JSON5"]
    S2 --> S3["3. Env var overlay<br/>(GOCLAW_*_API_KEY)"]
    S3 --> S4["4. Apply computed defaults<br/>(context pruning, etc.)"]
    S4 --> READY[Config ready]

Key Config Sections

Section Purpose
gateway host, port, token, allowed_origins, rate_limit_rpm, max_message_chars
agents defaults (provider, model, context_window) + list (per-agent overrides)
tools profile, allow/deny lists, exec_approval, web, browser, mcp_servers, rate_limit_per_hour
channels Per-channel: enabled, token, dm_policy, group_policy, allow_from
database postgres_dsn read only from env var

Secret Handling

  • Secrets exist only in env vars or .env.local -- never in config.json.
  • GOCLAW_POSTGRES_DSN is tagged json:"-" and cannot be read from the config file.
  • MaskedCopy() replaces API keys with "***" when returning config over WebSocket.
  • StripSecrets() removes secrets before writing config to disk.
  • Config hot-reload via fsnotify watcher with 300ms debounce.

11. File Reference

File Purpose
cmd/root.go Cobra CLI entry point, flag parsing
cmd/gateway.go Gateway startup orchestrator (runGateway())
cmd/gateway_managed.go Database wiring (wireManagedExtras(), wireManagedHTTP())
cmd/gateway_callbacks.go Shared callbacks (user seeding, context file loading)
cmd/gateway_consumer.go Inbound message consumer (subagent, teammate routing)
cmd/gateway_providers.go Provider registration (config-based + DB-based)
cmd/gateway_methods.go RPC method registration
internal/config/config.go Config struct definitions
internal/config/config_load.go JSON5 loading + env overlay
internal/config/config_channels.go Channel config structs
internal/gateway/server.go WS + HTTP server, CORS, rate limiter setup
internal/gateway/client.go WebSocket client handling, read limit (512KB)
internal/gateway/router.go RPC method routing
internal/scheduler/lanes.go Lane definitions, semaphore-based concurrency
internal/scheduler/queue.go Per-session queue, queue modes, debounce
internal/store/stores.go Stores container struct (all 22+ store interfaces)
internal/store/types.go StoreConfig, BaseModel

Cross-References

Document Content
01-agent-loop.md Agent loop detail, sanitization pipeline, history management
02-providers.md LLM providers, retry logic, schema cleaning
03-tools-system.md Tool registry, policy engine, interceptors, custom tools, MCP grants
04-gateway-protocol.md WebSocket protocol v3, HTTP API, RBAC, identity propagation
05-channels-messaging.md Channel adapters, Telegram formatting, pairing, per-user scoping
06-store-data-model.md Store interfaces, PostgreSQL schema, session caching, custom tool store
07-bootstrap-skills-memory.md Bootstrap files, skills system, memory, skills grants
08-scheduling-cron.md Scheduler lanes, cron lifecycle
09-security.md Defense layers, encryption, rate limiting, RBAC, sandbox
10-tracing-observability.md Tracing collector, span hierarchy, OTel export, trace API
11-agent-teams.md Agent teams, task board, mailbox, delegation integration
12-extended-thinking.md Extended thinking, per-provider support, streaming