16 KiB
Model Steering System — Track, Hint & Guard
How GoClaw steers and assists small models (MiniMax, Qwen, Gemini Flash...) through 3 control layers.
1. Overview
Small models (< 70B params) commonly face 3 issues when running agent loops:
| Problem | Symptom |
|---|---|
| Losing direction | Uses up iteration budget without answering, loops meaningless tool calls |
| Forgetting context | Doesn't report progress, doesn't leverage existing information |
| Safety violations | Runs dangerous commands, falls to prompt injection, writes malicious code |
GoClaw addresses these with 3 steering layers operating concurrently:
flowchart LR
REQ([Request]) --> TRACK
subgraph TRACK["Track — Where to run?"]
direction TB
T1[Lane routing]
T2[Concurrency control]
T3[Session serialization]
end
TRACK --> GUARD
subgraph GUARD["Guard — What's allowed?"]
direction TB
G1[Input validation]
G2[Shell deny patterns]
G3[Skill content scan]
end
GUARD --> HINT
subgraph HINT["Hint — What should it do?"]
direction TB
H1[Budget warnings]
H2[Error guidance]
H3[Progress nudges]
end
HINT --> LOOP([Agent Loop])
Design principles:
- Track = infrastructure — the model doesn't know which lane it's running on
- Guard = hard boundary — blocks dangerous behavior regardless of model
- Hint = soft guidance — suggests via messages, model can ignore (but usually doesn't)
2. Track System (Lane-based Scheduling)
Track routes requests by work type. Each lane has its own concurrency, ensuring workloads don't contend for resources.
2.1 Lane Architecture
flowchart TD
SCHED[Scheduler] --> LM[Lane Manager]
LM --> L1["main (30)"]
LM --> L2["subagent (50)"]
LM --> L3["team (100)"]
LM --> L4["cron (30)"]
L1 --> Q1[SessionQueue A]
L1 --> Q2[SessionQueue B]
L2 --> Q3[SessionQueue C]
L3 --> Q4[SessionQueue D]
L4 --> Q5[SessionQueue E]
Q1 --> LOOP1([Agent Loop])
Q2 --> LOOP2([Agent Loop])
Q3 --> LOOP3([Agent Loop])
Q4 --> LOOP4([Agent Loop])
Q5 --> LOOP5([Agent Loop])
2.2 Lane Assignment
| Lane | Concurrency | Request Source | Purpose |
|---|---|---|---|
main |
30 | User chat via WS/channel | Primary conversation sessions |
subagent |
50 | Subagent announce | Child agents spawned by main |
team |
100 | Team task dispatch | Members in agent teams |
cron |
30 | Cron scheduler | Scheduled periodic jobs |
Lane assignment is deterministic — based on request type, not agent config.
2.3 Per-session Queue
Each session within a lane has its own queue:
- DM:
maxConcurrent = 1— serial, no overlap - Group:
maxConcurrent = 3— allows parallel replies - Adaptive throttle: When session history > 60% context window → reduces to 1
This prevents small models from overwhelming themselves when context is nearly full.
2.4 Reference Keywords
Scheduler, LaneManager, Lane, SessionQueue, LaneMain, LaneSubagent, LaneTeam, LaneCron, DefaultLanes(), ScheduleWithOpts()
3. Hint System (Contextual Guidance Injection)
Hints are messages injected into the conversation at strategic moments during the agent loop. Small models especially need hints because they tend to forget initial instructions as conversations grow long.
3.1 Overview of 8 Hint Types
flowchart TD
subgraph LOOP["Agent Loop Phases"]
PH2["Phase 2: Input Validation"]
PH3["Phase 3: Build Messages"]
PH4["Phase 4: LLM Iteration"]
PH5["Phase 5: Tool Execution"]
end
CH["Channel Formatting Hint"] -.-> PH3
GR["Group Reply Hint"] -.-> PH3
SR["System Prompt Reminders"] -.-> PH3
BH["Budget Hint (75%)"] -.-> PH4
SE["Skill Evolution Nudge (70%/90%)"] -.-> PH4
TN["Team Progress Nudge (every 6 iter)"] -.-> PH4
OT["Output Truncation Hint"] -.-> PH4
SH["Sandbox Hint"] -.-> PH5
TC["Task Creation Guide"] -.-> PH5
3.2 Detailed Breakdown
A. Budget Hints — Preventing Directionless Looping
When the model uses up its iteration budget without answering the user:
| Trigger | Message (summary) |
|---|---|
| 75% iterations used, no text response yet | "You've used 75% of your budget. Start synthesizing results." |
| Max iterations reached | Loop stops, returns final result |
Especially effective with small models — instead of letting them loop indefinitely, forces early summarization.
B. Output Truncation Hints — Error Recovery
When LLM response is cut off due to max_tokens:
"[System] Output was truncated. Tool call arguments are incomplete. Retry with shorter content — split writes or reduce text."
Small models often don't recognize their output was truncated. This hint helps them understand the cause and adjust.
C. Skill Evolution Nudges — Encouraging Self-Improvement
| Trigger | Content |
|---|---|
| 70% iteration budget | Suggests creating a skill to reuse the workflow |
| 90% iteration budget | Stronger reminder about skill creation |
Characteristics: i18n (en/vi/zh), ephemeral (only exists in current run, not persisted to session).
D. Team Task Progress Nudges — Progress Reporting Reminders
Every 6 iterations when the agent is working on a team task:
"[System] You're at iteration 12/20 (~60% budget) for task #3: 'Implement auth module'. Report progress now:
team_tasks(action="progress", percent=60, text="...")"
Small models tend to forget progress reporting → lead agent doesn't know the status → causes bottlenecks. This hint addresses it directly.
E. Sandbox Hints — Explaining Environment Errors
When a command running in a Docker sandbox encounters an error, hints are attached directly to the error output:
| Error Pattern | Hint |
|---|---|
| Exit code 127 / "command not found" | Binary not installed in sandbox image |
| "permission denied" / EACCES | Workspace mounted read-only |
| "network is unreachable" / DNS fail | --network none is enabled |
| "read-only file system" / EROFS | Writing outside workspace volume |
| "no space left" / ENOSPC | Disk/memory exhausted in container |
| "no such file" | File doesn't exist in sandbox |
Priority-based: checks exit code 127 first, then pattern matches in priority order.
F. Channel Formatting Hints — Platform-Specific
Injected into system prompt based on channel type:
- Zalo: "Use plain text, no markdown, no HTML"
- Group chat: Instructions on using
NO_REPLYtoken when a message doesn't need a response
G. Task Creation Guidance — Guiding Task Creation
When the model lists/searches team tasks, the response includes:
- List of members + their models
- 4 rules: self-contained descriptions, split complex tasks, match complexity to model strength, ensure independence
Especially useful for small models (MiniMax, Qwen) acting as lead agents — they tend to create vague tasks or misassign complexity.
H. System Prompt Reminders — Recency Zone Reinforcement
Injected at the end of the system prompt (recency zone — where the model pays the most attention):
- Remind to search memory before answering
- Reinforce persona/character if agent has custom identity
- Bootstrap onboarding nudges for new users
3.3 Summary Table
| Hint | Trigger | Ephemeral? | Injection Point |
|---|---|---|---|
| Budget 75% | iteration == max*3/4, no text yet | Yes | Message list (Phase 4) |
| Output Truncation | finish_reason == "length" |
Yes | Message list (Phase 4) |
| Skill Nudge 70% | iteration/max >= 0.70 | Yes | Message list (Phase 4) |
| Skill Nudge 90% | iteration/max >= 0.90 | Yes | Message list (Phase 4) |
| Team Progress | iteration % 6 == 0, has TeamTaskID | Yes | Message list (Phase 4) |
| Sandbox Error | Pattern match on stderr/exit code | No | Tool result suffix (Phase 5) |
| Channel Format | Channel type == "zalo" etc. | No | System prompt (Phase 3) |
| Group Reply | PeerKind == "group" | No | System prompt (Phase 3) |
| Task Creation | team_tasks list/search response | No | Tool result JSON (Phase 5) |
| Memory/Persona | Config flags | No | System prompt (Phase 3) |
4. Guard System (Safety Boundaries)
Guards create hard boundaries — they don't depend on model compliance. Even though small models are more susceptible to prompt injection, guards ensure dangerous behavior is blocked at the infrastructure level.
4.1 4-Layer Guard Architecture
flowchart TD
INPUT([User Message]) --> IG
subgraph IG["Layer 1: InputGuard"]
IG1["6 regex patterns"]
IG2["Action: log / warn / block / off"]
end
IG --> LOOP([Agent Loop iterations])
LOOP --> TOOL{Tool call?}
TOOL -->|exec / shell| SDG
TOOL -->|write SKILL.md| SCG
TOOL -->|No| RESP
subgraph SDG["Layer 2: Shell Deny Groups"]
SDG1["15 categories"]
SDG2["200+ regex patterns"]
SDG3["Per-agent overrides"]
end
subgraph SCG["Layer 3: Skill Content Guard"]
SCG1["25 security rules"]
SCG2["Line-by-line scan"]
end
SDG --> RESP([Response])
SCG --> RESP
RESP --> VG
subgraph VG["Layer 4: Voice Guard"]
VG1["Error → friendly fallback"]
VG2["Telegram voice only"]
end
4.2 Layer 1: InputGuard — Prompt Injection Detection
Scans every user message before it enters the agent loop.
| Pattern | Detects |
|---|---|
ignore_instructions |
"Ignore all previous instructions..." |
role_override |
"You are now a...", "Pretend you are..." |
system_tags |
<system>, [SYSTEM], [INST], <<SYS>>, <|im_start|>system |
instruction_injection |
"New instructions:", "Override:", "System prompt:" |
null_bytes |
\x00 characters (null byte injection) |
delimiter_escape |
"End of system", </instructions>, </prompt> |
4 action modes (config gateway.injection_action):
log— logs info, doesn't blockwarn— logs warning (default)block— rejects message, returns erroroff— disables scanning
Scans at 3 points:
- Incoming user message (Phase 2)
- Mid-run injected messages (
processInjectedMessage) - Results from
web_fetch/web_search(tool result scan)
Small models are more susceptible to injection than large models → InputGuard plays a more critical role with small models.
4.3 Layer 2: Shell Deny Groups — Command Safety
15 deny groups, all ON by default — admin must explicitly allow.
| Group | Example Patterns |
|---|---|
destructive_ops |
rm -rf, mkfs, dd if=, shutdown, fork bomb |
data_exfiltration |
curl | sh, wget POST, DNS lookup, /dev/tcp/ |
reverse_shell |
nc, socat, openssl s_client, Python/Perl socket |
code_injection |
eval $(), base64 -d | sh |
privilege_escalation |
sudo, su, doas, pkexec, runuser, nsenter |
dangerous_paths |
chmod/chown on system paths |
env_injection |
LD_PRELOAD, BASH_ENV, GIT_EXTERNAL_DIFF |
container_escape |
Docker socket, /proc/sys/, /sys/ |
crypto_mining |
xmrig, cpuminer, stratum+tcp:// |
filter_bypass |
sed -e, git --exec, rg --pre |
network_recon |
nmap, ssh/scp/sftp, tunneling |
package_install |
pip install, npm install, apk add |
persistence |
crontab, shell RC file writes |
process_control |
kill -9, killall, pkill |
env_dump |
env, printenv, /proc/*/environ, GOCLAW_* |
Special case: package_install → approval flow (not hard deny), all others → hard block.
Per-agent override: Admin can allow specific groups for specific agents via DB config.
4.4 Layer 3: Skill Content Guard
Scans SKILL.md content before writing the file. 25 regex rules detect:
- Shell injection & destructive ops
- Code obfuscation (
base64 -d,eval,curl | sh) - Credential theft (
/etc/passwd,.ssh/id_rsa,AWS_SECRET_ACCESS_KEY) - Path traversal (
../../..) - SQL injection (
DROP TABLE,TRUNCATE) - Privilege escalation (
sudo,chmod 777)
Hard reject — any violation → file is not written.
4.5 Layer 4: Voice Guard
Specialized for Telegram voice agents:
- When voice/audio processing encounters technical errors
- Replaces error messages with friendly fallback for end users
- Not a security guard — it's a UX guard
4.6 Summary Table
| Guard | Scope | Default Action | Configurable? |
|---|---|---|---|
| InputGuard | All user messages + injected + tool results | warn | Yes (log/warn/block/off) |
| Shell Deny | All exec/shell tool calls |
hard block | Yes (per-agent group override) |
| Skill Content | SKILL.md file writes | hard reject | No |
| Voice Guard | Telegram voice error replies | friendly fallback | No |
5. How the 3 Systems Work Together
flowchart TD
REQ([User Request]) --> TRACK_ROUTE
subgraph TRACK["TRACK — Where to run?"]
TRACK_ROUTE["Lane routing<br/>(main / subagent / team / cron)"]
TRACK_ROUTE --> QUEUE["Session queue<br/>(serialize per session)"]
QUEUE --> THROTTLE["Adaptive throttle<br/>(reduce concurrency when context is nearly full)"]
end
THROTTLE --> GUARD_INPUT
subgraph GUARD["GUARD — What's allowed?"]
GUARD_INPUT["InputGuard scan<br/>(6 injection patterns)"]
GUARD_INPUT --> LOOP_START
LOOP_START["Agent Loop starts"] --> TOOL_CALL{Tool call?}
TOOL_CALL -->|exec/shell| SHELL_DENY["Shell Deny Groups<br/>(200+ patterns)"]
TOOL_CALL -->|write skill| SKILL_GUARD["Skill Content Guard<br/>(25 rules)"]
TOOL_CALL -->|other| SAFE[Allow]
SHELL_DENY --> RESULT
SKILL_GUARD --> RESULT
SAFE --> RESULT
end
RESULT["Tool Result"] --> HINT_INJECT
subgraph HINT["HINT — What should it do?"]
HINT_INJECT["Sandbox hints<br/>(attached to error output)"]
HINT_INJECT --> BUDGET["Budget hints<br/>(75% warning, truncation recovery)"]
BUDGET --> PROGRESS["Progress nudges<br/>(team task every 6 iter)"]
PROGRESS --> SKILL_EVO["Skill evolution nudges<br/>(70%/90% budget)"]
end
SKILL_EVO --> LLM([LLM continues iteration])
LLM --> TOOL_CALL
Role Summary
| Layer | Question | Mechanism | Nature |
|---|---|---|---|
| Track | Where to run? | Lane + Queue + Semaphore | Infrastructure, invisible to model |
| Guard | What's allowed? | Regex pattern matching, hard deny | Security boundary, model-agnostic |
| Hint | What should it do? | Message injection into conversation | Soft guidance, model can ignore |
Why 3 Layers Instead of 1?
- Track doesn't depend on the model — operates at the scheduler level
- Guard doesn't trust the model — blocks dangerous behavior regardless of instructions
- Hint collaborates with the model — provides context that small models lack
When using large models (Claude, GPT-4): Guard is still needed, Hint is less critical. When using small models (MiniMax, Qwen, Gemini Flash): all 3 layers are critical.
Reference Keywords
| Keyword | File/Package |
|---|---|
Scheduler, LaneManager, Lane |
internal/scheduler/ |
SessionQueue, DefaultLanes() |
internal/scheduler/lanes.go, queue.go |
InputGuard, Scan(), guardPattern |
internal/agent/input_guard.go |
DenyGroupRegistry, DenyGroup |
internal/tools/shell_deny_groups.go |
GuardSkillContent(), GuardViolation |
internal/skills/guard.go |
MaybeSandboxHint(), MaybeFsBridgeHint() |
internal/tools/sandbox_hints.go |
buildChannelFormattingHint() |
internal/agent/systemprompt_sections.go |
buildCreateHint() |
internal/tools/team_tasks_read.go |
IsSilentReply() (NO_REPLY) |
internal/agent/sanitize.go |
i18n.MsgSkillNudge70Pct / 90Pct |
internal/i18n/ |
WithIterationProgress() |
internal/tools/ |