§ 1.3 Architecture diagram

The whole shape on one page

OpenClaw doesn’t ship a glossy architecture poster. The official docs describe each layer in prose and code samples, scattered across half a dozen pages. So below is our flat schematic of the whole runtime — every layer, every box you’ll meet — annotated and cross-linked.

Read this once. After that, every other page on the site refers back to a piece of this diagram (often by §-number).

Important: this diagram describes the single-agent default. Multi-agent setups (different agents per channel, or per project) are a layered config on top of what’s below. See §2.7 Multi-agent routing (when that ships).

The five layers

Conceptually, OpenClaw arranges itself in five horizontal layers, top to bottom:

Layer	What sits here	What it does
Channels	WhatsApp · Telegram · Slack · iMessage · Discord · 19+ more	Receive messages from the world; deliver responses back
Gateway	The one daemon process	Routes messages, manages sessions, mediates tool calls
Agent runtime	One agent per Gateway, built on the Pi core	Holds the system prompt, calls the model, decides what to do
Workspace + Memory	`~/.openclaw/workspace` + memory engine	Persists identity, rules, and accumulated context
Models + Tools	Anthropic / OpenAI / Google / local + read·exec·edit·write·browser·canvas·MCP	The brain (models) and the hands (tools) the agent uses

Each layer talks only to its immediate neighbours. Channels never talk to models directly — everything goes through the Gateway and the agent runtime.

A walkthrough of one message

Here’s the flow when you message your agent on Slack with “What’s on my calendar tomorrow?”:

Slack channel receives the message. The DM-policy check runs: are you on the allowlist? (Yes — you paired earlier.)
The channel hands the message to the Gateway.
The Gateway looks up your session (or creates a new one) and routes the message to the agent runtime.
The agent runtime loads the workspace files (SOUL, AGENTS, USER, etc.) into the system prompt.
The system prompt + your message + any active memory go to the configured model (e.g. anthropic/claude-3-5-sonnet).
The model decides this needs the calendar tool. It returns a tool call.
The agent runtime invokes the tool (could be MCP-bridged, could be built-in exec calling a script).
Tool result goes back to the model for synthesis.
The model returns the final response.
The agent runtime writes it to the session JSONL, then hands it back to the Gateway.
The Gateway delivers it through the Slack channel as a reply.

That’s eleven hops. They’re all on your machine except the model API call (steps 5 and 8 cross the network to the model provider).

The diagram itself

A flat schematic showing the same flow as boxes and lines:

┌────────────┐   ┌──────────────┐   ┌────────────┐   ┌──────────────┐
│  WhatsApp  │   │   Telegram   │   │   Slack    │   │  Discord ... │   ← Channels
└──────┬─────┘   └──────┬───────┘   └─────┬──────┘   └──────┬───────┘
       │                │                 │                 │
       └────────────────┴─────────┬───────┴─────────────────┘
                                  │
                          ┌───────▼────────┐
                          │    Gateway     │  ← One per machine
                          │  (daemon, ~18789) │
                          └───────┬────────┘
                                  │
                       ┌──────────▼───────────┐
                       │   Agent runtime      │  ← Pi core
                       │  (one per Gateway)   │
                       └──────┬─────┬─────────┘
              ┌───────────────┘     │
              │                     │
   ┌──────────▼──────────┐  ┌───────▼─────────┐
   │  Workspace files    │  │  Active session │  ← Persistence
   │  (SOUL · AGENTS ·   │  │  (JSONL on disk)│
   │   USER · TOOLS ...) │  │                 │
   └─────────────────────┘  └─────────────────┘
              │
   ┌──────────▼──────────┐
   │   Memory engine     │  ← Long-term
   │ (builtin / Honcho / │
   │       QMD)          │
   └─────────────────────┘

       Tools called by the agent runtime:
   ┌─────────┬─────────┬─────────┬─────────┬─────────┬─────────┐
   │  read   │  exec   │  edit   │  write  │ browser │  canvas │
   └─────────┴─────────┴─────────┴─────────┴─────────┴─────────┘
   ┌─────────┐
   │   MCP   │  (bridge to external MCP servers)
   └─────────┘

       Models called by the agent runtime:
   ┌──────────────────────────────────────────────┐
   │  anthropic / openai / google / openrouter /  │
   │  local Ollama / self-hosted LLMs             │
   └──────────────────────────────────────────────┘

A more polished SVG version of the same diagram appears in the front-matter banner of the home page — exactly the same shape, just rendered.

Why this layering matters

Three things to notice:

The Gateway is the only thing exposed to the network — channels are clients of it, models are servers it calls, tools are local. Everything routes through one daemon. That’s good for security (one process to harden) and bad for operations (single point of failure — if the Gateway crashes, every channel goes silent).
The agent runtime never talks to channels directly. It talks to the Gateway. So if you write a custom channel adapter, you only need to teach the Gateway about it — the agent runtime stays the same.
Workspace files are read on every session start. If you change SOUL.md, the next session sees the new SOUL. If you change it mid-session, the change won’t take effect until the next session starts. (For mid-session change, you’d use steering messages.)

What the diagram doesn’t show

Sandbox boundaries. When you turn on agents.defaults.sandbox.mode: "non-main", non-main sessions get isolated (Docker default). The diagram above shows the main session shape; sandboxed sessions wrap parts of it.
Multi-agent routing. Multiple agents per Gateway, each with its own workspace and sessions. We’ll cover that in §2.8 Production patterns (when it ships).
Voice nodes. macOS / iOS / Android voice apps connect to the Gateway via Bonjour discovery — they’re channels, just on different transports.
Live Canvas. A canvas tool that the agent can drive to render visual artefacts. macOS-first, iOS in beta. It’s a special tool from the model’s perspective.

What we are NOT going to claim

We have not run OpenClaw end-to-end yet. The diagram describes the architecture in the docs and the README. Where we say “the message flows through X then Y,” we mean “the docs say it flows through X then Y.” When we eventually run it on Sush’s laptop and watch the message hops with --verbose, this page will get promoted from sourced-only to tested-by-sush with the real session log.

Architecture diagram