Claw Planet reference · v0a · first cut
last updated 2026-05-07 edit on GitHub colophon
§ 3 Connections / § 3.4

Models

The brain. Which model providers OpenClaw supports, how to configure model refs, and how to think about model failover and cost.

Note on verification: Compiled from the official model docs + README sponsor section. Specific model performance comparisons are not docs-derived; those are general-knowledge claims about model quality at last review.

What “model” means here

The model is the LLM — the brain that decides what the agent says. OpenClaw is model-agnostic: it talks to whichever provider you’ve configured. You can switch models without changing the agent runtime, the workspace, or the channels.

The model is the most expensive thing in your stack (in API cost) and the most consequential (in quality). Both worth thinking about up front.

Supported providers

From the docs and README:

ProviderOAuthAPI keyRecommended for
AnthropicClaude family — strongest agentic reasoning at last review
OpenAI✓ (ChatGPT/Codex subscriptions)If you already pay for ChatGPT Plus/Pro/Codex
GoogleGemini family; cheap for chat
OpenRouterMulti-model routing; good for trying many models with one key
Local (Ollama, etc.)n/an/aPrivacy-sensitive or offline; smaller models only on consumer hardware
Custom providersvariesvariesSelf-hosted commercial inference, Azure OpenAI, AWS Bedrock

The README’s “Sponsors” section lists OpenAI as the OAuth-able subscription. If you’re paying for ChatGPT Plus, your subscription auth flows directly without a separate API key.

Model refs (the format)

Model refs in config use provider/model:

{
  "agents": {
    "defaults": {
      "model": "anthropic/claude-3-5-sonnet"
    }
  }
}

For OpenRouter-style nested refs, include the provider prefix:

{ "model": "openrouter/moonshotai/kimi-k2" }

If you omit the provider, OpenClaw tries:

  1. An alias defined in your config
  2. A unique configured-provider match for that exact model id
  3. The configured default provider as fallback

If a configured default provider no longer exposes the configured default model, OpenClaw falls back to the first configured provider/model instead of surfacing a stale removed-provider default.

Model selection in 2026

This section is opinion as of 2026-05. The model landscape changes fast. Verify against current benchmarks before locking in.

For agentic work (tool use, multi-step reasoning)

  • anthropic/claude-3-5-sonnet — strongest agentic reasoning. The default in many serious deployments.
  • openai/gpt-4o — close second; faster, similar tool use quality.
  • google/gemini-2-flash-exp — cheaper; quality close enough for many tasks.

For cost-sensitive chat (no heavy tool use)

  • anthropic/claude-3-5-haiku — cheaper Claude variant; surprisingly capable for simple tool use.
  • openai/gpt-4o-mini — cheap-and-cheerful default for low-stakes tasks.
  • google/gemini-2-flash-lite — even cheaper.

For privacy / offline (local)

  • Llama 3.2 3B — runs on a Pi 5 8GB. Quality noticeably below cloud models but acceptable for narrow tasks.
  • Llama 3.1 8B — needs ~8GB RAM available; runs on a Mac M1+ or a Linux desktop with 16GB+.
  • Llama 3.3 70B — needs serious GPU; not for laptops.
# Quick local model setup with Ollama
ollama pull llama3.2:3b
# Then in openclaw.json:
{ "agents": { "defaults": { "model": "ollama/llama3.2:3b" } } }

Model failover

OpenClaw supports configuring multiple models, with failover when the primary fails. From the model failover docs:

{
  "agents": {
    "defaults": {
      "models": [
        "anthropic/claude-3-5-sonnet",
        "openai/gpt-4o",
        "google/gemini-2-flash-exp"
      ]
    }
  }
}

If the primary fails (rate limit, outage, auth error), OpenClaw tries the next. Auth-profile rotation is also supported — multiple credentials per provider for parallel quotas.

Why this matters: model providers have outages. If your only model is OpenAI and OpenAI is down for 30 minutes, your agent is silent for 30 minutes. With failover, you degrade gracefully.

Cost notes (rough 2026-05 figures)

API call cost per 1M tokens at last check:

ModelInputOutput
Claude 3.5 Sonnet$3$15
Claude 3.5 Haiku$0.80$4
GPT-4o$2.50$10
GPT-4o mini$0.15$0.60
Gemini 2 Flash$0.075$0.30

A typical “personal assistant agent” usage profile (~1000 messages/month, ~2000 tokens/message average) lands at:

  • Claude 3.5 Sonnet: ~$15–25/month
  • GPT-4o mini: ~$1–3/month
  • Gemini 2 Flash: ~$0.50–1.50/month

Heavy agentic workflows (long sessions, multiple tool calls per message) can be 5–10× this.

Cost-runaway pattern: see §6.3 pattern #3. Set provider-side spending alerts and hard caps before you tinker.

Auth profiles (per-agent)

Auth profiles live at ~/.openclaw/agents/<agentId>/agent/auth-profiles.json:

{
  "anthropic": { "apiKey": "sk-ant-..." },
  "openai":    { "apiKey": "sk-..." },
  "ollama":    { "host": "http://localhost:11434" }
}

Or via OAuth for OpenAI:

openclaw auth login openai
# opens a browser, signs in with ChatGPT/Codex subscription

Things to try

  • Run the agent on Sonnet for a week, then Haiku for a week. Notice which tasks degrade. Adjust your default based on real workload, not first-principles.
  • Configure failover Claude → GPT-4o → Gemini. Stress-test by temporarily revoking Claude’s key. Watch the agent gracefully fall through.
  • Try a local 3B model on a Pi. See §2.6 Raspberry Pi for the setup. Quality calibration is worthwhile even if you go back to hosted models afterwards.
  • Set spending alerts at $5 / $10 / $25 thresholds on your model provider account. Cheap insurance.

What we are NOT going to claim

Model quality rankings change with every major release. The 2026-05 picks above are best-effort at time of writing. Benchmarks for agentic (tool-using, multi-step) work are still maturing; published benchmarks usually measure single-turn tasks. Trust your own A/B tests over any single source.

Sources