CLI coding agents — Claude Code · Codex CLI · Gemini CLI · Copilot CLI
Four shell-native AI coding CLIs. Same shape, different identities. Where each wins, where each lags, when to switch — and why most thoughtful people end up using two.
What this comparison is#
Four shell-native AI coding CLIs. You install a binary, run a command in your project, and the CLI reads files, runs commands, edits code, and talks to MCP servers — all from your terminal.
From a distance they look the same. Once you actually use them, they diverge fast — different default models, different auth, different sandboxing philosophies, different plan modes. This is the “which CLI fits which job” cut. Not a benchmark. Not a winner table. The honest take at the bottom is which one I’d pick for which kind of work.
| Dimension | Claude Code | Codex CLI | Gemini CLI | GitHub Copilot CLI |
|---|---|---|---|---|
| Install (recommended) | curl -fsSL https://claude.ai/install.sh | bash npm install deprecated; brew + WinGet + PowerShell installer also supported | npm install -g @openai/codex brew install --cask codex or binary release also supported | npm install -g @google/gemini-cli brew install gemini-cli or npx (no install) also supported | curl -fsSL https://gh.io/copilot-install | bash Renamed from gh-copilot (deprecated Oct 2025) to standalone copilot-cli; brew/winget/npm also supported |
| Auth | Subscription OAuth (Pro/Max/Team/Enterprise) or API key Also: Bedrock, Vertex, Foundry env vars; OAuth tokens; apiKeyHelper for rotating creds | Sign in with ChatGPT (Plus/Pro/Business/Edu/Enterprise) or API key Also: device code (beta), enterprise access tokens, custom model providers | Personal Google OAuth, Gemini API key, or Vertex AI OAuth path has no API key — just run `gemini` | GitHub OAuth (browser /login) or fine-grained PAT with Copilot Requests scope Subject to org/enterprise Copilot policy |
| Free tier | None — paid plan required Pro from $17/mo billed annually ($20/mo monthly); or pay-per-token via Anthropic API | None — ChatGPT Plus or API key required The old $5 trial credit no longer exists in current docs | 1,000 req/day on personal Google OAuth (60 req/min cap) or AI Studio API key Both free-tier paths give the same daily cap; OAuth uses no API key, AI Studio key gives model-selection control | Available with GitHub Copilot Free (limited premium requests) Each prompt counts against the plan's monthly premium-request quota |
| MCP support | Client (stdio · SSE · streamable-HTTP) Project-scoped `.mcp.json` + user-scoped `~/.claude.json`; not exposed as an MCP server itself | Client (stdio); built-in MCPs as first-class runtime servers (v0.130.0+) Destructive MCP tool calls always require approval | Client (stdio · SSE · streamable-HTTP) Configure in `~/.gemini/settings.json`; resource tools added v0.40.0; OAuth-aware | Client (stdio); ships with GitHub's MCP server by default Custom servers via /mcp; OAuth client-credentials grant for headless auth; experimental MCP Tasks |
| Sandboxing | Permission modes (no container/VM) default · acceptEdits · plan · auto · dontAsk · bypassPermissions; protected: .git, .claude/ | OS-enforced sandbox: workspace-write · read-only · danger-full-access Network off by default; web search cached/live/disabled; auto_review checks data exfiltration | Opt-in sandbox: macOS Seatbelt · Docker/Podman · Windows Native · gVisor · LXC Disabled by default; enable via `-s` flag or `GEMINI_SANDBOX` env var | Trust + per-tool approval (no container) Per-directory trust; --allow-tool · --deny-tool · --allow-all-tools; read-only `gh` cmds auto-approved |
| Plan mode | First-class — Shift+Tab cycles default → acceptEdits → plan Also /plan prefix, --permission-mode plan, or settings default | No dedicated plan mode Closest: --sandbox read-only --ask-for-approval on-request | First-class plan mode (enabled by default) Routes Pro for planning + Flash for implementation; --yolo flag bypasses (CLI-only) | First-class — Shift+Tab cycles ask/execute → plan Plus experimental Autopilot mode via /autopilot or Shift+Tab cycle |
| Trusted folder behaviour | No folder-trust dialog Uses permission rules + per-project MCP server approval instead | Detects version-control state; recommends Auto for git repos Read-only mode recommended for non-VC folders; no formal trusted-folders file | Off by default; opt in via settings.json When enabled: first-run dialog; untrusted runs in safe mode (no MCP, no extensions, no .env) | Prompted on first use per directory Choose session-only · remember this folder · exit; persistent across sessions since v1.0.37 |
| Default model | Claude Sonnet 4.6 (auto-mode uses Opus 4.7 on eligible plans) Fast-mode default bumped to Opus 4.7 in 2.1.142 (CLAUDE_CODE_OPUS_4_6_FAST_MODE_OVERRIDE=1 pins back to 4.6); switch via /model · --model · ANTHROPIC_MODEL env | gpt-5.5 (recommended); gpt-5.4 (alt high-end); gpt-5.4-mini (fast); gpt-5.3-codex (coding specialist) Also gpt-5.3-codex-spark research preview (Pro subs only). Switch via /model · -m flag · config.toml | gemini-2.5-pro (stable channel); gemini-3-pro-preview (preview channel) Per Gemini CLI docs: Auto (Gemini 3) routes between gemini-3-pro-preview and gemini-3-flash-preview; on the API side, gemini-3-pro-preview redirects to gemini-3.1-pro-preview since 2026-03-09. Gemma 4 also enabled by default via Gemini API since v0.42.0; plan mode auto-routes Pro for planning, Flash for implementation | Claude Sonnet 4.5 (per README) Also GPT-5 and others via /model picker; default subject to change |
| OS support | macOS · Linux · Windows native (+ WSL) | macOS · Linux · Windows | macOS · Linux · Windows (incl. Windows Native Sandbox) | macOS · Linux · Windows (PowerShell 6+) + WSL |
The four identities#
If you only remember one thing per CLI:
- Claude Code — Anthropic’s premier coding agent. Subscription-first (Pro/Max/Team/Enterprise OAuth). The most polished plan/permission model of the four (
plan/acceptEdits/automodes cycled with Shift+Tab). Recently moved off npm to a curl-installer as the recommended path. - Codex CLI — OpenAI’s terminal coder. OS-enforced sandbox (
workspace-writedefault, network off, web search defaults to a cached index). Strong default safety posture, less mode-switching ceremony. ChatGPT-account-first; the old $5 free credit is gone. - Gemini CLI — Google’s terminal coder. The only one of the four with a meaningful no-cost tier — 1,000 req/day on a personal Google account (no card needed), or the same daily cap via an AI Studio API key with explicit model-selection control. The widest sandbox backend menu (Seatbelt, Docker, Podman, Windows Native, gVisor, LXC). Plan Mode is first-class and routes Pro for planning + Flash for implementation.
- GitHub Copilot CLI — GitHub’s terminal coder. Renamed in October 2025 — the old
gh-copilotextension is deprecated; the current product is the standalonecopilot-clibinary. Defaults to Claude Sonnet 4.5 (not a GitHub-owned model). Ships with GitHub’s MCP server pre-wired. The CLI that’s easiest to use if your work already lives on github.com.
Where each wins#
Claude Code#
- The plan/approval model.
planmode is a first-class permission state — you can ask Claude to research and propose a multi-file edit before it writes anything, then approve, refine, or hand-edit the plan withCtrl+G. Shift+Tab cycles throughdefault → acceptEdits → planmid-session. Of the four, this is the one that feels most like working with a thoughtful contractor rather than a junior who just starts typing. - The breadth of the auth model. Six different ways to authenticate (subscription OAuth, API key, env-var token, OAuth long-lived token,
apiKeyHelperfor rotating creds, plus Bedrock/Vertex/Foundry routing). If you need to point Claude Code at a specific model gateway or rotating-credential setup, it’s the most accommodating. - Auto mode with a classifier. On Max/Team/Enterprise/API,
automode runs autonomously with a background classifier model checking each action for destructive or exfiltration-style behaviour. The other three either prompt for everything or rely on a sandbox.
Codex CLI#
- The strongest default sandbox.
workspace-writemode plus network-off-by-default plus cached web-search plus anauto_reviewreviewer agent that watches for data exfiltration. You don’t have to think about safety knobs; the defaults are tight. - Custom model providers as a first-class concept. If you want to point Codex CLI at a non-OpenAI model (your own deployment, a self-hosted endpoint, another vendor via a proxy), the config supports it cleanly — Chat Completions API or Responses API, with auth via env var, OpenAI keyring, or no auth (for local models).
- The
codex-sparkresearch preview. A specialist model optimised for near-instant iteration on small tasks. Available on Pro subscriptions only at time of writing — niche, but if you do a lot of small typed-prompt-typed-prompt-typed-prompt iteration, the latency difference shows.
Gemini CLI#
- The free tier that’s actually free. A personal Google account (OAuth) gets you 1,000 requests/day — no card, no plan. An AI Studio API key gives you the same daily cap with explicit model-selection control. Either way, of the four, this is the one a student or a beginner can pick up without paying anything.
- The widest sandbox menu. macOS Seatbelt for native isolation. Docker/Podman for cross-platform. Windows Native Sandbox (uses
icaclsto set Low Mandatory Level on files — note that changes are persistent after the sandboxed session, unlike VM snapshots). gVisor for Linux user-space-kernel isolation. LXC for full-system containers. If your security posture cares about which mechanism is doing the isolating, Gemini CLI lets you choose. - Plan Mode with intelligent model routing. Plan mode is enabled by default; when active, Gemini routes the heavier Pro model for the planning phase and the cheaper Flash for execution. You get plan-quality reasoning without paying Pro rates for every line of generated code.
GitHub Copilot CLI#
- It already knows your GitHub world. Ships with GitHub’s MCP server pre-wired — issues, PRs, repo metadata, releases, workflow runs — without configuration. If your code lives on github.com, this is the lowest-friction CLI of the four for “agent that knows about my repo.”
- Plan mode plus experimental Autopilot. Shift+Tab cycles into a structured plan mode where Copilot asks clarifying questions before writing. The
--experimentalflag (or/autopilot) adds autopilot mode — agent runs continuation after continuation until the task is done (capped at 5 by default with--max-autopilot-continues). Closer to “kick off and walk away” than the other three out of the box. - Free plan availability. Copilot CLI ships with every GitHub Copilot plan — including the Copilot Free plan (which has a limited monthly premium-request quota). Of the four, this is the one with the lowest “I want to try a real agent CLI” cost for someone who already has a GitHub account.
- Custom agent profiles + repo-level instructions.
AGENTS.mdat the repo root,.github/copilot-instructions.md, and~/.copilot/agents/for personal agent profiles give it a richer customisation surface than the other three for “make this agent behave consistently across my team.”
Where each lags#
Claude Code#
- No free tier of any kind. Pro is $17/mo annual minimum. Or pay-per-token via the Anthropic API (which works fine, but the OAuth-subscription experience is what most users come for).
- The
npm installis deprecated. You’re meant to use the curl/PowerShell installers or Homebrew/WinGet. If you have habit-installed it via npm, your install isn’t recommended any more. - No built-in sandbox. The permission-mode system is sophisticated, but if your security posture wants OS-level isolation, you’re running Claude Code inside Docker or a VM yourself.
Codex CLI#
- No first-class plan mode the way Claude Code and Gemini CLI have. The closest equivalent is
--sandbox read-only --ask-for-approval on-request, which gets you safe browsing but not the “Claude proposes a multi-step plan and you approve it” UX. - The old $5 free credit is gone. You need a ChatGPT Plus subscription (≈$20/mo) or an API key (pay-per-token) to use it.
- No Shift+Tab mode cycling. Modes are set with flags at startup or with
/permissionsmid-session — less ergonomic than the cycle-and-go model.
Gemini CLI#
- Trusted Folders is off by default. Some readers expect a “this folder is trusted” prompt on first launch — Gemini CLI doesn’t ask one unless you opt in via
settings.json. Documented behaviour, but easy to misread as “no trust system.” - Windows Native Sandbox is persistent. Unlike Docker or Seatbelt, the
icaclsintegrity-level changes survive after the sandbox session ends. Useful but surprising; readers should know.
GitHub Copilot CLI#
- Still iterating fast. v1.0.48 as of mid-May 2026 — stable enough for daily use, but the changelog has had some breaking-ish renames (the gh-copilot deprecation in October 2025 wasn’t smooth —
docs.github.com/copilot/how-tos/use-copilot-for-common-tasks/use-copilot-in-the-clinow redirects to a deprecation notice for the old extension). Newer features (Autopilot, MCP Tasks) are flagged experimental and shift between releases. - Default model is not GitHub’s. Claude Sonnet 4.5 is the default, with GPT-5 and others available via
/model. GitHub reserves the right to change the default. This is fine, but worth knowing if you assumed Copilot CLI was running a GitHub-owned model. - No OS-enforced sandbox. Like Claude Code, the safety story is trust-prompts and per-tool approval. Good for productivity, less good if you want OS-level isolation.
Decision sketch#
Do you need a meaningful no-cost tier (no card, no plan)?
├─ Yes → Gemini CLI (1,000 req/day on personal Google account)
└─ No → continue
│
Is your code already on github.com (issues/PRs/Actions central)?
├─ Yes → GitHub Copilot CLI (ships with GitHub MCP server)
└─ No → continue
│
Do you care most about plan-before-execute UX?
├─ Yes → Claude Code (best plan/approval model of the four)
└─ No → continue
│
Do you care most about default sandbox tightness?
├─ Yes → Codex CLI (workspace-write + network-off + auto_review)
└─ No → pick by default model preference
They overlap more than they compete#
For most builders, the right answer is two of the four, not one. Common combinations worth knowing:
- Copilot CLI for daily repo work + Claude Code for the gnarly multi-file refactors. Copilot CLI knows the repo + the PR/issue surface; Claude Code’s plan mode is better when you need a deliberate multi-step rewrite.
- Gemini CLI for personal projects + a paid CLI for work. The free tier is genuinely usable for personal side-project velocity; the paid tools have higher reliability and more polished plan UX for production code.
- Codex CLI as the “safe one” in CI/autonomous flows. The default-tight sandbox plus the auto-review reviewer agent make it a reasonable choice for “run this CLI inside a runner without a human watching.”
Picking one CLI to rule them all is a choice that costs you. Try two; switch on context.
What we are NOT claiming#
- Which CLI writes the best code. Quality varies by language, framework, model selection, and how you prompt — across all four. We have no benchmark data and would not trust a benchmark even if we had one. The CLI shape doesn’t predict generated-code quality.
- Specific pricing in foreign currencies. Pricing is in USD here and tracks the vendor’s listed page; check your billing dashboard for local rates.
- Anything about model quality differences. That’s out-of-scope per Claw’s scope guardrails. We compare tool shapes, not model leaderboards.
- What will be true in six months. All four vendors ship fast. Versions in this comparison are current as of 15 May 2026. Free-tier numbers and default models in particular move.
Honest take#
If you’re Sush — Microsoft Copilot SE, code lives mostly on github.com, builds Hugo and Astro sites against MCP servers all day, doesn’t want to run a Docker container around the agent — GitHub Copilot CLI is the daily driver. It’s been the daily driver. This site was built with it.
For the work where you want to think before you cut — a multi-file refactor, a careful migration, anything where the agent needs to plan before it acts — Claude Code’s plan mode wins. Worth the second subscription for the maybe-twice-a-week jobs that benefit from it.
If you’re a student or new builder with no budget, Gemini CLI’s free tier is genuinely the right starting place. 1,000 requests/day on a personal Google account is enough to learn what these tools can do without paying anything.
Codex CLI is the one I reach for least, but I’d pick it first if I were standing up an autonomous coding agent in CI with a security-conscious org behind me. The defaults are tighter than the other three’s, which matters when no human is watching.
What to read next#
- §7.1 OpenClaw vs MCP-based stacks — the architectural comparison if you’re choosing between agent shapes, not coding CLI brands
- §7.3 M365 extensibility paths — if the code you’re shipping ends up inside Microsoft 365
- §7.4 Direct model APIs — when you want to call the model directly instead of through a CLI
- Claude Code overview — the Anthropic CLI deep-dive
- Codex CLI overview — the OpenAI CLI deep-dive
- Gemini CLI overview — the Google CLI deep-dive
- GitHub Copilot CLI overview — the GitHub CLI deep-dive
Sources
- https://github.com/anthropics/claude-code
- https://code.claude.com/docs/en/overview
- https://claude.com/pricing
- https://github.com/openai/codex
- https://developers.openai.com/codex
- https://developers.openai.com/codex/models
- https://github.com/google-gemini/gemini-cli
- https://www.geminicli.com/docs/
- https://www.geminicli.com/docs/tools/mcp-server
- https://github.com/github/copilot-cli
- https://docs.github.com/en/copilot/concepts/agents/about-copilot-cli
- https://github.com/github/copilot-cli/blob/main/changelog.md