Ga naar hoofdinhoud
50% korting alle plannen, beperkte tijd. Vanaf $2.48/mo
18 min left
AI en machine learning

Claude Code vs Codex CLI vs Gemini CLI vs Cline: The Agentic Coding CLI Comparison

B By Bill 18 min read
Agentic coding CLI comparison of Claude Code, Codex CLI, Gemini CLI, and Cline

Four tools, but really two questions. Do you want to be locked to one model or bring your own? And do you want to approve every action, or let the thing run? Pick your answer to those two and the four-way fight between Claude Code, Codex CLI, Gemini CLI, and Cline mostly sorts itself out. Everything else is detail.

I say "mostly" because one of the four just changed under everyone this month. Gemini CLI's free individual path ended on June 18, 2026, and Google is steering those users toward Antigravity CLI, so anything you knew about Gemini CLI's free tier is now out of date. This isn't a "which is best" race so much as a map of which corner of the matrix you're standing in, and which tool sits there with you.

De korte versie

If you only read this far, here's where I land on all four:

  • Claude Code is the polished, ecosystem-heavy pick, one model family (Anthropic's), deep tooling, configurable autonomy. No free tier; you pay $20/month minimum, and heavy users climb to a $200/month Max plan.
  • Codex CLI is the Rust-native, open-source OpenAI terminal agent with about 92,000 GitHub stars. It runs locally, can execute code inside a constrained sandbox, and is included across ChatGPT plans including Free, Go, Plus, Pro, Business, Edu, and Enterprise, with usage limits depending on plan. For regular daily use, Plus or higher is still the practical baseline, but "no free tier" is no longer accurate.
  • Gemini CLI changed the most: individual and free-tier access was cut off on June 18, 2026, and Google is steering those users to Antigravity CLI. Enterprise and paid API-key users keep Gemini CLI access.
  • Cline is the most flexible of the bunch: BYOK across 30+ providers, including local models via Ollama. It's free and open source; you only pay for inference. It asks permission for every edit by default, which you'll either love or switch off on day one.

The cost twist worth knowing up front: if you run a BYOK tool like Cline hard enough, variable API bills can blow past a flat subscription, and at that point a self-hosted model backend becomes the cheaper lane. More on that after the pricing math.

The Two Questions That Actually Decide This

Start a refactor in Cline and it stops at the first file edit and waits for you to say yes. Start the same task in Codex CLI with permissive approval settings, and it can inspect the repo, edit files, and run commands inside an OS-constrained sandbox. That gap, plus one other, is the whole comparison.

Question one is model strategy. Some tools lock you to one vendor's models; others let you bring your own key (BYOK) and point at whatever you want.

  • Locked: Claude Code runs Anthropic models (Opus 4.x family) for standard use. Third-party provider integrations exist, but they're not the default way you'd use it. Codex CLI runs OpenAI's current Codex model lineup, including GPT-5.5, GPT-5.4, and GPT-5.4 mini depending on surface and plan. Access is included across ChatGPT plans with usage limits, and API-key usage is available separately.
  • Flexible (BYOK): Cline connects to 30+ providers, Anthropic, OpenAI, Google, AWS Bedrock, Azure, OpenRouter, DeepSeek, Groq, plus local backends via Ollama and LM Studio. Gemini CLI in its enterprise form runs Google models through a paid API; its closed-source successor Antigravity stays in Google's ecosystem too.

Question two is autonomy posture. Does the tool ask before it acts, or act and tell you after?

  • Approval-required by default: Cline confirms every file edit and every terminal command before running it. Gemini CLI's Plan Mode does a read-only planning pass before it touches anything.
  • Configurable toward autonomous: Claude Code leans on its hooks system and auto-approve toggles, and goes fully hands-off with Agent Teams. Codex CLI can run commands inside an OS-constrained sandbox, with approval settings controlling when it stops for confirmation. Most of them also offer an unattended mode (Cline's -y flag, Gemini's YOLO flag) for when you genuinely want to walk away.

Plot the four on those two axes and they barely overlap. Claude Code sits at locked-but-configurable. Codex at locked-but-sandboxed-autonomous. Cline at flexible-but-approval-first. Gemini used to be the approval-friendly, Google-model option for individuals, but that role changed once the free individual Gemini CLI path was cut off.

Section takeaway: the four tools don't really compete on a single "best" axis. Each owns a different corner of model-strategy by autonomy, and your corner picks your tool more than any feature list does.

Two-axis matrix plotting Claude Code, Codex CLI, Gemini CLI, and Cline by model lock-in versus autonomy posture

The Quick Comparison

Here's the whole field on five axes, so you can find your corner before reading the per-tool detail below.

Claude CodeCodex CLIGemini CLICline
Model strategyLocked (Anthropic)Locked (OpenAI GPT-5.x)Google models (enterprise BYO API)BYOK, 30+ providers incl. local Ollama
Autonomy defaultConfigurable (hooks, Agent Teams)Sandboxed execution with approval controlsApproval-friendly (Plan Mode)Approval-required per action
Entry pricing$20/mo (no free tier)Included on ChatGPT Free/Go with limits; Plus starts at $20/moFree tier ended June 18, 2026Free tool; pay per inference (BYOK)
Context windowVaries by Claude surface and model; do not assume one fixed CLI contextNot stated1M tokensInherits configured model
Primary model / benchmarkOpus 4.x (see benchmark caveats)GPT-5.x (OpenAI stopped reporting SWE-bench)Gemini 3 familyWhatever you configure

Read the table as a routing tool, not a scoreboard. There's no column where one tool wins on every row, and the pricing row in particular is a moving target: Gemini's just changed, and Cline's "free" hides a variable inference bill I'll break down later. The per-tool sections below add the nuance the cells can't hold.

Claude Code

The thing Claude Code gets right is that it stops feeling like a single tool and starts feeling like one workflow stretched across surfaces. You start a task in the terminal, pick it up in the desktop app, check it on the web at claude.ai/code, same context, same agent. For the kind of developer who lives in an AI-assisted loop all day, that continuity is the main reason to choose it.

Under the hood it's Anthropic-only for standard use. The Opus 4.x family does the work; third-party provider integrations exist and are documented, but they're the exception, not the mode you'll run in. If model flexibility is a hard requirement for you, this is where Claude Code is "locked," and it's a deliberate trade for a more polished, more integrated experience.

The capability list is deep and, more importantly, it's the kind of depth you actually use: Agent Teams and sub-agents (which landed with the Opus 4.6 integration earlier in 2026) for parallel work, a CLAUDE.md persistent-memory file the agent reads on every run, a hooks system with 30+ programmable lifecycle events, MCP support, and CI/CD integration for GitHub Actions, GitLab, and CircleCI. That hooks system is where the autonomy lives, you decide what auto-approves and what stops for a human.

Now the catch. There's no standalone free Claude Code lane. Pro is $20/month and includes Claude Code, but the real complaint from heavy users is that Pro's usage is shared across Claude surfaces rather than being a dedicated coding allowance. A heavy chat day can eat into the same subscription headroom you wanted for coding. For serious daily use, Anthropic pushes users toward Max: Max 5x is $100/month and Max 20x is $200/month, with 5x or 20x more usage than Pro. The Max tiers are clearer than Pro for heavy work, but you feel the price.

Where it sits: locked to one model, configurable toward autonomy. It's the pick for people who want one good model and a deep ecosystem, and who'll pay for polish.

Codex CLI

The headline number for Codex CLI is 92,000 GitHub stars, and if you saw "65K" somewhere recently, that figure was stale; it's 92K as of mid-June 2026. It's Apache 2.0, and here's the detail I find genuinely interesting: it's 96.2% Rust. OpenAI rewrote it from TypeScript to Rust in late 2025, and the current release is v0.141.0 (they ship roughly a release a day). That's a tool under heavy, active development.

The architecture is the strongest part of the pitch. Codex CLI runs as a lightweight local terminal agent that can inspect your repository, edit files, and run commands from the terminal. The important detail is not that it "just runs" by default; it is that its execution model is built around sandboxing and approval controls. In other words, you can push it toward more autonomous work, but you are still choosing how much it can do before it stops for confirmation.

On pricing, Codex is now included across ChatGPT Free, Go, Plus, Pro, Business, Edu, and Enterprise plans, with limits depending on plan. Free and Go are better treated as trial or light-use access. Plus starts at $20/month and is the practical baseline for regular individual use, while API-key usage remains a separate metered path.

The caveat is reliability perception. Some developer discussions and third-party writeups have reported cases where Codex behaved less like an executing agent and more like a suggestion engine, or appeared to misstate command results. I would not present that as a proven universal defect without stronger primary sourcing, but it is worth flagging as a reported concern if your workflow depends on unsupervised execution.

Where it sits: locked to OpenAI, sandboxed, and configurable toward autonomy. It is the pick if you want an OpenAI-native terminal agent, Rust-native speed, and real sandbox and approval controls, and if the reliability caveat does not disqualify it for your workflow.

Codex CLI sandbox and approval controls diagram showing file edits, commands, and test runs against a local codebase

Gemini CLI and What Just Replaced It

⚠ Heads up: this changed on June 18, 2026. Gemini CLI no longer serves requests for free individual users, Google AI Pro users, or Google AI Ultra users. Google is steering those users to Antigravity CLI, a closed-source successor built around an asynchronous multi-agent architecture. Enterprise users and paid API-key users keep Gemini CLI access.

The confusing part is that the Gemini CLI GitHub repository still exists and still presents Gemini CLI as an Apache 2.0 open-source terminal agent. The README may also still mention the old 60-requests-per-minute and 1,000-requests-per-day free tier, but Google's transition notice supersedes that for affected individual users. The code is still open; the consumer service path changed.

Before the cutoff, Gemini CLI was a strong open-source option: 100,000+ GitHub stars, Apache 2.0 licensing, Gemini 3 model access, a 1M-token context window, built-in tools, and a generous free tier. That free tier was the thing people talked about. Anything that still recommends Gemini CLI mainly because of "1,000 free requests a day" is now describing the pre-June-18 consumer experience.

Antigravity CLI was announced and made available in May 2026. Google said there would not be 1:1 feature parity at launch, but that Antigravity CLI would keep critical Gemini CLI concepts such as Agent Skills, Hooks, Subagents, and Extensions, now handled as Antigravity plugins. The tradeoff is obvious: Google is moving toward a unified agent platform, but the individual developer path is no longer the same open Gemini CLI story.

Where it sits: Gemini CLI used to be the approval-friendly Google-model option for individuals. After June 18, 2026, that role mostly belongs to enterprise and paid API-key users, while individual users are being pushed toward Antigravity CLI.

Timeline showing Gemini CLI free individual access ending June 18, 2026 and the transition to closed-source Antigravity CLI

Cline

Cline is the only tool here that lets you point your agent at a model running on your own machine. That single fact reshapes the cost conversation later, so keep it in mind.

The numbers first: Cline currently claims 8M+ developers or installs across platforms and about 63.6k GitHub stars. Its terminal-native CLI is already moving quickly, with CLI v3.0.29 released on June 20, 2026, so treat any exact version number as a snapshot. The more important product shift was Cline CLI 2.0, announced in February 2026 as a terminal-native product separate from the VS Code extension. It ships an interactive TUI, a Plan/Act toggle, a headless -y mode, parallel agents, ACP compliance, and stdin/stdout piping for CI/CD. Cline frames it as "orchestration, not authoring": a terminal-first workflow, not a VS Code clone.

The headline feature is model flexibility, and it's not marketing. Cline does BYOK across 30+ providers (Anthropic, OpenAI, Google, Bedrock, Azure, OpenRouter, DeepSeek, Groq) and, the part that matters for this article, local and self-hosted backends via Ollama and LM Studio. It's the only one of the four with native local-model support. If "run my agent against a model I host" is on your wishlist, Cline is the only tool here that says yes.

That flexibility also changes how you reason about the model underneath. With Claude Code or Codex you're betting on one vendor's roadmap. With Cline you can swap a frontier model for a cheaper one on routine work and only spend the expensive tokens where they earn it, or run a fast local model for the boilerplate and reserve a cloud API for the hard reasoning. It's more knobs to manage, which isn't free, but for anyone who's watched a single-vendor price or policy change wreck their workflow, the ability to repoint the agent in one config line is the actual selling point. If you're already running agents on a server, the same BYOK plumbing is what makes a VPS-hosted setup work.

The autonomy story is the inverse of Codex. By default, every file edit and every terminal command waits for your approval. That's the safest default of the four, and for an overnight, fire-and-forget run, it's also the most annoying. There's an auto-approve toggle (Shift+Tab in CLI 2.0) and per-action-type configuration, so you can dial it from "ask me everything" to "just go." You're choosing your own autonomy rather than accepting the tool's.

Pricing is the easy part: Cline itself is free and open source. You pay only for inference, BYOK at provider rates. Which sounds great, and is, until you use it heavily enough that the API meter becomes the story.

One caveat I won't skip: when you see community benchmarks for "Cline," check whether they're actually testing Roo Code, a popular community-driven fork. Many "Cline" comparisons online are really Roo Code comparisons, so the numbers may not be apples-to-apples with the tool you'd install. And Cline has documented context-maintenance limits on large multi-file projects (cline/cline): a 600-hour usage study (bv_dev) flagged that it can struggle to hold project structure on big codebases. No tool has fully solved long-session context, but it's worth knowing where Cline's edge is.

Where it sits: flexible on models, approval-required by default (configurable to autonomous). The pick for model flexibility, a safety-first default, or local inference.

Cline BYOK architecture connecting to OpenAI, Anthropic, Google, OpenRouter, LM Studio, and Ollama with approval controls and local or VPS inference

Benchmarks: Read These Carefully

I'm not going to rank these tools by one SWE-bench Verified table. OpenAI has publicly argued that SWE-bench Verified is increasingly contaminated and now reports newer coding-agent performance through other benchmarks instead. Claude and Gemini benchmark numbers also move quickly, and Cline does not have one native score because it inherits whichever model you configure.

That makes benchmark numbers useful as loose model-family context, not as the deciding factor between these tools. If you choose between Claude Code, Codex CLI, Gemini CLI, and Cline based on a one-point benchmark gap, you are probably optimizing the wrong variable. In daily use, reliability, autonomy posture, model flexibility, and cost matter more than a headline score.

Pricing and the Real Cost of Heavy Use

The comparison that matters isn't the sticker price. It's what you actually spend once you're using one of these daily.

GereedschapFree tierEntry paidActive daily useHeavy use
Claude CodeNone for Claude Code$20/mo (Pro)$20–100/mo$200/mo (Max 20x)
Codex CLILimited access on ChatGPT Free/GoPlus starts at $20/mo$20/mo for regular individual use; API optionalHigher subscription tier or metered API usage
Gemini to AntigravityGemini CLI free individual path ended June 18, 2026Enterprise or paid API key for Gemini CLIAntigravity plan details need current checkingNiet beschikbaar
ClineFree toolBYOK inference cost~$8–12/mo moderate~$50–200+/mo heavy API use, and potentially more

The interesting math is in the combinations, because a lot of people run a hybrid setup: a BYOK tool like Cline for daily work, plus a subscription tool for the big refactors.

  • Cline BYOK daily plus occasional Claude Code Pro: roughly $28–32/month.
  • Cline BYOK plus Claude Code Max 5x: roughly $108–112/month.
  • Pure Claude Code Max 20x, heavy daily: a flat $200/month.
  • Pure Cline BYOK can become expensive under heavy use. For example, if a Sonnet-class API session costs $2–5 and you run five sessions a day for twenty workdays, the monthly bill lands around $200–500. That is not a universal Cline cost; it is the point where variable API billing starts competing with flat subscriptions or self-hosted inference.

That last line is the one to sit with. Run a BYOK tool hard enough and the variable API bill quietly passes the flat Max subscription, and keeps climbing. A developer on Hacker News running three agents non-stop summed it up: "API tokens cost x40 compared to tokens in the subscription." Once you're in that zone, you've got two rational moves: switch to a flat subscription, or, if you want to keep BYOK and model flexibility, stop paying per token entirely.

Chart showing variable API costs rising with heavy use and crossing the cost of self-hosted inference on a fixed GPU VPS

When Variable API Costs Stop Making Sense

That $200–500/month variable bill is the anchor for a third option worth pricing out. Because Cline can talk to local and self-hosted backends through tools like Ollama, heavy users can move some inference onto infrastructure they control instead of metering every token through a cloud API. Cloudzy's one-click Ollama VPS en GPU VPS plans make that setup possible, but the right hardware depends on model size, quantization, concurrency, and current pricing. The practical question is simple: can a fixed GPU VPS beat your variable API bill? If the answer is yes, self-hosted inference becomes the cheaper lane.

Which One Should You Use

Now that you've got the data, the routing is simple. Find your corner of the two-axis matrix and the tool is mostly decided:

  • Want one polished model and a deep ecosystem, willing to pay a subscription, comfortable configuring your own autonomy? Claude Code. Go straight to Max if you're heavy, and make peace with the shared-limit quirk on Pro.
  • Already in the OpenAI ecosystem and want a Rust-native terminal agent with sandboxing and approval controls? Codex CLI, provided the reliability caveat doesn't disqualify it for your work. If you've already been burned by an agent forgetting its role, weigh that hard before committing.
  • Were relying on Gemini CLI's free tier? That path closed on June 18, 2026. Evaluate Antigravity cautiously (it's closed-source with a parity gap), or move to a BYOK tool and stop depending on any one vendor's free offer.
  • Want model flexibility, an approval-by-default safety net, or local and self-hosted inference? Cline. It's the only one of the four that does all three, and the only one you can point at a model you host yourself.
  • Want to delegate genuine fire-and-forget overnight runs? Reach for the autonomy-configurable tools, Claude Code's Agent Teams or Cline with -y, not an approval-gated default. If you don't trust auto-approve yet (plenty of people don't), Cline's per-action gates let you build that trust gradually instead of flipping one scary switch.

A couple of pre-emptions, because these are the objections I'd raise myself. Favoring BYOK to dodge a subscription whose limits might shift under you is fair, right up until the BYOK bill stops being the cheaper option. And switching from your IDE assistant to a terminal agent isn't all-or-nothing; the hybrid setup above is most people's real answer, not a clean migration.

Veelgestelde vragen

What happened to Gemini CLI?

Gemini CLI's free individual tier ended on June 18, 2026. Individual and free-tier users lost access and were steered toward Antigravity CLI, a closed-source replacement. Enterprise and paid API-key users keep Gemini CLI access. The Gemini CLI repository remains Apache 2.0, the open-source code stays public; it's the hosted service for individuals that was discontinued.

Does Claude Code require a subscription?

Yes. Claude Code has no free tier. The $20/month Pro plan includes it, but Pro's usage limits are shared with Claude Chat rather than being a dedicated coding allowance, which heavy users find limiting. For more headroom there are Max 5x ($100/month) and Max 20x ($200/month) plans.

Is Cline free, and does it work with Ollama?

Cline is free and open source under Apache 2.0, you only pay for AI inference through your own provider key (BYOK). It supports more than 30 providers, and it works with local and self-hosted backends including Ollama and LM Studio. That makes it the only tool in this comparison that can run against a model on your own hardware.

Is Codex CLI open source?

Yes. Codex CLI is Apache 2.0, has about 92,000 GitHub stars, and is written almost entirely in Rust (rewritten from TypeScript in late 2025). It is open source, and Codex access is included across ChatGPT Free, Go, Plus, Pro, Business, Edu, and Enterprise plans, with limits varying by plan. You can also use an OpenAI API key, but API usage is billed separately.

Which agentic coding CLI is best for large refactors or autonomous runs?

For fire-and-forget autonomous runs, use an autonomy-configurable tool, Claude Code with Agent Teams, Codex CLI with permissive approval settings, or Cline with the -y flag, rather than an approval-gated default. For high-stakes work on a repo you can't afford to break, run an approval-first workflow like Cline or a stricter Codex approval mode so you confirm risky changes before they land. The deciding factor is your tolerance for unsupervised edits, not the model alone.

Share

Meer van de blog

Blijf lezen.

Klaar om uit te rollen? Vanaf $2,48/mnd.

Onafhankelijke cloud, sinds 2008. AMD EPYC, NVMe, 40 Gbps. 14 dagen niet-goed-geld-terug.