If you've been hearing about Claude, ChatGPT, Gemini, and other AI models but feel lost when people start talking about APIs, tokens, and rate limits — this guide is for you. We're going to break down exactly how these AI services work, what you're actually paying for, and which option makes sense for your situation.

What Even Is an AI Model?

Think of an AI model like a brain that's been trained on enormous amounts of text. Companies like Anthropic (Claude), OpenAI (GPT/ChatGPT), Google (Gemini), and Meta (Llama) build these models. Each company has multiple versions — some smarter but slower and more expensive, others faster and cheaper but less capable. For example, Anthropic makes Claude. Their lineup right now looks like this:

  • Claude Opus 4 — The big brain. Best reasoning, most expensive, slowest
  • Claude Sonnet 4 — The sweet spot. Very capable, reasonably priced, good speed
  • Claude Haiku 4 — The speedster. Fast, cheap, great for simple tasks

OpenAI has a similar lineup with GPT-4o, o1, o3, and Google has Gemini 2.5 Pro/Flash. The pattern is always the same: bigger model = smarter but costs more.

The Two Ways to Use AI: Subscription vs API

This is where most people get confused. There are fundamentally two ways to access these models:

1. Subscription (The Easy Way)

This is what most people use. You go to claude.ai, chatgpt.com, or gemini.google.com, sign up, and pay a monthly fee. That's it.

Claude Pro subscription: $20/month - Chat interface in your browser - Access to all Claude models (Opus, Sonnet, Haiku) - Usage limits (you get X messages per day, varies by model) - No technical setup required - Great for: asking questions, writing help, brainstorming, casual use

ChatGPT Plus: $20/month - Same concept, OpenAI's models - Access to GPT-4o, o1, DALL-E image generation - Usage limits on advanced models

What you're paying for: Convenience. A nice chat interface, no setup, predictable monthly cost. The tradeoff is usage limits — when you hit your limit, you either wait or get bumped to a weaker model.

2. API (The Power User Way)

API stands for Application Programming Interface. Instead of chatting in a browser, you send requests to the AI programmatically — through code, scripts, or tools like OpenClaw.

How API pricing works: - You pay per use, not per month - Pricing is based on tokens (roughly 4 characters = 1 token, or about 750 words = 1,000 tokens) - You pay separately for input tokens (what you send) and output tokens (what the AI writes back) - Output tokens cost more than input tokens (typically 3-5x more)

Claude API pricing (as of March 2026):

| Model | Input (per 1M tokens) | Output (per 1M tokens) | |-------|----------------------|----------------------| | Opus 4 | $15.00 | $75.00 | | Sonnet 4 | $3.00 | $15.00 | | Haiku 4 | $0.80 | $4.00 |

To put that in perspective: a typical back-and-forth conversation might use 2,000 input tokens and 1,000 output tokens. On Sonnet 4, that's about $0.006 + $0.015 = $0.02 per message. Two cents. You'd need to send 1,000 messages to spend $20.

So Which Is Cheaper?

It depends entirely on how much you use it.

Subscription wins if: - You chat casually throughout the day - You want zero surprises on your bill - You don't want to think about tokens or pricing - You use it for general questions, writing, brainstorming

API wins if: - You use AI in bursts (some days heavy, some days nothing) - You're building tools, automations, or agents - You want to pick exactly which model to use for each task - You want to control costs precisely - You use lighter models (Haiku/Sonnet) most of the time

Here's the math: if you send fewer than ~1,000 Sonnet messages per month, the API is cheaper than a $20 subscription. If you're a heavy daily user who maxes out the subscription limits, the subscription is the better deal.

Enter OpenRouter: The Model Buffet

Here's where it gets interesting. What if you don't want to be locked into just Claude or just GPT? What if you want access to every model from every company through a single account?

That's exactly what OpenRouter does.

OpenRouter is a routing service. You create one account, add credits, and get access to 200+ models from every major provider — Claude, GPT, Gemini, Llama, Mistral, Qwen, DeepSeek, and dozens more. Same API format for all of them.

Why this matters: - One API key for everything — instead of signing up separately with Anthropic, OpenAI, Google, etc. - Model comparison — try Claude Sonnet, then GPT-4o, then Gemini Pro on the same prompt and see which gives better results - Free models — some models on OpenRouter are literally free (community-hosted Llama, Mistral, etc.) - Fallback routing — if one provider is down, OpenRouter can automatically route to another

OpenRouter pricing adds a small markup over direct API prices (usually 0-20%), but the convenience of one account and one API key for everything is worth it for most people.

How to Get Started with the API

If you want to try the API route, here's the simplest path:

Option A: Direct from Anthropic 1. Go to console.anthropic.com 2. Create an account 3. Add a credit card (you only pay for what you use) 4. Generate an API key 5. Use it in any tool that supports Claude (OpenClaw, Continue, Cursor, etc.)

Option B: Through OpenRouter 1. Go to openrouter.ai 2. Sign up (Google/GitHub login works) 3. Add credits ($5 minimum — this will last a long time on cheaper models) 4. Generate an API key 5. Use it anywhere — just point your tool to https://openrouter.ai/api/v1 instead of the direct provider URL

OpenRouter is the recommended starting point for beginners because you get access to everything with one account, and $5 in credits will let you experiment with dozens of models before committing to anything.

What About Running Models Locally?

There's a third option nobody tells beginners about: running models on your own computer for free.

Tools like Ollama let you download and run open-source models (Llama, Qwen, Mistral, DeepSeek) directly on your machine. No API key, no monthly fee, no internet required.

The catch: You need decent hardware. A MacBook with 16GB+ RAM can run smaller models fine. For the really good models (70B+ parameters), you need 64GB+ RAM or a beefy GPU.

Good local models for beginners: - Llama 3.3 70B — Meta's flagship, excellent general-purpose - Qwen 3 32B — Great at reasoning and coding - DeepSeek R1 14B — Surprisingly capable for its size - Mistral Small 24B — Fast, good at following instructions

Local models are perfect for privacy-sensitive work, offline use, or just tinkering without worrying about costs.

The Real-World Setup Most Power Users Run

Here's what a typical OpenClaw power user setup looks like:

1. Anthropic API (direct) for Claude Opus/Sonnet — primary workhorse 2. OpenRouter for everything else — Gemini, GPT, Llama, exotic models 3. Ollama locally for quick tasks, privacy, and experimentation

OpenClaw can route to all three seamlessly. You configure each provider once, then just reference models by alias:

sonnet → anthropic/claude-sonnet-4
gemini → openrouter/google/gemini-2.5-flash
qwen3 → ollama/qwen3:32b

The tool picks the right provider automatically based on the model name. This gives you the best of every world — top-tier models when you need them, cheap models for routine stuff, and free local models for experimentation.

Key Terms Cheat Sheet

| Term | What It Means | |------|---------------| | Token | A chunk of text (~4 characters). This is how AI usage is measured. | | API Key | A secret password that identifies your account when making API calls | | Context Window | How much text the model can "remember" in one conversation (measured in tokens) | | Rate Limit | Maximum requests per minute/hour your account can make | | Input Tokens | The text you send to the model (your question + any context) | | Output Tokens | The text the model generates (the response) | | Prompt | The instruction or question you give the model | | System Prompt | Hidden instructions that shape how the model behaves | | Temperature | How "creative" vs "predictable" the model's responses are (0 = robotic, 1 = creative) | | Streaming | Getting the response word-by-word as it generates (instead of waiting for the full answer) |

Key Takeaways

  • Subscriptions ($20/month) are simple and predictable — great for casual daily use
  • APIs charge per token and can be much cheaper if you're not a heavy user
  • OpenRouter gives you one API key for 200+ models from every provider
  • Running models locally via Ollama is free but requires decent hardware
  • Most power users combine all three: direct API for primary models, OpenRouter for variety, Ollama for local
  • Start with OpenRouter if you're new — $5 in credits goes a surprisingly long way

The Bottom Line

The AI model landscape is confusing because there are genuinely multiple good options, and the right choice depends on your usage pattern. If you just want to chat with Claude a few times a day, the $20 subscription is a no-brainer. If you're building tools, running agents, or want to experiment with different models, go the API route — start with OpenRouter for simplicity, then add direct provider accounts as you figure out which models you use most. And if you have a decent computer, install Ollama and play with local models for free. You don't have to pick just one — the best setup is usually a combination.