How are tokens estimated?

This tool uses a fast in-browser heuristic of roughly 4 characters per token for English text. It's a close estimate for budgeting; for exact billing counts, use the provider's official tokenizer. Your pasted text is never uploaded.

Are the prices current?

Prices are illustrative reference values and change often. Always confirm against each provider's official pricing page before relying on a figure. The table is editable so you can paste in current numbers.

LLM API Cost Calculator

Paste a prompt to count tokens in your browser, set the expected output length, and compare cost per request — and at scale — across major models. Nothing you paste is uploaded.

Prices are illustrative — verify on each provider's pricing page

Cheapest option for your workload

—

Model	Per request	Per month

Per-request cost = (input tokens × input price + output tokens × output price) ÷ 1,000,000.

Token estimate uses a ~4-characters-per-token heuristic — good for budgeting, not exact billing. Prices below are illustrative reference values and go stale fast; edit any cell to drop in current pricing from the provider. Not affiliated with any provider.

Editable price table ($ per 1M tokens)

Model	Input	Output

Sources to verify: Anthropic · OpenAI · Google Gemini. Update the values above, then recalculate.

How to read this

Input tokens are what you send (prompt + context); output tokens are what the model generates. Most providers charge output at 3–5× the input rate, so a chatty model can cost far more than its input price suggests. For high-volume workloads, the per-month column usually decides — a fraction of a cent per request becomes thousands of dollars at scale.

FAQ

Is my pasted text sent anywhere?

No. Token estimation runs entirely in your browser; nothing is uploaded.

Why is my estimate slightly off from the bill?

Heuristic tokenization differs from a model's exact tokenizer, and code/non-English text tokenizes differently. Use it for budgeting.