Free GPT Token Counter & OpenAI Tokenizer
Count tokens and estimate API cost for GPT-5, GPT-4.1, GPT-4o, o3, o1, GPT-4, and GPT-3.5 in real time. Private, browser-only, no API key required.
Tokenize for
Pick a model to count its exact tokens.
GPT-5
OpenAI's flagship reasoning model.
0.00% of the GPT-5 context window used.
Prices in USD, updated November 2025. Check OpenAI for the latest rates.
Try a sample
Built for Prompt Engineers, Developers & AI Builders
Know exactly how many tokens your prompt uses and what it will cost before you hit the OpenAI API.
Accurate GPT Tokenization
Uses the exact BPE tokenizers OpenAI uses in production (o200k_base and cl100k_base), so counts match the API.
Every Current Model
Count tokens for GPT-5, GPT-5 mini, GPT-4.1, GPT-4o, GPT-4o mini, o3, o4-mini, o1, GPT-4, GPT-3.5 Turbo, and embedding models.
Live API Cost Estimator
Instantly see how much a prompt will cost in input and output tokens at the latest OpenAI pricing.
Context-Window Usage
Visual progress bar shows how close you are to the model's token limit, so you can trim before an API error.
Token Visualization
See your text color-coded by token boundaries and inspect every raw token ID behind the scenes.
100% Private, Browser-Only
Tokenization runs fully client-side. Your prompts, documents, and API keys never leave your device.
Upload Any Text File
Drop in .txt, .md, .json, .csv, or source files up to 5 MB to check long-context token counts.
How to Use
Simple 7-step process
Step 1
Pick the model you're targeting (GPT-5, GPT-4o, o-series, GPT-3.5, embeddings).
Step 2
Paste your prompt, document, or conversation into the text box.
Step 3
See the exact token count update in real time as you type.
Step 4
Switch to the Visualization tab to see how your text is split into tokens.
Step 5
Review the estimated API cost and context-window usage in the sidebar.
Step 6
Upload a .txt, .md, .json, or code file to tokenize long inputs instantly.
Step 7
Copy the token IDs as JSON to debug prompts or feed them into other tools.
Frequently Asked Questions
Everything you need to know about this tool and how it works.
See Full FAQA token is the smallest unit of text that a language model processes. OpenAI's tokenizers split text using Byte Pair Encoding (BPE), where a token can be a whole word, part of a word, a single character, or whitespace. As a rough guide for English text: 1 token ≈ 4 characters 1 token ≈ 0.75 words 100 tokens ≈ 75 words 1,000 tokens ≈ 750 words (about 1.5 pages)
This tool uses the official gpt-tokenizer library with the exact BPE vocabularies OpenAI ships: o200k_base — GPT-5, GPT-5 mini, GPT-5 nano, GPT-4.1, GPT-4.1 mini/nano, GPT-4o, GPT-4o mini, o1, o3, o4-mini cl100k_base — GPT-4, GPT-4 Turbo, GPT-3.5 Turbo, text-embedding-3-large, text-embedding-3-small Token counts produced here match what the OpenAI API bills you for.
Yes — for plain text input, the count is identical to what the OpenAI API and Playground report. The tool uses the same open-source BPE vocabulary that powers tiktoken. Note: chat completions also add a small constant overhead (roughly 3–4 tokens per message for role and separator tokens). For precise chat-message accounting, add ~4 tokens per message on top of the counted content.
Cost estimates use OpenAI's published per-million-token prices (updated November 2025). The input cost multiplies your measured tokens by the model's input price; the output cost is based on the number of output tokens you expect. Always double-check OpenAI's pricing page before committing to a large job, since rates occasionally change.
No. All tokenization happens entirely in your browser using WebAssembly-free JavaScript. Your prompts, documents, and source code never touch our servers, are never logged, and never leave your device.
Each model family uses a different BPE vocabulary. Newer o200k_base models (GPT-5, GPT-4o, GPT-4.1, o-series) were trained on a larger, more efficient vocabulary, so they typically produce 10–15% fewer tokens than older cl100k_base models (GPT-4, GPT-3.5), especially for code, emoji, and non-English text.
BPE vocabularies are trained primarily on English. Non-English characters — especially CJK (Chinese, Japanese, Korean) and right-to-left scripts — often fall back to multiple sub-byte tokens, consuming 2–4× more tokens per character than English equivalents. The newer o200k_base tokenizer significantly improves this, which is why it's recommended for multilingual apps.
Common ways to shrink a prompt without losing meaning: Remove redundant instructions and examples. Use shorter variable and placeholder names. Replace Markdown bullet lists with comma-separated lines when possible. Summarize long context with a cheaper model before feeding it to an expensive one. Use newer models (GPT-4o, GPT-4.1, GPT-5) — their tokenizer is more efficient.
Yes. Pick text-embedding-3-large or text-embedding-3-small from the model dropdown. You'll see the input token count and the exact price per request — embeddings have no output cost.
Yes — the GPT Token Counter is completely free, with no rate limits, no sign-up, and no API key required. Tokenization runs locally in your browser.
Still have questions?
Can't find what you're looking for? We're here to help you get the answers you need.