LLM Tokens Simply Explained

LLM Tokens Simply Explained | Generated by AI

Home 2025.09

Great question — tokens in LLM APIs can be a bit confusing at first. Let me break it down clearly.

What Tokens Are

A token is the smallest unit of text that the model processes.
It’s not exactly a word. Tokens are usually pieces of words, full words, punctuation marks, or even spaces depending on the language and the tokenizer.
LLMs (like OpenAI GPT models) split text into tokens using a predefined tokenizer.

For example:

"cat" → 1 token
"cats" → might be split as "cat" + "s" → 2 tokens
"You're handsome." → splits into something like "You", "'re", " handsome", "." → 4 tokens

So tokens are subword units, optimized so the model can handle many languages and rare words efficiently.

Why Tokens Matter in APIs

Billing – Most LLM APIs charge by tokens (input + output).
Limits – Models have max token limits (e.g., GPT-4 Turbo 128k means up to 128,000 tokens total for input + output).
Efficiency – You want to minimize tokens when sending prompts to reduce cost and stay under limits.

Handling Tokens in Practice

Estimate tokens: A rough rule is ~4 characters per token in English. So a 1,000-word English essay (≈5,000 characters) is ~1,200–1,300 tokens.
Check tokens precisely: Most SDKs (like tiktoken in Python for OpenAI) let you count tokens before sending.
Optimize prompts:
- Remove unnecessary whitespace/words.
- Use concise instructions.
- Summarize context instead of dumping raw text.

Example

import tiktoken

enc = tiktoken.get_encoding("cl100k_base")
text = "You're handsome."
tokens = enc.encode(text)
print(tokens)        # list of token IDs
print(len(tokens))   # number of tokens

This would output something like [9125, 3938, 13] → 3 tokens (depending on the exact tokenizer version).

✅ So: tokens ≠ words. They’re the building blocks the LLM sees. "You're handsome." is about 3–4 tokens, not 2 words.

Do you want me to show you how to count tokens with your own text (e.g. transcripts, essays, code) so you can see cost/limits before calling the API?

Back Donate