LLM Tokens Simply Explained | Generated by AI

Home 2025.09

Great question — tokens in LLM APIs can be a bit confusing at first. Let me break it down clearly.


What Tokens Are

For example:

So tokens are subword units, optimized so the model can handle many languages and rare words efficiently.


Why Tokens Matter in APIs

  1. Billing – Most LLM APIs charge by tokens (input + output).
  2. Limits – Models have max token limits (e.g., GPT-4 Turbo 128k means up to 128,000 tokens total for input + output).
  3. Efficiency – You want to minimize tokens when sending prompts to reduce cost and stay under limits.

Handling Tokens in Practice


Example

import tiktoken

enc = tiktoken.get_encoding("cl100k_base")
text = "You're handsome."
tokens = enc.encode(text)
print(tokens)        # list of token IDs
print(len(tokens))   # number of tokens

This would output something like [9125, 3938, 13] → 3 tokens (depending on the exact tokenizer version).


✅ So: tokens ≠ words. They’re the building blocks the LLM sees. "You're handsome." is about 3–4 tokens, not 2 words.


Do you want me to show you how to count tokens with your own text (e.g. transcripts, essays, code) so you can see cost/limits before calling the API?


Back Donate