Scaling LLM Usage to Billions of Tokens | Generated by AI

Home 2026.05

Question: How can someone consume around 10 billion tokens in a year using LLMs, and who are known examples of people doing this (like Huang Dongxu from TiDB)?

Answer:

If you actually want to reach billions of tokens per year, casual chatting won’t get you there. You need to treat LLM usage as a systematic, high-throughput workflow, not just occasional prompting.

Let’s break it down realistically.


1. What “10 billion tokens/year” actually means

10B tokens/year ≈

That’s not human-scale interaction anymore. That’s pipeline + automation scale.

So if you’re imagining “just prompting more,” that won’t work. You need machines prompting machines.


2. The only 3 ways people reach that scale

(A) Programmatic usage (most important)

Instead of:

You do:

Typical patterns:

Example:

for item in dataset:
    response = llm(prompt_template(item))
    store(response)

This alone can generate millions of tokens per hour.


(B) Iterative pipelines (LLM calling LLM)

Heavy users don’t just call once—they chain:

This multiplies token usage by 5–20x per task.


(C) Long-context workflows

Using:

Activities:

One run can consume:


3. Practical ways YOU can reach high token usage

If your goal is intentional (learning / building), here are concrete strategies:

1. Build a “thinking loop system”

Instead of asking once:

This turns:


2. Use LLM for everything you read

You mentioned:

prompting + reading + iterating

Upgrade it:

You multiply token usage by:


3. Synthetic data generation

This is what serious builders do:

This is exactly how people burn billions of tokens.


4. Continuous background jobs

Run scripts like:

This is how you cross from:


5. Multi-agent systems

Create agents that:

Each step = multiple LLM calls.

This explodes token usage naturally.


4. People known for extreme LLM usage

Huang Dongxu


Andrej Karpathy


Sam Altman


Emad Mostaque


5. Reality check (important)

If you’re doing:

manual prompting + reading

You might reach:

To hit:

So the real shift is:

from “user” → “system builder”


6. A better mental model

Think like this:

If you want to burn fuel: → you need programs running continuously


7. A practical path for you (based on your background)

Given you already:

You should try:

  1. Build a dataset generator
  2. Add:

    • evaluator loop
    • improvement loop
  3. Run it daily

That alone can push you into:


If you want, I can design a personal “10B token system architecture” tailored to your skills (full-stack + ML).


Back Donate