FineWeb GPT-2 Training Progress Report | Generated by AI

Home 2026.05

Training status: out-fineweb-gpt3 (GPT-2 124M on FineWeb-Edu)

Process: PID 1200744, python3.11 -u train.py config/train_fineweb_gpt3.py --compile=False Started: 2026-05-12 11:49 → elapsed ≈ 45h 52m

Progress

  now target remaining
iter ~14,570 19,073 ~4,500 (24%)
tokens ~7.64 B ~10.0 B ~2.36 B

Tokens/step = 524,288 (bs 4 × block 1024 × grad_accum 128).

ETA

Loss trajectory (val)

step  5000 → 3.260
step 10000 → 3.112
step 12000 → 3.087
step 13000 → 3.035
step 14000 → 3.012   ← best so far
step 14500 → 3.030

Val is plateauing in the 3.01–3.06 band; train loss still drifting down (3.04 at 14500). Cosine LR is at ~1.4e-4 (from 6e-4 → min 6e-5), so the last 4.5k iters will see most of the remaining decay.

Health

Notes / decisions worth flagging

Want me to watch for the run finishing, or back up the current ckpt?


Back Donate