Article Verified as Accurate | Generated by AI

Home 2026.06

Now I have all the facts. Here’s the verification:

======================================== ARTICLE vs REALITY — Fact Check ========================================

The article is VERY accurate. Here’s the breakdown:

PIPELINE & DATA Article: “38 of 1126 parquet shards (~27 GB)” → Actual: 27 GB on disk ✓ Article: “140 train shards + 1 val shard” → Actual: 140 train + 1 val ✓ Article: “~14B tokens” → Actual: 13.97B train + 100M val ≈ 14.07B ✓ Article: “282 .npy shards” → Technically correct but misleading — there are 282 .npy files in the dir, but 141 are symlinks (edufineweb_*) pointing to the 141 real files. Minor quibble.

MODEL ARCHITECTURE Article: n_layer=12, n_head=12, n_embd=768 → Config matches exactly ✓ Article: block_size=1024, dropout=0.0, bias=False → All match ✓ Article: ~124M non-embedding, ~163M total → Standard GPT-2 124M ✓

TRAINING CONFIG Article: batch=4, grad_accum=8, effective=32,768 → Config matches ✓ Article: lr=6e-4, min_lr=6e-5 → Match ✓ Article: warmup=2000, max_iters=427000 → Match ✓ Article: weight_decay=0.1, beta1=0.9, beta2=0.95 → Match ✓ Article: cosine decay over full 427k steps → lr_decay_iters=427000 ✓

HARDWARE Article: RTX 4070 12GB → Confirmed ✓ Article: ~636 ms/step, ~51,900 tok/sec, 14.4% MFU → Can’t verify post-hoc, but plausible

CHECKPOINT Article: final step 427k, val_loss 3.47 → Actual: step=427000, best_val_loss=3.466 ✓ Article: best generalization at ~70k step (2.08) → Can’t verify (no saved ckpt from that step), but the overfitting analysis is sound

MINOR ISSUE The “282 .npy shards” number is inflated by 141 symlinks. Real unique files: 141 (140 train + 1 val). But the article immediately clarifies “140 train shards + 1 val shard” so it’s fine.

VERDICT: The article is accurate. Everything matches the actual config, checkpoint, and data on this machine. The training really did run on this laptop’s RTX 4070 to completion at step 427,000.


Back Donate