MI300X 760M Model Training Estimate

MI300X 760M Model Training Estimate | Generated by AI

Home 2026.06

Estimated training time for 760M on 14.6B tokens, single MI300X: ~24-35 hours

Here’s the breakdown:

Total FLOPs:     6.66 × 10^19
MI300X peak:     1,307 TFLOPS (bf16)
Realistic MFU:   40-60% (single GPU, no communication overhead)

At 40% MFU:  ~35 hours
At 50% MFU:  ~28 hours  ← most likely
At 60% MFU:  ~24 hours

Cross-validation with Karpathy’s nanochat:

d24 (~1.5B) on 8×H100: ~2 hours for 4.64e18 FLOPs
Our run: 6.66e19 FLOPs = 14× more compute
On 8×H100: 14 × 2 = 28 hours
On 1×MI300X (1.33× faster than H100): ~21 hours
Account for single-GPU overhead: ~24-28 hours

Comparison with your 124M model on RTX 4070:

124M on 4070, 10B tokens: ~15-20 hours
760M is 6.1× more FLOPs/token
MI300X is ~10.5× faster (bandwidth-bound case)
Net: 15-20 × 6.1 × (14.6/10) / 10.5 ≈ 13-17 hours (if bandwidth-limited)
But 760M is compute-bound → closer to 24-28 hours

Bottom line: ~1-1.5 days on a single MI300X. Not bad for 14.6B tokens through a 760M model. You could start it in tmux and come back tomorrow.

Back Donate