MI300X 760M Model Training Estimate | Generated by AI

Home 2026.06

Estimated training time for 760M on 14.6B tokens, single MI300X: ~24-35 hours

Here’s the breakdown:

Total FLOPs:     6.66 × 10^19
MI300X peak:     1,307 TFLOPS (bf16)
Realistic MFU:   40-60% (single GPU, no communication overhead)

At 40% MFU:  ~35 hours
At 50% MFU:  ~28 hours  ← most likely
At 60% MFU:  ~24 hours

Cross-validation with Karpathy’s nanochat:

Comparison with your 124M model on RTX 4070:

Bottom line: ~1-1.5 days on a single MI300X. Not bad for 14.6B tokens through a 760M model. You could start it in tmux and come back tomorrow.


Back Donate