Next Steps for Training Decision | Generated by AI

Home 2026.06

Question: What should I do next after reviewing the Nanochat 760M MI300X setup document I uploaded?

Answer:

Based on the detailed breakdown in your uploaded file, here’s a structured next-step plan:


1. Decide on Training Strategy

You have two options:

Option A: Continue with current SDPA setup

Option B: Attempt speed optimizations

Recommendation: If reliability is your priority, stick to current setup (Option A). If you want to experiment and can handle potential build issues, try Option B selectively.


2. Prepare Training


3. Run Pretraining


4. Post-Training Pipeline

After base pretraining:

  1. Evaluate base model with DCLM CORE benchmark.
  2. Run SFT on chat data (identity_conversations.jsonl).
  3. Evaluate SFT model with chat benchmarks.
  4. Launch interactive chat:

    python -m scripts.chat_cli -p 'Hello!'
    python -m scripts.chat_web
    

5. Optional Improvements for Future Runs


Suggested Immediate Next Step


You can start by confirming your priority: reliability vs speed. That will dictate whether you run run_mi300x_d24.sh now or experiment with the performance tweaks.


Back Donate