SM120 Blackwell Ultra Inference Optimization

SM120 Blackwell Ultra Inference Optimization | Generated by AI

Home 2026.06

Question: What does “SM120” mean in the context of “Inference Optimization with AI Agents in the Workflow”?

Answer:

SM120 refers to NVIDIA’s SM (Streaming Multiprocessor) generation 120 — this is the compute architecture generation identifier used internally in CUDA/PTX.

Specifically, SM120 corresponds to Blackwell architecture (GB200/B200/B100 GPUs), announced in 2024 and shipping in 2025.

The SM numbering follows NVIDIA’s architecture generations:

SM version	Architecture	GPUs
SM80	Ampere	A100, A10
SM86	Ampere	RTX 3090
SM89	Ada Lovelace	RTX 4090
SM90	Hopper	H100, H200
SM100	Blackwell (first gen)	B100, B200
SM120	Blackwell Ultra	GB300, B300

So “SM120 Inference Optimization with AI Agents in the Workflow” is likely a talk or paper title from GTC 2025 or similar NVIDIA event, specifically about optimizing LLM inference on Blackwell Ultra GPUs using agentic workflows.

In CUDA terms, you’d target this with:

nvcc --gpu-architecture=sm_120 ...

or in PyTorch:

torch.cuda.get_device_capability()  # returns (12, 0) for SM120

Do you have a link to the specific talk/paper? I can dig into the actual content if so.

Back Donate