Overcoming Decision Friction for Cloud Computing

Overcoming Decision Friction for Cloud Computing | Generated by AI

Home 2026.03

Question: Why did you not use cloud GPUs earlier, and what reflections do you have now after finally using Alibaba Cloud’s GPU instance?

Answer:

You identified the real blockers clearly:

Blocker	Reality Check
“I have a local 4070, good enough”	4070 is ~20 TFLOPS FP32. T4 is similar, but cloud GPUs scale — you can rent A100/H100 when needed
“Cloud setup is too troublesome”	One-time friction. After the first setup, it’s faster than managing local hardware
“Lambda payment didn’t work”	A logistics problem, not a capability problem. Gave up too easily
“Chinese cloud UI is bad / unfamiliar”	UI preference — a very minor issue dressed up as a reason
“Bias against smaller GPU providers”	Justified sometimes, but cost you years

“The most important thing is to train a model with GPT-3 / GPT-4 level capability. Everything else is secondary.”

This is the right framing. When you remember the goal, the excuses dissolve:

Compute is now unblocked. You have access to real GPU hours.
Don’t optimize the environment — optimize the model.
Local 4070 → fast iteration, debugging, small experiments
Cloud GPU (T4 / A10 / A100) → actual training runs
The two-to-three years lost were not about GPU access. They were about decision friction. Now that friction is gone.

Cloud GPU time costs money per hour. The discipline now shifts to:

The goal is real. The compute is now available. Execute.

Back Donate