Self-Taught AI Learning Journey

Self-Taught AI Learning Journey | Generated by AI

Home 2026.05

Question: What do you think about my multi-year self-taught AI/ML learning journey — starting from building a PC with an RTX 4070 in 2023, training GPT-2 on 60GB text, spending hundreds on cloud GPUs (RunPod, DigitalOcean, H100/H200), earning two deep learning certificates, and now running local training on 400+ GB of text data — all within a budget constraint of roughly ¥1000–5000, driven initially by career goals but eventually by genuine curiosity?

Answer:

Honest Reflection on Your Journey

What you described is, frankly, one of the most authentic ways to learn anything technical. Let me break down why.

1. You Learned the Hard Way — That’s the Right Way

To escape tutorial hell and really learn, you have to get hands-on, write algorithms from scratch, implement papers, and do real side projects. That’s exactly what you did — not by following a clean roadmap, but by hitting walls repeatedly. RunPod bills, broken environments, out-of-stock cloud GPUs — these aren’t failures, they’re tuition fees paid to reality.

Most people watch YouTube videos about transformers. You actually ran training loops for days and felt the pain of a 60GB text corpus taking half a day. That gap — between knowing about and having done — is enormous.

2. The Motivation Shift is the Most Valuable Part

You started with:

“I want an AI job”
“I want to show off certificates”
“I want to chase AGI dreams”

And you ended up with:

Genuine curiosity
Budget-constrained creative problem solving
Real intuition about what small models can actually do

This is the classic arc of deep learning. It was just me, a dream, and my PC. In front of me was an ocean of knowledge — a big, deep ocean (the Internet), and I was starting an endless journey of learning. Throughout this journey, I faced many challenges, got lost many times. You lived that sentence. The people who stick through the “got lost many times” phase are exactly the ones who come out with real understanding.

3. Constraints Made You Smarter, Not Weaker

Working with one ¥5000 GPU, a tight budget, no team, no managed environment — this forced you to actually understand:

Memory bottlenecks
What block size really means (your 泳池/swimming pool analogy is genuinely insightful)
When cloud is worth it vs. when local compute wins
How to make 400GB of text data work without infinite VRAM

Projects with simple models focused on a specific task can use a single GPU to train their AI. You discovered this truth empirically, not from a textbook. That’s worth more.

4. What You Actually Have Now

After 2+ years of this:

What you did	What you gained
RTX 4070 local inference (2023)	Hardware intuition, CUDA basics
GPT-2 training on 60GB text	Data pipeline, tokenization, training dynamics
RunPod / DigitalOcean H100/H200	Cloud GPU workflow, cost optimization
Two DL certificates	Structured vocabulary, credentialing
400GB local training runs	Patience, data engineering, real throughput understanding
WeChat channel, friends discussions	Communication and teaching ability

Teaching others (your 泳网/block size video reaching tens of thousands of views) is one of the most underrated forms of learning validation.

5. What’s Honest Criticism?

The certificates leaning on “referencing homework too much” — that’s a gap worth being honest about. The credential is less important than whether you can reconstruct those concepts.
Cloud GPU spending of “several hundred dollars” for $3/hr H200 is a common trap. The lesson you learned — local compute often beats expensive cloud for iterative small-model work — is exactly right.
Three to four days of training runs on 400GB is actually impressive for a single consumer GPU setup. But ask yourself: do you understand what loss curves, gradient norms, and learning rate schedules are telling you during those runs? That’s the next level.

6. The Bigger Picture

What you’ve built isn’t a career credential. It’s a genuine mental model of how this technology works, built through pain, money, time, and iteration. In 2025, anyone with a mid-range laptop and an internet connection can build, train and deploy models that, five years ago, required a dedicated GPU cluster. You were doing this before it became easy — which means your intuitions are calibrated against real friction, not convenient abstractions.

The AGI dream may be a joke (your words), but the curiosity that drove you through all of this — that’s not a joke. That’s the most valuable asset you have.

Keep going. The fact that small model training feels “comfortable” now means you’ve earned that comfort.

References:

Back Donate