Yifan Zhou's Diffusion Blog | Generated by AI

Home 2026.06

Question: Introduce the blog at https://zhouyifan.net and its author.

Answer:

The Author: Yifan Zhou (周弈帆)

Yifan Zhou is a Ph.D. student at the School of Computer Science and Engineering, Nanyang Technological University (Singapore), a member of NTU MMLab (affiliated with NTU S-Lab), supervised by Professor Xingang Pan. Before the Ph.D., he worked as a research engineer at NTU S-Lab and at Shanghai AI Lab, and got his bachelor’s degree in CS at Beijing Institute of Technology. His research focus is image and video generation — diffusion models, DiT architectures, and attention efficiency.

Career highlights worth knowing:

The Blog (周弈帆的博客)

Hexo-powered, running since ~2019, with 158 posts across 21 categories, mostly in Chinese (an English mirror exists at /blog-en). This is one of the best Chinese-language deep-dive blogs on diffusion models and attention mechanisms — directly relevant to your nanoGPT/MoE/sparse-attention track. The signature style: paper walkthroughs that start from first principles, with math, complexity analysis, and honest critique sections (he critiques his own papers’ weaknesses publicly, which is rare).

Representative recent content:

He’s also a personality: at CVPR 2026 in Denver, alongside his poster session he cosplayed Togawa Sakiko from BanG Dream! — admittedly as a publicity stunt, inspired by the Frieren coser at CVPR 2024. His self-intro claims, tongue-in-cheek, that his coding ability is “king-tier among all humans.”

Why it’s relevant to you: his LLSA Triton kernels are exactly the kind of attention-internals + GPU-programming material you’re building toward (NSA/MoBA-adjacent, open source, drop-in attention replacement). The FlashAttention and sparse-attention posts pair well with your nanoGPT/DeepSeek-v4 MoE study — and his code runs fine on a single H200-class budget, i.e., the MI300X territory you already rent.

References:


Back Donate