Su Jianlin: Researcher, Blogger, and AI Contributor | Generated by AI
Question: Give a comprehensive introduction to Su Jianlin, his blog, and his work.
Answer:
About Su Jianlin
Su Jianlin (苏剑林), born in 1993 in a small village in Yunfu, Guangdong, China, is a Chinese researcher, blogger, and machine learning practitioner known online as BoJone. He holds a Bachelor’s degree in Mathematics from South China Normal University and a Master’s degree in Fundamental Mathematics (with a focus on mathematical biology applications) from Sun Yat-sen University (graduated in 2019).
From a young age, he showed strong interest in science, particularly mathematics and chemistry, which later expanded to physics, astronomy, and computing. He idolizes physicist Richard Feynman and describes himself as someone who loves “drilling into the corners” of problems—persistently exploring deep details until breakthroughs. His hobbies include reading, writing, Chinese chess, cooking, and theoretical pursuits in various sciences.
After graduation, he joined Zhuiyi Technology (a company focused on NLP and AI) as a machine learning algorithm engineer. As of 2025 (age 32), he continues to contribute to the AI field, with affiliations mentioned in various sources including potential later roles (e.g., references to Moonshot AI in some articles, though primarily known for independent and open-source work).
Su is highly regarded in the Chinese AI community for his clear, in-depth explanations of complex topics, often called “苏神” (God Su) by admirers. He has published papers on arXiv and in journals, including co-authorship on RoFormer (introducing Rotary Position Embedding, widely adopted in models like LLaMA, GPT variants, and Google’s models).
His Blog: 科学空间 (Scientific Spaces) - https://kexue.fm
Launched in 2009, “科学空间” (Kexue.fm, also accessible via spaces.ac.cn) is Su Jianlin’s personal blog dedicated to sharing knowledge in natural sciences and reflections on life. It started as a platform for exploring astronomy, mathematics, physics, chemistry, biology, and more general topics like photography and life insights.
Over time, the blog evolved heavily toward machine learning, natural language processing (NLP), deep learning optimization, generative models (e.g., diffusion models, VAEs, GANs), and Transformer architectures. It now features thousands of articles spanning 16+ years (copyright up to 2025), with categories including Mathematics, Physics-Chemistry, Big Data/Information Era, Astronomy, Biology, and others.
The blog’s style is technical yet accessible: long-form articles with mathematical derivations, code examples, and thoughtful commentary on recent papers. Recent posts (as of late 2025) focus on advanced “alchemy” (training tricks) in deep learning, such as learning rate schedules, optimizers (e.g., Muon), weight decay, diffusion models, and manifold-based gradient descent.
It encourages reader interaction through comments, supports open reprinting under CC license, and has received support from institutions like the National Astronomical Observatories (LAMOST project). Su also built tools like Cool Papers (papers.cool), an AI-assisted paper browsing site using models like Kimi for summaries and FAQs.
His Work and Contributions
Su Jianlin’s primary impact is in NLP and deep learning, bridging rigorous mathematics with practical implementations:
- Open-Source Projects (via GitHub @bojone, 12k+ followers):
- bert4keras: A lightweight, user-friendly Keras implementation of BERT and Transformers—highly popular (5k+ stars) for its clarity and examples.
- GlobalPointer: Efficient method for nested/non-nested named entity recognition.
- rerope: Rectified Rotary Position Embeddings.
- bytepiece: High-compression tokenizer.
- Keras-DDPM: Diffusion model implementations.
- Others like NBCE (Naive Bayes Context Extension) and various BERT enhancements.
- Key Research:
- Co-author of RoFormer (2021): Introduced Rotary Position Embedding (RoPE), improving long-sequence handling in Transformers; influential in modern LLMs.
- Work on variational inference, VAEs, residual networks for astronomy, and optimization analyses.
- Blog series on diffusion models (31+ parts), SSMs, MoE, energy-based models, and “making training more scientific” (e.g., SGD convergence, AdamW insights).
His contributions emphasize theoretical depth, practical tools, and knowledge dissemination, significantly influencing Chinese AI practitioners through open-source code and educational writing.
References:
- https://kexue.fm
- https://kexue.fm/me.html
- https://github.com/bojone
- https://scholar.google.com.hk/citations?user=cdbdaksAAAAJ
- https://arxiv.org/abs/2104.09864 (RoFormer paper)
- https://blog.csdn.net/sinat_37574187/article/details/149498108 (Profile summary)
- https://blog.csdn.net/liyongqiang2420/article/details/117128888 (Introduction to Su and his blog)