Key Researchers in Embodied AI | Generated by AI
Top Researchers and Influencers in Embodied AI (Focus on Diffusion Policy, RT-2/RT-X, ACT, OpenVLA, Octo, and Related VLA Models)
The field of embodied AI — especially vision-language-action (VLA) models, diffusion-based policies, and imitation learning for robotics — has exploded recently. Here are the current “educational stars” (technical researchers who actively share insights, code, papers, and explanations). They’re the closest equivalents to Andrej Karpathy in this niche: deep experts who make complex robotics accessible through talks, blogs, tweets, or open-source releases.
On X (Twitter) – Must-Follow Accounts
These people post frequent updates on new papers, code releases, robot demos, and technical breakdowns:
- @JimFan (Jim Fan (NVIDIA Senior Research Scientist) – Extremely active and insightful. Posts about foundation models for robotics, VLA scaling laws, RT-X/Open X-Embodiment, diffusion policies, and humanoid robots. One of the best for real-time commentary on the field.
- @SergeyLevine10Sergey Levine (UC Berkeley) – Professor whose lab (BAIR/RAIL) pioneered Octo, co-led OpenVLA and RT-X/Open X-Embodiment. Posts paper announcements, robot videos, and deep threads on imitation/diffusion policies.
- @chelseabfinnChelsea Finn (Stanford) – Stanford professor, co-lead on OpenVLA and many VLA/foundation model papers. Great for high-level explanations and new releases.
- @pieterabbeelPieter Abbeel (UC Berkeley) – Pioneer in deep RL and imitation learning; his lab produced early work leading to ACT-style chunking and modern policies.
- @_akhaliqAkshay (Akhaliq) – Not a core researcher but runs the “Papers with Code” daily threads; highlights every new embodied AI paper (Diffusion Policy variants, VLA models, etc.) with links and quick summaries.
- @covariantaiCovariant AI (company account, founded by Pieter Abbeel & others) – Shares real-world deployment of RT-X-style models in warehouses.
- @shuransongShuran Song (Stanford/Columbia) – Lead on Diffusion Policy; posts about visuomotor policies and new diffusion in robotics.
- @TonyZhaozhTony Zhao (UC Berkeley PhD student) – First author on ACT (Action Chunking Transformer) and many follow-ups; very active with code releases and explanations.
- @karolhausmanKarol Hausman (Google DeepMind) – Core contributor to RT-1/RT-2/RT-X; posts about Google Robotics advances.
- @lerobot_huggingLeRobot (Hugging Face robotics team) – Posts open-source releases, tutorials, and comparisons of OpenVLA, Octo, Diffusion Policy, etc.
Other strong follows: @feifei_li (Fei-Fei Li, “visual intelligence” godmother, involved in embodied work), @drjimfan again for breadth.
YouTube Channels & Technical Bloggers
Good technical YouTube content in embodied AI is still emerging (most is seminars or short demos), but these are the best for deep dives:
- Montréal Robotics and Embodied AI Lab (MILA) – Official channel with seminars from top researchers (Sergey Levine, Chelsea Finn, Pieter Abbeel frequently speak here).
- UC Berkeley BAIR Robotics – Seminar series with talks on Octo, Diffusion Policy, ACT, OpenVLA, etc. Many videos titled “Octo: An Open-Source Generalist Robot Policy” or similar.
- Stanford Vision and Learning Lab (SVL) & Stanford AI Lab – Chelsea Finn and others give detailed talks on OpenVLA and VLA scaling.
- Google DeepMind Robotics – Occasional long-form videos on RT-2, RT-X, and AutoRT (their data-collection system).
- Hugging Face (LeRobot channel/section) – Practical tutorials on running OpenVLA, Octo, Diffusion Policy in simulation/real robots. Very hands-on with code.
- Yannic Kilcher – Not robotics-specific, but he does in-depth paper reviews (including RT-2, OpenVLA, Diffusion Policy) with code walkthroughs.
- Outlier (by Daniel Bashir) – Podcast-style interviews with embodied AI researchers (e.g., Sergey Levine, Chelsea Finn, Jim Fan).
For more seminar-style content, search YouTube for “RSS 2024 robotics” or “CoRL 2024” — almost every major paper (OpenVLA, Octo, Diffusion Policy variants) has a 10–15 min presentation video.
Key Labs / Projects to Watch (They Release Educational Material)
- UC Berkeley BAIR/RAIL (Sergey Levine) → Octo, parts of OpenVLA, many diffusion/imitation papers.
- Stanford Improbable AI Lab (Chelsea Finn, Jeannette Bohg) → OpenVLA lead lab.
- Google DeepMind Robotics → RT-2, RT-X, AutoRT.
- Columbia Shuransong Lab → Diffusion Policy original.
- Physical Intelligence (π0 models) → New diffusion + VLM policies; very active on X.
- Hugging Face LeRobot → Democratizes all the above with open-source code and notebooks.
If you want to learn hands-on, start with the Hugging Face LeRobot repo — it has pretrained OpenVLA, Octo, and Diffusion Policy models you can run in minutes.
These are the current “stars” driving the field forward with open-source code, detailed blogs, and public explanations — very similar spirit to Karpathy’s lectures but focused on robots instead of LLMs. The area moves insanely fast (multiple breakthrough papers every month), so following the X accounts above is the best way to stay current.