Ilya Sutskever's AI Reading List

Ilya Sutskever's AI Reading List | Generated by AI

Home 2025.11

Ilya Sutskever, co-founder of OpenAI, shared a recommended reading list of about 30 key papers, blog posts, courses, and resources on deep learning and AI with John Carmack around 2020. He reportedly said that mastering them would cover “90% of what matters today” in the field. While the full original list of 30 isn’t publicly complete (some items may have been lost due to email deletions), a widely circulated version includes 27 foundational items spanning convolutional networks, recurrent networks, transformers, information theory, and more. Below is that curated list, grouped by category for clarity. Each entry includes the title, authors, year, and type.

Convolutional Neural Networks

CS231n: Convolutional Neural Networks for Visual Recognition - Fei-Fei Li, Andrej Karpathy, Justin Johnson - 2017 - Stanford Course
ImageNet Classification with Deep Convolutional Neural Networks (AlexNet) - Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton - 2012 - Paper
Deep Residual Learning for Image Recognition (ResNet) - Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun - 2015 - Paper
Identity Mappings in Deep Residual Networks - Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun - 2016 - Paper
Multi-Scale Context Aggregation by Dilated Convolutions - Fisher Yu, Vladlen Koltun - 2015 - Paper

Recurrent Neural Networks

Understanding LSTM Networks - Christopher Olah - 2015 - Blog Post
The Unreasonable Effectiveness of Recurrent Neural Networks - Andrej Karpathy - 2015 - Blog Post
Recurrent Neural Network Regularization - Wojciech Zaremba, Ilya Sutskever, Oriol Vinyals - 2014 - Paper
Neural Turing Machines - Alex Graves, Greg Wayne, Ivo Danihelka - 2014 - Paper
Deep Speech 2: End-to-End Speech Recognition in English and Mandarin - Dario Amodei et al. - 2016 - Paper
Neural Machine Translation by Jointly Learning to Align and Translate (RNNsearch) - Dzmitry Bahdanau, Kyunghyun Cho, Yoshua Bengio - 2015 - Paper
Pointer Networks - Oriol Vinyals, Meire Fortunato, Navdeep Jaitly - 2015 - Paper
Order Matters: Sequence to Sequence for Sets (Set2Set) - Oriol Vinyals, Samy Bengio, Manjunath Kudlur - 2016 - Paper
A Simple Neural Network Module for Relational Reasoning (Relation Networks) - Adam Santoro, David Raposo, David G. Barrett, Mateusz Malinowski, Razvan Pascanu, Peter Battaglia, Timothy Lillicrap - 2017 - Paper
Relational Recurrent Neural Networks - Adam Santoro, Ryan Faulkner, David Raposo, Jack Rae, Mike Chrzanowski, Theophane Weber, Daan Wierstra, Oriol Vinyals, Razvan Pascanu, Timothy Lillicrap - 2018 - Paper

Transformers

Attention Is All You Need - Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, Illia Polosukhin - 2017 - Paper
The Annotated Transformer - Sasha Rush et al. - 2017 (annotated 2020) - Blog Post
Scaling Laws for Neural Language Models - Jared Kaplan, Sam McCandlish, Tom Henighan, Tom B. Brown, Benjamin Chess, Rewon Child, Scott Gray, Alec Radford, Jeffrey Wu, Dario Amodei - 2020 - Paper

Information Theory

A Tutorial Introduction to the Minimum Description Length Principle - Peter Grünwald - 2004 - Book Chapter
Kolmogorov Complexity and Algorithmic Randomness (Chapter 14) - Alexander Shen, Vladimir A. Uspensky, Nikolay Vereshchagin - 2017 - Book Chapter
The First Law of Complexodynamics - Scott Aaronson - 2011 - Blog Post
Quantifying the Rise and Fall of Complexity in Closed Systems: The Coffee Automaton - Scott Aaronson, Sean M. Carroll, Lauren Ouellette - 2014 - Paper
Machine Super Intelligence - Shane Legg - 2008 - Dissertation

Miscellaneous

Keeping Neural Networks Simple by Minimizing the Description Length of the Weights - Geoffrey E. Hinton, Drew van Camp - 1993 - Paper
Variational Lossy Autoencoder - Xi Chen, Diederik P. Kingma, Tim Salimans, Yan Duan, Prafulla Dhariwal, John Schulman, Ilya Sutskever, Pieter Abbeel - 2017 - Paper
GPipe: Easy Scaling with Micro-Batch Pipeline Parallelism - Yanping Huang, Youlong Cheng, Ankur Bapna, Orhan Firat, Mia Xu Chen, Dehao Chen, HyoukJoong Lee, Jiquan Ngiam, Quoc V. Le, Yonghui Wu, Zhifeng Chen - 2018 - Paper
Neural Message Passing for Quantum Chemistry - Justin Gilmer, Samuel S. Schoenholz, Patrick F. Riley, Oriol Vinyals, George E. Dahl - 2017 - Paper

Speculation on the “missing” 3 items often points to meta-learning papers (e.g., Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks by Chelsea Finn et al., 2017) or reinforcement learning works on self-play, based on Sutskever’s contemporary talks, but these aren’t confirmed.

Deep learning reading list from Ilya Sutskever
Summary of Ilya Sutskever’s AI Reading List
The Lost Reading Items of Ilya Sutskever’s AI Reading List

Back

x-ai/grok-4-fast

Donate