Ilya Sutskever's AI Reading List | Generated by AI

Home 2025.11

Ilya Sutskever, co-founder of OpenAI, shared a recommended reading list of about 30 key papers, blog posts, courses, and resources on deep learning and AI with John Carmack around 2020. He reportedly said that mastering them would cover “90% of what matters today” in the field. While the full original list of 30 isn’t publicly complete (some items may have been lost due to email deletions), a widely circulated version includes 27 foundational items spanning convolutional networks, recurrent networks, transformers, information theory, and more. Below is that curated list, grouped by category for clarity. Each entry includes the title, authors, year, and type.

Convolutional Neural Networks

  1. CS231n: Convolutional Neural Networks for Visual Recognition - Fei-Fei Li, Andrej Karpathy, Justin Johnson - 2017 - Stanford Course
  2. ImageNet Classification with Deep Convolutional Neural Networks (AlexNet) - Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton - 2012 - Paper
  3. Deep Residual Learning for Image Recognition (ResNet) - Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun - 2015 - Paper
  4. Identity Mappings in Deep Residual Networks - Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun - 2016 - Paper
  5. Multi-Scale Context Aggregation by Dilated Convolutions - Fisher Yu, Vladlen Koltun - 2015 - Paper

Recurrent Neural Networks

  1. Understanding LSTM Networks - Christopher Olah - 2015 - Blog Post
  2. The Unreasonable Effectiveness of Recurrent Neural Networks - Andrej Karpathy - 2015 - Blog Post
  3. Recurrent Neural Network Regularization - Wojciech Zaremba, Ilya Sutskever, Oriol Vinyals - 2014 - Paper
  4. Neural Turing Machines - Alex Graves, Greg Wayne, Ivo Danihelka - 2014 - Paper
  5. Deep Speech 2: End-to-End Speech Recognition in English and Mandarin - Dario Amodei et al. - 2016 - Paper
  6. Neural Machine Translation by Jointly Learning to Align and Translate (RNNsearch) - Dzmitry Bahdanau, Kyunghyun Cho, Yoshua Bengio - 2015 - Paper
  7. Pointer Networks - Oriol Vinyals, Meire Fortunato, Navdeep Jaitly - 2015 - Paper
  8. Order Matters: Sequence to Sequence for Sets (Set2Set) - Oriol Vinyals, Samy Bengio, Manjunath Kudlur - 2016 - Paper
  9. A Simple Neural Network Module for Relational Reasoning (Relation Networks) - Adam Santoro, David Raposo, David G. Barrett, Mateusz Malinowski, Razvan Pascanu, Peter Battaglia, Timothy Lillicrap - 2017 - Paper
  10. Relational Recurrent Neural Networks - Adam Santoro, Ryan Faulkner, David Raposo, Jack Rae, Mike Chrzanowski, Theophane Weber, Daan Wierstra, Oriol Vinyals, Razvan Pascanu, Timothy Lillicrap - 2018 - Paper

Transformers

  1. Attention Is All You Need - Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, Illia Polosukhin - 2017 - Paper
  2. The Annotated Transformer - Sasha Rush et al. - 2017 (annotated 2020) - Blog Post
  3. Scaling Laws for Neural Language Models - Jared Kaplan, Sam McCandlish, Tom Henighan, Tom B. Brown, Benjamin Chess, Rewon Child, Scott Gray, Alec Radford, Jeffrey Wu, Dario Amodei - 2020 - Paper

Information Theory

  1. A Tutorial Introduction to the Minimum Description Length Principle - Peter Grünwald - 2004 - Book Chapter
  2. Kolmogorov Complexity and Algorithmic Randomness (Chapter 14) - Alexander Shen, Vladimir A. Uspensky, Nikolay Vereshchagin - 2017 - Book Chapter
  3. The First Law of Complexodynamics - Scott Aaronson - 2011 - Blog Post
  4. Quantifying the Rise and Fall of Complexity in Closed Systems: The Coffee Automaton - Scott Aaronson, Sean M. Carroll, Lauren Ouellette - 2014 - Paper
  5. Machine Super Intelligence - Shane Legg - 2008 - Dissertation

Miscellaneous

  1. Keeping Neural Networks Simple by Minimizing the Description Length of the Weights - Geoffrey E. Hinton, Drew van Camp - 1993 - Paper
  2. Variational Lossy Autoencoder - Xi Chen, Diederik P. Kingma, Tim Salimans, Yan Duan, Prafulla Dhariwal, John Schulman, Ilya Sutskever, Pieter Abbeel - 2017 - Paper
  3. GPipe: Easy Scaling with Micro-Batch Pipeline Parallelism - Yanping Huang, Youlong Cheng, Ankur Bapna, Orhan Firat, Mia Xu Chen, Dehao Chen, HyoukJoong Lee, Jiquan Ngiam, Quoc V. Le, Yonghui Wu, Zhifeng Chen - 2018 - Paper
  4. Neural Message Passing for Quantum Chemistry - Justin Gilmer, Samuel S. Schoenholz, Patrick F. Riley, Oriol Vinyals, George E. Dahl - 2017 - Paper

Speculation on the “missing” 3 items often points to meta-learning papers (e.g., Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks by Chelsea Finn et al., 2017) or reinforcement learning works on self-play, based on Sutskever’s contemporary talks, but these aren’t confirmed.

Deep learning reading list from Ilya Sutskever
Summary of Ilya Sutskever’s AI Reading List
The Lost Reading Items of Ilya Sutskever’s AI Reading List


Back

x-ai/grok-4-fast

Donate