AI Papers
Foundational Concepts & Architectures:
- Backpropagation: The fundamental algorithm for training neural networks.
- Convolutional Neural Networks: A key architecture for image processing.
- Word2Vec: Papers by Tomáš Mikolov on word embeddings.
- Sequence to Sequence Learning with Neural Networks: A foundational paper on sequence modeling.
- Attention is All You Need: The seminal paper introducing the Transformer architecture.
- ResNet, “Deep Residual Learning for Image Recognition”: Introduced residual connections for deeper networks.
Large Language Models & Related Techniques:
- Chain-of-Thought Prompting Elicits Reasoning in Large Language Models: Explores prompting techniques for enhanced reasoning.
- Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks: Combines retrieval with generation for improved performance.
- GPT-4 Technical Report
- Claude 3 Model Card
- LLaMA 3 Paper
Specific Models & Applications:
- DeepSeek V3 & V2: Recent models from DeepSeek.
- Whisper, Robust Speech Recognition via Large-Scale Weak Supervision: A robust speech recognition model.
- Latent Diffusion Models (Stable Diffusion) Paper: Introduces latent diffusion for image generation.
- DALL-E 3: Scaling Text-to-Image Generation: A powerful text-to-image model.
Benchmarks & Evaluations:
- SWE-Bench: Can Language Models Resolve Real-World GitHub Issues? A benchmark for evaluating code generation capabilities.
Curated Lists:
- NeurIPS Test of Time Papers: A collection of influential papers recognized for their long-term impact.
- Ilya’s Top 30 AI Papers: A curated list by Ilya Sutskever, available at https://aman.ai/primers/ai/top-30-papers/.