Deep Learning: Classics and Trends

We’ve officially moved to   👉

Past events

Date Presenter Topic or Paper
[Paper 1] [Paper 2] [Slides]
[Paper 1] [Paper 2] [Slides]
[Paper] [Slides]
[Paper] [Slides]
[Paper] [Slides]
[Paper] [Slides]
[Paper] [Slides]
[Paper] [Slides] [Recording]
[Slides & Recording on Piero’s website]
[Paper 1] [Paper 2] [Paper 3] [Slides]
[Paper] [Slides]
[Paper 1] [Paper 2] [Paper 3] [Slides]
2020.04.17 Nikhil Dev Deshmudre [AlphaGo], [AlphaGo Zero], [Alpha Zero], [MuZero] [Slides]
[AlphaGo], [AlphaGo Zero], [Alpha Zero], [MuZero] [Slides] [Recording]
2020.04.03 Alyssa Dayan Mode-Adaptive Neural Networks for Quadruped Motion Control [Slides]
2020.03.27 Michela Paganini Empirical Observations in Pruned Networks & Tools for Reproducible Pruning Research
2020.03.20 Rapha Gontijo Lopes Affinity and Diversity: Quantifying Mechanisms of Data Augmentation [Slides] [Recording]
2020.03.13 Ian Thompson A Good View Is All You Need: Deep InfoMax (DIM) and Augmented Multiscale Deep InfoMax (AMDIM) [Slides] [Recording]
2020.02.28 Ashley Edwards Estimating Q(s, s’) with Deep Deterministic Dynamics Gradients [Slides]
2020.02.14 Xinchen Yan Conditional generative modeling and adversarial learning
2020.02.07 Yaroslav Bulatov einsum is all you need [Slides] [Recording]
2020.01.31 Rosanne Liu Selective Brain Damage: Measuring the Disparate Impact of Model Pruning
2020.01.24 Jeff Coggshall ReMixMatch and FixMatch
2020.01.17 Rosanne Liu Improving sample diversity of a pre-trained, class-conditional GAN by changing its class embeddings [Slides] [Recording]
2020.01.10 Zhuoyuan Chen Why Build an Assistant in Minecraft?
2019.11.22 Rosanne Liu On the “steerability” of generative adversarialnetworks [Slides] [Recording]
2019.11.15 Polina Binder Learning Deep Sigmoid Belief Networks with Data Augmentation
2019.11.08 Sanyam Kapoor Policy Search & Planning: Unifying Connections [1][2]
2019.11.01 Chris Olah Zoom in: Features and circuits as the basic unit of neural networks
2019.10.25 Renjie Liao Efficient Graph Generation with Graph Recurrent Attention Networks
2019.10.18 Nitish Shirish Keskar, Bryan McCann CTRL: A Conditional Transformer Language Model for Controllable Generation
2019.10.11 Subutai Ahmad Sparsity in the neocortex, and its implications for machine learning
2019.10.04 Eli Bingham Multiple Causes: A Causal Graphical View
2019.09.27 Xinyu Hu Learning Representations for Counterfactual Inference
2019.09.04 Jonathan Frankle The Latest Updates on the Lottery Ticket Hypothesis
2019.08.23 Ankit Jain Knowledge-aware Graph Neural Networks with Label Smoothness Regularization for Recommender Systems [Slides]
2019.08.16 Jiale Zhi Meta-Learning Neural Bloom Filters
2019.08.16 Ted Moskovitz Lookahead Optimizer: k steps forward, 1 step back
2019.07.26 Rui Wang Off-Policy Evaluation for Contextual Bandits and RL [1][2][3][4]
2019.07.19 Rosanne Liu Weight Agnostic Neural Networks [Slides] [Recording]
2019.07.12 Joost Huizinga A Distributional Perspective on Reinforcement Learning
2019.06.28 Ashley Edwards [ICML Preview] Learning Values and Policies from Observation [1][2]
2019.06.21 Stanislav Fořt [ICML Preview] Large Scale Structure of Neural Network Loss Landscapes
2019.06.07 Joey Bose [ICML Preview] Compositional Fairness Constraints for Graph Embeddings
2019.05.31 Yulun Li IntentNet: Learning to Predict Intention from Raw Sensor Data
2019.05.24 Thomas Miconi, Rosanne Liu, Janice Lan ICLR Recap, cont.
2019.05.17 Aditya Rawal, Jason Yosinski ICLR Recap
2019.04.26 JP Chen 3D-Aware Scene Manipulation via Inverse Graphics [Slides]
2019.04.19 Felipe Petroski Such Relational Deep Reinforcement Learning
2019.04.12 Piero Molino, Jason Yosinski Open mic
2019.04.05 Joel Lehman The copycat project: A model of mental fluidity and analogy-making
2019.03.29 Rosanne Liu Non-local Neural Networks [Slides]
2019.03.22 Yariv Sadan Learning deep representations by mutual information estimation and maximization [Slides]
2019.03.15 Chandra Khatri Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation
2019.03.01 Nikhil Dev Deshmudre BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
2019.02.22 Vashisht Madhavan Neural Turing Machines
2019.02.15 Open discussion GPT-2
2019.02.08 Adrien Ecoffet HyperNetworks
2019.02.01 Jiale Zhi Non-delusional Q-learning and value iteration
2019.01.25 Yulun Li Relational Recurrent Neural Networks
2019.01.18 Rui Wang Neural Ordinary Differential Equations
2019.01.11 Jonathan Simon Generating Humorous Portmanteaus using Word Embeddings [1][2] [Slides]
2018.12.21 Christian Perez Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles [Slides]
2018.12.14 Alexandros Papangelis Two trends in dialog [1][2]
2018.10.26 Aditya Rawal Stochastic Weight Averaging [Slides]
2018.10.12 Mahdi Namazifar Troubling Trends in Machine Learning Scholarship
2018.09.28 Yariv Sadan MINE: Mutual Information Neural Estimation [Slides]
2018.09.21 Jan-Matthis Lueckmann Glow and RealNVP [Slides]
2018.09.14 Jane Hung The YOLO series: v1, v2, v3
2018.09.07 Rosanne Liu Pooling is Neither Necessary nor Sufficient for Appropriate Deformation Stability in CNNs [Slides]
2018.08.31 Alican Bozkur Multimodal Unsupervised Image-to-Image Translation [Slides]
2018.08.24 Janice Lan The Lottery Ticket Hypothesis: Finding Small, Trainable Neural Networks [Slides]
2018.08.17 Yariv Sadan Opening the black box of Deep Neural Networks via Information [Slides]
2018.08.10 Joost Huizinga Learning to Reinforcement Learn, and RL 2: Fast Reinforcement Learning via Slow Reinforcement Learning [Slides]
2018.08.03 JP Chen Deep Convolutional Inverse Graphics Network [Slides]
2018.07.27 Lei Shu Attention is all you need [Slides]
2018.07.06 Neeraj Pradhan Auto encoding Variational Bayes, and ELBO
2018.06.29 Ankit Jain Dynamic Routing Between Capsules [Slides]
2018.06.22 Xinyu Hu Self Normalizing Neural Networks [Slides]
2018.06.15 John Sears The Decline and Fall of Adam: [1][2] [Slides]
2018.06.08 Alex Gajewski GANs, etc. [1][2] [Slides]
2018.06.01 Jason Yosinski Sensitivity and Generalization in Neural Networks: an Empirical Study [Slides]

What it is

“A super influential reading group that has achieved cult-like status.” —John Sears

Deep Learning: Classics and Trends (DLCT) is a reading group I have been running since 2018. It started within Uber AI Labs, with the support of Zoubin, Ken and Jason, and the help of many, when we felt the need of a space to sample the overwhelmingly large amount of papers, and to hold free-form, judgemental (JK), cozy discussions; or as Piero puts it, to “ask a million questions” without embarrassment.

Since then, it has grown much larger, first opened up to the broader machine learning community in Uber, then to the general public in 2019. Starting March 2020, in light of COVID-19, we hold all meetings virtually, making it radically accessible to anyone from anywhere. Starting June 2020, DLCT operates under ML Collective, with a mission of making researchers more connected.

In August 2020, I moved this page to be under MLC, and all future updates will be seen there.

This page here, is kept only for memory. 2+ years running this, to me it’s more than reading papers and listening to presentations. It has started to serve as an anchor for all of us to connect every once in a while amidst all the changes, shifts of emphasis, and chaos, in Bay Area, in AI research, and generally in this fast-paced world.

The best thing about it is the group of people that it enables to connect—seriously, the smartest and kindest researchers that I feel so lucky to have known and have worked with.


Wow you have scrolled all the way down here and are still reading?? Ok! Here’s more text about the scope and vision of this reading group.

  • Q: Why aren’t talks recorded?
    A: First and foremost, I want to create a safe, intimate and cozy space for all, where we can ask stupid questions, and hold honest discussions without much filtering; exposing and leaving everything you say permanently on the internet just won’t serve the purpose. Second, there are so much recorded content out there these days; I don’t feel like contributing to making the world even more crowded and overwhelming. And honestly, how often do you actually watch the things you said to yourself that you’re gonna watch? Third, I value real, in-person communications and events way more than scaling up and popularizing. I love the feeling of urgency that only real-time permits—the feeling that if you miss it, you’d actually miss it. Like everything else in your real life.

  • Q: What was the initial idea of organizing a reading group like this?
    A: It started with the rather selfish idea that I wanted to know about papers that I don’t have time to read, and learn about topics my individual intelligence limits me from fully understanding. Besides, I enjoy being around people that are smarter and more knowledgable than me, faster than me working out twelve math equations on one slide, braver than me to ask stupid questions, and more patient than me answering them, as well as those who value great presentations as much as I do.

  • Q: How much work is it for you?
    A: I never travel on Fridays now.

  • Q: Where do you see it going?
    A: I envision building a community where people work hard to tell science stories well. Each paper is a story. A great paper, apart from solid results and technical and scientific advances, stands out particularly in the way it tells the story. I hope we all value storytelling and talk-giving slightly more than we do now. This ties to an eventual wish that scientific writing moves towards being lucid and understandable. This reading group is a start.

    Here is how I see different levels of storytelling, in the format of an one-hour presentation, could happen in this group.

    You can give a Level 0 talk, which is going through someone else’s paper—the storyline is already there. This is perhaps the most basic and involves the least work: you just need to understand it and retell it to others. (I assume as a researcher you already read papers, and this additional work of making it into a presentation would only help you understand it better yourself.) And best of all, when the audience asks hard questions, you can just say “I don’t know—not my work.”

    A Level 1 talk, could mean presenting one of your own papers. The bar is higher because you are expected to know every detail of the project, but also lower because you probably already do. And a good background coverage to lead to the exact problem and idea always helps.

    Then we have Level 2 talks, which are usually a topic formed by understanding a field (however small it is) thoroughly well, and having in mind a hierarchical chart or spiderweb of a number of fields leading to that particular one. You might be citing multiple papers, drawing connections and coming up with conclusions that are mainly your own.

  • Q: Do you have a high bar for talks given there?
    A: Yes I do. But I also know we all have to start somewhere. And I myself was a horrible presenter not too long ago (likely still am). But we all get better.