Topics in Deep Learning
This topics course aims to present the mathematical, statistical and computational challenges of building stable representations for high-dimensional data, such as images, text and audio. We will delve into selected topics of Deep Learning, discussing recent models from both supervised and unsupervised learning. Special emphasis will be on convolutional architectures, invariance learning, unsupervised learning and non-convex optimization.
Detailed Syllabus and Lectures
Lec1: Intro and Logistics
-
Lec2: Representations for Recognition : stability, variability. Kernel approaches / Feature extraction.
- Elements of Statistical Learning, chapt. 12, Hastie, Tibshirani, Friedman.
-
Lec3: Groups, Invariants and Filters.
-
Lec4: Scattering Convolutional Networks.
further reading
-
Lec5: Further Scattering: Properties and Extensions.
-
Lec6: Convolutional Neural Networks: Geometry and first Properties.
- Deep Learning Y. LeCun, Bengio & Hinton.
- Understanding Deep Convolutional Networks, S. Mallat.
-
Lec7: Properties of learnt CNN representations: Covariance and Invariance, redundancy, invertibility.
- Deep Neural Networks with Random Gaussian Weights: A universal Classification Strategy?, R. Giryes, G. Sapiro, A. Bronstein.
- Intriguing Properties of Neural Networks C. Szegedy et al.
- Geodesics of Learnt Representations O. Henaff & E. Simoncelli.
- Inverting Visual Representations with Convolutional Networks, A. Dosovitskiy, T. Brox.
- Visualizing and Understanding Convolutional Networks M. Zeiler, R. Fergus.
-
Lec8: Connections with other models (Dict. Learning, Random Forests)
- Proximal Splitting Methods in Signal Processing Combettes & Pesquet.
- A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems Beck & Teboulle
- Learning Fast Approximations of Sparse Coding K. Gregor & Y. LeCun
- Task Driven Dictionary Learning J. Mairal, F. Bach, J. Ponce
- Exploiting Generative Models in Discriminative Classifiers T. Jaakkola & D. Haussler
- Improving the Fisher Kernel for Large-Scale Image Classification F. Perronnin et al.
- NetVLAD R. Arandjelovic et al.
-
Lec9: Other high level tasks: localization, regression, embedding, inverse problems.
- Object Detection with Discriminatively Trained Deformable Parts Model Felzenswalb, Girshick, McAllester and Ramanan, PAMI'10
- Deformable Parts Models are Convolutional Neural Networks, Girshick, Iandola, Darrel and Malik, CVPR'15.
- Rich Feature Hierarchies for accurate object detection and semantic segmentation Girshick, Donahue, Darrel and Malik, PAMI'14.
- Graphical Models, message-passing algorithms and convex optimization M. Wainwright.
- Conditional Random Fields as Recurrent Neural Networks Zheng et al, ICCV'15
- Joint Training of a Convolutional Network and a Graphical Model for Human Pose Estimation Tompson, Jain, LeCun and Bregler, NIPS'14.
-
Lec10: Extensions to non-Euclidean domain. Representations of stationary processes. Properties.
- Dimensionality Reduction by Learning an Invariant Mapping Hadsell, Chopra, LeCun,'06.
- Deep Metric Learning via Lifted Structured Feature Embedding Oh Song, Xiang, Jegelka, Savarese,'15.
- Spectral Networks and Locally Connected Networks on Graphs Bruna, Szlam, Zaremba, LeCun,'14.
- Spatial Transformer Networks Jaderberg, Simonyan, Zisserman, Kavukcuoglu,'15.
- Intermittent Process Analysis with Scattering Moments Bruna, Mallat, Bacry, Muzy,'14.
Lec11: Guest Lecture ( W. Zaremba, OpenAI ) Discrete Neural Turing Machines.
-
Lec12: Representations of Stationary Processes (contd). Sequential Data: Recurrent Neural Networks.
- Intermittent Process Analysis with Scattering Moments J.B., Mallat, Bacry and Muzy, Annals of Statistics,'13.
- A mathematical motivation for complex-valued convolutional networks Tygert et al., Neural Computation'16.
- Texture Synthesis Using Convolutional Neural Networks Gatys, Ecker, Betghe, NIPS'15.
- A Neural Algorithm of Artistic Style, Gatys, Ecker, Betghe, '15.
- Time Series Analysis and its Applications Shumway, Stoffer, Chapter 6.
- Deep Learning Goodfellow, Bengio, Courville,'16. Chapter 10.
-
Lec13: Recurrent Neural Networks (contd). Long Short Term Memory. Applications.
- Deep Learning Goodfellow, Bengio, Courville,'16. Chapter 10.
- Generating Sequences with Recurrent Neural Networks A. Graves.
- The Unreasonable Effectiveness of Recurrent Neural Networks A. Karpathy
- The Unreasonable effectiveness of Character-level Language Models Y. Goldberg
-
Lec14: Unsupervised Learning: Curse of dimensionality, Density estimation. Graphical Models, Latent Variable models.
- Describing Multimedia Content Using Attention-based Encoder-Decoder Networks K. Cho, A. Courville, Y. Bengio
- Graphical Models, Exponential Families and Variational Inference M. Wainwright, M. Jordan.
-
Lec15: Autoencoders. Variational Inference. Variational Autoencoders.
- Graphical Models, Exponential Families and Variational Inference, chapter 3 M. Wainwright, M. Jordan.
- Variational Inference with Stochastic Search J.Paisley, D. Blei, M.Jordan.
- Stochastic Variational Inference M. Hoffman, D. Blei, Wang, Paisley.
- Auto-Encoding Variational Bayes, Kingma & Welling.
- Stochastic Backpropagation and variational inference in deep latent gaussian models D. Rezende, S. Mohamed, D. Wierstra.
-
Lec16: Variational Autoencoders (contd). Normalizing Flows. Generative Adversarial Networks.
- Semi-supervised learning with Deep generative models Kingma, Rezende, Mohamed, Welling.
- Importance Weighted Autoencoders Burda, Grosse, Salakhutdinov.
- Variational Inference with Normalizing Flows Rezende, Mohamed.
- Unsupervised Learning using Nonequilibrium Thermodynamics Sohl-Dickstein et al.
- Generative Adversarial Networks, Goodfellow et al.
-
Lec17: Generative Adversarial Networks (contd).
- Generative Adversarial Networks, Goodfellow et al.
- Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks Denton, Chintala, Szlam, Fergus.
- Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks Radford, Metz, Chintala.
-
Lec18: Maximum Entropy Distributions. Self-supervised models (analogies, video prediction, text, word2vec).
- Graphical Models, Exponential Families and Variational Inference, chapter 3 M. Wainwright, M. Jordan.
- An Introduction to MCMC for Machine Learning Andrieu, de Freitas, Doucet, Jordan.
- Stochastic relaxation, Gibbs distributions and the Bayesian Restoration of Images Geman & Geman.
- Distributed Representations of Words and Phrases and their compositionality Mikolov et al.
- word2vec Explained: deriving Mikolov et al's negative-sampling embedding method Goldberg & Levy.
-
Lec19: Self-supervised models (contd). Non-convex Optimization. Stochastic Optimization.
- Pixel Recurrent Neural Networks A. van den Oord, N. Kalchbrenner, K. Kavukcuoglu.
- The tradeoffs of Large Scale Learning Bottou, Bousquet.
- Introduction to Statistical Learning Theory Bousquet, Boucheron, Lugosi.
Lec20: Guest Lecture (S. Chintala, Facebook AI Research), "The Adversarial Network Nonsense".
-
Lec21: Accelerated Gradient Descent, Regularization, Dropout.
- Convex Optimization: Algorithms and Complexity S. Bubeck
- Optimization, Simons Big Data Boot Camp B. Recht
- The Zen of Gradient Descent M. Hardt.
- Train Faster, Generalize Better: Stability of Stochastic Gradient Descent M. Hardt, B. Recht, Y. Singer.
- Dropout: a simple way to prevent neural networks from Overfitting Srivastava, Hinton et al.
-
Lec22: Dropout (contd). Batch Normalization, Tensor Decompositions.
- Dropout Training as Adaptive Regularization Wager, Wang, Liang.
- Batch Normalization: accelerating Deep Network Training by Reducing internal covariate shift Ioffe, Szegedy.
- Global Optimality in Tensor Factorization, Deep Learning and Beyond Haefflele, Vidal.
Lec23 Guest Lecture (Yann Dauphin, Facebook AI Research), "Optimizing Deep Nets".
-
Lec24: Tensor Decompositions (contd), Spin Glasses.
- On the expressive power of Deep Learning: a tensor analysis Cohen, Sharir, Shashua.
- Beating the Perils of non-convexity: Guaranteed Training of Neural Networks using Tensor methods Janzamin, Sedghi, Anandkumar.
- The Loss Surfaces of Multilayer Networks Choromaska, Henaff, Mathieu, Ben Arous, LeCun.