Realistic images concentrate on a low-dimensional manifold in input space.
How does the structure of real-world data impact learning in neural networks? This is a key challenge for the theory of deep learning. We started addressing this question by introducing a model for structured data sets that we call the “hidden manifold model” (HMM) . The HMM is a statistical model for generating high-dimensional data from low-dimensional inputs. We showed experimentally that the HMM reproduces some effects seen when training neural networks on practical data sets for image classification. We also introduced analytical techniques [1,2] which allow us to study the dynamics and the performance of two-layer neural networks analytically.
SG, B. Loureiro, G. Reeves, M. Mézard, F. Krzakala, and L. Zdeborová The Gaussian equivalence of generative models for learning with two-layer neural networks arXiv:2006.14709
The learning algorithm shapes the path of neural networks in the losslandscape. Image courtesy of S. d'Ascoli
Understanding the dynamics of learning in neural networks is a key step in understanding their performance. We are interested in the analysis of standard algorithms to train neural networks, i.e. the backpropagation algorithm , but we are increasingly interested in studying alternatives to backprop, such as feedback alignment algorithms .
M. Refinetti, S. d’Ascoli, R. Ohana, SG The dynamics of learning with feedback alignment arXiv:2011.12428
SG, M.S. Advani, A.M. Saxe, F. Krzakala, L. Zdeborová Dynamics of stochastic gradient descent for two-layer neural networks in the teacher-student setup Advances in Neural Information Processing (NeurIPS) 6979-6989 (2019)arXiv:1906.08632
Energetic efficiency of learning
Video abstact on the energetic efficiency of learning. Click to play!
Every organism needs to gather information about its noisy environment and build models from that data to support future decisions. Information processing, however, comes at a thermodynamic cost. Stochastic thermodynamics is a powerful framework to analyse this interplay between information processing and dissipation in small, fluctuating systems far from equilibrium. We analysed simple models of neural networks and showed that the total entropy production of a network, i.e. its dissipation, bounds the information it can infer from data or learn from a teacher.