Advanced statistical physics

3 CFU –Pietro Rotondo

1. Introduction to supervised learning: neural networks as a variational ansatz,
generalisation, loss function (classification and regression), gradient descent
and backpropagation algorithm.
2. The problem of generalisation in overparametrised neural networks: bias-
variance tradeoff. One basic (but fundamental) theorem in Statistical Learning Theory.
3. The building block of a Neural Network: the perceptron model. Geometrical
interpretation. Linear programming (basics). Storage capacity (definition).
4. Analytical evaluation of the storage capacity: the combinatorial approach
(Cover’s Theorem) and the statistical physics approach (Gardner Volume).
The replica trick. Gaussian integrals and the saddle-point method. The
teacher-student paradigm for generalisation.

5. Intermezzo: the Hopfield model for associative memories. Replica computation of

the storage capacity and spin glass phase transition.

6. Enhancing the expressivity of a perceptron: polynomial mappings.
Geometrical interpretation. Support vector machines. Wolfe dual equivalent
learning problem and non-linear kernels.
7. State-of-the-art architectures for machine learning: deep neural networks.
Basic facts. Open problems in deep learning theory: the problem of
generalisation in overparametrised models. Loss landscape. Learning
algortihms.
8. One analytically solvable limit of deep learning: the infinite-width limit.
Gaussian processes. Gradient descent and neural tangent kernel. Bayesian
learning and NNGP kernel. Physical implications for feature learning.
9. Beyond the infinite-width limit: statistical physics of deep linear networks.
Why the problem is trivial. Why the problem is interesting. An exact
calculation for one-hidden-layer linear networks. Beyond linear networks.