Dear all,
Heads up: Umut Şimşekli's in person talk at the UvA is today:
Umut Şimşekli (INRIA/École Normale Supérieure, https://www.di.ens.fr/~simsekli/)
Monday November 14, 16h00-17h00
In person, at the University of Amsterdam
Location: Science Park 904, Room A1.24
Fractal Structure and Generalization Properties of Stochastic
Optimization Algorithms
Understanding generalization in deep learning has been one of the
major challenges in statistical learning theory over the last
decade. While recent work has illustrated that the dataset and the
training algorithm must be taken into account in order to obtain
meaningful generalization bounds, it is still theoretically not
clear which properties of the data and the algorithm determine the
generalization performance. In this talk, I will approach this
problem from a dynamical systems theory perspective and represent
stochastic optimization algorithms as random iterated function
systems (IFS). Well studied in the dynamical systems literature,
under mild assumptions, such IFSs can be shown to be ergodic with an
invariant measure that is often supported on sets with a fractal
structure. We will prove that the generalization error of a
stochastic optimization algorithm can be bounded based on the
‘complexity’ of the fractal structure that underlies its invariant
measure. Leveraging results from dynamical systems theory, we will
show that the generalization error can be explicitly linked to the
choice of the algorithm (e.g., stochastic gradient descent – SGD),
algorithm hyperparameters (e.g., step-size, batch-size), and the
geometry of the problem (e.g., Hessian of the loss). We will further
specialize our results to specific problems (e.g., linear/logistic
regression, one hidden-layered neural networks) and algorithms
(e.g., SGD and preconditioned variants), and obtain analytical
estimates for our bound. For modern neural networks, we will develop
an efficient algorithm to compute the developed bound and support
our theory with various experiments on neural networks.
The talk is based on the following publication:
Camuto, A., Deligiannidis, G., Erdogdu, M. A., Gurbuzbalaban, M.,
Simsekli, U., & Zhu, L. (2021). Fractal structure and
generalization properties of stochastic optimization algorithms.
Advances in Neural Information Processing Systems, 34, 18774-18788.
Seminar organizers:
Tim van Erven
Botond Szabo
https://mschauer.github.io/StructuresSeminar/
--
Tim van Erven <tim@timvanerven.nl>
www.timvanerven.nl