An overview of multi-agent reinforcement learning from game theoretical perspective
Y Yang, J Wang - arXiv preprint arXiv:2011.00583, 2020 - arxiv.org
Following the remarkable success of the AlphaGO series, 2019 was a booming year that
witnessed significant advances in multi-agent reinforcement learning (MARL) techniques …
witnessed significant advances in multi-agent reinforcement learning (MARL) techniques …
Propagation of chaos: a review of models, methods and applications. I. Models and methods
LP Chaintron, A Diez - arXiv preprint arXiv:2203.00446, 2022 - arxiv.org
The notion of propagation of chaos for large systems of interacting particles originates in
statistical physics and has recently become a central notion in many areas of applied …
statistical physics and has recently become a central notion in many areas of applied …
Deep learning: a statistical viewpoint
The remarkable practical success of deep learning has revealed some major surprises from
a theoretical perspective. In particular, simple gradient methods easily find near-optimal …
a theoretical perspective. In particular, simple gradient methods easily find near-optimal …
The shaped transformer: Attention models in the infinite depth-and-width limit
In deep learning theory, the covariance matrix of the representations serves as aproxy to
examine the network's trainability. Motivated by the success of Transform-ers, we study the …
examine the network's trainability. Motivated by the success of Transform-ers, we study the …
Wide neural networks of any depth evolve as linear models under gradient descent
A longstanding goal in deep learning research has been to precisely characterize training
and generalization. However, the often complex loss landscapes of neural networks have …
and generalization. However, the often complex loss landscapes of neural networks have …
Fine-grained analysis of optimization and generalization for overparameterized two-layer neural networks
Recent works have cast some light on the mystery of why deep nets fit any data and
generalize despite being very overparametrized. This paper analyzes training and …
generalize despite being very overparametrized. This paper analyzes training and …
[HTML][HTML] Surprises in high-dimensional ridgeless least squares interpolation
Interpolators—estimators that achieve zero training error—have attracted growing attention
in machine learning, mainly because state-of-the art neural networks appear to be models of …
in machine learning, mainly because state-of-the art neural networks appear to be models of …
Gradient descent finds global minima of deep neural networks
Gradient descent finds a global minimum in training deep neural networks despite the
objective function being non-convex. The current paper proves gradient descent achieves …
objective function being non-convex. The current paper proves gradient descent achieves …
On the global convergence of gradient descent for over-parameterized models using optimal transport
Many tasks in machine learning and signal processing can be solved by minimizing a
convex function of a measure. This includes sparse spikes deconvolution or training a …
convex function of a measure. This includes sparse spikes deconvolution or training a …
A mean field view of the landscape of two-layer neural networks
Multilayer neural networks are among the most powerful models in machine learning, yet the
fundamental reasons for this success defy mathematical understanding. Learning a neural …
fundamental reasons for this success defy mathematical understanding. Learning a neural …