[HTML][HTML] Deep learning in electron microscopy
JM Ede - Machine Learning: Science and Technology, 2021 - iopscience.iop.org
Deep learning is transforming most areas of science and technology, including electron
microscopy. This review paper offers a practical perspective aimed at developers with …
microscopy. This review paper offers a practical perspective aimed at developers with …
Multi-task self-supervised learning for robust speech recognition
Despite the growing interest in unsupervised learning, extracting meaningful knowledge
from unlabelled audio remains an open challenge. To take a step in this direction, we …
from unlabelled audio remains an open challenge. To take a step in this direction, we …
An improved analysis of stochastic gradient descent with momentum
SGD with momentum (SGDM) has been widely applied in many machine learning tasks, and
it is often applied with dynamic stepsizes and momentum weights tuned in a stagewise …
it is often applied with dynamic stepsizes and momentum weights tuned in a stagewise …
Understanding the role of training regimes in continual learning
Catastrophic forgetting affects the training of neural networks, limiting their ability to learn
multiple tasks sequentially. From the perspective of the well established plasticity-stability …
multiple tasks sequentially. From the perspective of the well established plasticity-stability …
How Many Pretraining Tasks Are Needed for In-Context Learning of Linear Regression?
Transformers pretrained on diverse tasks exhibit remarkable in-context learning (ICL)
capabilities, enabling them to solve unseen tasks solely based on input contexts without …
capabilities, enabling them to solve unseen tasks solely based on input contexts without …
On the generalization benefit of noise in stochastic gradient descent
It has long been argued that minibatch stochastic gradient descent can generalize better
than large batch gradient descent in deep neural networks. However recent papers have …
than large batch gradient descent in deep neural networks. However recent papers have …
Closing the generalization gap of adaptive gradient methods in training deep neural networks
Adaptive gradient methods, which adopt historical gradient information to automatically
adjust the learning rate, despite the nice property of fast convergence, have been observed …
adjust the learning rate, despite the nice property of fast convergence, have been observed …
Last-iterate convergence: Zero-sum games and constrained min-max optimization
C Daskalakis, I Panageas - arXiv preprint arXiv:1807.04252, 2018 - arxiv.org
Motivated by applications in Game Theory, Optimization, and Generative Adversarial
Networks, recent work of Daskalakis et al\cite {DISZ17} and follow-up work of Liang and …
Networks, recent work of Daskalakis et al\cite {DISZ17} and follow-up work of Liang and …
Last iterate is slower than averaged iterate in smooth convex-concave saddle point problems
In this paper we study the smooth convex-concave saddle point problem. Specifically, we
analyze the last iterate convergence properties of the Extragradient (EG) algorithm. It is well …
analyze the last iterate convergence properties of the Extragradient (EG) algorithm. It is well …
Generalized Polyak step size for first order optimization with momentum
In machine learning applications, it is well known that carefully designed learning rate (step
size) schedules can significantly improve the convergence of commonly used first-order …
size) schedules can significantly improve the convergence of commonly used first-order …