Beyond lipschitz: Sharp generalization and excess risk bounds for full-batch gd

P Deora, R Ghaderi, H Taheri… - arXiv preprint arXiv …, 2023 - arxiv.org

The training and generalization dynamics of the Transformer's core mechanism, namely the
Attention mechanism, remain under-explored. Besides, existing analyses primarily focus on …

被引用次数：30 相关文章所有 3 个版本

[PDF] imstat.org

Emerging Directions in Bayesian Computation

S Winter, T Campbell, L Lin, S Srivastava… - Statistical …, 2024 - projecteuclid.org

Bayesian models are powerful tools for studying complex data, allowing the analyst to
encode rich hierarchical dependencies and leverage prior information. Most importantly …

被引用次数：3 相关文章所有 3 个版本

[PDF] nsf.gov

Deep neural networks for parameterized homogenization in concurrent multiscale structural optimization

N Black, AR Najafi - Structural and Multidisciplinary Optimization, 2023 - Springer

Concurrent multiscale structural optimization is concerned with the improvement of
macroscale structural performance through the design of microscale architectures. The …

被引用次数：13 相关文章所有 6 个版本

[PDF] neurips.cc

Stability and generalization analysis of gradient methods for shallow neural networks

Y Lei, R Jin, Y Ying - Advances in Neural Information …, 2022 - proceedings.neurips.cc

While significant theoretical progress has been achieved, unveiling the generalization
mystery of overparameterized neural networks still remains largely elusive. In this paper, we …

被引用次数：15 相关文章所有 9 个版本

[PDF] neurips.cc

Learning trajectories are generalization indicators

J Fu, Z Zhang, D Yin, Y Lu… - Advances in Neural …, 2024 - proceedings.neurips.cc

This paper explores the connection between learning trajectories of Deep Neural Networks
(DNNs) and their generalization capabilities when optimized using (stochastic) gradient …

被引用次数：2 相关文章所有 7 个版本

[PDF] arxiv.org

Machine learning and the future of bayesian computation

S Winter, T Campbell, L Lin, S Srivastava… - arXiv preprint arXiv …, 2023 - arxiv.org

Bayesian models are a powerful tool for studying complex data, allowing the analyst to
encode rich hierarchical dependencies and leverage prior information. Most importantly …

被引用次数：7 相关文章所有 4 个版本

[PDF] neurips.cc

Toward better PAC-bayes bounds for uniformly stable algorithms

S Zhou, Y Lei, A Kabán - Advances in Neural Information …, 2023 - proceedings.neurips.cc

We give sharper bounds for uniformly stable randomized algorithms in a PAC-Bayesian
framework, which improve the existing results by up to a factor of $\sqrt {n} $(ignoring a log …

被引用次数：1 相关文章所有 6 个版本

[PDF] arxiv.org

Generalization error bounds for iterative learning algorithms with bounded updates

J Fu, N Zheng - arXiv preprint arXiv:2309.05077, 2023 - arxiv.org

This paper explores the generalization characteristics of iterative learning algorithms with
bounded updates for non-convex loss functions, employing information-theoretic …

被引用次数：1 相关文章所有 3 个版本

[PDF] aaai.org

Towards Stability and Generalization Bounds in Decentralized Minibatch Stochastic Gradient Descent

J Wang, H Chen - Proceedings of the AAAI Conference on Artificial …, 2024 - ojs.aaai.org

Decentralized Stochastic Gradient Descent (D-SGD) represents an efficient communication
approach tailored for mastering insights from vast, distributed datasets. Inspired by parallel …

被引用次数：1 相关文章

[PDF] openreview.net

Sharper Bounds for Uniformly Stable Algorithms with Stationary Mixing Process

S Fu, Y Lei, Q Cao, X Tian, D Tao - The Eleventh International …, 2023 - openreview.net

Generalization analysis of learning algorithms often builds on a critical assumption that
training examples are independently and identically distributed, which is often violated in …

被引用次数：4 相关文章所有 2 个版本