Learning distributions generated by one-layer ReLU networks

D Fotakis, A Kalavasis, V Kontonis… - … on Learning Theory, 2021 - proceedings.mlr.press

For many learning problems one may not have access to fine grained label information; eg,
an image can be labeled as husky, dog, or even animal depending on the expertise of the …

被引用次数：18 相关文章所有 4 个版本

[PDF] mlr.press

Sgd learns one-layer networks in wgans

Q Lei, J Lee, A Dimakis… - … Conference on Machine …, 2020 - proceedings.mlr.press

Generative adversarial networks (GANs) are a widely used framework for learning
generative models. Wasserstein GANs (WGANs), one of the most successful variants of …

被引用次数：44 相关文章所有 11 个版本

[PDF] neurips.cc

Learning (very) simple generative models is hard

S Chen, J Li, Y Li - Advances in Neural Information …, 2022 - proceedings.neurips.cc

Motivated by the recent empirical successes of deep generative models, we study the
computational complexity of the following unsupervised learning problem. For an unknown …

被引用次数：11 相关文章所有 5 个版本

[PDF] mlr.press

A modular analysis of provable acceleration via polyak's momentum: Training a wide relu network and a deep linear network

JK Wang, CH Lin, JD Abernethy - … Conference on Machine …, 2021 - proceedings.mlr.press

Incorporating a so-called “momentum” dynamic in gradient descent methods is widely used
in neural net training as it has been broadly observed that, at least empirically, it often leads …

被引用次数：23 相关文章所有 5 个版本

[PDF] neurips.cc

Learning a 1-layer conditional generative model in total variation

A Jalal, J Kang, A Uppal… - Advances in Neural …, 2024 - proceedings.neurips.cc

A conditional generative model is a method for sampling from a conditional distribution $ p
(y\mid x) $. For example, one may want to sample an image of a cat given the label``cat''. A …

Learning polynomial transformations via generalized tensor decompositions

S Chen, J Li, Y Li, AR Zhang - Proceedings of the 55th Annual ACM …, 2023 - dl.acm.org

We consider the problem of learning high dimensional polynomial transformations of
Gaussians. Given samples of the form f (x), where x∼ N (0, I r) is hidden and f: ℝ r→ ℝ d is a …

被引用次数：4 相关文章所有 2 个版本

[PDF] google.com

Improved linear convergence of training cnns with generalizability guarantees: A one-hidden-layer case

S Zhang, M Wang, J Xiong, S Liu… - IEEE Transactions on …, 2020 - ieeexplore.ieee.org

We analyze the learning problem of one-hidden-layer nonoverlapping convolutional neural
networks with the rectified linear unit (ReLU) activation function from the perspective of …

被引用次数：19 相关文章所有 8 个版本

[PDF] mlr.press

Lower bounds on the total variation distance between mixtures of two gaussians

S Davies, A Mazumdar, S Pal… - International …, 2022 - proceedings.mlr.press

Mixtures of high dimensional Gaussian distributions have been studied extensively in
statistics and learning theory. While the total variation distance appears naturally in the …

被引用次数：9 相关文章所有 4 个版本

[PDF] arxiv.org

Agnostic learning of general relu activation using gradient descent

P Awasthi, A Tang, A Vijayaraghavan - arXiv preprint arXiv:2208.02711, 2022 - arxiv.org

We provide a convergence analysis of gradient descent for the problem of agnostically
learning a single ReLU function under Gaussian distributions. Unlike prior work that studies …

被引用次数：6 相关文章所有 3 个版本

[PDF] arxiv.org

A mathematical framework for learning probability distributions

H Yang - arXiv preprint arXiv:2212.11481, 2022 - arxiv.org

The modeling of probability distributions, specifically generative modeling and density
estimation, has become an immensely popular subject in recent years by virtue of its …

被引用次数：5 相关文章所有 5 个版本