Efficient algorithms for learning from coarse labels
D Fotakis, A Kalavasis, V Kontonis… - … on Learning Theory, 2021 - proceedings.mlr.press
For many learning problems one may not have access to fine grained label information; eg,
an image can be labeled as husky, dog, or even animal depending on the expertise of the …
an image can be labeled as husky, dog, or even animal depending on the expertise of the …
Sgd learns one-layer networks in wgans
Generative adversarial networks (GANs) are a widely used framework for learning
generative models. Wasserstein GANs (WGANs), one of the most successful variants of …
generative models. Wasserstein GANs (WGANs), one of the most successful variants of …
Learning (very) simple generative models is hard
Motivated by the recent empirical successes of deep generative models, we study the
computational complexity of the following unsupervised learning problem. For an unknown …
computational complexity of the following unsupervised learning problem. For an unknown …
A modular analysis of provable acceleration via polyak's momentum: Training a wide relu network and a deep linear network
Incorporating a so-called “momentum” dynamic in gradient descent methods is widely used
in neural net training as it has been broadly observed that, at least empirically, it often leads …
in neural net training as it has been broadly observed that, at least empirically, it often leads …
Learning a 1-layer conditional generative model in total variation
A conditional generative model is a method for sampling from a conditional distribution $ p
(y\mid x) $. For example, one may want to sample an image of a cat given the label``cat''. A …
(y\mid x) $. For example, one may want to sample an image of a cat given the label``cat''. A …
Learning polynomial transformations via generalized tensor decompositions
We consider the problem of learning high dimensional polynomial transformations of
Gaussians. Given samples of the form f (x), where x∼ N (0, I r) is hidden and f: ℝ r→ ℝ d is a …
Gaussians. Given samples of the form f (x), where x∼ N (0, I r) is hidden and f: ℝ r→ ℝ d is a …
Improved linear convergence of training cnns with generalizability guarantees: A one-hidden-layer case
We analyze the learning problem of one-hidden-layer nonoverlapping convolutional neural
networks with the rectified linear unit (ReLU) activation function from the perspective of …
networks with the rectified linear unit (ReLU) activation function from the perspective of …
Lower bounds on the total variation distance between mixtures of two gaussians
Mixtures of high dimensional Gaussian distributions have been studied extensively in
statistics and learning theory. While the total variation distance appears naturally in the …
statistics and learning theory. While the total variation distance appears naturally in the …
Agnostic learning of general relu activation using gradient descent
We provide a convergence analysis of gradient descent for the problem of agnostically
learning a single ReLU function under Gaussian distributions. Unlike prior work that studies …
learning a single ReLU function under Gaussian distributions. Unlike prior work that studies …
A mathematical framework for learning probability distributions
H Yang - arXiv preprint arXiv:2212.11481, 2022 - arxiv.org
The modeling of probability distributions, specifically generative modeling and density
estimation, has become an immensely popular subject in recent years by virtue of its …
estimation, has become an immensely popular subject in recent years by virtue of its …