Hidden convexity of wasserstein GANs: Interpretable generative models with closed-form solutions

A Sahiner, T Ergen, B Ozturkler… - International …, 2022 - proceedings.mlr.press

Vision transformers using self-attention or its proposed alternatives have demonstrated
promising results in many image related tasks. However, the underpinning inductive bias of …

被引用次数：27 相关文章所有 9 个版本

[PDF] mlr.press

Fast convex optimization for two-layer relu networks: Equivalent model classes and cone decompositions

A Mishkin, A Sahiner, M Pilanci - … Conference on Machine …, 2022 - proceedings.mlr.press

We develop fast algorithms and robust software for convex optimization of two-layer neural
networks with ReLU activation functions. Our work leverages a convex re-formulation of the …

被引用次数：28 相关文章所有 4 个版本

[PDF] mlr.press

Global optimality beyond two layers: Training deep relu networks via convex programs

T Ergen, M Pilanci - International Conference on Machine …, 2021 - proceedings.mlr.press

Understanding the fundamental mechanism behind the success of deep neural networks is
one of the key challenges in the modern machine learning literature. Despite numerous …

被引用次数：36 相关文章所有 6 个版本

[PDF] arxiv.org

Vector-output relu neural network problems are copositive programs: Convex analysis of two layer networks and polynomial-time algorithms

A Sahiner, T Ergen, J Pauly, M Pilanci - arXiv preprint arXiv:2012.13329, 2020 - arxiv.org

We describe the convex semi-infinite dual of the two-layer vector-output ReLU neural
network training problem. This semi-infinite dual admits a finite dimensional representation …

被引用次数：40 相关文章所有 6 个版本

[PDF] arxiv.org

Demystifying batch normalization in relu networks: Equivalent convex optimization models and implicit regularization

T Ergen, A Sahiner, B Ozturkler, J Pauly… - arXiv preprint arXiv …, 2021 - arxiv.org

Batch Normalization (BN) is a commonly used technique to accelerate and stabilize training
of deep neural networks. Despite its empirical success, a full theoretical understanding of …

被引用次数：31 相关文章所有 6 个版本

[PDF] mlr.press

A neural tangent kernel perspective of GANs

JY Franceschi, E De Bézenac, I Ayed… - International …, 2022 - proceedings.mlr.press

We propose a novel theoretical framework of analysis for Generative Adversarial Networks
(GANs). We reveal a fundamental flaw of previous analyses which, by incorrectly modeling …

被引用次数：23 相关文章所有 18 个版本

[PDF] neurips.cc

Path regularization: A convexity and sparsity inducing regularization for parallel relu networks

T Ergen, M Pilanci - Advances in Neural Information …, 2024 - proceedings.neurips.cc

Understanding the fundamental principles behind the success of deep neural networks is
one of the most important open questions in the current literature. To this end, we study the …

被引用次数：17 相关文章所有 10 个版本

[PDF] arxiv.org

Globally optimal training of neural networks with threshold activation functions

T Ergen, HI Gulluk, J Lacotte, M Pilanci - arXiv preprint arXiv:2303.03382, 2023 - arxiv.org

Threshold activation functions are highly preferable in neural networks due to their efficiency
in hardware implementations. Moreover, their mode of operation is more interpretable and …

被引用次数：10 相关文章所有 3 个版本

[PDF] neurips.cc

Fixing the NTK: from neural network linearizations to exact convex programs

RV Dwaraknath, T Ergen… - Advances in Neural …, 2024 - proceedings.neurips.cc

Recently, theoretical analyses of deep neural networks have broadly focused on two
directions: 1) Providing insight into neural network training by SGD in the limit of infinite …

Parallel deep neural networks have zero duality gap

Y Wang, T Ergen, M Pilanci - arXiv preprint arXiv:2110.06482, 2021 - arxiv.org

Training deep neural networks is a challenging non-convex optimization problem. Recent
work has proven that the strong duality holds (which means zero duality gap) for regularized …

被引用次数：11 相关文章所有 6 个版本