Deep convolutional networks are hierarchical kernel machines

T Poggio, H Mhaskar, L Rosasco, B Miranda… - International Journal of …, 2017 - Springer

The paper reviews and extends an emerging body of theoretical results on deep learning
including the conditions under which it can be exponentially better than shallow learning. A …

被引用次数：730 相关文章所有 22 个版本

[PDF] nowpublishers.com

Tensor networks for dimensionality reduction and large-scale optimization: Part 2 applications and future perspectives

A Cichocki, AH Phan, Q Zhao, N Lee… - … and Trends® in …, 2017 - nowpublishers.com

Part 2 of this monograph builds on the introduction to tensor networks and their operations
presented in Part 1. It focuses on tensor network models for super-compressed higher-order …

被引用次数：296 相关文章所有 10 个版本

[PDF] mlr.press

On the expressive power of deep learning: A tensor analysis

N Cohen, O Sharir, A Shashua - Conference on learning …, 2016 - proceedings.mlr.press

It has long been conjectured that hypotheses spaces suitable for data that is compositional
in nature, such as text or images, may be more efficiently represented with deep hierarchical …

被引用次数：572 相关文章所有 11 个版本

[PDF] arxiv.org

Simple recurrent units for highly parallelizable recurrence

T Lei, Y Zhang, SI Wang, H Dai, Y Artzi - arXiv preprint arXiv:1709.02755, 2017 - arxiv.org

Common recurrent neural architectures scale poorly due to the intrinsic difficulty in
parallelizing their state computations. In this work, we propose the Simple Recurrent Unit …

被引用次数：338 相关文章所有 5 个版本

[PDF] pnas.org Full View

Theoretical issues in deep networks

T Poggio, A Banburski, Q Liao - Proceedings of the …, 2020 - National Acad Sciences

While deep learning is successful in a number of applications, it is not yet well understood
theoretically. A theoretical characterization of deep learning should answer questions about …

被引用次数：215 相关文章所有 11 个版本

[PDF] neurips.cc

Toward deeper understanding of neural networks: The power of initialization and a dual view on expressivity

A Daniely, R Frostig, Y Singer - Advances in neural …, 2016 - proceedings.neurips.cc

We develop a general duality between neural networks and compositional kernel Hilbert
spaces. We introduce the notion of a computation skeleton, an acyclic graph that succinctly …

被引用次数：388 相关文章所有 9 个版本

[PDF] openreview.net

Training rnns as fast as cnns

T Lei, Y Zhang, Y Artzi - 2018 - openreview.net

Common recurrent neural network architectures scale poorly due to the intrinsic difficulty in
parallelizing their state computations. In this work, we propose the Simple Recurrent Unit …

被引用次数：202 相关文章

[PDF] neurips.cc

SGD learns the conjugate kernel class of the network

A Daniely - Advances in neural information processing …, 2017 - proceedings.neurips.cc

We show that the standard stochastic gradient decent (SGD) algorithm is guaranteed to
learn, in polynomial time, a function that is competitive with the best function in the conjugate …

被引用次数：196 相关文章所有 6 个版本

[PDF] mlr.press

Deriving neural architectures from sequence and graph kernels

T Lei, W Jin, R Barzilay… - … Conference on Machine …, 2017 - proceedings.mlr.press

The design of neural architectures for structured objects is typically guided by experimental
insights rather than a formal process. In this work, we appeal to kernels over combinatorial …

被引用次数：150 相关文章所有 9 个版本

[PDF] arxiv.org

Deep randomized neural networks

C Gallicchio, S Scardapane - Recent Trends in Learning From Data …, 2020 - Springer

Abstract Randomized Neural Networks explore the behavior of neural systems where the
majority of connections are fixed, either in a stochastic or a deterministic fashion. Typical …

被引用次数：82 相关文章所有 12 个版本