Mollifying networks

P Chaudhari, A Choromanska, S Soatto… - Journal of Statistical …, 2019 - iopscience.iop.org

This paper proposes a new optimization algorithm called Entropy-SGD for training deep
neural networks that is motivated by the local geometry of the energy landscape. Local …

被引用次数：817 相关文章所有 14 个版本

[PDF] arxiv.org

Regularization for deep learning: A taxonomy

J Kukačka, V Golkov, D Cremers - arXiv preprint arXiv:1710.10686, 2017 - arxiv.org

Regularization is one of the crucial ingredients of deep learning, yet the term regularization
has various definitions, and regularization methods are often studied separately from each …

被引用次数：501 相关文章所有 8 个版本

[PDF] arxiv.org

Online deep learning: Learning deep neural networks on the fly

D Sahoo, Q Pham, J Lu, SCH Hoi - arXiv preprint arXiv:1711.03705, 2017 - arxiv.org

Deep Neural Networks (DNNs) are typically trained by backpropagation in a batch learning
setting, which requires the entire training data to be made available prior to the learning …

被引用次数：385 相关文章所有 9 个版本

[PDF] thecvf.com

C-mil: Continuation multiple instance learning for weakly supervised object detection

F Wan, C Liu, W Ke, X Ji, J Jiao… - Proceedings of the IEEE …, 2019 - openaccess.thecvf.com

Weakly supervised object detection (WSOD) is a challenging task when provided with image
category supervision but required to simultaneously learn object locations and object …

被引用次数：275 相关文章所有 7 个版本

[PDF] neurips.cc

Active bias: Training more accurate neural networks by emphasizing high variance samples

HS Chang, E Learned-Miller… - Advances in Neural …, 2017 - proceedings.neurips.cc

Self-paced learning and hard example mining re-weight training instances to improve
learning accuracy. This paper presents two improved alternatives based on lightweight …

被引用次数：390 相关文章所有 7 个版本

[PDF] arxiv.org

Empirical analysis of the hessian of over-parametrized neural networks

L Sagun, U Evci, VU Guney, Y Dauphin… - arXiv preprint arXiv …, 2017 - arxiv.org

We study the properties of common loss surfaces through their Hessian matrix. In particular,
in the context of deep learning, we empirically show that the spectrum of the Hessian is …

被引用次数：381 相关文章所有 6 个版本

[PDF] thecvf.com

Min-entropy latent model for weakly supervised object detection

F Wan, P Wei, J Jiao, Z Han… - Proceedings of the IEEE …, 2018 - openaccess.thecvf.com

Weakly supervised object detection is a challenging task when provided with image
category supervision but required to learn, at the same time, object locations and object …

被引用次数：277 相关文章所有 13 个版本

[PDF] mlr.press

Understanding the impact of entropy on policy optimization

Z Ahmed, N Le Roux, M Norouzi… - … on machine learning, 2019 - proceedings.mlr.press

Entropy regularization is commonly used to improve policy optimization in reinforcement
learning. It is believed to help with exploration by encouraging the selection of more …

被引用次数：243 相关文章所有 11 个版本

[PDF] neurips.cc

Maximum mean discrepancy gradient flow

M Arbel, A Korba, A Salim… - Advances in Neural …, 2019 - proceedings.neurips.cc

We construct a Wasserstein gradient flow of the maximum mean discrepancy (MMD) and
study its convergence properties. The MMD is an integral probability metric defined for a …

被引用次数：155 相关文章所有 15 个版本

[PDF] arxiv.org

Multi-level residual networks from dynamical systems view

B Chang, L Meng, E Haber, F Tung… - arXiv preprint arXiv …, 2017 - arxiv.org

Deep residual networks (ResNets) and their variants are widely used in many computer
vision applications and natural language processing tasks. However, the theoretical …

被引用次数：184 相关文章所有 3 个版本

Entropy-sgd: Biasing gradient descent into wide valleys

Regularization for deep learning: A taxonomy

Online deep learning: Learning deep neural networks on the fly

C-mil: Continuation multiple instance learning for weakly supervised object detection

Active bias: Training more accurate neural networks by emphasizing high variance samples

Empirical analysis of the hessian of over-parametrized neural networks

Min-entropy latent model for weakly supervised object detection

Understanding the impact of entropy on policy optimization

Maximum mean discrepancy gradient flow

Multi-level residual networks from dynamical systems view

高级搜索

引用