Online batch selection for faster training of neural networks

X Wang, Y Chen, W Zhu - IEEE transactions on pattern analysis …, 2021 - ieeexplore.ieee.org

Curriculum learning (CL) is a training strategy that trains a machine learning model from
easier data to harder data, which imitates the meaningful learning order in human curricula …

被引用次数：601 相关文章所有 6 个版本

[PDF] arxiv.org

Recent advances in convolutional neural networks

J Gu, Z Wang, J Kuen, L Ma, A Shahroudy, B Shuai… - Pattern recognition, 2018 - Elsevier

In the last few years, deep learning has led to very good performance on a variety of
problems, such as visual recognition, speech recognition and natural language processing …

被引用次数：6446 相关文章所有 7 个版本

[PDF] neurips.cc

Data selection for language models via importance resampling

SM Xie, S Santurkar, T Ma… - Advances in Neural …, 2023 - proceedings.neurips.cc

Selecting a suitable pretraining dataset is crucial for both general-domain (eg, GPT-3) and
domain-specific (eg, Codex) language models (LMs). We formalize this problem as selecting …

被引用次数：91 相关文章所有 5 个版本

[PDF] ieee.org

Meta-learning in neural networks: A survey

T Hospedales, A Antoniou, P Micaelli… - IEEE transactions on …, 2021 - ieeexplore.ieee.org

The field of meta-learning, or learning-to-learn, has seen a dramatic rise in interest in recent
years. Contrary to conventional approaches to AI where tasks are solved from scratch using …

被引用次数：2224 相关文章所有 10 个版本

[PDF] arxiv.org

Dataset cartography: Mapping and diagnosing datasets with training dynamics

S Swayamdipta, R Schwartz, N Lourie, Y Wang… - arXiv preprint arXiv …, 2020 - arxiv.org

Large datasets have become commonplace in NLP research. However, the increased
emphasis on data quantity has made it challenging to assess the quality of data. We …

被引用次数：354 相关文章所有 3 个版本

[PDF] arxiv.org

Distributed prioritized experience replay

D Horgan, J Quan, D Budden, G Barth-Maron… - arXiv preprint arXiv …, 2018 - arxiv.org

We propose a distributed architecture for deep reinforcement learning at scale, that enables
agents to learn effectively from orders of magnitude more data than previously possible. The …

被引用次数：899 相关文章所有 4 个版本

[PDF] mlr.press

Prioritized training on points that are learnable, worth learning, and not yet learnt

S Mindermann, JM Brauner… - International …, 2022 - proceedings.mlr.press

Training on web-scale data can take months. But much computation and time is wasted on
redundant and noisy points that are already learnt or not learnable. To accelerate training …

被引用次数：106 相关文章所有 9 个版本

[PDF] mlr.press

Not all samples are created equal: Deep learning with importance sampling

A Katharopoulos, F Fleuret - International conference on …, 2018 - proceedings.mlr.press

Abstract Deep Neural Network training spends most of the computation on examples that
are properly handled, and could be ignored. We propose to mitigate this phenomenon with a …

被引用次数：552 相关文章所有 12 个版本

[PDF] cv-foundation.org

Training region-based object detectors with online hard example mining

A Shrivastava, A Gupta, R Girshick - Proceedings of the IEEE …, 2016 - cv-foundation.org

The field of object detection has made significant advances riding on the wave of region-
based ConvNets, but their training procedure still includes many heuristics and …

被引用次数：2936 相关文章所有 9 个版本

[PDF] thecvf.com

A-fast-rcnn: Hard positive generation via adversary for object detection

X Wang, A Shrivastava, A Gupta - Proceedings of the IEEE …, 2017 - openaccess.thecvf.com

How do we learn an object detector that is invariant to occlusions and deformations? Our
current solution is to use a data-driven strategy--collect large-scale datasets which have …

被引用次数：759 相关文章所有 9 个版本