Text data augmentation for deep learning
Abstract Natural Language Processing (NLP) is one of the most captivating applications of
Deep Learning. In this survey, we consider how the Data Augmentation training strategy can …
Deep Learning. In this survey, we consider how the Data Augmentation training strategy can …
Model complexity of deep learning: A survey
Abstract Model complexity is a fundamental problem in deep learning. In this paper, we
conduct a systematic overview of the latest studies on model complexity in deep learning …
conduct a systematic overview of the latest studies on model complexity in deep learning …
A survey of large language models
Language is essentially a complex, intricate system of human expressions governed by
grammatical rules. It poses a significant challenge to develop capable AI algorithms for …
grammatical rules. It poses a significant challenge to develop capable AI algorithms for …
Beyond the imitation game: Quantifying and extrapolating the capabilities of language models
Language models demonstrate both quantitative improvement and new qualitative
capabilities with increasing scale. Despite their potentially transformative impact, these new …
capabilities with increasing scale. Despite their potentially transformative impact, these new …
Scaling data-constrained language models
The current trend of scaling language models involves increasing both parameter count and
training dataset size. Extrapolating this trend suggests that training dataset size may soon be …
training dataset size. Extrapolating this trend suggests that training dataset size may soon be …
Memorization without overfitting: Analyzing the training dynamics of large language models
K Tirumala, A Markosyan… - Advances in …, 2022 - proceedings.neurips.cc
Despite their wide adoption, the underlying training and memorization dynamics of very
large language models is not well understood. We empirically study exact memorization in …
large language models is not well understood. We empirically study exact memorization in …
On the opportunities and risks of foundation models
AI is undergoing a paradigm shift with the rise of models (eg, BERT, DALL-E, GPT-3) that are
trained on broad data at scale and are adaptable to a wide range of downstream tasks. We …
trained on broad data at scale and are adaptable to a wide range of downstream tasks. We …
Towards understanding sharpness-aware minimization
M Andriushchenko… - … Conference on Machine …, 2022 - proceedings.mlr.press
Abstract Sharpness-Aware Minimization (SAM) is a recent training method that relies on
worst-case weight perturbations which significantly improves generalization in various …
worst-case weight perturbations which significantly improves generalization in various …
Underspecification presents challenges for credibility in modern machine learning
Machine learning (ML) systems often exhibit unexpectedly poor behavior when they are
deployed in real-world domains. We identify underspecification in ML pipelines as a key …
deployed in real-world domains. We identify underspecification in ML pipelines as a key …
[PDF][PDF] The computational limits of deep learning
Deep learning's recent history has been one of achievement: from triumphing over humans
in the game of Go to world-leading performance in image classification, voice recognition …
in the game of Go to world-leading performance in image classification, voice recognition …