Text data augmentation for deep learning

C Shorten, TM Khoshgoftaar, B Furht - Journal of big Data, 2021 - Springer
Abstract Natural Language Processing (NLP) is one of the most captivating applications of
Deep Learning. In this survey, we consider how the Data Augmentation training strategy can …

Model complexity of deep learning: A survey

X Hu, L Chu, J Pei, W Liu, J Bian - Knowledge and Information Systems, 2021 - Springer
Abstract Model complexity is a fundamental problem in deep learning. In this paper, we
conduct a systematic overview of the latest studies on model complexity in deep learning …

A survey of large language models

WX Zhao, K Zhou, J Li, T Tang, X Wang, Y Hou… - arXiv preprint arXiv …, 2023 - arxiv.org
Language is essentially a complex, intricate system of human expressions governed by
grammatical rules. It poses a significant challenge to develop capable AI algorithms for …

Beyond the imitation game: Quantifying and extrapolating the capabilities of language models

A Srivastava, A Rastogi, A Rao, AAM Shoeb… - arXiv preprint arXiv …, 2022 - arxiv.org
Language models demonstrate both quantitative improvement and new qualitative
capabilities with increasing scale. Despite their potentially transformative impact, these new …

Scaling data-constrained language models

N Muennighoff, A Rush, B Barak… - Advances in …, 2024 - proceedings.neurips.cc
The current trend of scaling language models involves increasing both parameter count and
training dataset size. Extrapolating this trend suggests that training dataset size may soon be …

Memorization without overfitting: Analyzing the training dynamics of large language models

K Tirumala, A Markosyan… - Advances in …, 2022 - proceedings.neurips.cc
Despite their wide adoption, the underlying training and memorization dynamics of very
large language models is not well understood. We empirically study exact memorization in …

On the opportunities and risks of foundation models

R Bommasani, DA Hudson, E Adeli, R Altman… - arXiv preprint arXiv …, 2021 - arxiv.org
AI is undergoing a paradigm shift with the rise of models (eg, BERT, DALL-E, GPT-3) that are
trained on broad data at scale and are adaptable to a wide range of downstream tasks. We …

Towards understanding sharpness-aware minimization

M Andriushchenko… - … Conference on Machine …, 2022 - proceedings.mlr.press
Abstract Sharpness-Aware Minimization (SAM) is a recent training method that relies on
worst-case weight perturbations which significantly improves generalization in various …

Underspecification presents challenges for credibility in modern machine learning

A D'Amour, K Heller, D Moldovan, B Adlam… - Journal of Machine …, 2022 - jmlr.org
Machine learning (ML) systems often exhibit unexpectedly poor behavior when they are
deployed in real-world domains. We identify underspecification in ML pipelines as a key …

[PDF][PDF] The computational limits of deep learning

NC Thompson, K Greenewald, K Lee… - arXiv preprint arXiv …, 2020 - assets.pubpub.org
Deep learning's recent history has been one of achievement: from triumphing over humans
in the game of Go to world-leading performance in image classification, voice recognition …