Large language models in medicine

AJ Thirunavukarasu, DSJ Ting, K Elangovan… - Nature medicine, 2023 - nature.com
Large language models (LLMs) can respond to free-text queries without being specifically
trained in the task in question, causing excitement and concern about their use in healthcare …

Challenges and applications of large language models

J Kaddour, J Harris, M Mozes, H Bradley… - arXiv preprint arXiv …, 2023 - arxiv.org
Large Language Models (LLMs) went from non-existent to ubiquitous in the machine
learning discourse within a few years. Due to the fast pace of the field, it is difficult to identify …

A survey of large language models

WX Zhao, K Zhou, J Li, T Tang, X Wang, Y Hou… - arXiv preprint arXiv …, 2023 - arxiv.org
Language is essentially a complex, intricate system of human expressions governed by
grammatical rules. It poses a significant challenge to develop capable AI algorithms for …

Scaling data-constrained language models

N Muennighoff, A Rush, B Barak… - Advances in …, 2024 - proceedings.neurips.cc
The current trend of scaling language models involves increasing both parameter count and
training dataset size. Extrapolating this trend suggests that training dataset size may soon be …

The RefinedWeb dataset for Falcon LLM: Outperforming curated corpora with web data only

G Penedo, Q Malartic, D Hesslow… - Advances in …, 2023 - proceedings.neurips.cc
Large language models are commonly trained on a mixture of filtered web data and
curated``high-quality''corpora, such as social media conversations, books, or technical …

A pretrainer's guide to training data: Measuring the effects of data age, domain coverage, quality, & toxicity

S Longpre, G Yauney, E Reif, K Lee, A Roberts… - arXiv preprint arXiv …, 2023 - arxiv.org
Pretraining is the preliminary and fundamental step in developing capable language models
(LM). Despite this, pretraining data design is critically under-documented and often guided …

Red teaming chatgpt via jailbreaking: Bias, robustness, reliability and toxicity

TY Zhuo, Y Huang, C Chen, Z Xing - arXiv preprint arXiv:2301.12867, 2023 - arxiv.org
Recent breakthroughs in natural language processing (NLP) have permitted the synthesis
and comprehension of coherent text in an open-ended way, therefore translating the …

Self-consuming generative models go mad

S Alemohammad, J Casco-Rodriguez, L Luzi… - arXiv preprint arXiv …, 2023 - arxiv.org
Seismic advances in generative AI algorithms for imagery, text, and other data types has led
to the temptation to use synthetic data to train next-generation models. Repeating this …

To repeat or not to repeat: Insights from scaling llm under token-crisis

F Xue, Y Fu, W Zhou, Z Zheng… - Advances in Neural …, 2024 - proceedings.neurips.cc
Recent research has highlighted the importance of dataset size in scaling language models.
However, large language models (LLMs) are notoriously token-hungry during pre-training …

When foundation model meets federated learning: Motivations, challenges, and future directions

W Zhuang, C Chen, L Lyu - arXiv preprint arXiv:2306.15546, 2023 - arxiv.org
The intersection of the Foundation Model (FM) and Federated Learning (FL) provides mutual
benefits, presents a unique opportunity to unlock new possibilities in AI research, and …