On the importance of pre-training data volume for compact language models
V Micheli, M d'Hoffschmidt, F Fleuret - arXiv preprint arXiv:2010.03813, 2020 - arxiv.org
Recent advances in language modeling have led to computationally intensive and resource-
demanding state-of-the-art models. In an effort towards sustainable practices, we study the …
demanding state-of-the-art models. In an effort towards sustainable practices, we study the …
Bayling: Bridging cross-lingual alignment and instruction following through interactive translation for large language models
Large language models (LLMs) have demonstrated remarkable prowess in language
understanding and generation. Advancing from foundation LLMs to instructionfollowing …
understanding and generation. Advancing from foundation LLMs to instructionfollowing …
h2ogpt: Democratizing large language models
A Candel, J McKinney, P Singer, P Pfeiffer… - arXiv preprint arXiv …, 2023 - arxiv.org
Foundation Large Language Models (LLMs) such as GPT-4 represent a revolution in AI due
to their real-world applications though natural language processing. However, they also …
to their real-world applications though natural language processing. However, they also …
Knowledge fusion of large language models
While training large language models (LLMs) from scratch can generate models with distinct
functionalities and strengths, it comes at significant costs and may result in redundant …
functionalities and strengths, it comes at significant costs and may result in redundant …
Large language models suffer from their own output: An analysis of the self-consuming training loop
Large language models (LLM) have become state of the art in many benchmarks and
conversational LLM applications like ChatGPT are now widely used by the public. Those …
conversational LLM applications like ChatGPT are now widely used by the public. Those …
Typhoon: Thai large language models
K Pipatanakul, P Jirabovonvisut, P Manakul… - arXiv preprint arXiv …, 2023 - arxiv.org
Typhoon is a series of Thai large language models (LLMs) developed specifically for the
Thai language. This technical report presents challenges and insights in developing Thai …
Thai language. This technical report presents challenges and insights in developing Thai …
Are larger pretrained language models uniformly better? comparing performance at the instance level
Larger language models have higher accuracy on average, but are they better on every
single instance (datapoint)? Some work suggests larger models have higher out-of …
single instance (datapoint)? Some work suggests larger models have higher out-of …
Hypertuning: Toward adapting large language models without back-propagation
Fine-tuning large language models for different tasks can be costly and inefficient, and even
methods that reduce the number of tuned parameters still require full gradient-based …
methods that reduce the number of tuned parameters still require full gradient-based …
Can we trust the evaluation on ChatGPT?
ChatGPT, the first large language model (LLM) with mass adoption, has demonstrated
remarkable performance in numerous natural language tasks. Despite its evident …
remarkable performance in numerous natural language tasks. Despite its evident …
Dolma: An open corpus of three trillion tokens for language model pretraining research
Language models have become a critical technology to tackling a wide range of natural
language processing tasks, yet many details about how the best-performing language …
language processing tasks, yet many details about how the best-performing language …