Metacognitive Capabilities of LLMs: An Exploration in Mathematical Problem Solving

A Didolkar, A Goyal, NR Ke, S Guo, M Valko… - arXiv preprint arXiv …, 2024 - arxiv.org
Metacognitive knowledge refers to humans' intuitive knowledge of their own thinking and
reasoning processes. Today's best LLMs clearly possess some reasoning processes. The …

Data curation via joint example selection further accelerates multimodal learning

T Evans, N Parthasarathy, H Merzic… - arXiv preprint arXiv …, 2024 - arxiv.org
Data curation is an essential component of large-scale pretraining. In this work, we
demonstrate that jointly selecting batches of data is more effective for learning than selecting …

Large Language Model-guided Document Selection

X Kong, T Gunter, R Pang - arXiv preprint arXiv:2406.04638, 2024 - arxiv.org
Large Language Model (LLM) pre-training exhausts an ever growing compute budget, yet
recent research has demonstrated that careful document selection enables comparable …

Data-Centric AI in the Age of Large Language Models

X Xu, Z Wu, R Qiao, A Verma, Y Shu, J Wang… - arXiv preprint arXiv …, 2024 - arxiv.org
This position paper proposes a data-centric viewpoint of AI research, focusing on large
language models (LLMs). We start by making the key observation that data is instrumental in …

MATES: Model-Aware Data Selection for Efficient Pretraining with Data Influence Models

Z Yu, S Das, C Xiong - arXiv preprint arXiv:2406.06046, 2024 - arxiv.org
Pretraining data selection has the potential to improve language model pretraining efficiency
by utilizing higher-quality data from massive web data corpora. Current data selection …

Multi-Agent Collaborative Data Selection for Efficient LLM Pretraining

T Bai, L Yang, ZH Wong, J Peng, X Zhuang… - arXiv preprint arXiv …, 2024 - arxiv.org
Efficient data selection is crucial to accelerate the pretraining of large language models
(LLMs). While various methods have been proposed to enhance data efficiency, limited …

Harnessing Diversity for Important Data Selection in Pretraining Large Language Models

C Zhang, H Zhong, K Zhang, C Chai, R Wang… - arXiv preprint arXiv …, 2024 - arxiv.org
Data selection is of great significance in pre-training large language models, given the
variation in quality within the large-scale available training corpora. To achieve this …

Uncertainties of latent representations in computer vision

M Kirchhof - arXiv preprint arXiv:2408.14281, 2024 - arxiv.org
Uncertainty quantification is a key pillar of trustworthy machine learning. It enables safe
reactions under unsafe inputs, like predicting only when the machine learning model detects …

Balancing Cost and Effectiveness of Synthetic Data Generation Strategies for LLMs

YC Chan, G Pu, A Shanker, P Suresh, P Jenks… - arXiv preprint arXiv …, 2024 - arxiv.org
As large language models (LLMs) are applied to more use cases, creating high quality, task-
specific datasets for fine-tuning becomes a bottleneck for model improvement. Using high …

Advanced Deep Learning Methods for Chemistry and Material Science

Z Shui - 2024 - search.proquest.com
In chemistry and material science, scientific discovery is usually achieved through a
combination of wet-lab experiments and first-principle computational methods. These …