Metacognitive Capabilities of LLMs: An Exploration in Mathematical Problem Solving
Metacognitive knowledge refers to humans' intuitive knowledge of their own thinking and
reasoning processes. Today's best LLMs clearly possess some reasoning processes. The …
reasoning processes. Today's best LLMs clearly possess some reasoning processes. The …
Data curation via joint example selection further accelerates multimodal learning
Data curation is an essential component of large-scale pretraining. In this work, we
demonstrate that jointly selecting batches of data is more effective for learning than selecting …
demonstrate that jointly selecting batches of data is more effective for learning than selecting …
Large Language Model-guided Document Selection
Large Language Model (LLM) pre-training exhausts an ever growing compute budget, yet
recent research has demonstrated that careful document selection enables comparable …
recent research has demonstrated that careful document selection enables comparable …
Data-Centric AI in the Age of Large Language Models
This position paper proposes a data-centric viewpoint of AI research, focusing on large
language models (LLMs). We start by making the key observation that data is instrumental in …
language models (LLMs). We start by making the key observation that data is instrumental in …
MATES: Model-Aware Data Selection for Efficient Pretraining with Data Influence Models
Pretraining data selection has the potential to improve language model pretraining efficiency
by utilizing higher-quality data from massive web data corpora. Current data selection …
by utilizing higher-quality data from massive web data corpora. Current data selection …
Multi-Agent Collaborative Data Selection for Efficient LLM Pretraining
Efficient data selection is crucial to accelerate the pretraining of large language models
(LLMs). While various methods have been proposed to enhance data efficiency, limited …
(LLMs). While various methods have been proposed to enhance data efficiency, limited …
Harnessing Diversity for Important Data Selection in Pretraining Large Language Models
C Zhang, H Zhong, K Zhang, C Chai, R Wang… - arXiv preprint arXiv …, 2024 - arxiv.org
Data selection is of great significance in pre-training large language models, given the
variation in quality within the large-scale available training corpora. To achieve this …
variation in quality within the large-scale available training corpora. To achieve this …
Uncertainties of latent representations in computer vision
M Kirchhof - arXiv preprint arXiv:2408.14281, 2024 - arxiv.org
Uncertainty quantification is a key pillar of trustworthy machine learning. It enables safe
reactions under unsafe inputs, like predicting only when the machine learning model detects …
reactions under unsafe inputs, like predicting only when the machine learning model detects …
Balancing Cost and Effectiveness of Synthetic Data Generation Strategies for LLMs
As large language models (LLMs) are applied to more use cases, creating high quality, task-
specific datasets for fine-tuning becomes a bottleneck for model improvement. Using high …
specific datasets for fine-tuning becomes a bottleneck for model improvement. Using high …
Advanced Deep Learning Methods for Chemistry and Material Science
Z Shui - 2024 - search.proquest.com
In chemistry and material science, scientific discovery is usually achieved through a
combination of wet-lab experiments and first-principle computational methods. These …
combination of wet-lab experiments and first-principle computational methods. These …