Scientific discovery in the age of artificial intelligence
Artificial intelligence (AI) is being increasingly integrated into scientific discovery to augment
and accelerate research, helping scientists to generate hypotheses, design experiments …
and accelerate research, helping scientists to generate hypotheses, design experiments …
Too large; data reduction for vision-language pre-training
This paper examines the problems of severe image-text misalignment and high redundancy
in the widely-used large-scale Vision-Language Pre-Training (VLP) datasets. To address …
in the widely-used large-scale Vision-Language Pre-Training (VLP) datasets. To address …
Optimizing data collection for machine learning
Modern deep learning systems require huge data sets to achieve impressive performance,
but there is little guidance on how much or what kind of data to collect. Over-collecting data …
but there is little guidance on how much or what kind of data to collect. Over-collecting data …
Performance scaling via optimal transport: Enabling data selection from partially revealed sources
Traditionally, data selection has been studied in settings where all samples from prospective
sources are fully revealed to a machine learning developer. However, in practical data …
sources are fully revealed to a machine learning developer. However, in practical data …
Delegated classification
E Saig, I Talgam-Cohen… - Advances in Neural …, 2024 - proceedings.neurips.cc
When machine learning is outsourced to a rational agent, conflicts of interest might arise and
severely impact predictive performance. In this work, we propose a theoretical framework for …
severely impact predictive performance. In this work, we propose a theoretical framework for …
Playground v2. 5: Three insights towards enhancing aesthetic quality in text-to-image generation
D Li, A Kamko, E Akhgari, A Sabet, L Xu… - arXiv preprint arXiv …, 2024 - arxiv.org
In this work, we share three insights for achieving state-of-the-art aesthetic quality in text-to-
image generative models. We focus on three critical aspects for model improvement …
image generative models. We focus on three critical aspects for model improvement …
Full or Weak annotations? An adaptive strategy for budget-constrained annotation campaigns
JG Tejero, MS Zinkernagel, S Wolf… - Proceedings of the …, 2023 - openaccess.thecvf.com
Annotating new datasets for machine learning tasks is tedious, time-consuming, and costly.
For segmentation applications, the burden is particularly high as manual delineations of …
For segmentation applications, the burden is particularly high as manual delineations of …
A meta-learning approach to predicting performance and data requirements
We propose an approach to estimate the number of samples required for a model to reach a
target performance. We find that the power law, the de facto principle to estimate model …
target performance. We find that the power law, the de facto principle to estimate model …
[HTML][HTML] Machine learning-based label quality assurance for object detection projects in requirements engineering
N Pičuljan, Ž Car - Applied Sciences, 2023 - mdpi.com
Featured Application Our machine learning-based label quality assurance demo showcases
the potential of our approach to improve object detection projects within the data …
the potential of our approach to improve object detection projects within the data …
[HTML][HTML] Low fidelity data driven machine learning based optimisation method for box-wing configuration
M Hasan, A Khandoker, G Gessl, MA Hamid… - Aerospace Science and …, 2024 - Elsevier
Wing design optimization traditionally involves computationally expensive high-fidelity
simulations, limiting the exploration of design spaces. In this study, we propose a …
simulations, limiting the exploration of design spaces. In this study, we propose a …