Confident learning: Estimating uncertainty in dataset labels

W Liang, GA Tadesse, D Ho, L Fei-Fei… - Nature Machine …, 2022 - nature.com

As artificial intelligence (AI) transitions from research to deployment, creating the appropriate
datasets and data pipelines to develop and evaluate AI models is increasingly the biggest …

被引用次数：274 相关文章所有 3 个版本

[HTML] sciencedirect.com

[HTML][HTML] A review of uncertainty quantification in deep learning: Techniques, applications and challenges

M Abdar, F Pourpanah, S Hussain, D Rezazadegan… - Information fusion, 2021 - Elsevier

Uncertainty quantification (UQ) methods play a pivotal role in reducing the impact of
uncertainties during both optimization and decision making processes. They have been …

被引用次数：2081 相关文章所有 12 个版本

[PDF] thecvf.com

Diffumask: Synthesizing images with pixel-level annotations for semantic segmentation using diffusion models

W Wu, Y Zhao, MZ Shou, H Zhou… - Proceedings of the …, 2023 - openaccess.thecvf.com

Collecting and annotating images with pixel-wise labels is time-consuming and laborious. In
contrast, synthetic data can be freely available using a generative model (eg, DALL-E …

被引用次数：107 相关文章所有 5 个版本

[PDF] arxiv.org

Pervasive label errors in test sets destabilize machine learning benchmarks

CG Northcutt, A Athalye, J Mueller - arXiv preprint arXiv:2103.14749, 2021 - arxiv.org

We identify label errors in the test sets of 10 of the most commonly-used computer vision,
natural language, and audio datasets, and subsequently study the potential for these label …

被引用次数：573 相关文章所有 9 个版本

[PDF] acm.org

Challenges in deploying machine learning: a survey of case studies

A Paleyes, RG Urma, ND Lawrence - ACM computing surveys, 2022 - dl.acm.org

In recent years, machine learning has transitioned from a field of academic research interest
to a field capable of solving real-world business problems. However, the deployment of …

被引用次数：456 相关文章所有 6 个版本

[PDF] arxiv.org

Fsd50k: an open dataset of human-labeled sound events

E Fonseca, X Favory, J Pons, F Font… - IEEE/ACM Transactions …, 2021 - ieeexplore.ieee.org

Most existing datasets for sound event recognition (SER) are relatively small and/or domain-
specific, with the exception of AudioSet, based on over 2 M tracks from YouTube videos and …

被引用次数：448 相关文章所有 5 个版本

[PDF] neurips.cc

Dataperf: Benchmarks for data-centric ai development

M Mazumder, C Banbury, X Yao… - Advances in …, 2024 - proceedings.neurips.cc

Abstract Machine learning research has long focused on models rather than datasets, and
prominent datasets are used for common ML tasks without regard to the breadth, difficulty …

被引用次数：110 相关文章所有 6 个版本

[PDF] usenix.org

Dos and don'ts of machine learning in computer security

D Arp, E Quiring, F Pendlebury, A Warnecke… - 31st USENIX Security …, 2022 - usenix.org

With the growing processing power of computing systems and the increasing availability of
massive datasets, machine learning algorithms have led to major breakthroughs in many …

被引用次数：371 相关文章所有 30 个版本

[PDF] jair.org Full View

Learning from disagreement: A survey

AN Uma, T Fornaciari, D Hovy, S Paun, B Plank… - Journal of Artificial …, 2021 - jair.org

Abstract Many tasks in Natural Language Processing (NLP) and Computer Vision (CV) offer
evidence that humans disagree, from objective tasks such as part-of-speech tagging to more …

被引用次数：158 相关文章所有 12 个版本

[PDF] arxiv.org

Are we done with imagenet?

L Beyer, OJ Hénaff, A Kolesnikov, X Zhai… - arXiv preprint arXiv …, 2020 - arxiv.org

Yes, and no. We ask whether recent progress on the ImageNet classification benchmark
continues to represent meaningful generalization, or whether the community has started to …

被引用次数：383 相关文章所有 2 个版本