Do datasets have politics? Disciplinary values in computer vision dataset development

MK Scheuerman, A Hanna, E Denton - … of the ACM on Human-Computer …, 2021 - dl.acm.org
Data is a crucial component of machine learning. The field is reliant on data to train, validate,
and test models. With increased technical capabilities, machine learning research has …

Simulation intelligence: Towards a new generation of scientific methods

A Lavin, D Krakauer, H Zenil, J Gottschlich… - arXiv preprint arXiv …, 2021 - arxiv.org
The original" Seven Motifs" set forth a roadmap of essential methods for the field of scientific
computing, where a motif is an algorithmic method that captures a pattern of computation …

The data-production dispositif

M Miceli, J Posada - Proceedings of the ACM on human-computer …, 2022 - dl.acm.org
Machine learning (ML) depends on data to train and verify models. Very often, organizations
outsource processes related to data work (ie, generating and annotating data and …

Forgetting practices in the data sciences

M Muller, A Strohmayer - Proceedings of the 2022 CHI Conference on …, 2022 - dl.acm.org
HCI engages with data science through many topics and themes. Researchers have
addressed biased dataset problems, arguing that bad data can cause innocent software to …

Reward reports for reinforcement learning

TK Gilbert, N Lambert, S Dean, T Zick… - Proceedings of the …, 2023 - dl.acm.org
Building systems that are good for society in the face of complex societal effects requires a
dynamic approach. Recent approaches to machine learning (ML) documentation have …

How to data in datathons

C Mougan, R Plant, C Teng, M Bazzi… - Advances in …, 2024 - proceedings.neurips.cc
The rise of datathons, also known as data or data science hackathons, has provided a
platform to collaborate, learn, and innovate quickly. Despite their significant potential …

Ethical considerations for responsible data curation

J Andrews, D Zhao, W Thong… - Advances in …, 2024 - proceedings.neurips.cc
Human-centric computer vision (HCCV) data curation practices often neglect privacy and
bias concerns, leading to dataset retractions and unfair models. HCCV datasets constructed …

A Data-centric AI Framework for Automating Exploratory Data Analysis and Data Quality Tasks

H Patel, S Guttula, N Gupta, S Hans… - ACM Journal of Data and …, 2023 - dl.acm.org
Democratisation of machine learning (ML) has been an important theme in the research
community for the last several years with notable progress made by the model-building …

From principles to practice: An accountability metrics catalogue for managing ai risks

B Xia, Q Lu, L Zhu, SU Lee, Y Liu, Z Xing - arXiv preprint arXiv:2311.13158, 2023 - arxiv.org
Artificial Intelligence (AI), particularly through the advent of large-scale generative AI (GenAI)
models such as Large Language Models (LLMs), has become a transformative element in …

The landscape of data and AI documentation approaches in the European policy context

M Micheli, I Hupont, B Delipetrev… - Ethics and Information …, 2023 - Springer
Abstract Nowadays, Artificial Intelligence (AI) is present in all sectors of the economy.
Consequently, both data-the raw material used to build AI systems-and AI have an …