Automated fact-checking to support professional practices: systematic literature review and meta-analysis

L Dierickx, CG Lindén, AL Opdahl - International Journal of …, 2023 - ojs3.ijoc.org
Fact-checking is a time-consuming process that automation can potentially make more
efficient. This study provides a comprehensive, multidisciplinary state of the art that …

The effects of data quality on machine learning performance

L Budach, M Feuerpfeil, N Ihde, A Nathansen… - arXiv preprint arXiv …, 2022 - arxiv.org
Modern artificial intelligence (AI) applications require large quantities of training and test
data. This need creates critical challenges not only concerning the availability of such data …

An artificial intelligence life cycle: From conception to production

D De Silva, D Alahakoon - Patterns, 2022 - cell.com
This paper presents the" CDAC AI life cycle," a comprehensive life cycle for the design,
development, and deployment of artificial intelligence (AI) systems and solutions. It …

Data-iq: Characterizing subgroups with heterogeneous outcomes in tabular data

N Seedat, J Crabbé, I Bica… - Advances in Neural …, 2022 - proceedings.neurips.cc
High model performance, on average, can hide that models may systematically
underperform on subgroups of the data. We consider the tabular setting, which surfaces the …

TRIAGE: Characterizing and auditing training data for improved regression

N Seedat, J Crabbé, Z Qian… - Advances in Neural …, 2024 - proceedings.neurips.cc
Data quality is crucial for robust machine learning algorithms, with the recent interest in data-
centric AI emphasizing the importance of training data characterization. However, current …

[HTML][HTML] A hybrid machine learning approach for the load prediction in the sustainable transition of district heating networks

M Habib, TO Timoudas, Y Ding, N Nord, S Chen… - Sustainable cities and …, 2023 - Elsevier
Current district heating networks are undergoing a sustainable transition towards the 4 th
and 5 th generation of district heating networks, characterized by the integration of different …

Reimagining synthetic tabular data generation through data-centric AI: A comprehensive benchmark

L Hansen, N Seedat… - Advances in Neural …, 2023 - proceedings.neurips.cc
Synthetic data serves as an alternative in training machine learning models, particularly
when real-world data is limited or inaccessible. However, ensuring that synthetic data …

Data smells: Categories, causes and consequences, and detection of suspicious data in ai-based systems

H Foidl, M Felderer, R Ramler - … of the 1st International Conference on …, 2022 - dl.acm.org
High data quality is fundamental for today's AI-based systems. However, although data
quality has been an object of research for decades, there is a clear lack of research on …

Data assimilation for urban stormwater and water quality simulations using deep reinforcement learning

M Jeung, J Jang, K Yoon, SS Baek - Journal of Hydrology, 2023 - Elsevier
Hydrological models have been used to understand the transportation of water quantity and
quality in drainage systems, and the stormwater management model (SWMM) is one of the …

Advances in exploratory data analysis, visualisation and quality for data centric AI systems

H Patel, S Guttula, RS Mittal, N Manwani… - Proceedings of the 28th …, 2022 - dl.acm.org
It is widely accepted that data preparation is one of the most time-consuming steps of the
machine learning (ML) lifecycle. It is also one of the most important steps, as the quality of …