Data cleaning: Overview and emerging challenges
Detecting and repairing dirty data is one of the perennial challenges in data analytics, and
failure to do so can result in inaccurate analytics and unreliable decisions. Over the past few …
failure to do so can result in inaccurate analytics and unreliable decisions. Over the past few …
Trends in cleaning relational data: Consistency and deduplication
Data quality is one of the most important problems in data management, since dirty data
often leads to inaccurate data analytics results and wrong business decisions. Poor data …
often leads to inaccurate data analytics results and wrong business decisions. Poor data …
A formal approach to finding explanations for database queries
As a consequence of the popularity of big data, many users with a variety of backgrounds
seek to extract high level information from datasets collected from various sources and …
seek to extract high level information from datasets collected from various sources and …
Data quality: From theory to practice
W Fan - Acm Sigmod Record, 2015 - dl.acm.org
Data quantity and data quality, like two sides of a coin, are equally important to data
management. This paper provides an overview of recent advances in the study of data …
management. This paper provides an overview of recent advances in the study of data …
Data x-ray: A diagnostic tool for data errors
A lot of systems and applications are data-driven, and the correctness of their operation
relies heavily on the correctness of their data. While existing data cleaning techniques can …
relies heavily on the correctness of their data. While existing data cleaning techniques can …
Data provenance
B Glavic - Foundations and Trends® in Databases, 2021 - nowpublishers.com
Data provenance has evolved from a niche topic to a mainstream area of research in
databases and other research communities. This article gives a comprehensive introduction …
databases and other research communities. This article gives a comprehensive introduction …
Xinsight: explainable data analysis through the lens of causality
In light of the growing popularity of Exploratory Data Analysis (EDA), understanding the
underlying causes of the knowledge acquired by EDA is crucial. However, it remains under …
underlying causes of the knowledge acquired by EDA is crucial. However, it remains under …
Qualitative data cleaning
Data quality is one of the most important problems in data management, since dirty data
often leads to inaccurate data analytics results and wrong business decisions. Data cleaning …
often leads to inaccurate data analytics results and wrong business decisions. Data cleaning …