[HTML][HTML] Profiling relational data: a survey

Z Abedjan, L Golab, F Naumann - The VLDB Journal, 2015 - Springer
Profiling data to determine metadata about a given dataset is an important and frequent
activity of any IT professional and researcher and is necessary for various use-cases. It …

Data profiling: A tutorial

Z Abedjan, L Golab, F Naumann - Proceedings of the 2017 ACM …, 2017 - dl.acm.org
is to understand the dataset at hand and its metadata. The process of metadata discovery is
known as data profiling. Profiling activities range from ad-hoc approaches, such as eye …

Data profiling revisited

F Naumann - ACM SIGMOD Record, 2014 - dl.acm.org
Data profiling comprises a broad range of methods to efficiently analyze a given data set. In
a typical scenario, which mirrors the capabilities of commercial data profiling tools, tables of …

Scalable discovery of unique column combinations

A Heise, JA Quiané-Ruiz, Z Abedjan… - Proceedings of the …, 2013 - dl.acm.org
The discovery of all unique (and non-unique) column combinations in a given dataset is at
the core of any data profiling effort. The results are useful for a large number of areas of data …

Discovering similarity inclusion dependencies

Y Kaminsky, EHM Pena, F Naumann - … of the ACM on Management of …, 2023 - dl.acm.org
Inclusion dependencies (INDs) are a well-known type of data dependency, specifying that
the values of one column are contained in those of another column. INDs can be used for …

Silkmoth: An efficient method for finding related sets with maximum matching constraints

D Deng, A Kim, S Madden, M Stonebraker - arXiv preprint arXiv …, 2017 - arxiv.org
Determining if two sets are related-that is, if they have similar values or if one set contains
the other-is an important problem with many applications in data cleaning, data integration …

Discovering conditional matching rules

Y Wang, S Song, L Chen, JX Yu, H Cheng - ACM Transactions on …, 2017 - dl.acm.org
Matching dependencies (MDs) have recently been proposed to make data dependencies
tolerant to various information representations, and found useful in data quality applications …

[HTML][HTML] Extending inclusion dependencies with conditions

S Ma, W Fan, L Bravo - Theoretical Computer Science, 2014 - Elsevier
This paper introduces a class of conditional inclusion dependencies (CINDs), which extends
inclusion dependencies (INDs) by enforcing patterns of semantically related data values. We …

Inclusion dependencies reloaded

H Köhler, S Link - Proceedings of the 24th ACM International on …, 2015 - dl.acm.org
Inclusion dependencies form one of the most fundamental classes of integrity constraints.
Their importance in classical data management is reinforced by modern applications such …

Generalization of typed include dependencies with null values in databases

SV Zykin - information systems, 2023 - elibrary.ru
MSC2020: 68P15 Received July 7, 2023 Research article A er revision August 1, 2023 Full
text in Russian Accepted August 2, 2023 e paper discusses a new type of dependency in …