[HTML][HTML] Profiling relational data: a survey

Z Abedjan, L Golab, F Naumann - The VLDB Journal, 2015 - Springer
Profiling data to determine metadata about a given dataset is an important and frequent
activity of any IT professional and researcher and is necessary for various use-cases. It …

Data profiling revisited

F Naumann - ACM SIGMOD Record, 2014 - dl.acm.org
Data profiling comprises a broad range of methods to efficiently analyze a given data set. In
a typical scenario, which mirrors the capabilities of commercial data profiling tools, tables of …

[图书][B] Data profiling

Z Abedjan, L Golab, F Naumann, T Papenbrock - 2019 - Springer
Data profiling refers to the activity of collecting data about data,{ie}, metadata. Most IT
professionals and researchers who work with data have engaged in data profiling, at least …

Scalable discovery of unique column combinations

A Heise, JA Quiané-Ruiz, Z Abedjan… - Proceedings of the …, 2013 - dl.acm.org
The discovery of all unique (and non-unique) column combinations in a given dataset is at
the core of any data profiling effort. The results are useful for a large number of areas of data …

Automatic discovery of data-centric and artifact-centric processes

EHJ Nooijen, BF van Dongen, D Fahland - … Management Workshops: BPM …, 2013 - Springer
Process discovery is a technique that allows for automatically discovering a process model
from recorded executions of a process as it happens in reality. This technique has …

Protecting data integrity of web applications with database constraints inferred from application code

H Huang, B Shen, L Zhong, Y Zhou - Proceedings of the 28th ACM …, 2023 - dl.acm.org
Database-backed web applications persist a large amount of production data and have high
requirements for integrity. To protect data integrity against application code bugs and …

DFD: Efficient functional dependency discovery

Z Abedjan, P Schulze, F Naumann - Proceedings of the 23rd ACM …, 2014 - dl.acm.org
The discovery of unknown functional dependencies in a dataset is of great importance for
database redesign, anomaly detection and data cleansing applications. However, as the …

Profiling and mining RDF data with ProLOD++

Z Abedjan, T Grütze, A Jentzsch… - 2014 IEEE 30th …, 2014 - ieeexplore.ieee.org
Before reaping the benefits of open data to add value to an organizations internal data, such
new, external datasets must be analyzed and understood already at the basic level of data …

Hitting set enumeration with partial information for unique column combination discovery

J Birnick, T Bläsius, T Friedrich, F Naumann… - Proceedings of the …, 2020 - dl.acm.org
Unique column combinations (UCCs) are a fundamental concept in relational databases.
They identify entities in the data and support various data management activities. Still, UCCs …

Sampling for big data profiling: A survey

Z Liu, A Zhang - IEEE access, 2020 - ieeexplore.ieee.org
Due to the development of internet technology and computer science, data is exploding at
an exponential rate. Big data brings us new opportunities and challenges. On the one hand …