Generalized supervised meta-blocking
Entity Resolution is a core data integration task that relies on Blocking to scale to large
datasets. Schema-agnostic blocking achieves very high recall, requires no domain …
datasets. Schema-agnostic blocking achieves very high recall, requires no domain …
Towards building live open scientific knowledge graphs
Due to the large number and heterogeneity of data sources, it becomes increasingly difficult
to follow the research output and the scientific discourse. For example, a publication listed …
to follow the research output and the scientific discourse. For example, a publication listed …
Towards Scalable Generation of Realistic Test Data for Duplicate Detection
Due to the increasing volume, volatility, and diversity of data in virtually all areas of our lives,
the ability to detect duplicates in potentially linked data sources is more important than ever …
the ability to detect duplicates in potentially linked data sources is more important than ever …
Leveraging Machine Learning for Effective Data Management
S Sellami - Transactions on Large-Scale Data-and Knowledge …, 2024 - Springer
The exponential growth of heterogeneous data from diverse sources, such as social media,
IoT sensors, and transactional databases, poses significant challenges for effective …
IoT sensors, and transactional databases, poses significant challenges for effective …
[PDF][PDF] Similarity-driven Schema Transformation for Test Data Generation.
ABSTRACT A flexible and versed generation of test data is an important aspect in
benchmarking algorithms for data integration. This includes the generation of …
benchmarking algorithms for data integration. This includes the generation of …