Towards data-centric graph machine learning: Review and outlook

X Zheng, Y Liu, Z Bao, M Fang, X Hu, AWC Liew… - arXiv preprint arXiv …, 2023 - arxiv.org
Data-centric AI, with its primary focus on the collection, management, and utilization of data
to drive AI models and applications, has attracted increasing attention in recent years. In this …

Data cleaning and machine learning: a systematic literature review

PO Côté, A Nikanjam, N Ahmed, D Humeniuk… - Automated Software …, 2024 - Springer
Abstract Machine Learning (ML) is integrated into a growing number of systems for various
applications. Because the performance of an ML model is highly dependent on the quality of …

Better entity matching with transformers through ensembles

JF Low, BCM Fung, P Xiong - Knowledge-Based Systems, 2024 - Elsevier
In this paper, we introduce AttendEM, a framework for entity matching (EM), ie, pairwise
identification of duplicates across databases. Eschewing the prevalent focus on text …

[HTML][HTML] Dual data mapping with fine-tuned large language models and asset administration shells toward interoperable knowledge representation

D Shi, O Meyer, M Oberle, T Bauernhansl - Robotics and Computer …, 2025 - Elsevier
In the context of Industry 4.0, ensuring the compatibility of digital twins (DTs) with existing
software systems in the manufacturing sector presents a significant challenge. The Asset …

Deep active alignment of knowledge graph entities and schemata

J Huang, Z Sun, Q Chen, X Xu, W Ren… - Proceedings of the ACM on …, 2023 - dl.acm.org
Knowledge graphs (KGs) store rich facts about the real world. In this paper, we study KG
alignment, which aims to find alignment between not only entities but also relations and …

A Survey on Data Markets

J Zhang, Y Bi, M Cheng, J Liu, K Ren, Q Sun… - arXiv preprint arXiv …, 2024 - arxiv.org
Data is the new oil of the 21st century. The growing trend of trading data for greater welfare
has led to the emergence of data markets. A data market is any mechanism whereby the …

Adaptive deep learning for entity resolution by risk analysis

Q Chen, Z Chen, Y Nafa, T Duan, W Pan… - Knowledge-Based …, 2023 - Elsevier
The state-of-the-art performance on entity resolution (ER) has been achieved by deep
learning. However, deep models usually need to be trained on large quantities of accurately …

Entity Matching by Pool-Based Active Learning

Y Han, C Li - Electronics, 2024 - mdpi.com
The goal of entity matching is to find the corresponding records representing the same entity
from different data sources. At present, in the mainstream methods, rule-based entity …

[PDF][PDF] EMBA: Entity Matching using Multi-Task Learning of BERT with Attention-over-Attention.

J Zhang, H Sun, JC Ho - EDBT, 2024 - openproceedings.org
Entity matching is a crucial data integration process as it identifies whether two records refer
to the same real-world object. Since shared identifiers are not always available, learning to …

LLM+ KG@ VLDB'24 Workshop Summary

A Khan, T Wu, X Chen - arXiv preprint arXiv:2410.01978, 2024 - arxiv.org
The unification of large language models (LLMs) and knowledge graphs (KGs) has emerged
as a hot topic. At the LLM+ KG'24 workshop, held in conjunction with VLDB 2024 in …