Challenges in deploying machine learning: a survey of case studies
In recent years, machine learning has transitioned from a field of academic research interest
to a field capable of solving real-world business problems. However, the deployment of …
to a field capable of solving real-world business problems. However, the deployment of …
Knowledge graph quality management: a comprehensive survey
B Xue, L Zou - IEEE Transactions on Knowledge and Data …, 2022 - ieeexplore.ieee.org
As a powerful expression of human knowledge in a structural form, knowledge graph (KG)
has drawn great attention from both the academia and the industry and a large number of …
has drawn great attention from both the academia and the industry and a large number of …
Holistic evaluation of language models
Language models (LMs) are becoming the foundation for almost all major language
technologies, but their capabilities, limitations, and risks are not well understood. We present …
technologies, but their capabilities, limitations, and risks are not well understood. We present …
Can foundation models wrangle your data?
Foundation Models (FMs) are models trained on large corpora of data that, at very large
scale, can generalize to new tasks without any task-specific finetuning. As these models …
scale, can generalize to new tasks without any task-specific finetuning. As these models …
[HTML][HTML] A benchmark for data imputation methods
S Jäger, A Allhorn, F Bießmann - Frontiers in big Data, 2021 - frontiersin.org
With the increasing importance and complexity of data pipelines, data quality became one of
the key challenges in modern software applications. The importance of data quality has …
the key challenges in modern software applications. The importance of data quality has …
Holoclean: Holistic data repairs with probabilistic inference
We introduce HoloClean, a framework for holistic data repairing driven by probabilistic
inference. HoloClean unifies existing qualitative data repairing approaches, which rely on …
inference. HoloClean unifies existing qualitative data repairing approaches, which rely on …
Creating embeddings of heterogeneous relational datasets for data integration tasks
R Cappuzzo, P Papotti… - Proceedings of the 2020 …, 2020 - dl.acm.org
Deep learning based techniques have been recently used with promising results for data
integration problems. Some methods directly use pre-trained embeddings that were trained …
integration problems. Some methods directly use pre-trained embeddings that were trained …
Holodetect: Few-shot learning for error detection
We introduce a few-shot learning framework for error detection. We show that data
augmentation (a form of weak supervision) is key to training high-quality, ML-based error …
augmentation (a form of weak supervision) is key to training high-quality, ML-based error …
[PDF][PDF] Data Integration: The Current Status and the Way Forward.
M Stonebraker, IF Ilyas - IEEE Data Eng. Bull., 2018 - cs.uwaterloo.ca
We discuss scalable data integration challenges in the enterprise inspired by our
experience at Tamr1. We use multiple real customer examples to highlight the technical …
experience at Tamr1. We use multiple real customer examples to highlight the technical …