Data-centric artificial intelligence: A survey

D Zha, ZP Bhat, KH Lai, F Yang, Z Jiang… - arXiv preprint arXiv …, 2023 - arxiv.org
Artificial Intelligence (AI) is making a profound impact in almost every domain. A vital enabler
of its great success is the availability of abundant and high-quality data for building machine …

A survey on programmatic weak supervision

J Zhang, CY Hsieh, Y Yu, C Zhang, A Ratner - arXiv preprint arXiv …, 2022 - arxiv.org
Labeling training data has become one of the major roadblocks to using machine learning.
Among various weak supervision paradigms, programmatic weak supervision (PWS) has …

WRENCH: A comprehensive benchmark for weak supervision

J Zhang, Y Yu, Y Li, Y Wang, Y Yang, M Yang… - arXiv preprint arXiv …, 2021 - arxiv.org
Recent Weak Supervision (WS) approaches have had widespread success in easing the
bottleneck of labeling training data for machine learning by synthesizing labels from multiple …

Prboost: Prompt-based rule discovery and boosting for interactive weakly-supervised learning

R Zhang, Y Yu, P Shetty, L Song, C Zhang - arXiv preprint arXiv …, 2022 - arxiv.org
Weakly-supervised learning (WSL) has shown promising results in addressing label scarcity
on many NLP tasks, but manually designing a comprehensive, high-quality labeling rule set …

Explainable ai: Foundations, applications, opportunities for data management research

R Pradhan, A Lahiri, S Galhotra, B Salimi - Proceedings of the 2022 …, 2022 - dl.acm.org
Algorithmic decision-making systems are successfully being adopted in a wide range of
domains for diverse tasks. While the potential benefits of algorithmic decision-making are …

[HTML][HTML] Understanding the influence of news on society decision making: application to economic policy uncertainty

P Trust, A Zahran, R Minghim - Neural Computing and Applications, 2023 - Springer
The abundance of digital documents offers a valuable chance to gain insights into public
opinion, social structure, and dynamics. However, the scale and volume of these digital …

Nemo: Guiding and contextualizing weak supervision for interactive data programming

CY Hsieh, J Zhang, A Ratner - arXiv preprint arXiv:2203.01382, 2022 - arxiv.org
Weak Supervision (WS) techniques allow users to efficiently create large training datasets
by programmatically labeling data with heuristic sources of supervision. While the success of …

Automatic Rule Induction for Interpretable Semi-Supervised Learning

R Pryzant, Z Yang, Y Xu, C Zhu, M Zeng - arXiv preprint arXiv:2205.09067, 2022 - arxiv.org
Semi-supervised learning has shown promise in allowing NLP models to generalize from
small amounts of labeled data. Meanwhile, pretrained transformer models act as black-box …

Self-supervised self-supervision by combining deep learning and probabilistic logic

H Lang, H Poon - Proceedings of the AAAI Conference on Artificial …, 2021 - ojs.aaai.org
Labeling training examples at scale is a perennial challenge in machine learning. Self-
supervision methods compensate for the lack of direct supervision by leveraging prior …

Can Large Language Models Design Accurate Label Functions?

N Guan, K Chen, N Koudas - arXiv preprint arXiv:2311.00739, 2023 - arxiv.org
Programmatic weak supervision methodologies facilitate the expedited labeling of extensive
datasets through the use of label functions (LFs) that encapsulate heuristic data sources …