Data-centric artificial intelligence: A survey
Artificial Intelligence (AI) is making a profound impact in almost every domain. A vital enabler
of its great success is the availability of abundant and high-quality data for building machine …
of its great success is the availability of abundant and high-quality data for building machine …
A survey on programmatic weak supervision
Labeling training data has become one of the major roadblocks to using machine learning.
Among various weak supervision paradigms, programmatic weak supervision (PWS) has …
Among various weak supervision paradigms, programmatic weak supervision (PWS) has …
WRENCH: A comprehensive benchmark for weak supervision
Recent Weak Supervision (WS) approaches have had widespread success in easing the
bottleneck of labeling training data for machine learning by synthesizing labels from multiple …
bottleneck of labeling training data for machine learning by synthesizing labels from multiple …
Prboost: Prompt-based rule discovery and boosting for interactive weakly-supervised learning
Weakly-supervised learning (WSL) has shown promising results in addressing label scarcity
on many NLP tasks, but manually designing a comprehensive, high-quality labeling rule set …
on many NLP tasks, but manually designing a comprehensive, high-quality labeling rule set …
Explainable ai: Foundations, applications, opportunities for data management research
Algorithmic decision-making systems are successfully being adopted in a wide range of
domains for diverse tasks. While the potential benefits of algorithmic decision-making are …
domains for diverse tasks. While the potential benefits of algorithmic decision-making are …
[HTML][HTML] Understanding the influence of news on society decision making: application to economic policy uncertainty
The abundance of digital documents offers a valuable chance to gain insights into public
opinion, social structure, and dynamics. However, the scale and volume of these digital …
opinion, social structure, and dynamics. However, the scale and volume of these digital …
Nemo: Guiding and contextualizing weak supervision for interactive data programming
Weak Supervision (WS) techniques allow users to efficiently create large training datasets
by programmatically labeling data with heuristic sources of supervision. While the success of …
by programmatically labeling data with heuristic sources of supervision. While the success of …
Automatic Rule Induction for Interpretable Semi-Supervised Learning
Semi-supervised learning has shown promise in allowing NLP models to generalize from
small amounts of labeled data. Meanwhile, pretrained transformer models act as black-box …
small amounts of labeled data. Meanwhile, pretrained transformer models act as black-box …
Self-supervised self-supervision by combining deep learning and probabilistic logic
Labeling training examples at scale is a perennial challenge in machine learning. Self-
supervision methods compensate for the lack of direct supervision by leveraging prior …
supervision methods compensate for the lack of direct supervision by leveraging prior …
Can Large Language Models Design Accurate Label Functions?
Programmatic weak supervision methodologies facilitate the expedited labeling of extensive
datasets through the use of label functions (LFs) that encapsulate heuristic data sources …
datasets through the use of label functions (LFs) that encapsulate heuristic data sources …