On llms-driven synthetic data generation, curation, and evaluation: A survey
Within the evolving landscape of deep learning, the dilemma of data quantity and quality has
been a long-standing problem. The recent advent of Large Language Models (LLMs) offers …
been a long-standing problem. The recent advent of Large Language Models (LLMs) offers …
A survey on programmatic weak supervision
Labeling training data has become one of the major roadblocks to using machine learning.
Among various weak supervision paradigms, programmatic weak supervision (PWS) has …
Among various weak supervision paradigms, programmatic weak supervision (PWS) has …
Coannotating: Uncertainty-guided work allocation between human and large language models for data annotation
Annotated data plays a critical role in Natural Language Processing (NLP) in training
models and evaluating their performance. Given recent developments in Large Language …
models and evaluating their performance. Given recent developments in Large Language …
Language models in the loop: Incorporating prompting into weak supervision
We propose a new strategy for applying large pre-trained language models to novel tasks
when labeled training data is limited. Rather than apply the model in a typical zero-shot or …
when labeled training data is limited. Rather than apply the model in a typical zero-shot or …
Understanding programmatic weak supervision via source-aware influence function
Abstract Programmatic Weak Supervision (PWS) aggregates the source votes of multiple
weak supervision sources into probabilistic training labels, which are in turn used to train an …
weak supervision sources into probabilistic training labels, which are in turn used to train an …
Losses over labels: Weakly supervised learning via direct loss construction
Owing to the prohibitive costs of generating large amounts of labeled data, programmatic
weak supervision is a growing paradigm within machine learning. In this setting, users …
weak supervision is a growing paradigm within machine learning. In this setting, users …
Leveraging instance features for label aggregation in programmatic weak supervision
Abstract Programmatic Weak Supervision (PWS) has emerged as a widespread paradigm to
synthesize training labels efficiently. The core component of PWS is the label model, which …
synthesize training labels efficiently. The core component of PWS is the label model, which …
Robust weak supervision with variational auto-encoders
Recent advances in weak supervision (WS) techniques allow to mitigate the enormous cost
and effort of human data annotation for supervised machine learning by automating it using …
and effort of human data annotation for supervised machine learning by automating it using …
How many validation labels do you need? exploring the design space of label-efficient model ranking
The paper introduces LEMR, a framework that reduces annotation costs for model selection
tasks. Our approach leverages ensemble methods to generate pseudo-labels, employs …
tasks. Our approach leverages ensemble methods to generate pseudo-labels, employs …
Cross-task Knowledge Transfer for Extremely Weakly Supervised Text Classification
Text classification with extremely weak supervision (EWS) imposes stricter supervision
constraints compared to regular weakly supervise classification. Absolutely no labeled …
constraints compared to regular weakly supervise classification. Absolutely no labeled …