Tempura: Query analysis with structural templates

T Wu, MT Ribeiro, J Heer, DS Weld - arXiv preprint arXiv:2101.00288, 2021 - arxiv.org

While counterfactual examples are useful for analysis and training of NLP models, current
generation methods either rely on manual labor to create very few counterfactuals, or only …

被引用次数：255 相关文章所有 11 个版本

[PDF] nsf.gov

Patat: Human-ai collaborative qualitative coding with explainable interactive rule synthesis

SA Gebreegziabher, Z Zhang, X Tang, Y Meng… - Proceedings of the …, 2023 - dl.acm.org

Over the years, the task of AI-assisted data annotation has seen remarkable advancements.
However, a specific type of annotation task, the qualitative coding performed during thematic …

被引用次数：57 相关文章所有 7 个版本

[PDF] arxiv.org

MultiCoNER: A large-scale multilingual dataset for complex named entity recognition

S Malmasi, A Fang, B Fetahu, S Kar… - arXiv preprint arXiv …, 2022 - arxiv.org

We present MultiCoNER, a large multilingual dataset for Named Entity Recognition that
covers 3 domains (Wiki sentences, questions, and search queries) across 11 languages, as …

被引用次数：103 相关文章所有 4 个版本

[PDF] aclanthology.org

GEMNET: Effective gated gazetteer representations for recognizing complex entities in low-context input

T Meng, A Fang, O Rokhlenko… - Proceedings of the 2021 …, 2021 - aclanthology.org

Abstract Named Entity Recognition (NER) remains difficult in real-world settings; current
challenges include short texts (low context), emerging entities, and complex entities (eg …

被引用次数：71 相关文章所有 3 个版本

[PDF] acm.org

Scattershot: Interactive in-context example curation for text transformation

S Wu, H Shen, DS Weld, J Heer… - Proceedings of the 28th …, 2023 - dl.acm.org

The in-context learning capabilities of LLMs like GPT-3 allow annotators to customize an
LLM to their specific tasks with a small number of examples. However, users tend to include …

被引用次数：24 相关文章所有 7 个版本

[PDF] arxiv.org

Supporting Sensemaking of Large Language Model Outputs at Scale

KI Gero, C Swoopes, Z Gu, JK Kummerfeld… - Proceedings of the CHI …, 2024 - dl.acm.org

Large language models (LLMs) are capable of generating multiple responses to a single
prompt, yet little effort has been expended to help end-users or system designers make use …

被引用次数：15 相关文章所有 4 个版本

[PDF] acm.org

Intuitively assessing ml model reliability through example-based explanations and editing model inputs

H Suresh, KM Lewis, J Guttag… - Proceedings of the 27th …, 2022 - dl.acm.org

Interpretability methods aim to help users build trust in and understand the capabilities of
machine learning models. However, existing approaches often rely on abstract, complex …

被引用次数：29 相关文章所有 7 个版本

[PDF] arxiv.org

ShortcutLens: A visual analytics approach for exploring shortcuts in natural language understanding dataset

Z Jin, X Wang, F Cheng, C Sun, Q Liu… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

Benchmark datasets play an important role in evaluating Natural Language Understanding
(NLU) models. However, shortcuts—unwanted biases in the benchmark datasets—can …

被引用次数：11 相关文章所有 6 个版本

[PDF] arxiv.org

Jailbreakhunter: a visual analytics approach for jailbreak prompts discovery from large-scale human-llm conversational datasets

Z Jin, S Liu, H Li, X Zhao, H Qu - arXiv preprint arXiv:2407.03045, 2024 - arxiv.org

Large Language Models (LLMs) have gained significant attention but also raised concerns
due to the risk of misuse. Jailbreak prompts, a popular type of adversarial attack towards …

被引用次数：2 相关文章所有 2 个版本

[PDF] arxiv.org

Exploring Empty Spaces: Human-in-the-Loop Data Augmentation

C Yeh, D Ren, Y Assogba, D Moritz… - arXiv preprint arXiv …, 2024 - arxiv.org

Data augmentation is crucial to make machine learning models more robust and safe.
However, augmenting data can be challenging as it requires generating diverse data points …