Towards robustness against natural language word substitutions

S Goyal, S Doddapaneni, MM Khapra… - ACM Computing …, 2023 - dl.acm.org

In the past few years, it has become increasingly evident that deep neural networks are not
resilient enough to withstand adversarial perturbations in input data, leaving them …

被引用次数：119 相关文章所有 5 个版本

[PDF] ieee.org

Robust natural language processing: Recent advances, challenges, and future directions

M Omar, S Choi, DH Nyang, D Mohaisen - IEEE Access, 2022 - ieeexplore.ieee.org

Recent natural language processing (NLP) techniques have accomplished high
performance on benchmark data sets, primarily due to the significant improvement in the …

被引用次数：73 相关文章所有 6 个版本

[PDF] arxiv.org

Prompt as triggers for backdoor attack: Examining the vulnerability in language models

S Zhao, J Wen, LA Tuan, J Zhao, J Fu - arXiv preprint arXiv:2305.01219, 2023 - arxiv.org

The prompt-based learning paradigm, which bridges the gap between pre-training and fine-
tuning, achieves state-of-the-art performance on several NLP tasks, particularly in few-shot …

被引用次数：72 相关文章所有 5 个版本

[PDF] arxiv.org

Defending against alignment-breaking attacks via robustly aligned llm

B Cao, Y Cao, L Lin, J Chen - arXiv preprint arXiv:2309.14348, 2023 - arxiv.org

Recently, Large Language Models (LLMs) have made significant advancements and are
now widely used across various domains. Unfortunately, there has been a rising concern …

被引用次数：95 相关文章所有 3 个版本

[PDF] arxiv.org

Towards improving adversarial training of NLP models

JY Yoo, Y Qi - arXiv preprint arXiv:2109.00544, 2021 - arxiv.org

Adversarial training, a method for learning robust deep neural networks, constructs
adversarial examples during training. However, recent methods for generating NLP …

被引用次数：143 相关文章所有 4 个版本

[PDF] arxiv.org

Evaluating the robustness of neural language models to input perturbations

M Moradi, M Samwald - arXiv preprint arXiv:2108.12237, 2021 - arxiv.org

High-performance neural language models have obtained state-of-the-art results on a wide
range of Natural Language Processing (NLP) tasks. However, results for common …

被引用次数：104 相关文章所有 4 个版本

[PDF] arxiv.org

Searching for an effective defender: Benchmarking defense against adversarial word substitution

Z Li, J Xu, J Zeng, L Li, X Zheng, Q Zhang… - arXiv preprint arXiv …, 2021 - arxiv.org

Recent studies have shown that deep neural networks are vulnerable to intentionally crafted
adversarial examples, and various methods have been proposed to defend against …

被引用次数：69 相关文章所有 4 个版本

[PDF] neurips.cc

How should pre-trained language models be fine-tuned towards adversarial robustness?

X Dong, AT Luu, M Lin, S Yan… - Advances in Neural …, 2021 - proceedings.neurips.cc

The fine-tuning of pre-trained language models has a great success in many NLP fields. Yet,
it is strikingly vulnerable to adversarial examples, eg, word substitution attacks using only …

被引用次数：61 相关文章所有 8 个版本

Transformer models used for text-based question answering systems

K Nassiri, M Akhloufi - Applied Intelligence, 2023 - Springer

The question answering system is frequently applied in the area of natural language
processing (NLP) because of the wide variety of applications. It consists of answering …

被引用次数：84 相关文章所有 3 个版本

[PDF] acm.org

Prada: Practical black-box adversarial attacks against neural ranking models

C Wu, R Zhang, J Guo, M De Rijke, Y Fan… - ACM Transactions on …, 2023 - dl.acm.org

Neural ranking models (NRMs) have shown remarkable success in recent years, especially
with pre-trained language models. However, deep neural models are notorious for their …

被引用次数：52 相关文章所有 10 个版本