Ammus: A survey of transformer-based pretrained models in natural language processing

KS Kalyan, A Rajasekharan, S Sangeetha - arXiv preprint arXiv …, 2021 - arxiv.org
Transformer-based pretrained language models (T-PTLMs) have achieved great success in
almost every NLP task. The evolution of these models started with GPT and BERT. These …

Pre-trained models for natural language processing: A survey

X Qiu, T Sun, Y Xu, Y Shao, N Dai, X Huang - Science China …, 2020 - Springer
Recently, the emergence of pre-trained models (PTMs) has brought natural language
processing (NLP) to a new era. In this survey, we provide a comprehensive review of PTMs …

Exploiting programmatic behavior of llms: Dual-use through standard security attacks

D Kang, X Li, I Stoica, C Guestrin… - 2024 IEEE Security …, 2024 - ieeexplore.ieee.org
Recent advances in instruction-following large language models (LLMs) have led to
dramatic improvements in a range of NLP tasks. Unfortunately, we find that the same …

Byt5: Towards a token-free future with pre-trained byte-to-byte models

L Xue, A Barua, N Constant, R Al-Rfou… - Transactions of the …, 2022 - direct.mit.edu
Most widely used pre-trained language models operate on sequences of tokens
corresponding to word or subword units. By comparison, token-free models that operate …

User preference-aware fake news detection

Y Dou, K Shu, C Xia, PS Yu, L Sun - … of the 44th international ACM SIGIR …, 2021 - dl.acm.org
Disinformation and fake news have posed detrimental effects on individuals and society in
recent years, attracting broad attention to fake news detection. The majority of existing fake …

Language model tokenizers introduce unfairness between languages

A Petrov, E La Malfa, P Torr… - Advances in Neural …, 2024 - proceedings.neurips.cc
Recent language models have shown impressive multilingual performance, even when not
explicitly trained for it. Despite this, there are concerns about the quality of their outputs …

Canine: Pre-training an Efficient Tokenization-Free Encoder for Language Representation

JH Clark, D Garrette, I Turc, J Wieting - Transactions of the Association …, 2022 - direct.mit.edu
Pipelined NLP systems have largely been superseded by end-to-end neural modeling, yet
nearly all commonly used models still require an explicit tokenization step. While recent …

CharacterBERT: Reconciling ELMo and BERT for word-level open-vocabulary representations from characters

HE Boukkouri, O Ferret, T Lavergne, H Noji… - arXiv preprint arXiv …, 2020 - arxiv.org
Due to the compelling improvements brought by BERT, many recent representation models
adopted the Transformer architecture as their main building block, consequently inheriting …

Charformer: Fast character transformers via gradient-based subword tokenization

Y Tay, VQ Tran, S Ruder, J Gupta, HW Chung… - arXiv preprint arXiv …, 2021 - arxiv.org
State-of-the-art models in natural language processing rely on separate rigid subword
tokenization algorithms, which limit their generalization ability and adaptation to new …

Adversarial example detection for DNN models: A review and experimental comparison

A Aldahdooh, W Hamidouche, SA Fezza… - Artificial Intelligence …, 2022 - Springer
Deep learning (DL) has shown great success in many human-related tasks, which has led to
its adoption in many computer vision based applications, such as security surveillance …