Continual lifelong learning in natural language processing: A survey
Continual learning (CL) aims to enable information systems to learn from a continuous data
stream across time. However, it is difficult for existing deep learning architectures to learn a …
stream across time. However, it is difficult for existing deep learning architectures to learn a …
NusaCrowd: Open source initiative for Indonesian NLP resources
We present NusaCrowd, a collaborative initiative to collect and unify existing resources for
Indonesian languages, including opening access to previously non-public resources …
Indonesian languages, including opening access to previously non-public resources …
Crosslingual generalization through multitask finetuning
Multitask prompted finetuning (MTF) has been shown to help large language models
generalize to new tasks in a zero-shot setting, but so far explorations of MTF have focused …
generalize to new tasks in a zero-shot setting, but so far explorations of MTF have focused …
[PDF][PDF] mt5: A massively multilingual pre-trained text-to-text transformer
L Xue - arXiv preprint arXiv:2010.11934, 2020 - fq.pkwyx.com
The recent" Text-to-Text Transfer Transformer"(T5) leveraged a unified text-to-text format and
scale to attain state-of-the-art results on a wide variety of English-language NLP tasks. In this …
scale to attain state-of-the-art results on a wide variety of English-language NLP tasks. In this …
Merging models with fisher-weighted averaging
MS Matena, CA Raffel - Advances in Neural Information …, 2022 - proceedings.neurips.cc
Averaging the parameters of models that have the same architecture and initialization can
provide a means of combining their respective capabilities. In this paper, we take the …
provide a means of combining their respective capabilities. In this paper, we take the …
[图书][B] Pretrained transformers for text ranking: Bert and beyond
The goal of text ranking is to generate an ordered list of texts retrieved from a corpus in
response to a query. Although the most common formulation of text ranking is search …
response to a query. Although the most common formulation of text ranking is search …
Canine: Pre-training an Efficient Tokenization-Free Encoder for Language Representation
Pipelined NLP systems have largely been superseded by end-to-end neural modeling, yet
nearly all commonly used models still require an explicit tokenization step. While recent …
nearly all commonly used models still require an explicit tokenization step. While recent …
Neural unsupervised domain adaptation in NLP---a survey
Deep neural networks excel at learning from labeled data and achieve state-of-the-art
resultson a wide array of Natural Language Processing tasks. In contrast, learning from …
resultson a wide array of Natural Language Processing tasks. In contrast, learning from …
XTREME-R: Towards more challenging and nuanced multilingual evaluation
Machine learning has brought striking advances in multilingual natural language processing
capabilities over the past year. For example, the latest techniques have improved the state …
capabilities over the past year. For example, the latest techniques have improved the state …
Rethinking embedding coupling in pre-trained language models
We re-evaluate the standard practice of sharing weights between input and output
embeddings in state-of-the-art pre-trained language models. We show that decoupled …
embeddings in state-of-the-art pre-trained language models. We show that decoupled …