Dense text retrieval based on pretrained language models: A survey

WX Zhao, J Liu, R Ren, JR Wen - ACM Transactions on Information …, 2024 - dl.acm.org
Text retrieval is a long-standing research topic on information seeking, where a system is
required to return relevant information resources to user's queries in natural language. From …

How to train your dragon: Diverse augmentation towards generalizable dense retrieval

SC Lin, A Asai, M Li, B Oguz, J Lin, Y Mehdad… - arXiv preprint arXiv …, 2023 - arxiv.org
Various techniques have been developed in recent years to improve dense retrieval (DR),
such as unsupervised contrastive learning and pseudo-query generation. Existing DRs …

PLAID: an efficient engine for late interaction retrieval

K Santhanam, O Khattab, C Potts… - Proceedings of the 31st …, 2022 - dl.acm.org
Pre-trained language models are increasingly important components across multiple
information retrieval (IR) paradigms. Late interaction, introduced with the ColBERT model …

Low-resource dense retrieval for open-domain question answering: A comprehensive survey

X Shen, S Vakulenko, M Del Tredici… - arXiv preprint arXiv …, 2022 - arxiv.org
Dense retrieval (DR) approaches based on powerful pre-trained language models (PLMs)
achieved significant advances and have become a key component for modern open-domain …

Aggretriever: A simple approach to aggregate textual representations for robust dense passage retrieval

SC Lin, M Li, J Lin - Transactions of the Association for Computational …, 2023 - direct.mit.edu
Pre-trained language models have been successful in many knowledge-intensive NLP
tasks. However, recent work has shown that models such as BERT are not “structurally …

Scalable and effective generative information retrieval

H Zeng, C Luo, B Jin, SM Sarwar, T Wei… - Proceedings of the ACM …, 2024 - dl.acm.org
Recent research has shown that transformer networks can be used as differentiable search
indexes by representing each document as a sequence of document ID tokens. These …

Led: Lexicon-enlightened dense retriever for large-scale retrieval

K Zhang, C Tao, T Shen, C Xu, X Geng, B Jiao… - Proceedings of the …, 2023 - dl.acm.org
Retrieval models based on dense representations in semantic space have become an
indispensable branch for first-stage retrieval. These retrievers benefit from surging advances …

Distillation from heterogeneous models for top-k recommendation

SK Kang, W Kweon, D Lee, J Lian, X Xie… - Proceedings of the ACM …, 2023 - dl.acm.org
Recent recommender systems have shown remarkable performance by using an ensemble
of heterogeneous models. However, it is exceedingly costly because it requires resources …

Tevatron: An efficient and flexible toolkit for neural retrieval

L Gao, X Ma, J Lin, J Callan - Proceedings of the 46th International ACM …, 2023 - dl.acm.org
Recent rapid advances in deep pre-trained language models and the introduction of large
datasets have powered research in embedding-based neural retrieval. While many …

Listwise generative retrieval models via a sequential learning process

Y Tang, R Zhang, J Guo, M de Rijke, W Chen… - ACM Transactions on …, 2024 - dl.acm.org
Recently, a novel generative retrieval (GR) paradigm has been proposed, where a single
sequence-to-sequence model is learned to directly generate a list of relevant document …