Semantic models for the first-stage retrieval: A comprehensive review

J Guo, Y Cai, Y Fan, F Sun, R Zhang… - ACM Transactions on …, 2022 - dl.acm.org
Multi-stage ranking pipelines have been a practical solution in modern search systems,
where the first-stage retrieval is to return a subset of candidate documents and latter stages …

Information retrieval: recent advances and beyond

KA Hambarde, H Proenca - IEEE Access, 2023 - ieeexplore.ieee.org
This paper provides an extensive and thorough overview of the models and techniques
utilized in the first and second stages of the typical information retrieval processing chain …

Beir: A heterogenous benchmark for zero-shot evaluation of information retrieval models

N Thakur, N Reimers, A Rücklé, A Srivastava… - arXiv preprint arXiv …, 2021 - arxiv.org
Existing neural information retrieval (IR) models have often been studied in homogeneous
and narrow settings, which has considerably limited insights into their out-of-distribution …

Transformer memory as a differentiable search index

Y Tay, V Tran, M Dehghani, J Ni… - Advances in …, 2022 - proceedings.neurips.cc
In this paper, we demonstrate that information retrieval can be accomplished with a single
Transformer, in which all information about the corpus is encoded in the parameters of the …

Large dual encoders are generalizable retrievers

J Ni, C Qu, J Lu, Z Dai, GH Ábrego, J Ma… - arXiv preprint arXiv …, 2021 - arxiv.org
It has been shown that dual encoders trained on one domain often fail to generalize to other
domains for retrieval tasks. One widespread belief is that the bottleneck layer of a dual …

Sentence-t5: Scalable sentence encoders from pre-trained text-to-text models

J Ni, GH Abrego, N Constant, J Ma, KB Hall… - arXiv preprint arXiv …, 2021 - arxiv.org
We provide the first exploration of sentence embeddings from text-to-text transformers (T5).
Sentence embeddings are broadly useful for language processing tasks. While T5 achieves …

Promptagator: Few-shot dense retrieval from 8 examples

Z Dai, VY Zhao, J Ma, Y Luan, J Ni, J Lu… - arXiv preprint arXiv …, 2022 - arxiv.org
Much recent research on information retrieval has focused on how to transfer from one task
(typically with abundant supervised data) to various other tasks where supervision is limited …

[图书][B] Pretrained transformers for text ranking: Bert and beyond

J Lin, R Nogueira, A Yates - 2022 - books.google.com
The goal of text ranking is to generate an ordered list of texts retrieved from a corpus in
response to a query. Although the most common formulation of text ranking is search …

Learning to tokenize for generative retrieval

W Sun, L Yan, Z Chen, S Wang, H Zhu… - Advances in …, 2024 - proceedings.neurips.cc
As a new paradigm in information retrieval, generative retrieval directly generates a ranked
list of document identifiers (docids) for a given query using generative language models …

Multilingual universal sentence encoder for semantic retrieval

Y Yang, D Cer, A Ahmad, M Guo, J Law… - arXiv preprint arXiv …, 2019 - arxiv.org
We introduce two pre-trained retrieval focused multilingual sentence encoding models,
respectively based on the Transformer and CNN model architectures. The models embed …