Semantic models for the first-stage retrieval: A comprehensive review

J Guo, Y Cai, Y Fan, F Sun, R Zhang… - ACM Transactions on …, 2022 - dl.acm.org
Multi-stage ranking pipelines have been a practical solution in modern search systems,
where the first-stage retrieval is to return a subset of candidate documents and latter stages …

Information retrieval: recent advances and beyond

KA Hambarde, H Proenca - IEEE Access, 2023 - ieeexplore.ieee.org
This paper provides an extensive and thorough overview of the models and techniques
utilized in the first and second stages of the typical information retrieval processing chain …

Colbertv2: Effective and efficient retrieval via lightweight late interaction

K Santhanam, O Khattab, J Saad-Falcon… - arXiv preprint arXiv …, 2021 - arxiv.org
Neural information retrieval (IR) has greatly advanced search and other knowledge-
intensive language tasks. While many neural IR methods encode queries and documents …

Autoregressive search engines: Generating substrings as document identifiers

M Bevilacqua, G Ottaviano, P Lewis… - Advances in …, 2022 - proceedings.neurips.cc
Abstract Knowledge-intensive language tasks require NLP systems to both provide the
correct answer and retrieve supporting evidence for it in a given corpus. Autoregressive …

[图书][B] Pretrained transformers for text ranking: Bert and beyond

J Lin, R Nogueira, A Yates - 2022 - books.google.com
The goal of text ranking is to generate an ordered list of texts retrieved from a corpus in
response to a query. Although the most common formulation of text ranking is search …

From distillation to hard negative sampling: Making sparse neural ir models more effective

T Formal, C Lassance, B Piwowarski… - Proceedings of the 45th …, 2022 - dl.acm.org
Neural retrievers based on dense representations combined with Approximate Nearest
Neighbors search have recently received a lot of attention, owing their success to distillation …

SPLADE v2: Sparse lexical and expansion model for information retrieval

T Formal, C Lassance, B Piwowarski… - arXiv preprint arXiv …, 2021 - arxiv.org
In neural Information Retrieval (IR), ongoing research is directed towards improving the first
retriever in ranking pipelines. Learning dense embeddings to conduct retrieval using …

PARADE: Passage Representation Aggregation forDocument Reranking

C Li, A Yates, S MacAvaney, B He, Y Sun - ACM Transactions on …, 2023 - dl.acm.org
Pre-trained transformer models, such as BERT and T5, have shown to be highly effective at
ad hoc passage and document ranking. Due to the inherent sequence length limits of these …

A few brief notes on deepimpact, coil, and a conceptual framework for information retrieval techniques

J Lin, X Ma - arXiv preprint arXiv:2106.14807, 2021 - arxiv.org
Recent developments in representational learning for information retrieval can be organized
in a conceptual framework that establishes two pairs of contrasts: sparse vs. dense …

Pre-training methods in information retrieval

Y Fan, X Xie, Y Cai, J Chen, X Ma, X Li… - … and Trends® in …, 2022 - nowpublishers.com
The core of information retrieval (IR) is to identify relevant information from large-scale
resources and return it as a ranked list to respond to user's information need. In recent years …