Generator-retriever-generator: A novel approach to open-domain question answering

A Abdallah, A Jatowt - arXiv preprint arXiv:2307.11278, 2023 - arxiv.org
Open-domain question answering (QA) tasks usually require the retrieval of relevant
information from a large corpus to generate accurate answers. We propose a novel …

Sprint: A unified toolkit for evaluating and demystifying zero-shot neural sparse retrieval

N Thakur, K Wang, I Gurevych, J Lin - Proceedings of the 46th …, 2023 - dl.acm.org
Traditionally, sparse retrieval systems relied on lexical representations to retrieve
documents, such as BM25, dominated information retrieval tasks. With the onset of pre …

Generative retrieval as multi-vector dense retrieval

S Wu, W Wei, M Zhang, Z Chen, J Ma, Z Ren… - Proceedings of the 47th …, 2024 - dl.acm.org
For a given query generative retrieval generates identifiers of relevant documents in an end-
to-end manner using a sequence-to-sequence architecture. The relation between …

Distillation for Multilingual Information Retrieval

E Yang, D Lawrie, J Mayfield - Proceedings of the 47th International ACM …, 2024 - dl.acm.org
Recent work in cross-language information retrieval (CLIR), where queries and documents
are in different languages, has shown the benefit of the Translate-Distill framework that …

Resources for Brewing BEIR: Reproducible Reference Models and Statistical Analyses

E Kamalloo, N Thakur, C Lassance, X Ma… - Proceedings of the 47th …, 2024 - dl.acm.org
BEIR is a benchmark dataset originally designed for zero-shot evaluation of retrieval models
across 18 different domain/task combinations. In recent years, we have witnessed the …

Resources for brewing BEIR: reproducible reference models and an official leaderboard

E Kamalloo, N Thakur, C Lassance, X Ma… - arXiv preprint arXiv …, 2023 - arxiv.org
BEIR is a benchmark dataset for zero-shot evaluation of information retrieval models across
18 different domain/task combinations. In recent years, we have witnessed the growing …

Balanced Knowledge Distillation with Contrastive Learning for Document Re-ranking

Y Yang, S He, Y Qiao, W Xie, T Yang - Proceedings of the 2023 ACM …, 2023 - dl.acm.org
Knowledge distillation is commonly used in training a neural document ranking model by
employing a teacher to guide model refinement. As a teacher may not be correct in all cases …

Splate: Sparse late interaction retrieval

T Formal, S Clinchant, H Déjean… - Proceedings of the 47th …, 2024 - dl.acm.org
The late interaction paradigm introduced with ColBERT stands out in the neural Information
Retrieval space, offering a compelling effectiveness-efficiency trade-off across many …

End-to-End Retrieval with Learned Dense and Sparse Representations Using Lucene

H Chen, C Lassance, J Lin - arXiv preprint arXiv:2311.18503, 2023 - arxiv.org
The bi-encoder architecture provides a framework for understanding machine-learned
retrieval models based on dense and sparse vector representations. Although these …

Weighted KL-Divergence for Document Ranking Model Refinement

Y Yang, Y Qiao, S He, T Yang - … of the 47th International ACM SIGIR …, 2024 - dl.acm.org
Transformer-based retrieval and reranking models for text document search are often
refined through knowledge distillation together with contrastive learning. A tight distribution …