SLIM: Sparsified late interaction for multi-vector retrieval with inverted indexes

M Li, SC Lin, X Ma, J Lin - Proceedings of the 46th International ACM …, 2023 - dl.acm.org
This paper introduces Sparsified Late Interaction for Multi-vector (SLIM) retrieval with
inverted indexes. Multi-vector retrieval methods have demonstrated their effectiveness on …

Fast, differentiable and sparse top-k: a convex analysis perspective

ME Sander, J Puigcerver, J Djolonga… - International …, 2023 - proceedings.mlr.press
The top-$ k $ operator returns a $ k $-sparse vector, where the non-zero values correspond
to the $ k $ largest values of the input. Unfortunately, because it is a discontinuous function …

Rethinking the role of token retrieval in multi-vector retrieval

J Lee, Z Dai, SMK Duddu, T Lei… - Advances in …, 2024 - proceedings.neurips.cc
Multi-vector retrieval models such as ColBERT [Khattab et al., 2020] allow token-level
interactions between queries and documents, and hence achieve state of the art on many …

CITADEL: Conditional token interaction via dynamic lexical routing for efficient and effective multi-vector retrieval

M Li, SC Lin, B Oguz, A Ghoshal, J Lin… - arXiv preprint arXiv …, 2022 - arxiv.org
Multi-vector retrieval methods combine the merits of sparse (eg BM25) and dense (eg DPR)
retrievers and have achieved state-of-the-art performance on various retrieval tasks. These …

Towards Effective and Efficient Sparse Neural Information Retrieval

T Formal, C Lassance, B Piwowarski… - ACM Transactions on …, 2024 - dl.acm.org
Sparse representation learning based on Pre-trained Language Models has seen a growing
interest in Information Retrieval. Such approaches can take advantage of the proven …

Generative retrieval as multi-vector dense retrieval

S Wu, W Wei, M Zhang, Z Chen, J Ma, Z Ren… - Proceedings of the 47th …, 2024 - dl.acm.org
For a given query generative retrieval generates identifiers of relevant documents in an end-
to-end manner using a sequence-to-sequence architecture. The relation between …

An Analysis on Matching Mechanisms and Token Pruning for Late-interaction Models

Q Liu, G Guo, J Mao, Z Dou, JR Wen, H Jiang… - ACM Transactions on …, 2024 - dl.acm.org
With the development of pre-trained language models, the dense retrieval models have
become promising alternatives to the traditional retrieval models that rely on exact match …

Improving dual-encoder training through dynamic indexes for negative mining

N Monath, M Zaheer, K Allen… - … Conference on Artificial …, 2023 - proceedings.mlr.press
Dual encoder models are ubiquitous in modern classification and retrieval. Crucial for
training such dual encoders is an accurate estimation of gradients from the partition function …

Splate: Sparse late interaction retrieval

T Formal, S Clinchant, H Déjean… - Proceedings of the 47th …, 2024 - dl.acm.org
The late interaction paradigm introduced with ColBERT stands out in the neural Information
Retrieval space, offering a compelling effectiveness-efficiency trade-off across many …

CSurF: Sparse Lexical Retrieval through Contextualized Surface Forms

Z Fan, L Gao, J Callan - Proceedings of the 2023 ACM SIGIR …, 2023 - dl.acm.org
Lexical exact-match systems perform text retrieval efficiently with sparse matching signals
and fast retrieval through inverted lists, but naturally suffer from the mismatch between …