Semantic models for the first-stage retrieval: A comprehensive review
Multi-stage ranking pipelines have been a practical solution in modern search systems,
where the first-stage retrieval is to return a subset of candidate documents and latter stages …
where the first-stage retrieval is to return a subset of candidate documents and latter stages …
Information retrieval: recent advances and beyond
KA Hambarde, H Proenca - IEEE Access, 2023 - ieeexplore.ieee.org
This paper provides an extensive and thorough overview of the models and techniques
utilized in the first and second stages of the typical information retrieval processing chain …
utilized in the first and second stages of the typical information retrieval processing chain …
A survey of large language models
Language is essentially a complex, intricate system of human expressions governed by
grammatical rules. It poses a significant challenge to develop capable AI algorithms for …
grammatical rules. It poses a significant challenge to develop capable AI algorithms for …
C-pack: Packaged resources to advance general chinese embedding
We introduce C-Pack, a package of resources that significantly advance the field of general
Chinese embeddings. C-Pack includes three critical resources. 1) C-MTEB is a …
Chinese embeddings. C-Pack includes three critical resources. 1) C-MTEB is a …
Large language models for information retrieval: A survey
As a primary means of information acquisition, information retrieval (IR) systems, such as
search engines, have integrated themselves into our daily lives. These systems also serve …
search engines, have integrated themselves into our daily lives. These systems also serve …
Learning to retrieve prompts for in-context learning
In-context learning is a recent paradigm in natural language understanding, where a large
pre-trained language model (LM) observes a test instance and a few training examples as …
pre-trained language model (LM) observes a test instance and a few training examples as …
Text and code embeddings by contrastive pre-training
Text embeddings are useful features in many applications such as semantic search and
computing text similarity. Previous work typically trains models customized for different use …
computing text similarity. Previous work typically trains models customized for different use …
Generate rather than retrieve: Large language models are strong context generators
Knowledge-intensive tasks, such as open-domain question answering (QA), require access
to a large amount of world or domain knowledge. A common approach for knowledge …
to a large amount of world or domain knowledge. A common approach for knowledge …
Colbertv2: Effective and efficient retrieval via lightweight late interaction
Neural information retrieval (IR) has greatly advanced search and other knowledge-
intensive language tasks. While many neural IR methods encode queries and documents …
intensive language tasks. While many neural IR methods encode queries and documents …
Beir: A heterogenous benchmark for zero-shot evaluation of information retrieval models
Existing neural information retrieval (IR) models have often been studied in homogeneous
and narrow settings, which has considerably limited insights into their out-of-distribution …
and narrow settings, which has considerably limited insights into their out-of-distribution …