- 学术资源搜索

Semantic models for the first-stage retrieval: A comprehensive review

J Guo, Y Cai, Y Fan, F Sun, R Zhang… - ACM Transactions on …, 2022 - dl.acm.org

Multi-stage ranking pipelines have been a practical solution in modern search systems,
where the first-stage retrieval is to return a subset of candidate documents and latter stages …

被引用次数：114 相关文章所有 4 个版本

[PDF] ieee.org

Information retrieval: recent advances and beyond

KA Hambarde, H Proenca - IEEE Access, 2023 - ieeexplore.ieee.org

This paper provides an extensive and thorough overview of the models and techniques
utilized in the first and second stages of the typical information retrieval processing chain …

被引用次数：49 相关文章所有 7 个版本

[PDF] arxiv.org

A survey of large language models

WX Zhao, K Zhou, J Li, T Tang, X Wang, Y Hou… - arXiv preprint arXiv …, 2023 - arxiv.org

Language is essentially a complex, intricate system of human expressions governed by
grammatical rules. It poses a significant challenge to develop capable AI algorithms for …

被引用次数：2407 相关文章所有 4 个版本

[PDF] arxiv.org

C-pack: Packaged resources to advance general chinese embedding

S Xiao, Z Liu, P Zhang, N Muennighoff, D Lian… - arXiv preprint arXiv …, 2023 - arxiv.org

We introduce C-Pack, a package of resources that significantly advance the field of general
Chinese embeddings. C-Pack includes three critical resources. 1) C-MTEB is a …

被引用次数：213 相关文章所有 2 个版本

[PDF] arxiv.org

Large language models for information retrieval: A survey

Y Zhu, H Yuan, S Wang, J Liu, W Liu, C Deng… - arXiv preprint arXiv …, 2023 - arxiv.org

As a primary means of information acquisition, information retrieval (IR) systems, such as
search engines, have integrated themselves into our daily lives. These systems also serve …

被引用次数：179 相关文章所有 3 个版本

[PDF] arxiv.org

Learning to retrieve prompts for in-context learning

O Rubin, J Herzig, J Berant - arXiv preprint arXiv:2112.08633, 2021 - arxiv.org

In-context learning is a recent paradigm in natural language understanding, where a large
pre-trained language model (LM) observes a test instance and a few training examples as …

被引用次数：490 相关文章所有 5 个版本

[PDF] arxiv.org

Text and code embeddings by contrastive pre-training

A Neelakantan, T Xu, R Puri, A Radford, JM Han… - arXiv preprint arXiv …, 2022 - arxiv.org

Text embeddings are useful features in many applications such as semantic search and
computing text similarity. Previous work typically trains models customized for different use …

被引用次数：355 相关文章所有 5 个版本

[PDF] arxiv.org

Generate rather than retrieve: Large language models are strong context generators

W Yu, D Iter, S Wang, Y Xu, M Ju, S Sanyal… - arXiv preprint arXiv …, 2022 - arxiv.org

Knowledge-intensive tasks, such as open-domain question answering (QA), require access
to a large amount of world or domain knowledge. A common approach for knowledge …

被引用次数：217 相关文章所有 4 个版本

[PDF] arxiv.org

Colbertv2: Effective and efficient retrieval via lightweight late interaction

K Santhanam, O Khattab, J Saad-Falcon… - arXiv preprint arXiv …, 2021 - arxiv.org

Neural information retrieval (IR) has greatly advanced search and other knowledge-
intensive language tasks. While many neural IR methods encode queries and documents …

被引用次数：327 相关文章所有 5 个版本

[PDF] arxiv.org

Beir: A heterogenous benchmark for zero-shot evaluation of information retrieval models

N Thakur, N Reimers, A Rücklé, A Srivastava… - arXiv preprint arXiv …, 2021 - arxiv.org

Existing neural information retrieval (IR) models have often been studied in homogeneous
and narrow settings, which has considerably limited insights into their out-of-distribution …

被引用次数：750 相关文章所有 6 个版本