Dense text retrieval based on pretrained language models: A survey
Text retrieval is a long-standing research topic on information seeking, where a system is
required to return relevant information resources to user's queries in natural language. From …
required to return relevant information resources to user's queries in natural language. From …
RetroMAE-2: Duplex Masked Auto-Encoder For Pre-Training Retrieval-Oriented Language Models
To better support information retrieval tasks such as web search and open-domain question
answering, growing effort is made to develop retrieval-oriented language models, eg …
answering, growing effort is made to develop retrieval-oriented language models, eg …
TOME: A two-stage approach for model-based retrieval
Recently, model-based retrieval has emerged as a new paradigm in text retrieval that
discards the index in the traditional retrieval model and instead memorizes the candidate …
discards the index in the traditional retrieval model and instead memorizes the candidate …
Making large language models a better foundation for dense retrieval
Dense retrieval needs to learn discriminative text embeddings to represent the semantic
relationship between query and document. It may benefit from the using of large language …
relationship between query and document. It may benefit from the using of large language …
Master: Multi-task pre-trained bottlenecked masked autoencoders are better dense retrievers
Pre-trained Transformers (eg, BERT) have been commonly used in existing dense retrieval
methods for parameter initialization, and recent studies are exploring more effective pre …
methods for parameter initialization, and recent studies are exploring more effective pre …
Llama2vec: Unsupervised adaptation of large language models for dense retrieval
Dense retrieval calls for discriminative embeddings to represent the semantic relationship
between query and document. It may benefit from the using of large language models …
between query and document. It may benefit from the using of large language models …
Learning Discrete Document Representations in Web Search
R Huang, D Zhang, W Lu, H Li, M Wang, D Shi… - Proceedings of the 29th …, 2023 - dl.acm.org
Product quantization (PQ) has been usually applied to dense retrieval (DR) of documents
thanks to its competitive time, memory efficiency and compatibility with other approximate …
thanks to its competitive time, memory efficiency and compatibility with other approximate …
TriSampler: A Better Negative Sampling Principle for Dense Retrieval
Negative sampling stands as a pivotal technique in dense retrieval, essential for training
effective retrieval models and significantly impacting retrieval performance. While existing …
effective retrieval models and significantly impacting retrieval performance. While existing …
Retromae-2: Duplex masked auto-encoder for pre-training retrieval-oriented language models
To better support information retrieval tasks such as web search and open-domain question
answering, growing effort is made to develop retrieval-oriented language models, eg …
answering, growing effort is made to develop retrieval-oriented language models, eg …
Improving News Recommendation via Bottlenecked Multi-task Pre-training
X Xiao, Q Li, S Liu, K Zhou - Proceedings of the 46th International ACM …, 2023 - dl.acm.org
Recent years have witnessed the boom of deep neural networks in online news
recommendation service. As news articles mainly consist of textual content, pre-trained …
recommendation service. As news articles mainly consist of textual content, pre-trained …