Dense text retrieval based on pretrained language models: A survey

WX Zhao, J Liu, R Ren, JR Wen - ACM Transactions on Information …, 2024 - dl.acm.org
Text retrieval is a long-standing research topic on information seeking, where a system is
required to return relevant information resources to user's queries in natural language. From …

RocketQA: An optimized training approach to dense passage retrieval for open-domain question answering

Y Qu, Y Ding, J Liu, K Liu, R Ren, WX Zhao… - arXiv preprint arXiv …, 2020 - arxiv.org
In open-domain question answering, dense passage retrieval has become a new paradigm
to retrieve relevant passages for finding answers. Typically, the dual-encoder architecture is …

Overview of the TREC 2019 deep learning track

N Craswell, B Mitra, E Yilmaz, D Campos… - arXiv preprint arXiv …, 2020 - arxiv.org
The Deep Learning Track is a new track for TREC 2019, with the goal of studying ad hoc
ranking in a large data regime. It is the first track with large human-labeled training sets …

Semeval-2022 task 11: Multilingual complex named entity recognition (multiconer)

S Malmasi, A Fang, B Fetahu, S Kar… - Proceedings of the …, 2022 - aclanthology.org
We present the findings of SemEval-2022 Task 11 on Multilingual Complex Named Entity
Recognition MULTICONER. Divided into 13 tracks, the task focused on methods to identify …

MultiCoNER: A large-scale multilingual dataset for complex named entity recognition

S Malmasi, A Fang, B Fetahu, S Kar… - arXiv preprint arXiv …, 2022 - arxiv.org
We present MultiCoNER, a large multilingual dataset for Named Entity Recognition that
covers 3 domains (Wiki sentences, questions, and search queries) across 11 languages, as …

Simplified data wrangling with ir_datasets

S MacAvaney, A Yates, S Feldman, D Downey… - Proceedings of the 44th …, 2021 - dl.acm.org
Managing the data for Information Retrieval (IR) experiments can be challenging. Dataset
documentation is scattered across the Internet and once one obtains a copy of the data …

PAIR: Leveraging passage-centric similarity relation for improving dense passage retrieval

R Ren, S Lv, Y Qu, J Liu, WX Zhao, QQ She… - arXiv preprint arXiv …, 2021 - arxiv.org
Recently, dense passage retrieval has become a mainstream approach to finding relevant
information in various natural language processing tasks. A number of studies have been …

GEMNET: Effective gated gazetteer representations for recognizing complex entities in low-context input

T Meng, A Fang, O Rokhlenko… - Proceedings of the 2021 …, 2021 - aclanthology.org
Abstract Named Entity Recognition (NER) remains difficult in real-world settings; current
challenges include short texts (low context), emerging entities, and complex entities (eg …

Topic-oriented adversarial attacks against black-box neural ranking models

YA Liu, R Zhang, J Guo, M de Rijke, W Chen… - Proceedings of the 46th …, 2023 - dl.acm.org
Neural ranking models (NRMs) have attracted considerable attention in information retrieval.
Unfortunately, NRMs may inherit the adversarial vulnerabilities of general neural networks …

Mimics: A large-scale data collection for search clarification

H Zamani, G Lueck, E Chen, R Quispe, F Luu… - Proceedings of the 29th …, 2020 - dl.acm.org
Search clarification has recently attracted much attention due to its applications in search
engines. It has also been recognized as a major component in conversational information …