Dense text retrieval based on pretrained language models: A survey
Text retrieval is a long-standing research topic on information seeking, where a system is
required to return relevant information resources to user's queries in natural language. From …
required to return relevant information resources to user's queries in natural language. From …
RocketQA: An optimized training approach to dense passage retrieval for open-domain question answering
In open-domain question answering, dense passage retrieval has become a new paradigm
to retrieve relevant passages for finding answers. Typically, the dual-encoder architecture is …
to retrieve relevant passages for finding answers. Typically, the dual-encoder architecture is …
Overview of the TREC 2019 deep learning track
The Deep Learning Track is a new track for TREC 2019, with the goal of studying ad hoc
ranking in a large data regime. It is the first track with large human-labeled training sets …
ranking in a large data regime. It is the first track with large human-labeled training sets …
Semeval-2022 task 11: Multilingual complex named entity recognition (multiconer)
We present the findings of SemEval-2022 Task 11 on Multilingual Complex Named Entity
Recognition MULTICONER. Divided into 13 tracks, the task focused on methods to identify …
Recognition MULTICONER. Divided into 13 tracks, the task focused on methods to identify …
MultiCoNER: A large-scale multilingual dataset for complex named entity recognition
We present MultiCoNER, a large multilingual dataset for Named Entity Recognition that
covers 3 domains (Wiki sentences, questions, and search queries) across 11 languages, as …
covers 3 domains (Wiki sentences, questions, and search queries) across 11 languages, as …
Simplified data wrangling with ir_datasets
Managing the data for Information Retrieval (IR) experiments can be challenging. Dataset
documentation is scattered across the Internet and once one obtains a copy of the data …
documentation is scattered across the Internet and once one obtains a copy of the data …
PAIR: Leveraging passage-centric similarity relation for improving dense passage retrieval
Recently, dense passage retrieval has become a mainstream approach to finding relevant
information in various natural language processing tasks. A number of studies have been …
information in various natural language processing tasks. A number of studies have been …
GEMNET: Effective gated gazetteer representations for recognizing complex entities in low-context input
Abstract Named Entity Recognition (NER) remains difficult in real-world settings; current
challenges include short texts (low context), emerging entities, and complex entities (eg …
challenges include short texts (low context), emerging entities, and complex entities (eg …
Topic-oriented adversarial attacks against black-box neural ranking models
Neural ranking models (NRMs) have attracted considerable attention in information retrieval.
Unfortunately, NRMs may inherit the adversarial vulnerabilities of general neural networks …
Unfortunately, NRMs may inherit the adversarial vulnerabilities of general neural networks …
Mimics: A large-scale data collection for search clarification
Search clarification has recently attracted much attention due to its applications in search
engines. It has also been recognized as a major component in conversational information …
engines. It has also been recognized as a major component in conversational information …