Pyserini: A Python toolkit for reproducible information retrieval research with sparse and dense representations

J Lin, X Ma, SC Lin, JH Yang, R Pradeep… - Proceedings of the 44th …, 2021 - dl.acm.org
Pyserini is a Python toolkit for reproducible information retrieval research with sparse and
dense representations. It aims to provide effective, reproducible, and easy-to-use first-stage …

[图书][B] Pretrained transformers for text ranking: Bert and beyond

J Lin, R Nogueira, A Yates - 2022 - books.google.com
The goal of text ranking is to generate an ordered list of texts retrieved from a corpus in
response to a query. Although the most common formulation of text ranking is search …

Anserini: Reproducible ranking baselines using Lucene

P Yang, H Fang, J Lin - Journal of Data and Information Quality (JDIQ), 2018 - dl.acm.org
This work tackles the perennial problem of reproducible baselines in information retrieval
research, focusing on bag-of-words ranking models. Although academic information …

Declarative experimentation in information retrieval using PyTerrier

C Macdonald, N Tonellotto - Proceedings of the 2020 ACM SIGIR on …, 2020 - dl.acm.org
The advent of deep machine learning platforms such as Tensorflow and Pytorch, developed
in expressive high-level languages such as Python, have allowed more expressive …

Reduce, reuse, recycle: Green information retrieval research

H Scells, S Zhuang, G Zuccon - … of the 45th International ACM SIGIR …, 2022 - dl.acm.org
Recent advances in Information Retrieval utilise energy-intensive hardware to produce state-
of-the-art results. In areas of research highly related to Information Retrieval, such as Natural …

PyTerrier: Declarative experimentation in Python from BM25 to dense retrieval

C Macdonald, N Tonellotto, S MacAvaney… - Proceedings of the 30th …, 2021 - dl.acm.org
PyTerrier is a Python-based retrieval framework for expressing simple and complex
information retrieval (IR) pipelines in a declarative manner. While making use of the long …

BERT-QE: contextualized query expansion for document re-ranking

Z Zheng, K Hui, B He, X Han, L Sun, A Yates - arXiv preprint arXiv …, 2020 - arxiv.org
Query expansion aims to mitigate the mismatch between the language used in a query and
in a document. However, query expansion methods can suffer from introducing non-relevant …

Shopping queries dataset: A large-scale ESCI benchmark for improving product search

CK Reddy, L Màrquez, F Valero, N Rao… - arXiv preprint arXiv …, 2022 - arxiv.org
Improving the quality of search results can significantly enhance users experience and
engagement with search engines. In spite of several recent advancements in the fields of …

Using word embeddings in twitter election classification

X Yang, C Macdonald, I Ounis - Information Retrieval Journal, 2018 - Springer
Word embeddings and convolutional neural networks (CNN) have attracted extensive
attention in various classification tasks for Twitter, eg sentiment classification. However, the …

[HTML][HTML] Overview of the TREC 2017 precision medicine track

K Roberts, D Demner-Fushman… - The... text REtrieval …, 2017 - ncbi.nlm.nih.gov
For many complex diseases, there is no “one size fits all” solutions for patients with a
particular diagnosis. The proper treatment for a patient depends upon genetic …