A survey on trajectory data management, analytics, and learning
Recent advances in sensor and mobile devices have enabled an unprecedented increase
in the availability and collection of urban trajectory data, thus increasing the demand for …
in the availability and collection of urban trajectory data, thus increasing the demand for …
Techniques for inverted index compression
GE Pibiri, R Venturini - ACM Computing Surveys (CSUR), 2020 - dl.acm.org
The data structure at the core of large-scale search engines is the inverted index, which is
essentially a collection of sorted integer sequences called inverted lists. Because of the …
essentially a collection of sorted integer sequences called inverted lists. Because of the …
Graph convolutional neural networks for web-scale recommender systems
Recent advancements in deep neural networks for graph-structured data have led to state-of-
the-art performance on recommender system benchmarks. However, making these methods …
the-art performance on recommender system benchmarks. However, making these methods …
[图书][B] The data matching process
P Christen, P Christen - 2012 - Springer
This chapter provides an overview of the data matching process, and describes the five
major steps involved in this process: data pre-processing (cleaning and standardisation) …
major steps involved in this process: data pre-processing (cleaning and standardisation) …
[图书][B] Data clustering: theory, algorithms, and applications
The monograph Data Clustering: Theory, Algorithms, and Applications was published in
2007. Starting with the common ground and knowledge for data clustering, the monograph …
2007. Starting with the common ground and knowledge for data clustering, the monograph …
An efficiency study for SPLADE models
C Lassance, S Clinchant - Proceedings of the 45th International ACM …, 2022 - dl.acm.org
Latency and efficiency issues are often overlooked when evaluating IR models based on
Pretrained Language Models (PLMs) in reason of multiple hardware and software testing …
Pretrained Language Models (PLMs) in reason of multiple hardware and software testing …
Data-Centric Systems and Applications
The rapid growth of the Web in the past two decades has made it the largest publicly
accessible data source in the world. Web mining aims to discover useful information or …
accessible data source in the world. Web mining aims to discover useful information or …
PLAID: an efficient engine for late interaction retrieval
Pre-trained language models are increasingly important components across multiple
information retrieval (IR) paradigms. Late interaction, introduced with the ColBERT model …
information retrieval (IR) paradigms. Late interaction, introduced with the ColBERT model …
Efficient and effective tree-based and neural learning to rank
As information retrieval researchers, we not only develop algorithmic solutions to hard
problems, but we also insist on a proper, multifaceted evaluation of ideas. The literature on …
problems, but we also insist on a proper, multifaceted evaluation of ideas. The literature on …
Inverted index compression and query processing with optimized document ordering
Web search engines use highly optimized compression schemes to decrease inverted index
size and improve query throughput, and many index compression techniques have been …
size and improve query throughput, and many index compression techniques have been …