Time-Optimal Top- Document Retrieval

G Navarro, Y Nekrich - SIAM Journal on Computing, 2017 - SIAM
Let \mathcalD be a collection of D documents, which are strings over an alphabet of size σ,
of total length n. We describe a data structure that uses linear space and reports k most …

Efficient indexing for semantic search

F Lashkari, F Ensan, E Bagheri, AA Ghorbani - Expert Systems with …, 2017 - Elsevier
The increasing performance and wider spread use of automated semantic annotation and
entity linking platforms has empowered the possibility of using semantic information in …

Document retrieval on repetitive string collections

T Gagie, A Hartikainen, K Karhu, J Kärkkäinen… - Information Retrieval …, 2017 - Springer
Most of the fastest-growing string collections today are repetitive, that is, most of the
constituent documents are similar to many others. As these collections keep growing, a key …

Document retrieval hacks

SJ Puglisi, B Zhukova - 19th International Symposium on …, 2021 - drops.dagstuhl.de
Given a collection of strings, document listing refers to the problem of finding all the strings
(or documents) where a given query string (or pattern) appears. Index data structures that …

[HTML][HTML] Grammar compressed sequences with rank/select support

A Ordóñez, G Navarro, NR Brisaboa - Journal of Discrete Algorithms, 2017 - Elsevier
Sequence representations supporting not only direct access to their symbols, but also
rank/select operations, are a fundamental building block in many compressed data …

DSSM with text hashing technique for text document retrieval in next-generation search engine for big data and data analytics

HS Chiranjeevi, M Shenoy, S Prabhu… - … on engineering and …, 2016 - ieeexplore.ieee.org
Digital world is coming, were data as become big data with ever increase in large volume of
digital information available in terms of text documents. This tends for data extraction …

Improved queryable representations of rasters

A Pinto, D Seco, G Gutiérrez - 2017 Data Compression …, 2017 - ieeexplore.ieee.org
We present two compact representations of rasters, which are used in GIS to represent
temperatures, elevations, and other spatial attributes, that support queries on the positions …

Question processing for Arabic question answering system

HM Al Chalabi - 2015 - search.proquest.com
Due to very fast growth of information in the last few decades, getting precise information in
real time is becoming increasingly difficult. Search engines such as Google and Yahoo are …

[HTML][HTML] Lempel–Ziv compressed structures for document retrieval

H Ferrada, G Navarro - Information and Computation, 2019 - Elsevier
Document retrieval structures index a collection of string documents, to retrieve those that
are relevant to query strings p: document listing retrieves all documents where p appears; …

Practical Compact Indexes for Top-k Document Retrieval

S Gog, R Konow, G Navarro - Journal of Experimental Algorithmics (JEA …, 2017 - dl.acm.org
We present a fast and compact index for top-k document retrieval on general string
collections, in which given a string pattern, the index returns the k documents where it …