Spaces, trees, and colors: The algorithmic landscape of document retrieval on sequences
G Navarro - ACM Computing Surveys (CSUR), 2014 - dl.acm.org
Document retrieval is one of the best-established information retrieval activities since
the'60s, pervading all search engines. Its aim is to obtain, from a collection of text …
the'60s, pervading all search engines. Its aim is to obtain, from a collection of text …
[HTML][HTML] Wavelet trees for all
G Navarro - Journal of Discrete Algorithms, 2014 - Elsevier
The wavelet tree is a versatile data structure that serves a number of purposes, from string
processing to computational geometry. It can be regarded as a device that represents a …
processing to computational geometry. It can be regarded as a device that represents a …
Alphabet-independent compressed text indexing
D Belazzougui, G Navarro - ACM Transactions on Algorithms (TALG), 2014 - dl.acm.org
Self-indexes are able to represent a text asymptotically within the information-theoretic lower
bound under the k th order entropy model and offer access to any text substring and indexed …
bound under the k th order entropy model and offer access to any text substring and indexed …
The wavelet matrix: An efficient wavelet tree for large alphabets
The wavelet tree is a flexible data structure that permits representing sequences S [1, n] of
symbols over an alphabet of size σ, within compressed space and supporting a wide range …
symbols over an alphabet of size σ, within compressed space and supporting a wide range …
Improved grammar-based compressed indexes
We introduce the first grammar-compressed representation of a sequence that supports
searches in time that depends only logarithmically on the size of the grammar. Given a text T …
searches in time that depends only logarithmically on the size of the grammar. Given a text T …
Compact querieable representations of raster data
G De Bernardo, S Álvarez-García, NR Brisaboa… - String Processing and …, 2013 - Springer
Abstract In Geographic Information Systems (GIS) the attributes of the space (altitude,
temperature, etc.) are usually represented using a raster model. There are no compact …
temperature, etc.) are usually represented using a raster model. There are no compact …
Colored range queries and document retrieval
Colored range queries are a well-studied topic in computational geometry and database
research that, in the past decade, have found exciting applications in information retrieval. In …
research that, in the past decade, have found exciting applications in information retrieval. In …
Lightweight lempel-ziv parsing
We introduce a new approach to LZ77 factorization that uses \O(n/d) words of working space
and \O(dn) time for any d≥ 1 (for polylogarithmic alphabet sizes). We also describe carefully …
and \O(dn) time for any d≥ 1 (for polylogarithmic alphabet sizes). We also describe carefully …
[HTML][HTML] On compressing permutations and adaptive sorting
We prove that, given a permutation π over [1.. n] formed of nRuns sorted blocks of sizes
given by the vector R=< r 1,…, r nRuns>, there exists a compressed data structure encoding …
given by the vector R=< r 1,…, r nRuns>, there exists a compressed data structure encoding …