Access, rank, and select in grammar-compressed strings

G Navarro - ACM Computing Surveys (CSUR), 2021 - dl.acm.org

Two decades ago, a breakthrough in indexing string collections made it possible to
represent them within their compressed space while at the same time offering indexed …

被引用次数：112 相关文章所有 7 个版本

[PDF] arxiv.org

Fully functional suffix trees and optimal text searching in BWT-runs bounded space

T Gagie, G Navarro, N Prezza - Journal of the ACM (JACM), 2020 - dl.acm.org

Indexing highly repetitive texts—such as genomic databases, software repositories and
versioned text collections—has become an important problem since the turn of the …

被引用次数：190 相关文章所有 12 个版本

[PDF] arxiv.org

At the roots of dictionary compression: string attractors

D Kempa, N Prezza - Proceedings of the 50th Annual ACM SIGACT …, 2018 - dl.acm.org

A well-known fact in the field of lossless text compression is that high-order entropy is a
weak model when the input contains long repetitions. Motivated by this fact, decades of …

被引用次数：148 相关文章所有 17 个版本

[PDF] siam.org

Optimal-time text indexing in BWT-runs bounded space

T Gagie, G Navarro, N Prezza - Proceedings of the Twenty-Ninth Annual ACM …, 2018 - SIAM

Indexing highly repetitive texts—such as genomic databases, software repositories and
versioned text collections—has become an important problem since the turn of the …

被引用次数：128 相关文章所有 13 个版本

[PDF] unive.it

Towards a definitive measure of repetitiveness

T Kociumaka, G Navarro, N Prezza - Latin American Symposium on …, 2020 - Springer

Unlike in statistical compression, where Shannon's entropy is a definitive lower bound, no
such clear measure exists for the compressibility of repetitive sequences. Since statistical …

被引用次数：61 相关文章所有 5 个版本

[PDF] acm.org

Balancing straight-line programs

M Ganardi, A Jeż, M Lohrey - Journal of the ACM (JACM), 2021 - dl.acm.org

We show that a context-free grammar of size that produces a single string of length (such a
grammar is also called a string straight-line program) can be transformed in linear time into a …

被引用次数：64 相关文章所有 9 个版本

[PDF] arxiv.org

Toward a definitive compressibility measure for repetitive sequences

T Kociumaka, G Navarro… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org

While the th order empirical entropy is an accepted measure of the compressibility of
individual sequences on classical text collections, it is useful only for small values of and …

被引用次数：42 相关文章所有 8 个版本

[HTML] sciencedirect.com

[HTML][HTML] Universal compressed text indexing

G Navarro, N Prezza - Theoretical Computer Science, 2019 - Elsevier

The rise of repetitive datasets has lately generated a lot of interest in compressed self-
indexes based on dictionary compression, a rich and heterogeneous family of techniques …

被引用次数：55 相关文章所有 12 个版本

[PDF] arxiv.org

Grammar-compressed indexes with logarithmic search time

F Claude, G Navarro, A Pacheco - Journal of Computer and System …, 2021 - Elsevier

Abstract Let a text T [1.. n] be the only string generated by a context-free grammar with g
(terminal and nonterminal) symbols, and of size G (measured as the sum of the lengths of …

被引用次数：34 相关文章所有 5 个版本

[PDF] helsinki.fi

Block trees

D Belazzougui, M Cáceres, T Gagie… - Journal of Computer and …, 2021 - Elsevier

Abstract Let string S [1.. n] be parsed into z phrases by the Lempel-Ziv algorithm. The
corresponding compression algorithm encodes S in O (z) space, but it does not support …

被引用次数：26 相关文章所有 10 个版本