An upper bound and linear-space queries on the LZ-End parsing
Lempel–Ziv (LZ77) compression is the most commonly used lossless compression
algorithm. The basic idea is to greedily break the input string into blocks (called “phrases”) …
algorithm. The basic idea is to greedily break the input string into blocks (called “phrases”) …
[HTML][HTML] Sensitivity of string compressors and repetitiveness measures
T Akagi, M Funakoshi, S Inenaga - Information and Computation, 2023 - Elsevier
The sensitivity of a string compression algorithm C asks how much the output size C (T) for
an input string T can increase when a single character edit operation is performed on T. This …
an input string T can increase when a single character edit operation is performed on T. This …
Toward a definitive compressibility measure for repetitive sequences
T Kociumaka, G Navarro… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
While the th order empirical entropy is an accepted measure of the compressibility of
individual sequences on classical text collections, it is useful only for small values of and …
individual sequences on classical text collections, it is useful only for small values of and …
Near-optimal quantum algorithms for bounded edit distance and lempel-ziv factorization
Measuring sequence similarity and compressing texts are among the most fundamental
tasks in string algorithms. In this work, we develop near-optimal quantum algorithms for the …
tasks in string algorithms. In this work, we develop near-optimal quantum algorithms for the …
[HTML][HTML] r-indexing the eBWT
Abstract The extended Burrows-Wheeler Transform (eBWT) was introduced by Mantaci et
al.[TCS 2007] to extend the definition of the BWT to a collection of strings. As opposed to …
al.[TCS 2007] to extend the definition of the BWT to a collection of strings. As opposed to …
Grammar boosting: A new technique for proving lower bounds for computation over compressed data
R De, D Kempa - Proceedings of the 2024 Annual ACM-SIAM …, 2024 - SIAM
Computation over compressed data is a new paradigm in the design of algorithms and data
structures that can reduce space usage and speed up computation by orders of magnitude …
structures that can reduce space usage and speed up computation by orders of magnitude …
Bi-directional r-indexes
Y Arakawa, G Navarro… - 33rd Annual Symposium …, 2022 - drops.dagstuhl.de
Indexing highly repetitive texts is important in fields such as bioinformatics and versioned
repositories. The run-length compression of the Burrows-Wheeler transform (BWT) provides …
repositories. The run-length compression of the Burrows-Wheeler transform (BWT) provides …