An upper bound and linear-space queries on the LZ-End parsing

D Kempa, B Saha - Proceedings of the 2022 Annual ACM-SIAM …, 2022 - SIAM
Lempel–Ziv (LZ77) compression is the most commonly used lossless compression
algorithm. The basic idea is to greedily break the input string into blocks (called “phrases”) …

[HTML][HTML] Sensitivity of string compressors and repetitiveness measures

T Akagi, M Funakoshi, S Inenaga - Information and Computation, 2023 - Elsevier
The sensitivity of a string compression algorithm C asks how much the output size C (T) for
an input string T can increase when a single character edit operation is performed on T. This …

Toward a definitive compressibility measure for repetitive sequences

T Kociumaka, G Navarro… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
While the th order empirical entropy is an accepted measure of the compressibility of
individual sequences on classical text collections, it is useful only for small values of and …

Near-optimal quantum algorithms for bounded edit distance and lempel-ziv factorization

D Gibney, C Jin, T Kociumaka, SV Thankachan - Proceedings of the 2024 …, 2024 - SIAM
Measuring sequence similarity and compressing texts are among the most fundamental
tasks in string algorithms. In this work, we develop near-optimal quantum algorithms for the …

[HTML][HTML] r-indexing the eBWT

C Boucher, D Cenzato, Z Lipták, M Rossi… - Information and …, 2024 - Elsevier
Abstract The extended Burrows-Wheeler Transform (eBWT) was introduced by Mantaci et
al.[TCS 2007] to extend the definition of the BWT to a collection of strings. As opposed to …

RLBWT tricks

NK Brown, T Gagie, M Rossi - arXiv preprint arXiv:2112.04271, 2021 - arxiv.org
Until recently, most experts would probably have agreed we cannot backwards-step in
constant time with a run-length compressed Burrows-Wheeler Transform (RLBWT), since …

Wheeler maps

A Baláž, T Gagie, A Goga, S Heumos… - Latin American …, 2024 - Springer
Motivated by challenges in pangenomic read alignment, we propose a generalization of
Wheeler graphs that we call Wheeler maps. A Wheeler map stores a text T [1.. n] and an …

Grammar boosting: A new technique for proving lower bounds for computation over compressed data

R De, D Kempa - Proceedings of the 2024 Annual ACM-SIAM …, 2024 - SIAM
Computation over compressed data is a new paradigm in the design of algorithms and data
structures that can reduce space usage and speed up computation by orders of magnitude …

r-indexing the eBWT

C Boucher, D Cenzato, Z Lipták, M Rossi… - … Symposium on String …, 2021 - Springer
Abstract The extended Burrows Wheeler Transform (eBWT eBWT) was introduced by
Mantaci et al. TCS 2007 to extend the definition of the BWT BWT to a collection of strings. In …

Bi-directional r-indexes

Y Arakawa, G Navarro… - 33rd Annual Symposium …, 2022 - drops.dagstuhl.de
Indexing highly repetitive texts is important in fields such as bioinformatics and versioned
repositories. The run-length compression of the Burrows-Wheeler transform (BWT) provides …