Indexing highly repetitive string collections, part II: Compressed indexes
G Navarro - ACM Computing Surveys (CSUR), 2021 - dl.acm.org
Two decades ago, a breakthrough in indexing string collections made it possible to
represent them within their compressed space while at the same time offering indexed …
represent them within their compressed space while at the same time offering indexed …
Dynamic suffix array with polylogarithmic queries and updates
D Kempa, T Kociumaka - Proceedings of the 54th Annual ACM SIGACT …, 2022 - dl.acm.org
The suffix array SA [1.. n] of a text T of length n is a permutation of {1,…, n} describing the
lexicographical ordering of suffixes of T and is considered to be one of the most important …
lexicographical ordering of suffixes of T and is considered to be one of the most important …
Searching and indexing genomic databases via kernelization
T Gagie, SJ Puglisi - Frontiers in Bioengineering and Biotechnology, 2015 - frontiersin.org
The rapid advance of DNA sequencing technologies has yielded databases of thousands of
genomes. To search and index these databases effectively, it is important that we take …
genomes. To search and index these databases effectively, it is important that we take …
An upper bound and linear-space queries on the LZ-End parsing
Lempel–Ziv (LZ77) compression is the most commonly used lossless compression
algorithm. The basic idea is to greedily break the input string into blocks (called “phrases”) …
algorithm. The basic idea is to greedily break the input string into blocks (called “phrases”) …
[HTML][HTML] Dynamic index and LZ factorization in compressed space
In this paper, we propose a new dynamic compressed index of O (w) space for a dynamic
text T, where w= O (min (z log N log∗ M, N)) is the size of the signature encoding of T, z is …
text T, where w= O (min (z log N log∗ M, N)) is the size of the signature encoding of T, z is …
A space-optimal grammar compression
Y Takabatake, H Sakamoto - 25th Annual European …, 2017 - drops.dagstuhl.de
A grammar compression is a context-free grammar (CFG) deriving a single string
deterministically. For an input string of length N over an alphabet of size sigma, the smallest …
deterministically. For an input string of length N over an alphabet of size sigma, the smallest …
Indexing highly repetitive string collections
G Navarro - arXiv preprint arXiv:2004.02781, 2020 - arxiv.org
Two decades ago, a breakthrough in indexing string collections made it possible to
represent them within their compressed space while at the same time offering indexed …
represent them within their compressed space while at the same time offering indexed …
Grammar-compressed self-index with Lyndon words
We introduce a new class of straight-line programs (SLPs), named the Lyndon SLP, inspired
by the Lyndon trees (Barcelo, 1990). Based on this SLP, we propose a self-index data …
by the Lyndon trees (Barcelo, 1990). Based on this SLP, we propose a self-index data …
Grammar index by induced suffix sorting
We propose a new compressed text index built upon a grammar compression based on
induced suffix sorting Nunes et al., DCC'18. We show that this grammar exhibits a locality …
induced suffix sorting Nunes et al., DCC'18. We show that this grammar exhibits a locality …
Linear-size CDAWG: New repetition-aware indexing and grammar compression
In this paper, we propose a novel approach to combine compact directed acyclic word
graphs (CDAWGs) and grammar-based compression. This leads us to an efficient self-index …
graphs (CDAWGs) and grammar-based compression. This leads us to an efficient self-index …