Computational graph pangenomics: a tutorial on data structures and their applications
Computational pangenomics is an emerging research field that is changing the way
computer scientists are facing challenges in biological sequence analysis. In past decades …
computer scientists are facing challenges in biological sequence analysis. In past decades …
Fully functional suffix trees and optimal text searching in BWT-runs bounded space
Indexing highly repetitive texts—such as genomic databases, software repositories and
versioned text collections—has become an important problem since the turn of the …
versioned text collections—has become an important problem since the turn of the …
[HTML][HTML] Refining the r-index
Abstract Gagie, Navarro and Prezza's r-index (SODA, 2018) promises to speed up DNA
alignment and variation calling by allowing us to index entire genomic databases, provided …
alignment and variation calling by allowing us to index entire genomic databases, provided …
A comparison of index-based Lempel-Ziv LZ77 factorization algorithms
A Al-Hafeedh, M Crochemore, L Ilie… - ACM Computing …, 2012 - dl.acm.org
Since 1977, when Lempel and Ziv described a kind of string factorization useful for text
compression, there has been a succession of algorithms proposed for computing “LZ …
compression, there has been a succession of algorithms proposed for computing “LZ …
Optimal-time queries on BWT-runs compressed indexes
T Nishimoto, Y Tabei - arXiv preprint arXiv:2006.05104, 2020 - arxiv.org
Indexing highly repetitive strings (ie, strings with many repetitions) for fast queries has
become a central research topic in string processing, because it has a wide variety of …
become a central research topic in string processing, because it has a wide variety of …
Linear time Lempel-Ziv factorization: Simple, fast, small
Computing the LZ factorization (or LZ77 parsing) of a string is a computational bottleneck in
many diverse applications, including data compression, text indexing, and pattern discovery …
many diverse applications, including data compression, text indexing, and pattern discovery …
[HTML][HTML] Inducing enhanced suffix arrays for string collections
Constructing the suffix array for a string collection is an important task that may be performed
by sorting the concatenation of all strings. In this article we present algorithms g SAIS and g …
by sorting the concatenation of all strings. In this article we present algorithms g SAIS and g …
Inducing suffix and LCP arrays in external memory
T Bingmann, J Fischer, V Osipov - Journal of Experimental Algorithmics …, 2016 - dl.acm.org
We consider full text index construction in external memory (EM). Our first contribution is an
inducing algorithm for suffix arrays in external memory, which runs in sorting complexity …
inducing algorithm for suffix arrays in external memory, which runs in sorting complexity …
Inducing the LCP-array
J Fischer - Workshop on Algorithms and Data Structures, 2011 - Springer
We show how to modify the linear-time construction algorithm for suffix arrays based on
induced sorting (Nong et al., DCC'09) such that it computes the array of longest common …
induced sorting (Nong et al., DCC'09) such that it computes the array of longest common …
Weighted ancestors in suffix trees revisited
The weighted ancestor problem is a well-known generalization of the predecessor problem
to trees. It is known to require $\Omega (\log\log n) $ time for queries provided $ O (n\mathop …
to trees. It is known to require $\Omega (\log\log n) $ time for queries provided $ O (n\mathop …