Indexing highly repetitive string collections, part II: Compressed indexes
G Navarro - ACM Computing Surveys (CSUR), 2021 - dl.acm.org
Two decades ago, a breakthrough in indexing string collections made it possible to
represent them within their compressed space while at the same time offering indexed …
represent them within their compressed space while at the same time offering indexed …
Data Structures to Represent a Set of k-long DNA Sequences
R Chikhi, J Holub, P Medvedev - ACM Computing Surveys (CSUR), 2021 - dl.acm.org
The analysis of biological sequencing data has been one of the biggest applications of
string algorithms. The approaches used in many such applications are based on the …
string algorithms. The approaches used in many such applications are based on the …
Fully functional suffix trees and optimal text searching in BWT-runs bounded space
Indexing highly repetitive texts—such as genomic databases, software repositories and
versioned text collections—has become an important problem since the turn of the …
versioned text collections—has become an important problem since the turn of the …
MONI: a pangenomic index for finding maximal exact matches
Recently, Gagie et al. proposed a version of the FM-index, called the r-index, that can store
thousands of human genomes on a commodity computer. Then Kuhnle et al. showed how to …
thousands of human genomes on a commodity computer. Then Kuhnle et al. showed how to …
Resolution of the burrows-wheeler transform conjecture
D Kempa, T Kociumaka - Communications of the ACM, 2022 - dl.acm.org
Abstract The Burrows-Wheeler Transform (BWT) is an invertible text transformation that
permutes symbols of a text according to the lexicographical order of its suffixes. BWT is the …
permutes symbols of a text according to the lexicographical order of its suffixes. BWT is the …
Recycler: an algorithm for detecting plasmids from de novo assembly graphs
R Rozov, A Brown Kav, D Bogumil, N Shterzer… - …, 2017 - academic.oup.com
Motivation Plasmids and other mobile elements are central contributors to microbial
evolution and genome innovation. Recently, they have been found to have important roles in …
evolution and genome innovation. Recently, they have been found to have important roles in …
Optimal-time text indexing in BWT-runs bounded space
Indexing highly repetitive texts—such as genomic databases, software repositories and
versioned text collections—has become an important problem since the turn of the …
versioned text collections—has become an important problem since the turn of the …
[HTML][HTML] Wheeler graphs: A framework for BWT-based data structures
Abstract The famous Burrows–Wheeler Transform (BWT) was originally defined for a single
string but variations have been developed for sets of strings, labeled trees, de Bruijn graphs …
string but variations have been developed for sets of strings, labeled trees, de Bruijn graphs …
A learned approach to design compressed rank/select data structures
We address the problem of designing, implementing, and experimenting with compressed
data structures that support rank and select queries over a dictionary of integers. We shine a …
data structures that support rank and select queries over a dictionary of integers. We shine a …
Dynamic suffix array with polylogarithmic queries and updates
D Kempa, T Kociumaka - Proceedings of the 54th Annual ACM SIGACT …, 2022 - dl.acm.org
The suffix array SA [1.. n] of a text T of length n is a permutation of {1,…, n} describing the
lexicographical ordering of suffixes of T and is considered to be one of the most important …
lexicographical ordering of suffixes of T and is considered to be one of the most important …