Compressed full-text indexes
Full-text indexes provide fast substring search over large text collections. A serious problem
of these indexes has traditionally been their space consumption. A recent trend is to develop …
of these indexes has traditionally been their space consumption. A recent trend is to develop …
Indexing compressed text
P Ferragina, G Manzini - Journal of the ACM (JACM), 2005 - dl.acm.org
We design two compressed data structures for the full-text indexing problem that support
efficient substring searches using roughly the space required for storing the text in …
efficient substring searches using roughly the space required for storing the text in …
Efficient and accurate quantitative profiling of alternative splicing patterns of any complexity on a laptop
Alternative splicing (AS) is a widespread process underlying the generation of transcriptomic
and proteomic diversity and is frequently misregulated in human disease. Accordingly, an …
and proteomic diversity and is frequently misregulated in human disease. Accordingly, an …
[HTML][HTML] Wavelet trees for all
G Navarro - Journal of Discrete Algorithms, 2014 - Elsevier
The wavelet tree is a versatile data structure that serves a number of purposes, from string
processing to computational geometry. It can be regarded as a device that represents a …
processing to computational geometry. It can be regarded as a device that represents a …
Compressed representations of sequences and full-text indexes
Given a sequence S= s 1 s 2… sn of integers smaller than r= O (polylog (n)), we show how S
can be represented using nH 0 (S)+ o (n) bits, so that we can know any sq, as well as …
can be represented using nH 0 (S)+ o (n) bits, so that we can know any sq, as well as …
Succinct suffix arrays based on run-length encoding
A succinct full-text self-index is a data structure built on a text T= t 1 t 2... tn, which takes little
space (ideally close to that of the compressed text), permits efficient search for the …
space (ideally close to that of the compressed text), permits efficient search for the …
Exploring genome characteristics and sequence quality without a reference
JT Simpson - Bioinformatics, 2014 - academic.oup.com
Motivation: The de novo assembly of large, complex genomes is a significant challenge with
currently available DNA sequencing technology. While many de novo assembly software …
currently available DNA sequencing technology. While many de novo assembly software …
[PDF][PDF] Practical implementation of rank and select queries
R González, S Grabowski, V Mäkinen… - Poster Proc. Volume of …, 2005 - academia.edu
Research on succinct data structures has made significant progress in recent years. An
essential building block of many of those techniques is a data structure to perform rank and …
essential building block of many of those techniques is a data structure to perform rank and …
[图书][B] Handbook of computational molecular biology
S Aluru - 2005 - taylorfrancis.com
The enormous complexity of biological systems at the molecular level must be answered
with powerful computational methods. Computational biology is a young field, but has seen …
with powerful computational methods. Computational biology is a young field, but has seen …
Rank/select operations on large alphabets: a tool for text indexing
We consider a generalization of the problem of supporting rank and select queries on binary
strings. Given a string of length n from an alphabet of size σ, we give the first representation …
strings. Given a string of length n from an alphabet of size σ, we give the first representation …