MEGAHIT v1. 0: a fast and scalable metagenome assembler driven by advanced methodologies and community practices
The study of metagenomics has been much benefited from low-cost and high-throughput
sequencing technologies, yet the tremendous amount of data generated make analysis like …
sequencing technologies, yet the tremendous amount of data generated make analysis like …
Indexing highly repetitive string collections, part II: Compressed indexes
G Navarro - ACM Computing Surveys (CSUR), 2021 - dl.acm.org
Two decades ago, a breakthrough in indexing string collections made it possible to
represent them within their compressed space while at the same time offering indexed …
represent them within their compressed space while at the same time offering indexed …
[图书][B] Modern information retrieval
R Baeza-Yates, B Ribeiro-Neto - 1999 - people.ischool.berkeley.edu
Information retrieval (IR) has changed considerably in recent years with the expansion of the
World Wide Web and the advent of modern and inexpensive graphical user interfaces and …
World Wide Web and the advent of modern and inexpensive graphical user interfaces and …
Fully functional suffix trees and optimal text searching in BWT-runs bounded space
Indexing highly repetitive texts—such as genomic databases, software repositories and
versioned text collections—has become an important problem since the turn of the …
versioned text collections—has become an important problem since the turn of the …
Compressed full-text indexes
Full-text indexes provide fast substring search over large text collections. A serious problem
of these indexes has traditionally been their space consumption. A recent trend is to develop …
of these indexes has traditionally been their space consumption. A recent trend is to develop …
Fully functional static and dynamic succinct trees
G Navarro, K Sadakane - ACM Transactions on Algorithms (TALG), 2014 - dl.acm.org
We propose new succinct representations of ordinal trees and match various space/time
lower bounds. It is known that any n-node static tree can be represented in 2 n+ o (n) bits so …
lower bounds. It is known that any n-node static tree can be represented in 2 n+ o (n) bits so …
[HTML][HTML] Wavelet trees for all
G Navarro - Journal of Discrete Algorithms, 2014 - Elsevier
The wavelet tree is a versatile data structure that serves a number of purposes, from string
processing to computational geometry. It can be regarded as a device that represents a …
processing to computational geometry. It can be regarded as a device that represents a …
Succinct de Bruijn graphs
A Bowe, T Onodera, K Sadakane, T Shibuya - International workshop on …, 2012 - Springer
We propose a new succinct de Bruijn graph representation. If the de Bruijn graph of k-mers
in a DNA sequence of length N has m edges, it can be represented in 4 m+ o (m) bits. This is …
in a DNA sequence of length N has m edges, it can be represented in 4 m+ o (m) bits. This is …
Navigating bottlenecks and trade-offs in genomic data analysis
Genome sequencing and analysis allow researchers to decode the functional information
hidden in DNA sequences as well as to study cell to cell variation within a cell population …
hidden in DNA sequences as well as to study cell to cell variation within a cell population …
Space-efficient preprocessing schemes for range minimum queries on static arrays
J Fischer, V Heun - SIAM Journal on Computing, 2011 - SIAM
Given a static array of n totally ordered objects, the range minimum query problem is to build
a data structure that allows us to answer efficiently subsequent on-line queries of the form …
a data structure that allows us to answer efficiently subsequent on-line queries of the form …