Compressed full-text indexes

G Navarro, V Mäkinen - ACM Computing Surveys (CSUR), 2007 - dl.acm.org
Full-text indexes provide fast substring search over large text collections. A serious problem
of these indexes has traditionally been their space consumption. A recent trend is to develop …

Survey and taxonomy of lossless graph compression and space-efficient graph representations

M Besta, T Hoefler - arXiv preprint arXiv:1806.01799, 2018 - arxiv.org
Various graphs such as web or social networks may contain up to trillions of edges.
Compressing such datasets can accelerate graph processing by reducing the amount of I/O …

The PGM-index: a fully-dynamic compressed learned index with provable worst-case bounds

P Ferragina, G Vinciguerra - Proceedings of the VLDB Endowment, 2020 - dl.acm.org
We present the first learned index that supports predecessor, range queries and updates
within provably efficient time and space bounds in the worst case. In the (static) context of …

From theory to practice: Plug and play with succinct data structures

S Gog, T Beller, A Moffat, M Petri - … Copenhagen, Denmark, June 29–July 1 …, 2014 - Springer
Engineering efficient implementations of compact and succinct structures is time-consuming
and challenging, since there is no standard library of easy-to-use, highly optimized, and …

Xenome—a tool for classifying reads from xenograft samples

T Conway, J Wazny, A Bromage, M Tymms… - …, 2012 - academic.oup.com
Motivation: Shotgun sequence read data derived from xenograft material contains a mixture
of reads arising from the host and reads arising from the graft. Classifying the read mixture to …

[HTML][HTML] Wavelet trees for all

G Navarro - Journal of Discrete Algorithms, 2014 - Elsevier
The wavelet tree is a versatile data structure that serves a number of purposes, from string
processing to computational geometry. It can be regarded as a device that represents a …

Succinct de Bruijn graphs

A Bowe, T Onodera, K Sadakane, T Shibuya - International workshop on …, 2012 - Springer
We propose a new succinct de Bruijn graph representation. If the de Bruijn graph of k-mers
in a DNA sequence of length N has m edges, it can be represented in 4 m+ o (m) bits. This is …

The theory and practice of genome sequence assembly

JT Simpson, M Pop - Annual review of genomics and human …, 2015 - annualreviews.org
The current genomic revolution was made possible by joint advances in genome
sequencing technologies and computational approaches for analyzing sequence data. The …

Succinct colored de Bruijn graphs

MD Muggli, A Bowe, NR Noyes, PS Morley… - …, 2017 - academic.oup.com
Abstract Motivation In 2012, Iqbal et al. introduced the colored de Bruijn graph, a variant of
the classic de Bruijn graph, which is aimed at 'detecting and genotyping simple and complex …

On compressing and indexing repetitive sequences

S Kreft, G Navarro - Theoretical Computer Science, 2013 - Elsevier
We introduce LZ-End, a new member of the Lempel–Ziv family of text compressors, which
achieves compression ratios close to those of LZ77 but is much faster at extracting arbitrary …