Variable-order de Bruijn graphs

C Marchet, C Boucher, SJ Puglisi, P Medvedev… - Genome …, 2021 - genome.cshlp.org

High-throughput sequencing data sets are usually deposited in public repositories (eg, the
European Nucleotide Archive) to ensure reproducibility. As the amount of data has reached …

被引用次数：113 相关文章所有 18 个版本

[PDF] arxiv.org

Survey and taxonomy of lossless graph compression and space-efficient graph representations

M Besta, T Hoefler - arXiv preprint arXiv:1806.01799, 2018 - arxiv.org

Various graphs such as web or social networks may contain up to trillions of edges.
Compressing such datasets can accelerate graph processing by reducing the amount of I/O …

被引用次数：123 相关文章所有 20 个版本

[PDF] soken.ac.jp

Succinct de Bruijn graphs

A Bowe, T Onodera, K Sadakane, T Shibuya - International workshop on …, 2012 - Springer

We propose a new succinct de Bruijn graph representation. If the de Bruijn graph of k-mers
in a DNA sequence of length N has m edges, it can be represented in 4 m+ o (m) bits. This is …

被引用次数：253 相关文章所有 16 个版本

[PDF] springer.com

FMLRC: Hybrid long read error correction using an FM-index

JR Wang, J Holt, L McMillan, CD Jones - BMC bioinformatics, 2018 - Springer

Background Long read sequencing is changing the landscape of genomic research,
especially de novo assembly. Despite the high error rate inherent to long read technologies …

被引用次数：147 相关文章所有 13 个版本

[PDF] oup.com

Accurate self-correction of errors in long reads using de Bruijn graphs

L Salmela, R Walve, E Rivals, E Ukkonen - Bioinformatics, 2017 - academic.oup.com

Motivation New long read sequencing technologies, like PacBio SMRT and Oxford
NanoPore, can produce sequencing reads up to 50 000 bp long but with an error rate of at …

被引用次数：150 相关文章所有 20 个版本

[HTML] nih.gov

Representation of k-Mer Sets Using Spectrum-Preserving String Sets

A Rahman, P Medevedev - Journal of Computational Biology, 2021 - liebertpub.com

Given the popularity and elegance of k-mer-based tools, finding a space-efficient way to
represent a set of k-mers is important for improving the scalability of bioinformatics analyses …

被引用次数：65 相关文章所有 9 个版本

[PDF] siam.org

Indexing variation graphs

J Sirén - 2017 Proceedings of the ninteenth workshop on …, 2017 - SIAM

Variation graphs, which represent genetic variation within a population, are replacing
sequences as reference genomes. Path indexes are one of the most important tools for …

被引用次数：107 相关文章所有 4 个版本

[PDF] acm.org

Data Structures to Represent a Set of k-long DNA Sequences

R Chikhi, J Holub, P Medvedev - ACM Computing Surveys (CSUR), 2021 - dl.acm.org

The analysis of biological sequencing data has been one of the biggest applications of
string algorithms. The approaches used in many such applications are based on the …

被引用次数：72 相关文章所有 3 个版本

[PDF] oup.com

BLight: efficient exact associative structure for k-mers

C Marchet, M Kerbiriou, A Limasset - Bioinformatics, 2021 - academic.oup.com

Motivation A plethora of methods and applications share the fundamental need to associate
information to words for high-throughput sequence analysis. Doing so for billions of k-mers …

被引用次数：36 相关文章所有 7 个版本

[PDF] springer.com Full View

Overlap graphs and de Bruijn graphs: data structures for de novo genome assembly in the big data era

R Rizzi, S Beretta, M Patterson, Y Pirola, M Previtali… - Quantitative …, 2019 - Springer

Background De novo genome assembly relies on two kinds of graphs: de Bruijn graphs and
overlap graphs. Overlap graphs are the basis for the Celera assembler, while de Bruijn …

被引用次数：50 相关文章所有 8 个版本

Data structures based on k-mers for querying large collections of sequencing data sets