Disk-based k-mer counting on a PC

SC Manekar, SR Sathe - GigaScience, 2018 - academic.oup.com

The rapid development of high-throughput sequencing technologies means that hundreds of
gigabytes of sequencing data can be produced in a single study. Many bioinformatics tools …

被引用次数：106 相关文章所有 9 个版本

[PDF] oup.com

KMC 3: counting and manipulating k-mer statistics

M Kokot, M Długosz, S Deorowicz - Bioinformatics, 2017 - academic.oup.com

Counting all k-mers in a given dataset is a standard procedure in many bioinformatics
applications. We introduce KMC3, a significant improvement of the former KMC2 algorithm …

被引用次数：507 相关文章所有 10 个版本

[PDF] springer.com

Data compression for sequencing data

S Deorowicz, S Grabowski - Algorithms for Molecular Biology, 2013 - Springer

Post-Sanger sequencing methods produce tons of data, and there is a generalagreement
that the challenge to store and process them must be addressedwith data compression. In …

被引用次数：124 相关文章所有 13 个版本

[PDF] oup.com

KMC 2: fast and resource-frugal k-mer counting

S Deorowicz, M Kokot, S Grabowski… - …, 2015 - academic.oup.com

Motivation: Building the histogram of occurrences of every k-symbol long substring of
nucleotide data is a standard step in many bioinformatics applications, known under the …

被引用次数：304 相关文章所有 10 个版本

[PDF] oup.com

IVA: accurate de novo assembly of RNA virus genomes

M Hunt, A Gall, SH Ong, J Brener, B Ferns… - …, 2015 - academic.oup.com

Motivation: An accurate genome assembly from short read sequencing data is critical for
downstream analysis, for example allowing investigation of variants within a sequenced …

被引用次数：208 相关文章所有 14 个版本

[PDF] oup.com

BLESS: bloom filter-based error correction solution for high-throughput sequencing reads

Y Heo, XL Wu, D Chen, J Ma, WM Hwu - bioinformatics, 2014 - academic.oup.com

Motivation: Rapid advances in next-generation sequencing (NGS) technology have led to
exponential increase in the amount of genomic information. However, NGS reads contain far …

被引用次数：172 相关文章所有 10 个版本

[PDF] springer.com

Gerbil: a fast and memory-efficient k-mer counter with GPU-support

M Erbert, S Rechner, M Müller-Hannemann - Algorithms for Molecular …, 2017 - Springer

Background A basic task in bioinformatics is the counting of k-mers in genome sequences.
Existing k-mer counting tools are most often optimized for small k< 32 and suffer from …

被引用次数：101 相关文章所有 18 个版本

[PDF] springer.com

Simplitigs as an efficient and scalable representation of de Bruijn graphs

K Břinda, M Baym, G Kucherov - Genome biology, 2021 - Springer

Abstract de Bruijn graphs play an essential role in bioinformatics, yet they lack a universal
scalable representation. Here, we introduce simplitigs as a compact, efficient, and scalable …

被引用次数：44 相关文章所有 23 个版本

[PDF] plos.org

These are not the k-mers you are looking for: efficient online k-mer counting using a probabilistic data structure

Q Zhang, J Pell, R Canino-Koning, AC Howe… - PloS one, 2014 - journals.plos.org

K-mer abundance analysis is widely used for many purposes in nucleotide sequence
analysis, including data preprocessing for de novo assembly, repeat detection, and …

被引用次数：95 相关文章所有 12 个版本

[PDF] oup.com

Turtle: Identifying frequent k -mers with cache-efficient algorithms

RS Roy, D Bhattacharya, A Schliep - Bioinformatics, 2014 - academic.oup.com

Motivation: Counting the frequencies of k-mers in read libraries is often a first step in the
analysis of high-throughput sequencing data. Infrequent k-mers are assumed to be a result …

被引用次数：91 相关文章所有 15 个版本