- 学术资源搜索

Data structures based on k-mers for querying large collections of sequencing data sets

C Marchet, C Boucher, SJ Puglisi, P Medvedev… - Genome …, 2021 - genome.cshlp.org

High-throughput sequencing data sets are usually deposited in public repositories (eg, the
European Nucleotide Archive) to ensure reproducibility. As the amount of data has reached …

被引用次数：99 相关文章所有 18 个版本

[PDF] liebertpub.com

Creating and using minimizer sketches in computational genomics

H Zheng, G Marçais, C Kingsford - Journal of Computational …, 2023 - liebertpub.com

Processing large data sets has become an essential part of computational genomics.
Greatly increased availability of sequence data from multiple sources has fueled …

被引用次数：6 相关文章所有 5 个版本

[HTML] springer.com Full View

[HTML][HTML] Bifrost: highly parallel construction and indexing of colored and compacted de Bruijn graphs

G Holley, P Melsted - Genome biology, 2020 - Springer

Memory consumption of de Bruijn graphs is often prohibitive. Most de Bruijn graph-based
assemblers reduce the complexity by compacting paths into single vertices, but this is …

被引用次数：151 相关文章所有 16 个版本

[PDF] oup.com

Sparse and skew hashing of k-mers

GE Pibiri - Bioinformatics, 2022 - academic.oup.com

Motivation A dictionary of k-mers is a data structure that stores a set of n distinct k-mers and
supports membership queries. This data structure is at the hearth of many important tasks in …

被引用次数：43 相关文章所有 18 个版本

[PDF] oup.com

REINDEER: efficient indexing of k-mer presence and abundance in sequencing datasets

C Marchet, Z Iqbal, D Gautheret, M Salson… - …, 2020 - academic.oup.com

Motivation In this work we present REINDEER, a novel computational method that performs
indexing of sequences and records their abundances across a collection of datasets. To the …

被引用次数：62 相关文章所有 25 个版本

[HTML] springer.com Full View

[HTML][HTML] Scalable, ultra-fast, and low-memory construction of compacted de Bruijn graphs with Cuttlefish 2

J Khan, M Kokot, S Deorowicz, R Patro - Genome biology, 2022 - Springer

The de Bruijn graph is a key data structure in modern computational genomics, and
construction of its compacted variant resides upstream of many genomic analyses. As the …

被引用次数：26 相关文章所有 15 个版本

[HTML] nih.gov

Representation of k-Mer Sets Using Spectrum-Preserving String Sets

A Rahman, P Medevedev - Journal of Computational Biology, 2021 - liebertpub.com

Given the popularity and elegance of k-mer-based tools, finding a space-efficient way to
represent a set of k-mers is important for improving the scalability of bioinformatics analyses …

被引用次数：58 相关文章所有 9 个版本

[PDF] siam.org

Small Searchable κ-Spectra via Subset Rank Queries on the Spectral Burrows-Wheeler Transform

JN Alanko, SJ Puglisi, J Vuohtoniemi - SIAM Conference on Applied and …, 2023 - SIAM

The κ-spectrum of a string is the set of all distinct substrings of length κ occurring in the
string. This is a lossy but computationally convenient representation of the information in the …

被引用次数：12 相关文章所有 2 个版本

[PDF] oup.com

Conway–Bromage–Lyndon (CBL): an exact, dynamic representation of k-mer sets

I Martayan, B Cazaux, A Limasset, C Marchet - Bioinformatics, 2024 - academic.oup.com

In this article, we introduce the Conway–Bromage–Lyndon (CBL) structure, a compressed,
dynamic and exact method for representing k-mer sets. Originating from Conway and …

被引用次数：3 相关文章所有 3 个版本

[PDF] cshlp.org Free from Publisher

Efficient minimizer orders for large values of k using minimum decycling sets

D Pellow, L Pu, B Ekim, L Kotlar, B Berger… - Genome …, 2023 - genome.cshlp.org

Minimizers are ubiquitously used in data structures and algorithms for efficient searching,
mapping, and indexing of high-throughput DNA sequencing data. Minimizer schemes select …

被引用次数：6 相关文章所有 10 个版本