Metagraph: Indexing and analysing nucleotide archives at petabase-scale
The amount of biological sequencing data available in public repositories is growing
exponentially, forming an invaluable biomedical research resource. Yet, making all this …
exponentially, forming an invaluable biomedical research resource. Yet, making all this …
Meta-colored compacted de Bruijn graphs
The colored compacted de Bruijn graph (c-dBG) has become a fundamental tool used
across several areas of genomics and pangenomics. For example, it has been widely …
across several areas of genomics and pangenomics. For example, it has been widely …
[HTML][HTML] kmerDB: a database encompassing the set of genomic and proteomic sequence information for each species
The decrease in sequencing expenses has facilitated the creation of reference genomes
and proteomes for an expanding array of organisms. Nevertheless, no established …
and proteomes for an expanding array of organisms. Nevertheless, no established …
Label-guided seed-chain-extend alignment on annotated De Bruijn graphs
Motivation Exponential growth in sequencing databases has motivated scalable De Bruijn
graph-based (DBG) indexing for searching these data, using annotations to label nodes with …
graph-based (DBG) indexing for searching these data, using annotations to label nodes with …
Designing efficient randstrobes for sequence similarity analyses
M Karami, A Soltani Mohammadi, M Martin… - …, 2024 - academic.oup.com
Motivation Substrings of length k, commonly referred to as k-mers, play a vital role in
sequence analysis. However, k-mers are limited to exact matches between sequences …
sequence analysis. However, k-mers are limited to exact matches between sequences …
MegIS: High-Performance, Energy-Efficient, and Low-Cost Metagenomic Analysis with In-Storage Processing
Metagenomics, the study of the genome sequences of diverse organisms in a common
environment, has led to significant advances in many fields. Since the species present in a …
environment, has led to significant advances in many fields. Since the species present in a …
MetaGraph-MLA: Label-guided alignment to variable-order De Bruijn graphs
The amount of data stored in genomic sequence databases is growing exponentially, far
exceeding traditional indexing strategies' processing capabilities. Many recent indexing …
exceeding traditional indexing strategies' processing capabilities. Many recent indexing …
Conway-Bromage-Lyndon (CBL): an exact, dynamic representation of k-mer sets
In this paper, we introduce the Conway-Bromage-Lyndon (CBL) structure, a compressed,
dynamic and exact method for representing k-mer sets. Originating from Conway and …
dynamic and exact method for representing k-mer sets. Originating from Conway and …
[HTML][HTML] Movi: a fast and cache-efficient full-text pangenome index
Efficient pangenome indexes are promising tools for many applications, including rapid
classification of nanopore sequencing reads. Recently, a compressed-index data structure …
classification of nanopore sequencing reads. Recently, a compressed-index data structure …
Where the patterns are: repetition-aware compression for colored de Bruijn graphs
We describe lossless compressed data structures for the colored de Bruijn graph (or, c-
dBG). Given a collection of reference sequences, a c-dBG can be essentially regarded as a …
dBG). Given a collection of reference sequences, a c-dBG can be essentially regarded as a …