Computational graph pangenomics: a tutorial on data structures and their applications
Computational pangenomics is an emerging research field that is changing the way
computer scientists are facing challenges in biological sequence analysis. In past decades …
computer scientists are facing challenges in biological sequence analysis. In past decades …
Hardware acceleration of genomics data analysis: challenges and opportunities
The significant decline in the cost of genome sequencing has dramatically changed the
typical bioinformatics pipeline for analysing sequencing data. Where traditionally, the …
typical bioinformatics pipeline for analysing sequencing data. Where traditionally, the …
SVDSS: structural variation discovery in hard-to-call genomic regions using sample-specific strings from accurate long reads
Structural variants (SVs) account for a large amount of sequence variability across genomes
and play an important role in human genomics and precision medicine. Despite intense …
and play an important role in human genomics and precision medicine. Despite intense …
Representation of k-Mer Sets Using Spectrum-Preserving String Sets
A Rahman, P Medevedev - Journal of Computational Biology, 2021 - liebertpub.com
Given the popularity and elegance of k-mer-based tools, finding a space-efficient way to
represent a set of k-mers is important for improving the scalability of bioinformatics analyses …
represent a set of k-mers is important for improving the scalability of bioinformatics analyses …
The Statistics of k-mers from a Sequence Undergoing a Simple Mutation Process Without Spurious Matches
k-mer-based methods are widely used in bioinformatics, but there are many gaps in our
understanding of their statistical properties. Here, we consider the simple model where a …
understanding of their statistical properties. Here, we consider the simple model where a …
Disk compression of k-mer sets
K-mer based methods have become prevalent in many areas of bioinformatics. In
applications such as database search, they often work with large multi-terabyte-sized …
applications such as database search, they often work with large multi-terabyte-sized …
Sequencing technologies and analyses: where have we been and where are we going?
A wave of technologies transformed sequencing over a decade ago into the high-throughput
era, demanding research in new computational methods to analyze these data. The …
era, demanding research in new computational methods to analyze these data. The …
Kevlar: a mapping-free framework for accurate discovery of de novo variants
De novo genetic variants are an important source of causative variation in complex genetic
disorders. Many methods for variant discovery rely on mapping reads to a reference …
disorders. Many methods for variant discovery rely on mapping reads to a reference …
KAGE: fast alignment-free graph-based genotyping of SNPs and short indels
Genotyping is a core application of high-throughput sequencing. We present KAGE, a
genotyper for SNPs and short indels that is inspired by recent developments within graph …
genotyper for SNPs and short indels that is inspired by recent developments within graph …
Pangenomic genotyping with the marker array
T Mun, NSK Vaddadi, B Langmead - Algorithms for Molecular Biology, 2023 - Springer
We present a new method and software tool called rowbowt that applies a pangenome
index to the problem of inferring genotypes from short-read sequencing data. The method …
index to the problem of inferring genotypes from short-read sequencing data. The method …